Professional Development

News Direct – Number 56 | June 2007

Meeting Call Center Forecasting Challenges In A Direct Marketing Operation

By Peter Varisco

BenchmarkPortal and the Center for Customer-Driven Quality at Purdue University have recognized the New York Life Call Center in Tampa (NYL Tampa Operations) as a "Certified Center of Excellence." The call center has been certified for past three consecutive years, ranking in the top 20 percent among all participating call centers with similar volume and size characteristics.

NYL Tampa Operations operates in a highly dynamic environment. It is a leading direct marketer of life insurance and annuities in the senior market, and it constantly tests new products, new channels of communication, and new markets.

Anticipate the Peaks

Forecasting for a call center like ours that supports a direct marketing operation poses special challenges. As our advertisements arrive in home, call volume tends to peak abruptly and then fall off sharply in subsequent weeks. Unless we can anticipate the peaks and valleys, call center managers will struggle with scheduling telephone agents to meet fluctuating demand. To add to the volatility of our call volume, we have had an increasingly complex mail plan.

To see this, consider figure 1 that shows weekly inbound call patterns over a three-year period. Let's examine this figure closely. For 2003, the data consists of a series of peaks and valleys of about equal amplitude with peaks coming at regular intervals. The overall level of calls was stable throughout the year, except for a few dips during the holiday season. In 2004, our mail plan increased in complexity as we found new direct marketing opportunities. As a result, the overall call volume increased over the 2003 level. Within 2004, the weekly pattern became more erratic and there were level shifts within the year. In 2005 this trend toward a more complex mail plan and more erratic call patterns continued. It has not abated since.

In spite of this, we will demonstrate a simple forecasting approach that effectively anticipates response patterns to direct marketing campaigns, even with a complex mail plan. As we will see, the key to the simple approach is capturing the right data on each response. The challenge for our call center forecasting has been that we lack the key information on calls required for simple methods. So, we've had to step up the complexity of our methods in order to keep up with the complexity of our mail plan. In this article, I will show how NYL Tampa Operations has been meeting these challenges.

Figure 1 Weekly Call Patterns

Weekly call volume 2003-2005. Since the data is proprietary, the vertical axis labels were removed.

Telemarketing Call Patterns are Volatile

Prior to 2004, we forecasted weekly call volume by using workforce management software. The tool allowed us to look back at the calls received in recent weeks, the same week in recent months, and the same week in recent years. All this data was smoothed to produce an extrapolative forecast. Our call center experts could then adjust this forecast as needed. This worked well until 2004. But as we've seen, our call patterns became much more erratic in 2004. A simple look back at previous values was now of limited value. We needed a way to use our mail plan as a direct input into the forecast. We began by decomposing our calls. Here is what we saw.

In our operation, total inbound calls come from two distinct sources. The first source is customer service (CS) calls. In these calls, we provide service to existing customers by answering billing questions, product questions, and providing general assistance. However, the high degree of volatility in our weekly call pattern is not due to customer service calls.

The second source of total inbound call volume is our telemarketing calls. Each advertising piece we send to prospective customers contains a phone number so that the recipient can contact our Tampa call center for additional information. The resulting calls comprise our telemarketing (TM) call volume. That is, our TM calls are not telephone solicitations, but are inbound calls that are in response to our direct marketing campaigns. Figure 2 shows total call volume decomposed into weekly CS and TM volume for the same period as in figure 1. Clearly, the more volatile series is the TM calls; this is the primary source of our peaks and valleys in total weekly call volume. So, to anticipate the peaks in weekly call volume, we must correctly anticipate the peaks and valleys in our TM calls.

Figure 2 The Volatile Inbound Call Pattern is Primarily from Telemarketing Calls

Left: Weekly customer service call volume 2003-2005. Right: Weekly telemarketing call volume 2003-2005. Since the data is proprietary, the vertical axis labels were removed.

Data Capture Affects Forecasting Method

Let us now focus on methods for forecasting inbound TM calls. In choosing a method our first question is, can we establish a link between a phone call and the specific campaign that generated it? One way to link calls to campaigns is to print a source code on each advertising piece, and then capture that source code on all calls. You may have had the experience where you call to inquire about some direct mail advertisement, and the telephone agent instructs you to read off a sequence of letters and numbers printed on your advertisement before answering your question or taking your order. Although this can inconvenience the caller, it ensures source code capture for every call. As long as the source code is specific to the marketing campaign, the resulting calls can be linked back to that campaign. This is very powerful information for forecasting.

Here is a relatively simple, yet effective method for forecasting weekly responses to direct marketing campaigns when we can link each response back to the specific campaign that generated it. We refer to this method as "building a curve." It applies to forecasting any type of "response" to a campaign, including phone calls, applications for insurance, and the like.

How To Build a Curve

Assumptions:

We know the week that campaigns begin.
Responses carry a time stamp. So, we can determine when they arrived relative to the start of a campaign.
Each response can be linked back to the specific campaign that generated it.

Preliminaries:

First, choose an advertising channel or segment that you want to forecast. For instance, suppose we want to forecast responses from advertisements in a selected publication.
Next, select a specific campaign for that channel that is complete, that is, all the responses for that campaign have arrived. For instance, we might choose the January issue of our publication from the previous year. In that case, call the January issue campaign X.

Step 1: Isolate responses to campaign X using source codes

Let week 1 denote the week that campaign X began. Suppose we received 1,000 responses to campaign X, in total. How do we know that there were exactly 1,000 responses to campaign X? Since we capture a source code on each response, this information links the responses back to campaign X. Without this link, we could not complete step 1.

Step 2: Build a curve

Assume these responses arrived over a four-week period. Since the responses have a time stamp, we know the week in which each response arrived. The top portion of figure 3 shows the frequency of responses for each week of the campaign. Next, convert the data on responses by week into percentages. The sequence of percents comprises the "curve." If we know the number of responses to expect from a campaign, the curve tells us the arrival pattern of those responses.

Step 3: Build a forecast

The bottom portion of figure 3 shows the layout for the forecast. Our forecast horizon is seven weeks. During this time, there will be three campaigns similar to campaign X for which we want a forecast. The bottom portion of figure 3 shows the starting week for each campaign. Two quantities need to be forecasted. First, we must forecast the number of responses to each of the three campaigns. We can produce this forecast by analyzing historical data on responses for campaigns of this type. Second, we must forecast the arrival of the response. For this, we use our curve.

We project that campaign 1 will produce 1,000 responses. Since our curve tells us that 40 percent occur in week 1 of the campaign, we place 1,000 * 40 percent = 400 responses under the starting week for campaign 1. Again referring to our curve, 30 percent of responses occur in week 2, so we place 1,000 * 30 percent = 300 responses in week 2 of campaign 1, and so on. Turning to campaign 2, we project 2,000 responses in total. Since it starts in Wk2, we place 2,000 * 40 percent = 800 responses in Wk2, the starting week of campaign 2. We then use the curve to complete the arrival of the 2,000 responses for campaign 2 over a four-week period. We do the same for campaign 3. To produce our weekly forecast, we sum the responses across all three campaigns for each of the seven weeks. The forecast is shown in the bottom row of the bottom portion of figure 3.

Figure 3 How to Build a Curve

The forecast from our curve is plotted in figure 4. The solid line shows our forecast on the left vertical axis. The dashed line shows the projected responses for each campaign on the right vertical axis, plotted in the week the campaign starts. If there is no campaign starting in a given week, we plot zero. We see that the forecast consists of a series of peaks and valleys placed to correspond with the timing of our campaigns, and the number of responses each campaign is expected to produce.

This approach can be carried out in a spreadsheet and is simple enough to be transparent to all users. Moreover, it is well suited to forecasting responses to direct marketing campaigns in that it can anticipate the peaks and valleys, and so managers can adjust resources to meet fluctuating demand. Even if the mail plan is complex, we can decompose all responses into separate channels and segments, forecast each separately, and then perform a bottom-up sum to get our aggregate forecast.

Figure 4 Weekly Forecast Using Curves

Curves provide a relatively simple solution to forecasting responses to direct marketing campaigns if we can link every response back to the campaign that generated it. But, what if there is no link between responses and campaigns? At NYL Tampa Operations, we decided not to capture a source code on every TM call. If source codes are so useful for forecasting, why not capture them on each call?

Although useful for forecasting, the primary use of source codes is to credit sales activity to specific channels and specific campaigns. But many of our TM calls do not produce any immediate sales activity. Often, TM calls only involve answering questions about our offers. We refer to these calls as "inquiries." Since we do not do sales reporting on inquiries, there is no need to capture a source code on inquiry calls. Moreover, insisting on source code capture for inquiries requires that the caller read off the source code letters and numbers and that we verify this, before we answer their questions. Not only would this inconvenience the caller, it adds to the time it takes to complete the call. So, it would take more staff to handle the same number of calls with no tangible benefit, save easier forecasting.

So, how to forecast TM calls if you cannot link individual calls to campaigns? The next best thing is to link the weekly flow of calls to campaigns.

Dynamic Regression Provides The Link

Next, we will consider an effective method to link the weekly flow of calls to campaigns without using source codes. Consider figure 5. The solid line shows the TM calls received by week for 2004 along the left vertical axis. This includes TM calls from several channels: direct mail, print publications and collateral inserts. The dashed line shows the total applications generated by each direct mail campaign during the same period on the right vertical axis, plotted in the week the kits arrived in home. If no direct mail kits arrived in home in a given week, we plot zero. So, similar to figure 4, this is a plot of "responses" to each campaign.

Figure 5 Weekly TM Calls and Applications per Direct Mail Campaign for 2004

Weekly telemarketing call volume 2004 (left vertical axis) and applications produced by each direct mail campaign of 2004 (right vertical axis). Since the data is proprietary, the vertical axis labels were removed.

I've chosen to plot the applications from direct mail campaigns because, for our operation, the weekly flow of TM calls is most strongly linked to this type of campaign. This is because our marketing department predetermines the in-home date for each direct mail campaign. This means that all across the country people are receiving our direct mail kits on about the same day. These kits contain a phone number so that recipients can call our Tampa call center with inquiries. The result is the direct mail kits arrive in home, and phones light up, producing peaks in TM phone volume. Generally, this is not true of other channels in which the arrival of advertisements in home is often much more spread out. So, it makes sense that the campaigns that are most responsible for our weekly TM peaks and valleys are our direct mail campaigns.

Unlike the previous chart illustrating our curve, here there is no direct link between calls and campaigns. The best we can say is that the time series of applications per direct mail campaign explains much of the variation in the time series of TM calls. For those with statistical training, this notion of "variance explained" suggests regression modeling. Thus, it is dynamic regression, a form of regression suited to time series data, which provides the link.

The dynamic regression models we use are an extension of the Box-Jenkins (ARIMA) forecasting methodology. The cited references provide an authoritative discussion of the theory of ARIMA modeling and dynamic regression and provide case studies of successful applications. I will only touch briefly on topics of direct relevance to us.

In the context of modeling telemarketing calls in response to direct marketing campaigns, a simple formulation of a dynamic regression model could look like equation 1 below. We will use this as our starting point and then describe enhancements.

Y_t = C + n₀X_t + n₁X_t-1 + ... + n_kX_t-k + N_t (1)

Where t indexes the week
Y_t = the forecast variable = TM calls in week t
X_t = the regressor variable = quantity of direct mail kits arriving in home in week t

N_t = regression error = Y_t - (C + n₀X_t + n₁X_t-1 + ... + n_kX_t-k) (2)

N_t is assumed to follow an ARIMA process, which is modeled using Box-Jenkins methods

The dynamic regression model accounts for the lagged effect of direct marketing campaigns on TM call patterns.

Many people that respond to direct marketing offers respond right away. Thus, direct marketing campaigns have the greatest positive effect on TM calls in the week that the offers arrive in home. If offers are scheduled to arrive in home in week t, X_t measures the quantity of kits mailed. (If no offers are scheduled to arrive in home in week t, X_t equals zero.) The effect of this quantity of kits on TM calls in week t is measured by the coefficient n₀. Since n₀ is a positive coefficient in our model, the forecast shows an elevation in call volume in week t by the amount n₀X_t.

Some people do not respond right away to direct marketing offers, but wait a week before responding. So, one week after the offers arrive in home we still experience an elevation in call volume. The quantity of kits scheduled to arrive in home one week ago is measured by X_t-1. The effect of this quantity of kits on TM calls in week t is measured by the coefficient n₁. Thus, the cumulative elevation in call volume due to the kits arriving in home in weeks t and t-1 is n₀X_t + n₁X_t-1.

We continue in this fashion until we have included all past weeks that have a significant impact on call volume in the current week. In this way, the model can account for all recent campaigns and the effect they have on TM calls in the current week. The result is that the terms up to X_t-k produce a forecast pattern of peaks and valleys, just like we saw from our curve.

The dynamic regression model accounts for other marketing channels not explicitly included as regressor inputs.

In dynamic regression, the regressor variables often do not account for all sources of variation in the forecast variable. Sources of variation not explicitly included in the model as regressors deposit their effect in the regression error N_t. The model accounts for this variation as well by including an ARIMA model for the N_t series. Using this model for the error term, the dynamic regression model can adjust the overall level of the forecast up or down to align with the current level of the TM calls. In this way, our model accounts for those other channels that affect the overall level of calls, but have little impact on peaks and valleys.

Some Enhancements

Equation 1 represents the basic dynamic regression model formulation that we've used at NYL Tampa Operations. However, several enhancements have proven useful.

The volume of calls we receive in response to direct marketing campaigns is not just a function of the quantity of kits mailed, but also of the responsiveness of the households selected for the mailing. The marketing group provides us with projected applications returned from the kits mailed, and we use this measure as Xt, in place of kits mailed.
Our model contains additional inputs to account for those print channels that most contribute to peaks and valleys in the weekly pattern.
We have added regression terms to account for the decrease in calls associated with certain holiday weeks like Thanksgiving and Christmas.

Take It To the Next Level

Our decomposition of total call volume into CS and TM calls was so that we could use our mail plan as a direct input to our TM forecast. As our mail plan has continued to become more complex, we've found that further decompositions are now necessary. With data enhancements, we can now decompose all TM volume into calls we get from distinct advertising channels, based on the phone numbers dialed. Soon we will forecast calls from each channel by linking the weekly TM calls to the campaigns for that same channel. This will produce a stronger link, since the calls and campaigns are from matching channels. Another benefit is that when errors occur, we can better isolate the source of the errors by looking at each channel's forecast separately.

Peter Varisco is Director of Reseach and Analysis for New York Life Tampa Operations. He can be contacted at pvarisco@newyorklife.com

References

Call center forecasting

Call Center Management Review (Ed) (2000). Call Center Forecasting and Scheduling, Annapolis: ICMI Inc.

Minnucci, Jay (2006). Nano Forecasting: Forecasting Techniques for Short-Time Intervals. Foresight 4, 6-10

Varisco, Peter (2006). Forecasting Call Flow in a Direct Marketing Environment. Foresight 4, 11-15

Rickwalder, Dan (2006) Forecasting Weekly Effects of Recurring Irregular Occurrences. Foresight 4, 16-18

Theory and application of Box-Jenkins (ARIMA) models and dynamic regression models

Box, George E.P., Jenkins, Gwilym M., Reinsel, Gregory C. (1994). Time Series Analysis Forecasting and Control Third Edition, Englewood Cliffs: Prentice-Hall

Liu, L.-M. (2005). Time Series Analysis and Forecasting, River Forest, IL: Scientific Computing Associates Corporation.

Makridakis, S., Wheelwright, S. C., & Hyndman, R. J. (1998). Forecasting Methods and Applications 3rd Edition, Hoboken: John Wiley and Sons.

Pankratz, A. (1991). Forecasting with Dynamic Regression Models, New York: John Wiley & Sons.