News Direct – Number 56 | June 2007
Meeting Call Center Forecasting Challenges In A Direct Marketing Operation
By Peter Varisco
BenchmarkPortal and the Center for Customer-Driven Quality at Purdue University have recognized the New York Life Call Center in Tampa (NYL Tampa Operations) as a "Certified Center of Excellence." The call center has been certified for past three consecutive years, ranking in the top 20 percent among all participating call centers with similar volume and size characteristics.
NYL Tampa Operations operates in a highly dynamic environment. It is a leading direct marketer of life insurance and annuities in the senior market, and it constantly tests new products, new channels of communication, and new markets.
Anticipate the Peaks
Forecasting for a call center like ours that supports a direct marketing operation poses special challenges. As our advertisements arrive in home, call volume tends to peak abruptly and then fall off sharply in subsequent weeks. Unless we can anticipate the peaks and valleys, call center managers will struggle with scheduling telephone agents to meet fluctuating demand. To add to the volatility of our call volume, we have had an increasingly complex mail plan.
To see this, consider figure 1 that shows weekly inbound call patterns over a three-year period. Let's examine this figure closely. For 2003, the data consists of a series of peaks and valleys of about equal amplitude with peaks coming at regular intervals. The overall level of calls was stable throughout the year, except for a few dips during the holiday season. In 2004, our mail plan increased in complexity as we found new direct marketing opportunities. As a result, the overall call volume increased over the 2003 level. Within 2004, the weekly pattern became more erratic and there were level shifts within the year. In 2005 this trend toward a more complex mail plan and more erratic call patterns continued. It has not abated since.
In spite of this, we will demonstrate a simple forecasting approach that effectively anticipates response patterns to direct marketing campaigns, even with a complex mail plan. As we will see, the key to the simple approach is capturing the right data on each response. The challenge for our call center forecasting has been that we lack the key information on calls required for simple methods. So, we've had to step up the complexity of our methods in order to keep up with the complexity of our mail plan. In this article, I will show how NYL Tampa Operations has been meeting these challenges.
Figure 1 Weekly Call Patterns
Weekly call volume 2003-2005. Since the data is proprietary, the vertical axis labels were removed.
Telemarketing Call Patterns are Volatile
Prior to 2004, we forecasted weekly call volume by using workforce management software. The tool allowed us to look back at the calls received in recent weeks, the same week in recent months, and the same week in recent years. All this data was smoothed to produce an extrapolative forecast. Our call center experts could then adjust this forecast as needed. This worked well until 2004. But as we've seen, our call patterns became much more erratic in 2004. A simple look back at previous values was now of limited value. We needed a way to use our mail plan as a direct input into the forecast. We began by decomposing our calls. Here is what we saw.
In our operation, total inbound calls come from two distinct sources. The first source is customer service (CS) calls. In these calls, we provide service to existing customers by answering billing questions, product questions, and providing general assistance. However, the high degree of volatility in our weekly call pattern is not due to customer service calls.
The second source of total inbound call volume is our telemarketing calls. Each advertising piece we send to prospective customers contains a phone number so that the recipient can contact our Tampa call center for additional information. The resulting calls comprise our telemarketing (TM) call volume. That is, our TM calls are not telephone solicitations, but are inbound calls that are in response to our direct marketing campaigns. Figure 2 shows total call volume decomposed into weekly CS and TM volume for the same period as in figure 1. Clearly, the more volatile series is the TM calls; this is the primary source of our peaks and valleys in total weekly call volume. So, to anticipate the peaks in weekly call volume, we must correctly anticipate the peaks and valleys in our TM calls.
Figure 2 The Volatile Inbound Call Pattern is Primarily from Telemarketing Calls
Left: Weekly customer service call volume 2003-2005. Right: Weekly telemarketing call volume 2003-2005. Since the data is proprietary, the vertical axis labels were removed.
Data Capture Affects Forecasting Method
Let us now focus on methods for forecasting inbound TM calls. In choosing a method our first question is, can we establish a link between a phone call and the specific campaign that generated it? One way to link calls to campaigns is to print a source code on each advertising piece, and then capture that source code on all calls. You may have had the experience where you call to inquire about some direct mail advertisement, and the telephone agent instructs you to read off a sequence of letters and numbers printed on your advertisement before answering your question or taking your order. Although this can inconvenience the caller, it ensures source code capture for every call. As long as the source code is specific to the marketing campaign, the resulting calls can be linked back to that campaign. This is very powerful information for forecasting.
Here is a relatively simple, yet effective method for forecasting weekly responses to direct marketing campaigns when we can link each response back to the specific campaign that generated it. We refer to this method as "building a curve." It applies to forecasting any type of "response" to a campaign, including phone calls, applications for insurance, and the like.
How To Build a Curve
Step 1: Isolate responses to campaign X using source codes
Let week 1 denote the week that campaign X began. Suppose we received 1,000 responses to campaign X, in total. How do we know that there were exactly 1,000 responses to campaign X? Since we capture a source code on each response, this information links the responses back to campaign X. Without this link, we could not complete step 1.
Step 2: Build a curve
Assume these responses arrived over a four-week period. Since the responses have a time stamp, we know the week in which each response arrived. The top portion of figure 3 shows the frequency of responses for each week of the campaign. Next, convert the data on responses by week into percentages. The sequence of percents comprises the "curve." If we know the number of responses to expect from a campaign, the curve tells us the arrival pattern of those responses.
Step 3: Build a forecast
The bottom portion of figure 3 shows the layout for the forecast. Our forecast horizon is seven weeks. During this time, there will be three campaigns similar to campaign X for which we want a forecast. The bottom portion of figure 3 shows the starting week for each campaign. Two quantities need to be forecasted. First, we must forecast the number of responses to each of the three campaigns. We can produce this forecast by analyzing historical data on responses for campaigns of this type. Second, we must forecast the arrival of the response. For this, we use our curve.
We project that campaign 1 will produce 1,000 responses. Since our curve tells us that 40 percent occur in week 1 of the campaign, we place 1,000 * 40 percent = 400 responses under the starting week for campaign 1. Again referring to our curve, 30 percent of responses occur in week 2, so we place 1,000 * 30 percent = 300 responses in week 2 of campaign 1, and so on. Turning to campaign 2, we project 2,000 responses in total. Since it starts in Wk2, we place 2,000 * 40 percent = 800 responses in Wk2, the starting week of campaign 2. We then use the curve to complete the arrival of the 2,000 responses for campaign 2 over a four-week period. We do the same for campaign 3. To produce our weekly forecast, we sum the responses across all three campaigns for each of the seven weeks. The forecast is shown in the bottom row of the bottom portion of figure 3.
Figure 3 How to Build a Curve
The forecast from our curve is plotted in figure 4. The solid line shows our forecast on the left vertical axis. The dashed line shows the projected responses for each campaign on the right vertical axis, plotted in the week the campaign starts. If there is no campaign starting in a given week, we plot zero. We see that the forecast consists of a series of peaks and valleys placed to correspond with the timing of our campaigns, and the number of responses each campaign is expected to produce.
This approach can be carried out in a spreadsheet and is simple enough to be transparent to all users. Moreover, it is well suited to forecasting responses to direct marketing campaigns in that it can anticipate the peaks and valleys, and so managers can adjust resources to meet fluctuating demand. Even if the mail plan is complex, we can decompose all responses into separate channels and segments, forecast each separately, and then perform a bottom-up sum to get our aggregate forecast.
Figure 4 Weekly Forecast Using Curves
Curves provide a relatively simple solution to forecasting responses to direct marketing campaigns if we can link every response back to the campaign that generated it. But, what if there is no link between responses and campaigns? At NYL Tampa Operations, we decided not to capture a source code on every TM call. If source codes are so useful for forecasting, why not capture them on each call?
Although useful for forecasting, the primary use of source codes is to credit sales activity to specific channels and specific campaigns. But many of our TM calls do not produce any immediate sales activity. Often, TM calls only involve answering questions about our offers. We refer to these calls as "inquiries." Since we do not do sales reporting on inquiries, there is no need to capture a source code on inquiry calls. Moreover, insisting on source code capture for inquiries requires that the caller read off the source code letters and numbers and that we verify this, before we answer their questions. Not only would this inconvenience the caller, it adds to the time it takes to complete the call. So, it would take more staff to handle the same number of calls with no tangible benefit, save easier forecasting.
So, how to forecast TM calls if you cannot link individual calls to campaigns? The next best thing is to link the weekly flow of calls to campaigns.
Dynamic Regression Provides The Link
Next, we will consider an effective method to link the weekly flow of calls to campaigns without using source codes. Consider figure 5. The solid line shows the TM calls received by week for 2004 along the left vertical axis. This includes TM calls from several channels: direct mail, print publications and collateral inserts. The dashed line shows the total applications generated by each direct mail campaign during the same period on the right vertical axis, plotted in the week the kits arrived in home. If no direct mail kits arrived in home in a given week, we plot zero. So, similar to figure 4, this is a plot of "responses" to each campaign.
Figure 5 Weekly TM Calls and Applications per Direct Mail Campaign for 2004
Weekly telemarketing call volume 2004 (left vertical axis) and applications produced by each direct mail campaign of 2004 (right vertical axis). Since the data is proprietary, the vertical axis labels were removed.
I've chosen to plot the applications from direct mail campaigns because, for our operation, the weekly flow of TM calls is most strongly linked to this type of campaign. This is because our marketing department predetermines the in-home date for each direct mail campaign. This means that all across the country people are receiving our direct mail kits on about the same day. These kits contain a phone number so that recipients can call our Tampa call center with inquiries. The result is the direct mail kits arrive in home, and phones light up, producing peaks in TM phone volume. Generally, this is not true of other channels in which the arrival of advertisements in home is often much more spread out. So, it makes sense that the campaigns that are most responsible for our weekly TM peaks and valleys are our direct mail campaigns.
Unlike the previous chart illustrating our curve, here there is no direct link between calls and campaigns. The best we can say is that the time series of applications per direct mail campaign explains much of the variation in the time series of TM calls. For those with statistical training, this notion of "variance explained" suggests regression modeling. Thus, it is dynamic regression, a form of regression suited to time series data, which provides the link.
The dynamic regression models we use are an extension of the Box-Jenkins (ARIMA) forecasting methodology. The cited references provide an authoritative discussion of the theory of ARIMA modeling and dynamic regression and provide case studies of successful applications. I will only touch briefly on topics of direct relevance to us.
In the context of modeling telemarketing calls in response to direct marketing campaigns, a simple formulation of a dynamic regression model could look like equation 1 below. We will use this as our starting point and then describe enhancements.
Yt = C + n0Xt + n1Xt-1 + ... + nkXt-k + Nt (1)
Nt = regression error = Yt - (C + n0Xt + n1Xt-1 + ... + nkXt-k) (2)
Nt is assumed to follow an ARIMA process, which is modeled using Box-Jenkins methods
The dynamic regression model accounts for the lagged effect of direct marketing campaigns on TM call patterns.
Many people that respond to direct marketing offers respond right away. Thus, direct marketing campaigns have the greatest positive effect on TM calls in the week that the offers arrive in home. If offers are scheduled to arrive in home in week t, Xt measures the quantity of kits mailed. (If no offers are scheduled to arrive in home in week t, Xt equals zero.) The effect of this quantity of kits on TM calls in week t is measured by the coefficient n0. Since n0 is a positive coefficient in our model, the forecast shows an elevation in call volume in week t by the amount n0Xt.
Some people do not respond right away to direct marketing offers, but wait a week before responding. So, one week after the offers arrive in home we still experience an elevation in call volume. The quantity of kits scheduled to arrive in home one week ago is measured by Xt-1. The effect of this quantity of kits on TM calls in week t is measured by the coefficient n1. Thus, the cumulative elevation in call volume due to the kits arriving in home in weeks t and t-1 is n0Xt + n1Xt-1.
We continue in this fashion until we have included all past weeks that have a significant impact on call volume in the current week. In this way, the model can account for all recent campaigns and the effect they have on TM calls in the current week. The result is that the terms up to Xt-k produce a forecast pattern of peaks and valleys, just like we saw from our curve.
The dynamic regression model accounts for other marketing channels not explicitly included as regressor inputs.
In dynamic regression, the regressor variables often do not account for all sources of variation in the forecast variable. Sources of variation not explicitly included in the model as regressors deposit their effect in the regression error Nt. The model accounts for this variation as well by including an ARIMA model for the Nt series. Using this model for the error term, the dynamic regression model can adjust the overall level of the forecast up or down to align with the current level of the TM calls. In this way, our model accounts for those other channels that affect the overall level of calls, but have little impact on peaks and valleys.
Equation 1 represents the basic dynamic regression model formulation that we've used at NYL Tampa Operations. However, several enhancements have proven useful.
Take It To the Next Level
Our decomposition of total call volume into CS and TM calls was so that we could use our mail plan as a direct input to our TM forecast. As our mail plan has continued to become more complex, we've found that further decompositions are now necessary. With data enhancements, we can now decompose all TM volume into calls we get from distinct advertising channels, based on the phone numbers dialed. Soon we will forecast calls from each channel by linking the weekly TM calls to the campaigns for that same channel. This will produce a stronger link, since the calls and campaigns are from matching channels. Another benefit is that when errors occur, we can better isolate the source of the errors by looking at each channel's forecast separately.
Peter Varisco is Director of Reseach and Analysis for New York Life Tampa Operations. He can be contacted at firstname.lastname@example.org
Call center forecasting
Call Center Management Review (Ed) (2000). Call Center Forecasting and Scheduling, Annapolis: ICMI Inc.
Minnucci, Jay (2006). Nano Forecasting: Forecasting Techniques for Short-Time Intervals. Foresight 4, 6-10
Varisco, Peter (2006). Forecasting Call Flow in a Direct Marketing Environment. Foresight 4, 11-15
Rickwalder, Dan (2006) Forecasting Weekly Effects of Recurring Irregular Occurrences. Foresight 4, 16-18
Theory and application of Box-Jenkins (ARIMA) models and dynamic regression models
Box, George E.P., Jenkins, Gwilym M., Reinsel, Gregory C. (1994). Time Series Analysis Forecasting and Control Third Edition, Englewood Cliffs: Prentice-Hall
Liu, L.-M. (2005). Time Series Analysis and Forecasting, River Forest, IL: Scientific Computing Associates Corporation.
Makridakis, S., Wheelwright, S. C., & Hyndman, R. J. (1998). Forecasting Methods and Applications 3rd Edition, Hoboken: John Wiley and Sons.
Pankratz, A. (1991). Forecasting with Dynamic Regression Models, New York: John Wiley & Sons.