Zacks

Professional Services


 

Back to Stump the Quant Main Page

Data and universe quality in backtesting

Backtesting can be a very powerful analytical tool, but one must proceed with caution for there are number of pitfalls that can be encountered. Fortunately, we can easily address and adjust our processes so long as we know what to look for, and what to avoid. In addition to data mining and transaction cost issues, let's focus on data and universe quality.

Possibly, the most common problem area involves survivor bias. This occurs when companies that were once active are dropped form a variable universe once they are no longer publicly traded. In order to avoid survivor bias, Zacks includes "research" or "dead" companies in its backtesting databases.

The other most common and often discussed bias involving data is that of look-ahead bias. Look-ahead bias exists when data is included in a time period, when in reality it would not have been known for that period. Earnings are a perfect example. A company with a December fiscal year end report sits earning some time after December 31. In our historical databases, those earnings are stored in the December 31 time slot, however, because that is the period to which they are tied. Because our clients use our historical databases for a number of different purposes, Zacks has made the active decision not to adjust data for look-ahead bias. This task is left to the backtester. Earnings, earnings-related items and balance sheet items are those that most commonly adjusted for the purposes of backtesting.

The process of adjusting the data is by lagging specific items in a custom database; by calculating data items into later time periods. Care must be given. In some instances, some components of calculated items may need to be lagged while other will not. P/E ratios for a given period would require earrings to be lagged but prices to be used as they exist for the particular time series being calculated.

Depending upon the universe that is being explored in the backtest, the frequency of rebalancing that is being used, and the nature of the data items, lagging data my differ for user to user. Data items may be appropriately lagged in annual, quarterly, monthly or weekly frequencies and for different length of time.

 

You can E-mail your questions to: comments@zacks.com

 

 

Copyright © 2001 Zacks Investment Research, Inc.