Data
and universe quality in
backtesting
Backtesting can
be a very powerful analytical tool, but one must
proceed with caution for there are number of
pitfalls that can be encountered. Fortunately, we
can easily address and adjust our processes so long
as we know what to look for, and what to avoid. In
addition to data mining and transaction cost
issues, let's focus on data and universe
quality.
Possibly, the
most common problem area involves survivor bias.
This occurs when companies that were once active
are dropped form a variable universe once they are
no longer publicly traded. In order to avoid
survivor bias, Zacks includes "research" or "dead"
companies in its backtesting databases.
The other most
common and often discussed bias involving data is
that of look-ahead bias. Look-ahead bias exists
when data is included in a time period, when in
reality it would not have been known for that
period. Earnings are a perfect example. A company
with a December fiscal year end report sits earning
some time after December 31. In our historical
databases, those earnings are stored in the
December 31 time slot, however, because that is the
period to which they are tied. Because our clients
use our historical databases for a number of
different purposes, Zacks has made the active
decision not to adjust data for look-ahead bias.
This task is left to the backtester. Earnings,
earnings-related items and balance sheet items are
those that most commonly adjusted for the purposes
of backtesting.
The process of
adjusting the data is by lagging specific items in
a custom database; by calculating data items into
later time periods. Care must be given. In some
instances, some components of calculated items may
need to be lagged while other will not. P/E ratios
for a given period would require earrings to be
lagged but prices to be used as they exist for the
particular time series being calculated.
Depending upon
the universe that is being explored in the
backtest, the frequency of rebalancing that is
being used, and the nature of the data items,
lagging data my differ for user to user. Data items
may be appropriately lagged in annual, quarterly,
monthly or weekly frequencies and for different
length of time.
You can E-mail your questions to:
comments@zacks.com