1. Data
Mining
Classic data
mining involves using multiple models or variables
until something seems to "fit" the data. Stepwise
regression, pattern recognition and neural networks
are often used in data mining expeditions. However,
simply running a lot of analyses with different
variables or combinations of variables will have
the same effect. Models developed by data mining
frequently fit past data very well but have poor
predictive quality outside the time span sued in
deriving and fitting the model.
A single test
with a t-statistic that is significant at the 5%
level implies that there is a 1-in-20 probability
that that the observed relationship arose by
chance.
The flip side of
this is that if you run 100 tests of variables with
no underlying relationship, some will have
t-statistic that appear significant at the 5%
level.
2. Transaction
Costs
If a strategy had
actually been used in a substantive way, some
stocks' prices would have been different from the
prices that were recorded. How much buying or
selling pressure would have eliminated inefficiency
is unknowable but might have been quite small in
small or mid cap stocks. Investors typically
underestimate price impact, which accounts for the
large part of transaction costs. A good summary of
the issues appeared in the Wall Street journal on
June 9, 1997: "Trading Costs Rising Along With the
Market."
You can E-mail your questions to:
comments@zacks.com