CurveHow can we avoid curve fitting when designing a trading strategy? Are there any solid parameters one can use as guide? It seems very easy to adjust the trading signals to the data. This leads to a perfect backtested system - and a tomorrow's crash. What is the line that tells apart perfect trading strategy optimization from curve fitting? The worry is to arrive to a model that explains everything and predicts nothing. (And a further question: What is the NATURE of the predictive value of a system? What - philosophically speaking - confer to a model it's ability to predict future market behavior?)

James Sogi writes:

KISS. Keep parameters simple and robust.

Newton Linchen replies:

You have to agree that it's easier said than done. There is always the desire to "improve" results, to avoid drawdown, to boost profitability…

Is there a "wise speculator's" to-do list on, for example, how many parameters does a system requires/accepts (can handle)?

Nigel Davies offers:

Here's an offbeat view:

Curve fitting isn't the only problem, there's also the issue of whether one takes into account contrary evidence. And there will usually be some kind of contrary evidence, unless and until a feeding frenzy occurs (i.e a segment of market participants start to lose their heads).

So for me the whole thing boils down to inner mental balance and harmony - when someone is under stress or has certain personality issues, they're going to find a way to fit some curves somehow. On the other those who are relaxed (even when the external situation is very difficult) and have stable characters will tend towards objectivity even in the most trying circumstances.

I think this way of seeing things provides a couple of important insights: a) True non randomness will tend to occur when most market participants are highly emotional. b) A good way to avoid curve fitting is to work on someone's ability to withstand stress - if they want to improve they should try green vegetables, good water and maybe some form of yoga, meditation or martial art (tai chi and yiquan are certainly good).

Newton Linchen replies:

The word that I found most important in your e-mail was "objectivity".

I kind of agree with the rest, but, I'm referring most to the curve fitting while developing trading ideas, not when trading them. That's why a scale to measure curve fitting (if it was possible at all) is in order: from what point curve fitting enters the modeling data process?

And, what would be the chess player point of view in this issue?

Nigel Davies replies:

Well what we chess players do is essentially try to destroy our own ideas because if we don't then our opponents will. In the midst of this process 'hope' is the enemy, and unless you're on top of your game he can appear in all sorts of situations. And this despite our best intentions.

Markets don't function in the same way as chess opponents; they act more as a mirror for our own flaws (mainly hope) rather than a malevolent force that's there to do you in. So the requirement to falsify doesn't seem quite so urgent, especially when one is winning game with a particular 'system'.

Out of sample testing can help simulate the process of falsification but not with the same level of paranoia, and also what's built into it is an assumption that the effect is stable.

This brings me to the other difference between chess and markets; the former offers a stable platform on which to experiment and test ones ideas, the latter only has moments of stability. How long will they last? Who knows. But I suspect that subliminal knowledge about the out of sample data may play a part in system construction, not to mention the fact that other people may be doing the same kind of thing and thus competing for the entrees.

An interesting experiment might be to see how the real time application of a system compares to the out of sample test. I hypothesize that it will be worse, much worse.

Kim Zussman adds:

Markets demonstrate repeating patterns over irregularly spaced intervals. It's one thing to find those patterns in the current regime, but how to determine when your precious pattern has failed vs. simply statistical noise?

The answers given here before include money-management and control analysis.

But if you manage your money so carefully as to not go bust when the patterns do, on the whole can you make money (beyond, say, B/H, net of vig, opportunity cost, day job)?

If control analysis and similar quantitative methods work, why aren't engineers rich? (OK some are, but more lawyers are and they don't understand this stuff)

The point will be made that systematic approaches fail, because all patterns get uncovered and you need to be alert to this, and adapt faster and bolder than other agents competing for mating rights. Which should result in certain runners at the top of the distribution (of smarts, guts, determination, etc) far out-distancing the pack.

And it seems there are such, in the infinitesimally small proportion predicted by the curve.

That is curve fitting.

Legacy Daily observes:

"I hypothesize that it will be worse, much worse." If it was so easy, I doubt this discussion would be taking place.

I think human judgment (+ the emotional balance Nigel mentions) are the elements that make multiple regression statistical analysis work. I am skeptical that past price history of a security can predict its future price action but not as skeptical that past relationships between multiple correlated markets (variables) can hold true in the future. The number of independent variables that you use to explain your dependent variable, which variables to choose, how to lag them, and interpretation of the result (why are the numbers saying what they are saying and the historical version of the same) among other decisions are based on so many human decisions that I doubt any system can accurately perpetually predict anything. Even if it could, the force (impact) of the system itself would skew the results rendering the original analysis, premises, and decisions invalid. I have heard of "learning" systems but I haven't had an opportunity to experiment with a model that is able to choose independent variables as the cycles change.

The system has two advantages over us the humans. It takes emotion out of the picture and it can perform many computations quickly. If one gives it any more credit than that, one learns some painful lessons sooner or later. The solution many people implement is "money management" techniques to cut losses short and let the winners take care of themselves (which again are based on judgment). I am sure there are studies out there that try to determine the impact of quantitative models on the markets. Perhaps fading those models by a contra model may yield more positive (dare I say predictable) results…

One last comment, check out how a system generates random numbers (if haven't already looked into this). While the number appears random to us, it is anything but random, unless the generator is based on external random phenomena.

Bill Rafter adds:

Research to identify a universal truth to be used going either forward or backward (out of sample or in-sample) is not curvefitting. An example of that might be the implications of higher levels of implied volatility to future asset price levels.

Research of past data to identify a specific value to be used going forward (out of sample) is not curvefitting, but used backward (in-sample) is curvefitting. If you think of the latter as look-ahead bias it becomes a little more clear. Optimization would clearly count as curvefitting.

Sometimes (usually because of insufficient history) you have no ability to divide your data into two tranches – one for identifying values and the second for testing. In such a case you had best limit your research to identifying universal truths rather than specific values.

Scott Brooks comments:

If the past is not a good measure of today and we only use the present data, then isn't that really just short term trend following? As has been said on this list many times, trend following works great until it doesn't. Therefore, using today's data doesn't really work either.

Phil McDonnell comments:

Curve fitting is one of those things market researchers try NOT to do. But as Mr. Linchen suggests, it is difficult to know when we are approaching the slippery slope of curve fitting. What is curve fitting and what is wrong with it?

A simple example of curve fitting may help. Suppose we had two variables that could not possibly have any predictive value. Call them x1 and x2. They are random numbers. Then let's use them to 'predict' two days worth of market changes m. We have the following table:

m x1 x2
+4 2 1
+20 8 6

Can our random numbers predict the market with a model like this? In fact they can. We know this because we can set up 2 simultaneous equations in two unknowns and solve it. The basic equation is:

m = a * x1 + b * x2

The solution is a = 1 and b = 2. You can check this by back substituting. Multiply x1 by 1 and add two times x2 and each time it appears to give you a correct answer for m. The reason is that it is almost always possible (*) to solve two equations in two unknowns.

So this gives us one rule to consider when we are fitting. The rule is: Never fit n data points with n parameters.

The reason is because you will generally get a 'too good to be true' fit as Larry Williams suggests. This rule generalizes. For example best practices include getting much more data than the number of parameters you are trying to fit. There is a statistical concept called degrees of freedom involved here.

Degrees of freedom is how much wiggle room there is in your model. Each variable you add is a chance for your model to wiggle to better fit the data. The rule of thumb is that you take the number of data points you have and subtract the number of variables. Another way to say this is the number of data points should be MUCH more than the number of fitted parameters.

It is also good to mention that the number of parameters can be tricky to understand. Looking at intraday patterns a parameter could be something like today's high was lower than yesterday's high. Even though it is a true false criteria it is still an independent variable. Choice of the length of a moving average is a parameter. Whether one is above or below is another parameter. Some people use thresholds in moving average systems. Each is a parameter. Adding a second moving average may add four more parameters and the comparison between the two
averages yet another. In a system involving a 200 day and 50 day
average that showed 10 buy sell signals it might have as many as 10 parameters and thus be nearly useless.

Steve Ellison mentioned the two sample data technique. Basically you can fit your model on one data set and then use the same parameters to test out of sample. What you cannot do is refit the model or system parameters to the new data.

Another caveat here is the data mining slippery slope. This means you need to keep track of how many other variables you tried and rejected. This is also called the multiple comparison problem. It can be as insidious as trying to know how many variables someone else tried before coming up with their idea. For example how many parameters did Welles Wilder try before coming up with his 14 day RSI index? There is no way 14 was his first and only guess.

Another bad practice is when you have a system that has picked say 20 profitable trades and you look for rules to weed out those pesky few bad trades to get the perfect system. If you find yourself adding a rule or variable to rule out one or two trades you are well into data mining territory.

Bruno's suggestion to use the BIC or AIC is a good one. If one is doing a multiple regression one should look at the individual t stats for the coefficients AND look at the F test for the overall quality of the fit. Any variables with t-stats that are not above 2 should be tossed. Also an variables which are highly correlated with each other, the weaker one should be tossed.

George Parkanyi reminds us:

Yeah but you guys are forgetting that without curve-fitting we never would have invented the bra.

Say, has anybody got any experience with vertical drop fitting? I just back-tested some oil data and …

Larry Williams writes:

If it looks like it works real well it is curve fitting.

Newton Linchen reiterates:

 my point is: what is the degree of system optimization that turns into curve fitting? In other words, how one is able to recognize curve fitting while modeling data? Perhaps returns too good to believe?

What I mean is to get a general rule that would tell: "Hey, man, from THIS point on you are curve fitting, so step back!"

Steve Ellison proffers:

I learned from Dr. McDonnell to divide the data into two halves and do the curve fitting on only the first half of the data, then test a strategy that looks good on the second half of the data.

Yishen Kuik writes:

The usual out of sample testing says, take price series data, break it into 2, optimize on the 1st piece, test on the 2nd piece, see if you still get a good result.

If you get a bad result you know you've curve fitted. If you get a good result, you know you have something that works.

But what if you get a mildly good result? Then what do you "know" ?

Jim Sogi adds:

This reminds me of the three blind men each touching one part of the elephant and describing what the elephant was like. Quants are often like the blind men, each touching say the 90's bull run tranche, others sampling recent data, others sample the whole. Each has their own description of the market, which like the blind men, are all wrong.

The most important data tranche is the most recent as that is what the current cycle is. You want your trades to work there. Don't try make the reality fit the model.

Also, why not break it into 3 pieces and have 2 out of sample pieces to test it on.

We can go further. If each discreet trade is of limited length, then why not slice up the price series into 100 pieces, reassemble all the odd numbered time slices chronologically into sample A, the even ones into sample B.

Then optimize on sample A and test on sample B. This can address to some degree concerns about regime shifts that might differently characterize your two samples in a simple break of the data.






Speak your mind

9 Comments so far

  1. Gavin Chait on March 5, 2009 5:39 pm

    Building algorithms while looking at the data is going to lead you into temptation.

    When I’m designing data forecasting systems, I always ensure that I research how an industry works in order to build an algorithmic model, but that the data research component is being performed in parallel and by a different team.

    That way, it is almost impossible for me to bias my equations with any prior “expectations”.

  2. Matt Johnson on March 5, 2009 11:05 pm

    You ask “How can we avoid curve fitting when designing a trading strategy?”
    It’s my experience that if you’re not curve fitting the entrance parameters or the exit parameters, you’ll end up curve fitting your portfolio selection module.
    In the end, I believe all mechanical systems eventually fail (i.e. make new equity DD’s).
    Mechanical systems do show us that trends do indeed persist, a lot longer than we (me) might forecast, both trending and non-trending. When I boil it down I find, cutting losses quickly (easy), holding winners (hard), and managing risk (fun); keep a trader out of trouble. I’ve found the silicon carbide system works best for me.

  3. Daily Speculations « Random Thoughts Of A Trader’s Toil on March 6, 2009 8:12 am

    […] Daily Speculations Filed under: Daily Speculations, Quantitative Trading, Research, Trading — newtonlinchen @ 11:10 pm I invite you to read a very interesting discussion on “Curve Fitting” at […]

  4. Albert Jann on March 6, 2009 11:37 am

    If you toss coin 1000 times off course you will get about 500 heads and about 500 tails, but in the process you can have 15 consecutive heads. So if you would like to trade in this kind of system the issue becomes how to identify special patterns like that. One already mentioned trend following (go with the last) which is not so bad idea. In my opinion real time long range pattern identification learning system may give be the best predictions.

  5. douglas roberts dimick on March 7, 2009 12:15 pm

    Based on the range of commentary here, “selective applications” is the phrase that comes to my mind - to minimize optimization as a way to preclude curve fitting. For example, as Newton considers drawdowns, account protocol may be optimized based on account externals, instead of market internals, as a distinct, separate module within an order execution component.

    Note that limiting optimization here is to this third of three components, order execution, following market situation (input) and market strategy (states). I found curve fitting to be ripe from applications within the prior two parts.

    Contrary evidence may be correlated in the market strategy component. Nigel’s “balance and harmony” observation may imply that metacircular dynamic that we have discussed on this site; as a result, optimizing one module could skew corresponding modules.

    It took me six years to recognize – actually to be told by a Tradestation engineer – what Kim and others refer to here as – the nature of a “current regime.” Given exchange systematic diversities, intervals are irregularly spaced; collectively, operators use (non)sequential time data, others involve trade and volume tick data. Accordingly, patterns are interspersed partially among those corresponding but assorted market actor regimes.

    Kim mentions money-management and control analysis. As risk is particular to individual account characteristics and objectives, program trading architecture may emphasize controls.

    What and where then become primary concerns. As noted, optimizing may apply to stop-loss, also possibly hedging strategies (splitting, spitting, spacing orders for instance).

    Legacy’s second paragraph presents a strong argument for closed loop systems. The “variables’ variables” conundrum has presented ample proof to me when designing market situation processing for input generation. Phil’s caveat to Steve’s two-sample-data model expands upon what he calls a “slippery slope” – I have fallen down it all too often.

    Phil also alludes to what may be the real challenge: “variables highly correlated with each other.” Again, the Madoff video squares here; (head trader) Josh observes how the highly correlated funds (or big animals of “the herd”) had not correlated the correlation of their positions relative among themselves (or other funds).

    As Victor pointed out in his Recent Moves at Close, many a steed have quickened their pace, all seemingly headed in a singly envisioned – perhaps correlated (?) – direction. Some weeks ago, I asked if 5500 was the next major support level; likewise, the pace getting there as well appears to be breaking from a collected canter into a gallop.

    The point – how exactly might one correlate that “herd” phenomenon for quantification purposes? I defer to the experts on this site, only noting that I have found neither the mechanistic transparencies nor data assimilations from which such formulations may be devised.

    Newton, you restate your issue, and for good reason, I think. The “degree” that perhaps we all have sought at times, given a survey of the responses, does not appear to be either a brightline or even a rule – in that exceptions may consume any given rule due to those issues of (non)patterning.

    Jim’s quant-ocular-oddity is the basis for my Theory of Quantitative Relativity. Because patterning is irregular, both in correspondence with interval and regime change, answering your question may become rules-based predicated (or systematic of program architecture).

    Consider Jim’s “price series” approach. What I find interesting is that he appears to be slicing time, thereby addressing the interval issue. The question may then become how to do it so as to correlate output present-backward instead of generating optimal, prospective price action models?

    I am still awaiting my opportunity to follow that path. It is fascinating, and I thank you for both reminding us of as well as provoking such a collective of wisdom to comment on the subject. I have learned.


  6. Newton Linchen on March 11, 2009 9:01 pm

    Thanks to all who made comments. I certainly learned a lot from you.

    Best wishes to all.


  7. Late Review of “Education Of A Speculator” - Introduction « Random Thoughts Of A Trader’s Toil on March 17, 2009 6:13 am

    […] Late Review of “Education Of A Speculator” - Introduction Filed under: Daily Speculations, Niederhoffer, Quantitative Trading, Reading, Research, Trading — Tags: laurel kenner, Niederhoffer, Quantitative Trading, speculation, Trading — newtonlinchen @ 10:42 am I’m starting this week a series of posts about the two books by Victor Niederhoffer: “Education Of A Speculator” and “Practical Speculation” (the last with Laurel Kenner). […]

  8. David Whitesel on March 22, 2009 3:26 pm

    Appreciate all these comments. What happens in these varying systems when your models bump into A triangle immersed in a saddle-shape plane (a hyperbolic paraboloid), as well as two diverging ultraparallel lines?

  9. John Chesnut on June 22, 2010 4:06 pm

    Bruce Babcock always rejected optimization of any kind. But that implies that he believed that there is a fixed, underlying truth to all markets. In particular, it implies a belief that the statistics of the markets are stationary, which may not be justifiable.

    I once subscribed to a newsletter where the author could always explain 98 percent of the variance in the past, but he could never predict the future.

    I also subscribed to a newsletter that used out of sample testing extensively, but I think the author was fooling himself. He always used the same in and out samples. But non-stationary conditions can extend over very long time periods.

    Marty Zweig developed a system of market timing that was very effective as long as the Fed was fighting inflation. Even though inflation has been a very long term characteristic of our economy, it could be questioned that this system would work in disinflationary or deflationary times. Inflation is a bad thing in an inflationary environment, but it is a good thing in a deflationary environment.

    So I don’t have an answer to the curve fitting question.

    What I do know is that when I dream up a really creative way-out system that works perfectly that always means that it is curve-fitted. Far-out systems never work outside of sample.

    If market statistics are not stationary, then I don’t see how optimization can be avoided, but I suspect that a system ought to be able to pass a reasonableness test.


Resources & Links