# Non-Linear Relationships, from Philip McDonnell

I would like to offer some simple thoughts on non-linear relationships. The usual way to study non-linear correlations is to transform one or more of the variables in question. For example if we have a reason to believe that the underlying process is multiplicative then we can use a log function to model our data. When we do a correlation or regression of y~x we can just take the transformed variables ln(y)~ln(x) as our new data set. We are still doing a linear correlation or a linear regression but now we are doing it on the transformed variables.

Ideally we would know the form of the non-linear relationship from some theory. Absent that we could use a general functional form such as the polynomials. So our transform could be something like X^2, X^3, or X^4. Using one of these terms is usually pretty safe. But combining them in a multiple regression can be problematic. The reason is that the terms x^2 and x^3 are about 67% correlated. Using highly correlated variables to model or predict some third variable is a bad idea because you cannot trust the statistics you get.

One way around that is to use orthogonal polynomials or functions. We have previously discussed Fourier transforms and Chebychev polynomials. Both of these classes are orthogonal which also means that we can fit a few terms and add or delete terms at will. The fitted coefficients will not change if we truncate or add to the series. Each term is guaranteed to be linearly independent of the others.

Using one of these terms is usually pretty safe. But combining them in a multiple regression can be problematic. The reason is that the terms x^2 and x^3 are about 67% correlated. Using highly correlated variables to model or predict some third variable is a bad idea because you cannot trust the statistics you get.

I have a question.

One of the reasons for adding regressors is to take into account all possible reasons behind a move in the variable we are trying to explain. However, multicollinearity being prevalent in finance, it is a source of headaches.

If we could randomize and/or design experience plans for empirical studies, as we do in biology, we could get rid of part of the problem.

Is it possible to randomize ex post? Let's say I what to study Y = aX+ b + e. If instead of taking the full history of observed (Y,X), I am taking a random sample of (Y,X), it creates some kind of post-randomization, which should reduces the impact of other factors.

Does it make sense? Of course, we would lose all the information contained in the non-sampled (Y,X). That means even less data to work with, which is not nice with ever-changing cycles.

## Rich Ghazarian mentions:

And of course if you want a more powerful model, you fit a Copula to your processes and now you are in a more realistic Dependence Structure. Engle has a nice paper on Dynamic Conditional Correlation that may interest Dependence modelers on the list. The use of Excel correlation, pearson correlation, linear correlation … these must be the biggest flaws in quant finance today.

With linear functions we can compute the Eigenvectors to get an orthogonal representation. One problem that gets in the way of nonlinear models is that it isn't clear what is the appropriate "distance" measurement. You need a formal metric of distance to model, compare, or optimize anything. How far apart are these points?

With linear axes, distance is determined by Pythagoras. But what is suggested for the underlying measure of distance if the axes aren't linear?

These remarks about correlation resonate with me, especially in the case of the stock market.

## From Vincent Andres:

If you did replace your original axis X and Y by new axis X'=fx(X) and Y'=fy(Y) this is a transformation of the kind P=(x,y) -> P'=f(P)=(x',y')=(fx(x), fy(y)).

This transformation can be reverted without worry. P'=(x',y') -> P=(x,y) where x and y are the antecedents of x' and y' thru the reciprocal functions fx^-1 and fy^-1.

A "natural" suggested distance measure in this new universe is thus : dist(P1, P2) = dist(ant(P1), ant(P2)) ant = antecedent.

This works for all functions fx and fy being monotonous, e.g., (ln(x), x^2, etc) because there is a strict bijection between the two universes. It could even do something for a more large class of functions.

Sorry for the difficult notations, but I hope the idea is clear.

# A Zero Sum Question, from Agustin Gonzalez

I am a big follower of your writings and philosophical thoughts. I have a question that I have never gotten a good answer to, so I decided to pose it to your brilliant minds!

Are trading gains and losses considered a zero-sum situation? For example, when Amaranth lost \$6 billion in less than one week, does that mean that investors on the other side of the trade made \$6 billion?

This might be a very simple question but I can’t really seem to figure it out, nor do I get a consistent answer from any of the people that I ask.

I found a paper on The Winners and Losers of The Zero Sum Game, from Larry Harris (author of Trading And Exchanges).

I have always believed that trading futures is a zero sum game. If this is incorrect, please be kind enough clarify, and thank you.

Steve Leslie offers:

Something can only be a zero sum if it is frictionless. There is no perfect machine, they all expend energy of some sort.

In a private transaction I sell you something and you buy it then it is zero sum. 100% of the money transferred hands

Einstein said all matter in the universe remains constant. That is not to say that it does not take intermediate forms.

Although I do not trade futures, the chair and others are the experts there. I believe in his book he mentions that it has the least costs to it. In the world of intangibles it is the “cleanest of transactions” as it eliminates the spreads. Please feel free to correct me if I am wrong. The big boys screw the little guys by manipulating the markets eloquently, described once again by chair, when the Bank of Japan would put in buy programs and sell programs on currencies. My guess is that the Federal Reserve can do same by adding money and taking it out of the system.

Forex has its costs in the form of pips.

In securities of course there is a transaction cost. You pay commissions, and in stocks there is a bid and an ask. Spreads are killers in options. A 2.4 bid and a 2.6 asked is approximately 10%, right there. Tack on handling fees and the math is rough.

Forget real estate, seemingly everybody in the world gets a piece of that action, be it from title searches, broker fees, impact fees, etc.

Also, do not forget taxes! You sell something for a profit and governments, state and Fed. want a piece of the action.

The rules of engagement are against the player from the start. that is why the investor needs to be wary and not overtrade — To control costs and taxes.

Exchanges are like poker games in casinos. For every hand there is a “rake”, for example, in a \$100 pot the house may drag \$5 of it off the table. Put in a dealer tip of \$3 and the player who wins the pot. gets \$92 of the \$100 that was in play. If no new money is added to the table the game will eventually fold due to lack of funds. It will have all ended in the house’s coffers. In a house game you win a \$100 pot you keep the full amount “no rake no toke”. There you have to guard against team play, cheats, and slippage due to betting mistakes.