# Longitudinal Analysis, from Victor Niederhoffer

April 8, 2013 |

The book Applied Longitudinal Data Analysis for Epidemiology by Joseph Twisk is a useful and accessible review for everyone who studies series that have repeated measurements of a subject, person, stock prices, earnings, or markets over time. The methods discussed take account that high values in one period are not likely to be followed by random high or low value in subsequent periods, i.e. the observations are not independent. Such studies are usually found in the medical field where patients are given some treatment and the effects are measured over time. A good example would be how do the various components of diet affect health for a group over time.

There are numerous examples in our field that spring to mind. How do the fortunes of companies devolve over time based on their earnings? How do the fortunes of different markets develop over time? What are the factors that influence the standings and consistency of performance of baseball teams across a season? The book contains methods that are accessible to anyone who has had a basic statistics course and is interested in time series. It starts each chapter with several easily understood examples of longitudinal studies with a few measurements of an outcome like weight gain for each subject. It then shows how to analyze the data assuming the outcome is continuous, dichotomous, dependent on time, dependent on other predictor variables, spaced equally or unequally. It gives a graphical example of each technique used, then shows the model used to analyze the data, then gives statistical output from standard software program that are the results of the analysis, then gives an explanation of the results.

The techniques used are almost always simple extensions of regression using three or four basic computations—- the sum of squares within a subject, the sum of squares between subjects, the sum of squares between groups of subjects at different time periods, and the simple linear regression of how the variations between the above are related to other variables.

There is a very accessible notation used with hardly any nested subscripts or unusual Greek letters used.

Chapters and sections on the design of experiments,nonparametric analysis of the observations, relations with other variables, how to define change and ferret out causality, dealing with missing data, tracking the observations over time, calculating proper sample size to come up with a reasonable likelihood of finding a significant difference appear. The techniques used are generally those that appeal to ones' common sense. They are simple extensions of the t test for measurements between a few groups, and the stability or time dependence of the subjects over time. Rank correlations of the Spearman and Kendall kind are frequently used.

The way to do all these analyses with the software packages Stata, Spss, Spida, and MLwin are shown for all examples. There is a nice discussion of how to use two relatively new techniques GEE and random coefficients. Both of these techniques take account of different slopes and intercepts that might apply to the subjects under study.

I found the book very educational as a review of how to use simple statistical techniques for the study of change over time. The analyses are very different than those used for survival studies. They have great applicability to the kinds of things we study in markets or psychology but are rarely used. All of the techniques and analyses could also be carried out with repeated random samplings from the data. I highly recommend the book as a learning tool for students of change, and a model for teaching modern and accessible methods of statistical analysis.