# Coming back from behind

## Alex Castaldo writes:

Heres the skinny. from math puzzles volume 1, by  presh talwalkar. doc here. from nature walk. originally to stretch aubrey's mind . odds of a comebak victory

Consider 2 teams a and b that are completely evenly matched. given that a team is behind in score at half time, what is the prob that a team will overcome the deficit and win the game. assume the first halve and the second half are taken to be independent events. Presh solves it as follows logically:

Since the two teams are evenly matched, it is equally likely that the team will score enuf points to overcome the deficit or that it will not score enuf points. fo example the event of falling behind 6 pts in a half game happens with the same prob as gaining 6 pts in a half game. He concludes prob is 0.25

Now we posted the empirical resutls from basektaball games and many others have given the empiriclal results for football games … and i gave some results for the markets.. this seems to be of interest to everyone , had the most views of any posts, and it was good for 7 or 8 points today.. lets have your discussion and solution of this problem. presh says the answer is 0.25 both empirically (NFL in 1995) and logically.

## Jared Albert writes:

In a game with two teams where in the first round, the team 1 advantage varies from flat to all the points available in the second round, the probability of  team 0 coming from behind to win are in array with 20 available points in the second round:

[0.49, 0.306, 0.22, 0.129, 0.09, 0.03, 0.018, 0.011, 0.004, 0.002, 0.002, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]

For example, if the teams are even going into the second round with 20 available points, .490 chance that team0 wins; with a one point advantage to team1 at the start of round2, team0 wins .306 of the time;

2 points to team1, team0 wins .220 of the time etc

Here's the montecarlo:

import numpy as np

np.random.seed(10)

out_list = []

out_list = []

count = 1000

win = 1

lose = 0

team0_start = 0

team1_start = 0

size=20

def runs():

z = np.sum(np.random.choice([win, lose], size=size, replace=True, p=None))

return z

def outcome(team1_start, count = count, team0_start=team0_start):

l= []

for _ in range(count):

team0_end = runs() + team0_start

team1_end = runs() + team1_start

came_from_behind = team0_end > team1_end

l.append(came_from_behind)

#print(f'l: {l}')

outcome = sum(i > 0 for i in l)

return(outcome)

for i in range(size):

out_list.append(outcome(team1_start=i)/count)

print(f'outlist: {out_list}')

## Victor Niederhoffer writes:

up your alley i  think. we have done something similar for market with real empirical results. the  unconditional prob is much less than20%

## Stephen Stigler writes:

I am sure you know but I repeat anyway:

1) the simple calculations ignore correlation between teams.

2) they also ignore information on the distribution of changes

3) Calculations using the distribution of changes are not hard.

4) But the information about the probability of extreme events is not well determined so they can be inaccurate

5) In any case  markets unlike sports are not zero sum games.

# Chez Galton, from Steve Stigler

October 1, 2018 | 1 Comment

Took a pilgrimage to a hallowed place.

Best,

Steve

[The plaque reads: Sir Francis Galton, 1822-1911, explorer statistician, founder of eugenics, lived here for fifty years].

# Explanation of the Hot Hand from Stigler, from Dan Grossman

Can this be explained in words so that a reader like me can understand it? The question arises: how to explain this to a normal person not a statistician?

"The ‘Hot Hand’ Debate Gets Flipped on Its Head: A new paper shows how a simple coin toss may prove that basketball players really can get hot"

Prof. Stigler ?

## Steve Stigler writes in:

Here's one take.

It comes from averaging relative frequencies over different numbers of trials.
Here are the possibilities for n=4 and the relative frequency of H following directly after H:

HHHH 3/3=1
HHHT 2/3=.67
HHTH 1/2=.5
HTHH 1/2=.5
THHH 2/2=1
HHTT 1/2=.5
HTHT 0/2=0
THHT 1/2=.5
HTTH 0/1=0
THTH 0/1=0
TTHH 1/1=1
HTTT 0/1=0
THTT 0/1=0
TTHT 0/1=0
TTTH 0/0 undefined
TTTT 0/0 undefined

Total rel freq = 5.67; average over the 14 cases that give data = 5.67/14= .40
Even though the number of successes is 12 out of 24 cases of looking at a
result after a H.

But isn't there a conditional probability explanation for this from normal statistics of bayesian or just conditional nature. Seems like a simple math team problem.

# The Parable of Google Flu: Traps in Big Data Analysis, from Steve Stigler

It seems like computer science people are finally discovering ever-changing cycles.

Do you believe ever changing cycles are related to the regression fallacy in any systematic way?

## Steve Stigler writes:

No.  EC Cycles is a real change — a reaction. But regression is only a selection effect. Google Flu works. Google changes to add ads around it and draws more clicks. Google Flu stops working. Analysts adjust. Etc.

# Great Example of the Regression Fallacy, from Victor Niederhoffer

October 26, 2011 | 1 Comment

A Congressional Budget Office report released today shows that from 1979 to 2007, after-tax income grew by 275 percent for the top 1 percent of households, compared with 18 percent for the bottom 20 percent. Bloomberg News.

Great example of the regression fallacy. Ones that happen to be in top 1% in 2007 necessarily grew more in % than typical person, much more than ones in bottom 20%.

And isn't that also true for trees, the top ones pulling nutrition from the bottom ones and growing from the inside out and higher simultaneously. Hence, why utilities as regulated monopolies are the most stable - they have eaten all the competition.

## Stephen Stigler writes:

A reaction to the report which says "A Congressional Budget Office report released today shows that from 1979 to 2007, after-tax income grew by 275 for the top 1 percent of households, compared with 18 percent for the bottom 20 percent."

First, it is not clear what they actually did, but you can be pretty sure they did not find what the item says. That would involve following a very large number of people and their individual after-tax income over a 28 year period, and I do not believe such data are available - and even if they were available there would be a serious problem of definition - do they mean top 1% in 1979? Or top 1% in 2007? These were not the same people - in fact, some of the top 1% in 2007 were in the bottom 20% in 1979, including probably Steve Jobs.

So what they probably did was just take the average after-tax income of the top 1% in 1979 and compare to that of the different group of people who were top 1% in 2007. Now, as I say, these were different people by and large, with an unreported overlap to be sure.Some in the 2007 group would be testimony to the opportunities that allowed them to improve from even the bottom 20% in 1979, a change that some might think possibly admirable and certainly non-discriminatory.Other problems are that the bottom 20% includes not just the undeniably poor, but also the young and not yet successful, and the comparison at the top compares the pre-Reagan tax cut era (when there were huge incentives to keep income out of the tax calculation), to the peak boom year when rates were relatively lower and the incentives to hide income much less.

This is not really a regression effect, but a different type of selection fallacy. Had they really taken the top 1% in 2007 and followed them individually back to 1979, similarly with the bottom 20%, that would have produced a regression effect. But I doubt they could do that. Anyway, who cares what Mark Zuckerberg was making in 1979? He was born in 1984!

# Galton and the History of Counting, shared by Bill Rafter

Very interesting article on Galton:

One, two, many: The prehistory of counting

The Victorian idea that "primitive" tribes can't count has cast a long shadow over efforts to understand the origins of mathematics

LOOKING back, Francis Galton would call it "our most difficult day". It was 4 March 1851, and the young English explorer was beginning to appreciate the obstacles confronting his attempts to map out the Lake Ngami region of south-western Africa. Struggling to navigate a narrow ridge of jagged rock, his wagon had "crashed and thundered and thumped" while his oxen "charged like wild buffaloes".

To make matters worse, Galton had little faith in his local guides from the Damara tribe, who appeared to lack even an understanding of basic arithmetic - a situation Galton found "very annoying". He recounts that having established an exchange rate of one sheep for two sticks of tobacco, he handed four sticks to a local herdsman in the expectation of purchasing two sheep. Having put two sticks in front of the first sheep, the man seemed surprised that two sticks remained to pay for the second. "His mind got hazy and confused," Galton reported, and the transaction had to be abandoned and the sheep purchased separately.

As further evidence of the apparent ignorance of the Damara, Galton wrote that they "use no numeral greater than three" and that they managed to keep track of their oxen only by recognising their faces, rather than by counting them. At a most inopportune time for his expedition, Galton seemed to have stumbled into a world without numbers.

To a modern reader, these tales in Galton's 1853 Narrative of an Explorer in Tropical South Africa seem little more than pithy anecdotes that reflect his prejudices as a gentleman of the growing Victorian empire. (His preoccupation with the supposed inferiority of other peoples persisted in his later work in eugenics.) Within 10 years, however, those same reports of primitive innumeracy were being used by the finest scientific minds of Victorian Britain to glimpse the savage condition of prehistoric humans.

## Victor Niederhoffer writes:

This seems wrong to criticize Galton. What am I missing?

## Steve Stigler writes:

Vic,

The author is a 1st year PhD student at Princeton who isn't even working on Galton, and writes carelessly without knowledge. See his bio. He looks bright but has a lot to learn.

# Correspondence With Steve Stigler, from Victor Niederhoffer

An article with highly defective statistical reasoning appears in the WSJ purporting to show there is a tremendous turnover in the top 25 companies from 10 years ago to today. It is best to consider this by noting how many of the top 25 baseball players by batting average in 1999 are still there today. That's the best way to get a handle on the regression bias implicit in such studies. The following correspondence between Vic and Steve Stigler puts it in perspective:

Dear Steve,

Hope you and family well. Merry Christmas. Here's an unusal aspect of the regresson bias. Wonder to what extent this consistent with properties of random numbers and to what extent it represents a change in the level of skill or in this case price change. It would be an interesting study. I was pleased that my daughter Kira got into Columbia Engineering School on early admission and another daughter had a grandson Wilder Niederhoffer named after a Libertarian. Best, Vic

Dear Vic,

This is the same as Horace Secrist "Triumph of Mediocrity in Business" 1933, in Chapter 8 of my book Statistics on the Table.

Congrats to all your successful avoidance of and tendency towards mediocrity in your descendents!

I attach my remarks from a memorial for Rose Friedman Dec 12.

Have a happy new year! Steve

Market cap 10 year free market survivors: old fashioned energy (big oil), food (Walmart), and technology (human invention) lead the list and boring consumer Gillette (razor blades for everyman). Things the human race needs for survival and to thrive will always be investments that will endure.

Big government wants to bleed big oil, keep food out of banking (probably best that Walmart was denied a key to the club after all), litigate tech for the halibut. And what can be done to Gillette? Maybe the healthcare bill can attach itself to some personal care products and bleed a little off. Also, energy, food, and the stuff everyone buys is what the government tries to melt out of the inflation indexes. So what we need is what they attack and thus make more expensive to us and then they tell us that its not more expensive and to just substitute chicken for meat.

GE? what do they do other than derivatives and green initiatives which are starting to turn a little brown.

## Kim Zussman replies:

From the "getting little things right but big ones wrong" department:

What were dinosaurs long at KT boundary?

Prior decade's math/science PhDs lemming into finance.

New decade definition: "lemming": The process of best/brightest young people flowing en mass to the latest promising nascent bubble.