In 1827, Scottish botanist Robert Brown reported the seemingly random jittering of tiny particles ejected from pollen. These particles wiggled under Brown's microscope, meandering in no particular direction. This random movement came to be known as "Brownian motion." In 1905, a Swiss patent clerk named Albert Einstein published a mathematical theory that describes Brownian motion. Einstein's theory of Brownian motion contains many characteristics of a random walk. Later models and theories would preserve this link between particle motion and randomness.
There is an interesting philosophical point about introducing the concept of randomness into the modeling of the physical world. Randomness was not always seen as an essential aspect of nature. In classical physics, the motions of atoms and molecules can be viewed as deterministic. If one knows the location and velocity of each particle in a system, and understands in detail the forces between these particles, it ought to be possible to predict all future particle configurations.
There are several problems with this idea. First, it is very difficult to understand at the required level of detail all of the interactions between molecules. Second, even if you knew these forces exactly, working out an exact solution to the motion of all the molecules in the system is mathematically intractable. So even if you view the world as an entirely deterministic one, you have no hope of modeling it in this way. 19th century scientists such as Maxwell, Boltzmann, and Gibbs applied probability concepts to physics as a way of getting around this problem. It was foolish to try to predict the trajectory of a single particle, but if you considered many different particles with a range of trajectories, you could devise models that predicted the likely range of outcomes for the group. Imagining microscopic processes as random led to macroscopic order. This was the foundation of the kinetic theory of gases and the field of statistical mechanics. It was a powerful idea. The use of randomness to understand and model the physical world played a critical role in the course of science and technology in the 20th century.
But what does this have to do with sports? Individual events in sports seem deterministic, and yet are quite difficult to predict. Attempting to understand and describe sports without a healthy respect for random processes is hopeless. And yet a large amount of sports commentary ignores randomness.
The random walk provides a powerful way to visualize the role of randomness. When we view events through the lens of probability theory, stuff starts to make more sense. Let's take a peek at what randomness looks like, to try to better understand its influence in sports.
Visualizing randomness: flipping coins
Let's consider a simple random process -- the flipping of a coin. Imagine a game where a coin is flipped 20 times. Each time the coin lands heads, player A is awarded a point. Each time the coin lands tails, player B is awarded a point. It is a simple and completely random game.
The figure below depicts a single game consisting of 20 coin flips between players A and B. The vertical axis presents the number of points ahead (or behind) player A is; positive numbers correspond to situations where player A is ahead. The horizontal line represents the number of flips. Following from left to right allows us to track the progress of a game. The blue line shows the trajectory of a single game. (More about the black lines in a minute.) Following the blue line, we see that player B gets off to an early lead. After nine flips, player B is ahead by three points. Starting from the tenth flip, player A goes on a run, scoring on all but two of the remaining flips, and winning easily.
We can easily imagine players A and B play this same game many times. With each new game, each player has an equal chance to win before the game starts. The graph below illustrates the results for ten games, where each game is depicted by a different colored line. As in the first graph, the black lines do not represent games, and will be explained below. The graph shows a range of outcomes for the games. Sometimes the games are close. Other times a player wins easily.
There are a few important characteristics of random processes of the types illustrated in the figures above. With 20 coin flips, there are approximately one million possible game trajectories that can take place. Each of these game trajectories has an equal probability of occurring. Only one trajectory can end with a score of 20-0, while many can end with a score of 10-10. If we average over all of the possible games that players A and B could have, the expected point difference is zero. This is true no matter if the game consists of twenty coin flips or one thousand.
Just because zero is the expected point differential, it does not mean that all of the individual games will have a point difference of zero. Non-zero scores are quite common. One way to think of this point difference is as a distance. The theory behind random walks tells us something important about this distance. On average, the magnitude of this distance from zero is proportional to the square root of the number of coin flips. The black lines in the figures above plot the positive and negative square root of the number of coin flips, illustrating how this changes as the number of coin flips increases.
Without having to get too deep into the math, we can think of the black lines as enclosing a typical range of results for the coin flipping game. Games can fall within this range. Games can also end up outside of this range. But games ending up within this range are very common, and should be seen as typical and unsurprising. For games consisting of 20 flips, one player winning by four or five flips is not unusual, as the square root of 20 is approximately 4.5.
Luck doesn't really even out
Fans of a team on the wrong side of a few bad breaks often comfort themselves with the phrase, "luck has a way of evening out." At least some people must actually believe this. Where does this belief come from? For some, it may come from a belief in outside forces that alter fate. But for many, it probably comes from a poor understanding of probability theory. At least for random processes, luck shouldn't be expected to even out.
Let's again use as an example our coin flipping game, a game where the outcome is completely determined by luck. At the start of the game, each player has an equal chance to win, and the expected point difference for the game is zero. Now let's imagine that player A starts off on a hot streak, scoring points on the first five coin flips. After the first five flips, the score is 5-0. While player B may comfort himself by saying, "luck has a way of evening out," more likely than not he is going to lose. If the score is 5-0 after the first five flips, the expected point difference after the remaining fifteen flips is five. Player A is expected to win. He will not always win, but he will be more likely to win than to lose.
It is important to keep this in mind: random processes do not divide positive outcomes between the participants. Quite often, luck will look more favorably on one side than the other.
Sports as a weighted random outcome generator
(Source: xkcd.com. Permission for use described here.)
The coin flipping game is a useful way to relate to randomness, but there is a limit as to how far we can take this idea when trying to describe the outcomes of sporting events. There are only a handful of things in sports that are truly random 50/50 propositions - actual coin flips, which team falls on a fumble, Shaquille O'Neil free throws - whereas most other events do not work like this. For many events, the odds are something other than one to one. But changing the odds doesn't remove randomness from the picture.
In a somewhat more general case, the proper metaphor is a biased coin. A biased coin is the type of coin used by someone who is crooked. However it might be constructed, a biased coin is a coin that lands heads with a probability that is something other than 50%. Rather going to the trouble of constructing a biased coin, it is much easier to use a computer to construct some sort of weighted outcome generator, or to use the techniques of probability theory to analyze these sorts of situations.
One situation that can be reasonably described as a weighted random outcome generator is shooting free throws in basketball. Some free throw shooters are better than others. On average, a better shooter makes more free throws than one that is not as good. But a good free throw shooter will still miss, and these misses will appear to be random. Missed free throws for a good free throw shooter are often due to some small and almost undetectable error. We can probably view these sorts of errors as almost completely random, without missing much of importance. Bad free throw shooters are often inconsistent in their approach, making the results subject to randomness as well.
Another situation where this thinking clearly applies is in baseball, in the batter pitcher match up. Here there is more to consider: a batter that hits .300 will not hit for this batting average in every situation. Perhaps the pitcher is not a good match up for him. Perhaps the park is not good for hitters. Perhaps the wind is in his face, the sun is in his eyes, and the mother-in-law is in his house. These combined effects turn him into a 0.260 hitter, so we put 0.260 into the weighted random outcome generator, sit back, and see what happens.
Noise in the data
One effect of all of this randomness is that events are hard to interpret. When a field goal kicker misses the winning kick, we often attribute it to lack of guts. With so much randomness involved in individual events, it is hard to know if this is true. Randomness places a lot of noise in the data, while the signal measuring the kicker's ability to rise to the occasion is rather weak.
Randomness makes it really hard to draw conclusions without a lot of data. You may not like this -- as someone who analyzes sports in his free time, I don't like this -- but that is besides the point. Randomness is simply a fact of life.
But while randomness takes something away from the analyst, it gives so much more to the sports fan. It gives us moments like this.
"THE SHOT" Tate George from Burrell full court pass 1 second left (via chiefroc1)
And like this.
Flutie's Miracle in Miami (via CBS)
So we ought to accept, understand, and embrace randomness. It is hard to imagine that sports would be as much fun if everything was preordained and predictable. Without the possibility of surprise, it is hard to imagine that we would care.