This is volume two of Off the Podium. If you have not read about statistical inference in Volume 1 (it’s definition and broader interpretation) - it is recommended as the lens to see this post.
In complete honesty I’m using dice to teach about statistical inference, not as the phrase alea iacta est is commonly taken to mean as a point of no return. Though once students see, that is understand, statistical inference there is no point of return to a life blind to what they now know.
We roll dice in class because we generally accept that the fate (and therefore outcome) of a dice roll is random (assuming fair dice are being rolled), and because the outcomes follow a population (universal) probability distribution. Each set of rolls is a sample, and each sample testifies to the population probability distribution. The larger the sample, the greater the testimony. But even several small samples can testify to the population distribution.
An important thought on the randomness of a dice roll though - it is an epistemological randomness, meaning it appears random because we don’t have all the knowledge we would need to know the outcome of a roll. We don’t have all the information we would need about the rather deterministic underlying mechanics of initial position and all the forces acting on the dice first in the hand and then thrown down. And even if we did, depending on the number of dice, we don’t have the math to calculate those outcomes. In contrast, an ontological randomness would be a situation that includes a truly random die that despite the same initial conditions and mechanics the outcome could go one way or the other. When we throw dice in class we accept that the result is random and perhaps even that each die is fair (more below). Whether the randomness is epistemological or ontological, it produces a probability distribution as a universal characteristic of the die that we may be interested in learning and thus knowing.
A note on die vs. dice: According to the Grammerist, in modern English dice can be used as plural or singular, but originally die is singular and dice is plural. I’ll try to stick with the original die (singular) and dice (plural).
The proclamation that a die is fair is an interesting phrase. There are two ways to think about this fairness. The first coincides with our sense of justice, a true value judgement (but not on the inanimate object itself - not ontological but epistemological). This use of the phrase comes from use in game play or gambling. It is fair (or just) that everyone knows the probability distribution of the die or dice being utilized. It does not need to be the case that each face on the cube be equally likely, that is turn up with repeated rolls, on average with equal probability. Fairness in this sense is not in the outcome per se. But the expectations of the people using the dice based on their knowledge of that probability distribution. If I am rolling three dice for a game and I expect that all faces have equal probability then I accept that a sum of 10.5 is the central tendency (mean, median, mode) of a sample of rolls (the larger the sample the more likely it is that the same converges on the central tendency of 10.5 (10 and 11 are equally probable). Based on my understanding of that probability distribution I expect 18 to occur rarely but at least as frequently (or with the same probability) as 3 (the extremes of the sample space creating a range for the entire probability distribution. However, if I believe that these three dice have the aforementioned probability distribution but they actually have a different distribution (they are weighted to roll a six 83% of the time), then my game play or bets will be greatly influenced by the probability distribution of the dice. That is not just, it is not fair. However, if I know what the probability distribution of the dice is, then I can base my game play and my bets on that probability distribution. That is just, it is fair.
The second sense of fairness of a die is not a value judgement on an inanimate object (it is not about justice). It is using the word fairness to communicate the probability distribution and it means that this die is constructed in such a way that it has equal likelihood of landing on each face, it has equal probability of being a 1, 2, 3, 4, 5 or 6. When you roll two fair dice, the central tendency is 7; three fair die the central tendency is 10.5 (equally likely to roll a 10 or 11). And the distribution is normal - for a roll of three dice there’s an equal chance of rolling greater than 11 as there is in rolling less than a 10.
My guess is that all of this makes complete sense to you. Which is simply to say that statistical inference makes complete sense to you. Each roll of the dice (a particular) will sum to one number. The probability of any particular number is determined by the probability distribution of the dice being rolled which is a characteristic of the population created by continued rolls of these dice (including other dice constructed the same way). If you have a die and you don’t know what its probability distribution is, you simply need to observe many rolls, record the number rolled, and then compute the distribution as a histogram (frequency plot) or make a frequency table (the number of times each number was rolled divided by the total number of rolls). You need to gain experience with that die by rolling it, over and over, making note of the outcome, logging it, analyzing it, and thinking about it (reflecting on it). That is both induction and statistical inference all at once. That is learning - you are learning the probability distribution of that die. If it was constructed in a way to be fair (the second sense of that phrase) then the distribution will be uniform (each face has an equal probability of 1/6).
Two years ago I started having students role dice in class. It’s not that it took me 22 years to realize that dice rolling was a great example of statistical inference. It’s that it took me 22 years to realize that actually doing dice rolling would be helpful to a student interacting with and coming to terms with the intuitions that they already possessed regarding statistical inference. I learned that it was worthwhile to spend class time rolling dice and analyzing that data and interpreting it as opposed to simply saying “dice rolling” and expecting everyone to immediately see that this is something they understand fundamentally, and that this is statistical inference, and that therefore they understand statistical inference, fundamentally. Shame on me.
May this post be read by past students that did not have the chance to cast dice with me, as a way for me to redeem myself and make good on their education on statistical inference! (And for a limited time I can give you dice that I have cast for free, just send me lots of money for shipping - enough to break the inertia required for me to pack them up, add your address, and bring them to the UPS store :)
An important point about statistical inference for a clinician is the need to think in terms of observations (particulars) becoming samples, samples having means, populations having distributions, and then, …, back to particulars. The samples, sample means and populations are an accumulation of particulars. That’s the challenge. That last bit, particulars. There’s no question (or at least I’ve been convinced) that induction and statistical inference is how we learn (I was convinced long ago reading Holland et al, Induction: Processes of inference, learning and discovery). Incidentally, I came to read Holland’s books and papers after reading Crichton’s two books in the Jurassic Park series (not the movies, the books). Crichton gives credit to Holland in the acknowledgements of “The Lost World” which was the second book. Holland, along with many other scholars, were foundational to Crichton’s understanding of adaptation in natural and artificial systems. I could digress further at this point on adaptation as a fundamental concept for physical therapy (and many areas of inquiry), or on the caveats that despite my belief, claim that induction is how we learn, I still don’t consider myself an empiricist overall. Noting these digressions here is mostly a seed planted for future posts…. For now I move on.
Particulars are our practical point of focus when we engage with the world - for the physical therapist, when we practice. An accumulation of particulars subjected to inductive inference is how we learn about particulars, which is when how we make sense of, or interpret particulars. The probability of rolling a 1 when rolling a die is the same as a 4. So - no surprise there when these particulars are observed. The probability of rolling a 2 when 2 dice are rolled is less than rolling a 8 when 2 dice are rolled, so it’s less common. But it is certainly possible. The probability of rolling a 1 when two dice are rolled is impossible. If accepting a die balancing on the edge as a 0 roll, then rolling a 1 when two dice are rolled is highly improbable, and a 0 is even less probable.
The entire process of examination - not only in physical therapy but examining anything - is based on applying what you know about probability distributions to the particular you are examining. When I’m doing a word study of a Greek word in the New Testament, the first thing I learn is how frequently that word is utilized in the NT. How rare is it to see that word? Next, how frequently it is used this same way (noun, adjective, with the same accents, context, etc). When you read a story, you’re confronting a particular and examining whether it is coherent, whether it is surprising, whether it seems plausible or possible. Those are all judgements based on a prior process of induction, of statistical inference.
When it comes to statistical inference and induction in your thinking - I’m reminded of an anonymous quote on what was my favorite bench to sit on in the South Campus Quad of UMass Lowell - “Does a fish know it’s wet?”
Inferential statistics is a process whereby statistical inference separates from induction. It involves the use of probability to quantify induction so that an understanding of an (albeit estimated) probability distributions can then be used to quantify the probability of events having occurred given some assumptions.
Inferential statistics are often utilized to test the assumption that there is no difference in the probability distribution of events that are systematically observed under controlled conditions that include only one known difference between samples (i.e. an experiment). Let’s say one set of observations includes the particular outcomes of rolling 5 dice; and the second set of observations includes the particular outcomes of rolling six dice. With one roll of 5 dice and one roll of six dice, there’s a a high probability that the one roll of 5 dice will sum to a larger number than the one roll of 6 dice. However, with repeated rolls of the five dice the sample mean of the sums will eventually prove to be smaller than the sample mean of the sums of the repeated rolls of the six dice. The roll of 5 dice has a different probability distribution than the roll of 6 dice. The difference is subtle enough that it takes many rolls. Having recently done this experiment twice with 50 rolls (well, having students in two sections of my class do this experiment) - I can say with confidence that 50 rolls is enough to demonstrate a difference of approximately 4 in the sum between rolling 4 dice and rolling 5 dice (with a 95% Confidence Interval between approximately 2 (low estimate) and 6 (high estimate). The more samples, the more accurate the estimate of 4, and therefore the narrower those confidence intervals get. The point is, we’re pretty confident that there is a difference in the effect (the outcome of the sum) of rolling 5 dice or 4 dice. We’re pretty confident that the magnitude of that effect is 4. Not a very revolutionary finding. But the insights, which are the same insights required to understand any experiment written up in any journal, were revolutionary when they took root in these past 100 years or so. And the ability to understand this concept can lead to revolutionary insights.
On more insight before closing. Did you ever question that rolling 5 dice would sum to more than rolling 4 dice, on average, with more and more rolls? If not, and I suspect not, doesn’t that testify to you having already - from your limited dice rolling experience - already having had used induction to know something about the probability distribution of dice rolling? And - did you ever consider the possibility that we could have been describing 4 weighted dice being rolled vs. 5 “fair” dice? Doesn’t that testify to the power of assumptions and it’s ability to create bias? But had I told you that the 4 dice in our experiments had a higher sum than 5 dice after 50 rolls, wouldn’t you have suspected that there was something strange - some alternative explanation - about these particular 4 dice or 5 dice? Doesn’t that testify to the power of induction and statistical inference as a corrective to the power of assumption and bias in our reasoning?
So we’ll keep rolling the dice in clinical inquiry. The exact outcome of a roll is random. But we can create systematic differences in the situation and context of rolling that simulate experimental (or even observational) investigations - such as the number of dice being rolled, or die with different underlying probability distributions, or some combination of these. My hope is that everyone’s understanding of induction and statistical inference will be more than random, that it will be systematic and a useful approach to understanding the world in which they operate and a corrective to their reasoning.
Alea iacta est.
So my take away: avoid the crap tables in Vegas and reread Gladwell's "Blink"