by Trey Pruitt
I recently enjoyed reading the 1983 classic Adventures in the Screen Trade by William Goldman (1931-2018), who wrote many screenplays including Butch Cassidy and the Sundance Kid1. In the book, Goldman famously observed that when it comes to predicting the success of a movie, "nobody knows anything." I don't think he was implying that studio executives are stupid; it’s just that it’s very difficult to foresee whether a movie will be a hit or a flop. While *Butch Cassidy and the Sundance Kid is one of the highest-grossing movies of all time2, many of the other scripts that Goldman wrote were box-office failures. Despite seeming like promising projects at the start, things didn’t turn out well; a situation that often frustrates artists and businesspeople alike.
"Nobody knows anything...... Not one person in the entire motion picture field knows for a certainty what's going to work. Every time out it's a guess and, if you're lucky, an educated one.".
More than thirty-five years after Goldman's statement, accurate predictions of box office success still seem to elude Hollywood. Consider the releases of the western-themed movies Django Unchained and The Lone Ranger. Django was a hit, whereas The Lone Ranger lost more than $100 million, despite a proven team including no less a star than Johnny Depp. All of this has made me wonder: given the difficulty of accurately predicting the future, is it worth the effort? And if so, are we any better at making predictions than we were three decades ago?
I believe the answer to both questions -- at least from a business perspective -- is "yes". Will a start-up investment turn into a billion dollar company? Will next year’s revenue be more than $100 million? Are users of a website more likely to click a green or orange button? Developing good answers to questions like these are critical for business success. What's different thirty years later are two things: our ability to collect and analyze data that will improve our chances of getting a prediction right and, perhaps less well-known but equally important, our ability to rapidly gather new data to test hypotheses.
My appreciation for good predictions is influenced by my perspective as a business consultant who helps companies use data and analysis to make better decisions. Most of the predictions we encounter in day-to-day life are really just for entertainment purposes. As stats-head and political blogger Nate Silver points out in his book Signal and Noise, the prognostications of talking heads on shows like The McLaughlin Group or FOXNews are actually worse than flipping a coin. Perhaps it's because these commentators have little to gain by being right and are not held accountable for being wrong. But for rest of us, predictions are critical to making decisions. And decisions have real consequences.
So how can we make educated guesses about the future that are more accurate than the flip of a coin? The answer can be ";in the data";, but businesspeople must tread cautiously here or risk getting swept up in the so-called big data euphoria. More data is often helpful but it is important to realize that bigger isn't always better. I often see companies spending so much time collecting, managing and wrangling data that they lose sight of the original purpose. More important than the sheer volume of the data are the questions: Will this new data help us make better predictions? Will it change our decisions?
When it comes to gaining insight about the future, I will always choose precious data over big data. What do I mean by ";precious data";? Data about the truth that is finite (and sometimes non-obvious), but gets to the heart of the matter. I'm talking about causal data, which is discovered by asking smart questions that reveal the true drivers of a variable of interest. For most business questions, even a small amount of causal data can make the difference between good predictions and bad, whereas adding "big data" can sometimes just serve to increase the noise. In fact, most of my firm's work in predictive analytics uses less than a terabyte of data (typically less than a billion rows) something that appears to be true for other data scientists as well.
Another advantage we have over the studio executives of thirty years ago is that we are now able to test our hypotheses more cheaply and easily than ever. In fact, the cost of conducting experiments and learning something about what might happen in future is dramatically lower than it was even five years ago2. In the past, conducting a basic survey took weeks and cost thousands of dollars; now, we can use readily-available tools to test and revise hypotheses based on new information, thus improving our predictions by reducing uncertainty about key variables. In fact, at this point, website optimization tools3 are so cheap and easy to use that there is really no excuse not to use them to inform any major e-commerce business decision.
So what can you do to improve predictions in your organization?
Data-driven forecasting isn't a panacea, but in a world where nobody still knows anything, better predictions can make a real difference.
1 Goldman also wrote the novel and screenplay The Princess Bride, one of my favorites
2 Butch Cassidy and the Sundance Kid is #34 on the list of "all-time" highest box-office grossing movies in the U.S., after adjusting for ticket price inflation. Gone with the Wind and Star Wars are #1 and #2. See entire list at [Box Office Mojo](http://boxofficemojo.com/alltime/adjusted.htm)