Predicting The Oscars

We’re putting our Oscar prediction streak on the line – will we be right again this year?

“Essentially, all models are wrong, but some are useful.”- George Edward Pelham Box

 

Models are built to understand things that happen in the real world, and then use that understanding to make a decision, or predict an outcome. They are an estimate of processes that can’t be fully understood, and as such, will miss the mark from time to time. Useful models make big data are a part of our lives in the real world. If you’ve found a movie or book you loved via either a Netflix or Amazon.com recommendation, then you have benefited from a useful model.

 

At Compass Point Media, our analytics team builds models to predict outcomes that are important to our clients: foot traffic at retail locations, web site conversions, and viable test markets for new marketing initiatives. When building these models, we often leverage machine-learning techniques that take a given data set, make it larger and more diverse through sampling, and then create an ensemble (crowd) of predictor algorithms aggregated to create the final prediction. Many of our models have been very useful, helping us predict critical business outcomes based on media mix assumptions, or helping us decide which ad to serve a prospective customer based on their historical purchase behavior.

 

Last year, we demonstrated the power of this approach by correctly predicting the winners of the four major categories of the Academy Awards – Best Picture, Best Director, Best Actor and Best Actress. This year, we’re putting that winning streak on the line with four new predictions for the 2016 Awards:

 

BEST PICTURE = THE BIG SHORT

Model probability of winning – 65%

Las Vegas probability – 60%

Key Indicators:

Winning best picture at Producer’s Guild Awards, Winning Golden Globe for best Drama, Nomination for Best Director, IMDB user rating

 

BEST DIRECTOR = ALEJANDRO GONZALEZ INARRITU
(THE REVENANT)

Model probability of winning – 90%

Las Vegas probability – 71%

Key Indicators:

Winning best director at Director’s Guild Awards, Total Oscar nominations, Previous Oscar wins

 

BEST ACTOR = LEONARDO DICAPRIO
(THE REVENANT)

Model probability of winning – 83%

Las Vegas probability – 98%

Key Indicators:

Winning best Actor (drama) at Golden Globes, Winning best Actor Screen Actors’ Guild, Age of Actor in release year

 

BEST ACTRESS = BRIE LARSON
(ROOM)

Model probability of winning – 67%

Las Vegas probability – 91%

Key Indicators:

Winning best Actress (drama) at Golden Globes, Winning best Actress Screen Actors’ Guild, Age of Actress in release year

 

The Science

To generate our predictions, we used machine-learning algorithms to land on the probable outcomes for each major nomination category.

static1.squarespace.png

As pictured above, the algorithms allow us to optimize model complexity and error by balancing bias and variance. An ensemble of simpler models is aggregated into the best possible estimate.  

 

Although the science works hard, the Oscars often provide surprises. This year, the best picture category is a toss-up. The Revenant, The Big Short, and Spotlight are all strong contenders, with our models favoring The Big Short slightly over The Revenant. Las Vegas shows almost equal odds for all three pictures as of February 9th, with The Revenant moving up fast.

Sooner or later, our model will be wrong. Will it be this year? The 88th Academy Awards are on Sunday, February 28, 2016. Tune in to find out who will bring home the hardware and check back to see how accurate our predictions were!
 

DATA SOURCES:

Academy Awards Database:  http://awardsdatabase.oscars.org/ampas_awards/BasicSearchInput.jsp

Internet Movie Database:  http://www.imdb.com/

Roger Ebert Reviews:  http://www.rogerebert.com/

Screen Actors Guild:  http://www.sagaftra.org/

Producers Guild:  http://www.producersguild.org/

Directors Guild:  http://www.dga.org/

 

References:

Pardoe, I. and Simonton D. K. (2007) Applying Discrete Choice Models to Predict Academy Award Winners, J. R. Statist. Soc. A(2008), 171, Part 2, pp. 375-394

http://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff

http://www.statsoft.com/Textbook/Multivariate-Adaptive-Regression-Splines

https://en.wikipedia.org/wiki/George_E._P._Box

https://en.wikipedia.org/wiki/All_models_are_wrong

 

Big Surprises by category:

BEST PICTURE:

The Departed – 2006

Won with only 7% probability
 

BEST ACTOR

Denzel Washington – Training Day 2001

Won with 9% probability
 

BEST ACTRESS

Marion Cotillard – La Vie en Rose 2007

Won with 8% probability
 

BEST DIRECTOR

Roman Polanski – The Pianist 2002

Won with 7% probablility