How do we gauge diversity in the film industry? A movie has many facets: its onscreen performers, its offscreen creative and technical talent, the story it tells, its budget, box-office returns, MPAA ratings---the list goes on.

There's enough movie data that we can start to comprehensively quantify the state of diversity in the film industry, past and present. The figures above reflect some of the major factors: gender, race, and ethnicity metrics in cast, crew, and script. Movie, poster, and cast/crew name data are from TMDB via themoviedb-javascript-library. Reload this page, and maybe you'll see how your favorite movie fares in its diversity.

The "Gender Count" the figures above indicate the estimated number of females in the top-billed cast and crew names. For this prototype, "top-billed" means the size of the cast or its first ten members, whichever number is smaller. This is also the case with crew size. Gender is based on data from Gender data is based on calculations from Genderize. Note that we have not localized this data yet, so gender approximations may be off for foreign films.

Gender (stats forthcoming):

The "Minority Count" of racial and ethnic minorities is more tricky. For this prototype, we used demographic data for surnames that occurred more than 100 times in the most recently available U.S. Census. We then counted the top-billed cast and crew names that are found in this dataset. Finally, of this subset, we counted the number of surnames that have a greater than 50% chance of belonging to a non-white person. Many surnames, particularly for foreign films, are not in this data set.

Race and Ethnicity (stats forthcoming):

The "Bechdel Test," coined from Alison Bechdel's 1985 comic series Dykes To Watch Out For, refers to a given movie's script. If the script contains at least two female characters and a conversation between the characters that is not about a male character, the movie passes the Bechdel Test. Data is based on the more than 6000 movies rated by BechdelTest.com.

Bechdel Test:

Our calculation for letter grade: We calculate a raw sum (between 0 and 1) of the cast and crew's minority percentages. We then add 0.2 bonus points if the film passes the Bechdel test. If there is no Bechdel data for the film, we add 0.2 bonus points if the raw score is at least 0.40.

By Arlene Ducao for DAEME.