Actions

Movie Project

From Santa Fe Institute Events Wiki

Quantifying the evolution of cultural ideals through film

Summary

Films provide an unparalleled view into cultural ideals of social interaction. We have enough data to look at how the structure of movies has changed from 1960-2012. This task is related to the predicting metadata and time dynamics tasks, but it has more of a focus on identifying important cultural shifts.

Some concrete questions

  • How has the global interaction structure shifted over time, e.g. do movies today have more complicated interaction networks? (Essentially part of the metadata task below).
  • How has the role of gender changed.
  • Are genres more "well-defined" today than they where in the past? I.e., has the within-genre variance between networks decreased?

Interested

  • Will Hamilton
  • Michael Schaub
  • Catriona Sissons

Predicting Metadata from Network Structure

Summary

This is a 'meta' task... Essentially the idea is to use machine learning or any kind of other techniques to predict things like success, genre etc. of a movie.

First Few Tasks

  • Script to download all movie galaxies (MS -- done; see post by Andrew in slack)
  • Conversion from gephi to useful format (MS -- done; note that there is a broken file and two movies with 1 and 0 nodes!)
  • Network comparison (MS -- running atm; note that you need to get orca for graphlet counting)
  • Get DigitalSmiths data in usable format (WIll -- almost done; tons of good metadata like Rotten Tomatoes scores etc.)

Interested

  • Michael Schaub
  • Andrew Meller
  • Xiao (Thomas) Zhang
  • Lu Liu
  • Harrison Smith
  • Will Hamilton

Network Construction and Time Dynamics

Summary

The main goal here will be to look at the time dynamics of the movie character networks, with a particular focus on how characters are introduced to the network. We can use this analysis to see how stories develop through the network construction. This can be compared between movies to see how similar network construction and dynamics are across movies.

Interested

  • Moriah Echlin (moriah.echlin@gmail.com)
  • Dan Biro (daniel.biro@med.einstein.yu.edu)
  • Will Hamilton (wleif@stanford.edu)
  • Michael Schaub

Trope network

Summary

There is another dataset from TV Tropes (http://tvtropes.org) that I would be happy to bring into this project. Tropes are story telling elements (if you go to http://tvtropes.org/pmwiki/pmwiki.php/Main/Tropes and read a few entries, you will quickly get a sense of them). The dataset contains ~3,500 movies and a list of tropes for each, as well as the movie's year, IMDB rating, and box office.

I am interested in studying story archetypes (typical plots). From a network perspective, it may be possible to build a directed network of "narrative" tropes (identified in http://tvtropes.org/pmwiki/pmwiki.php/Main/NarrativeTropes , but may need more inspection), where the edge directions represent time orders. The time sequence of tropes is not represented in the TV Tropes data, therefore I'm thinking if any of Will's datasets may shed some lights on it. If the network construction is successful, extracting the backbones of the network will show us what are the most commonly used story arcs in movies, etc.

This is only a half-baked idea, and I would love to hear any ideas/comments. If anyone is interested, please let me(Elise) know.

Interested

Natural Language Processing of Dialogues

Summary

Data

This subproject works with the dataset of Cornell Movie-Dialogs (www.mpi-sws.org/~cristian/Cornell_Movie-Dialogs_Corpus.html). Already clean.

Objective

The aim is explore the semantic information contained in dialogues (dynamic and static), and ideally to be complementary to other subprojects (on the film overlap) by bringing new features for datamining.

Ideas

Put your ideas here

  • (Juste) Use sentiment analysis to establish temporal profiles of sentiment evolution in movies. Try to find typical profiles by time-serie clustering e.g. ; check if they correspond to movie classification.
  • (Lu) Study difference among male/female characters by sentiment analysis, and how gender difference evolves over time and genres.
  • (Harrison) Comparing language usage in movies vs books over decades in which data for both are available. Are there any trends predicting how language is used in books vs movies?
  • (Marius) Can we create a static network from dialogues for each movie (with refined features like negative/positive interactions, strength of the interactions, etc) that predicts the success of the movie better than a network based simply on screen co-occurence?

Interested

  • Marius Somveille (marius.somveille@zoo.ox.ac.uk)
  • Lu Liu
  • Juste Raimbault
  • Will Hamilton (wleif@stanford.edu)
  • Harrison Smith