Comedy and Tragedy in Shakespeare
From Santa Fe Institute Events Wiki
Project page for A Midsummer Night's Project: Comedy and Tragedy in Shakespeare
Previous project description here: [1].
Data and code currently in the Dropbox folder.
Notes on what is in Dropbox
For those who don't want to use the xml (shakespeare.xml in the Dropbox): in the Dropbox is a folder plain_text that has all the plays in plain text form. Act.Scene numbers are in hashtags like #1.1# on their own line and the name of the speaker is in hashtags before their speech. The words and punctuation are all separated by spaces.
The folder edge_lists contains the edges in the network with weights (currently how many times each pair of characters in each play talks to each other).
classify.py is set up to load pickled (yodelay hee hop!) objects that have been created. For instance, if you run classify.py, you will get an object called networks which contains weighted networks for each play.
full_plays.pdf and all_networks.pdf contain plots of the networks (generated by network_plots.py)
character_counts.txt contains title, play, and number of speeches, lines, and words spoken by each character.
Related work and links
Papers
The small world of Shakespeare’s plays
James Stiller, Daniel Nettle, Robin I. M. Dunbar
Human Nature, December 2003, Volume 14, Issue 4, pp 397-408
This contains both an early network science spin (note "small worlds" in the title and publication year) as well as an evolutionary psychology/biology spin, particularly focusing on group size. A key note here is that one of the authors is Robin Dunbar, of Dunbar's number [2]. Again, this is network analysis circa early 2003, so they're really interested in connectance, average path length, clustering coefficient. The paper is available in the Dropbox and here [3].
Blog posts, etc.
There have been a few informal explorations of this so far that I've found.
There are a set of animations of the dynamics of communication in Shakespeare plays [4] , using PieSpy [5], a heuristic-based tool to infer social networks from IRC (PieSpy paper here [6]). There isn't any analysis here and the data isn't available, but we could check out PieSpy's rules for inferring relationships. (Our methods so far: co-occurrence and before- and after- speaking turns; easy next steps could use weighted co-occurrence--as in the next set of blog posts--and longer-memory speaking turns.)
Prosody & social networks in Shakespeare: [7]. This class does a few projects on social networks of "frequency of communication" between characters. They have a cool (but not pretty) visualization of communication links between characters, colored by type of communication - verse or prose.
Wordseer has a number of explorations of language in Shakespeare plays: on men and women [8] and on "beauty" [9]. They look at grammatical relationships, word order, word frequency by type of play, collections of words, etc. They do a preliminary look at gender and love in Shakespeare. (A nice sample result: "A picture emerges: women’s most commonly-mentioned possessions are their male relatives and their bodies.")
Set of blog posts using co-occurrence to construct social networks in Shakespeare, starting here: [10]. These are co-occurrence networks, where co-occurrences is measured within scenes, and the networks are broken down by acts. Edges are weighted (and sometimes thresholded) by the number of scenes with dyad co-occurrence. They point out that you can start to see the plot of the Tempest, though I think our networks showed that more clearly.
The second blog post of the set, comparing social network density (E/V^2) in tragedies and comedies: [11]. They make a cute observation that these networks are highly connected in Act 5 when the play is a comedy, which often end in weddings (and thus high co-occurrence).
Some other fun places to look: titles at the humanities workshop at NetSci, a networks conference held this week [12]; the Six Degrees of Francis Bacon [13].
Possible and previously discussed project directions
A non-exclusive, non-exhaustive and otherwise rough list of ideas that have come up recently. We could use this page to structure some of our results or sub-projects (see next section).
Predicting tragedy/comedy(/history)
Gender
Gender and character roles
Dynamics of the network(s)
Probabilistic modeling of the networks (consider also tragedy/comedy, gender, dynamics and evolution, etc.)
Language in Shakespeare (consider also tragedy, gender, dynamics, etc.) - more here.
Hierarchical/social roles in Shakespeare
Story structure in Shakespeare
Narrative style and structure (focus on main characters vs. more contextual/observed scenes)
etc.
Possible uses for this page
Structuring or collecting results
Collecting information about the data or code
Sharing analyses
Nothing after this week
Showing off our results at the end of the summer school
etc.
Linear Mixed Effects Model
I've Dropboxed prop_frame.Rdata, adj_mat.RData and shakes_v1_1.R to run some simple analyses on network dynamics. You can call interaction matrices of each play in each act from the list of all plays using adj.mat[[i]] where i is a number from 1 to 135. The data frame, props calls a dataset of network properties (simple properties for now) of length 135. The play and act in row i corresponds to the weighted adjacency matrix adj.mat[[i]].
-Ashkaan
Types of Network Construction
To be included in make_network_of_scene in network.py
Types/wishlist:
Co-occurrence within scene (i <->j if i and j appear in scene s)
Speaker/listener (i -> j if i speaks before j), i.e., directed before/after
Before/after speaking (number of times, i <-> j if i speaks before j)
Binary before/after (i <-> j if i speaks before j)
Before/after speaking, longer memory (i <-> j if i speaks one or two or m turns before j)