Complex Systems Summer School 2018-Projects & Working Groups
From Santa Fe Institute Events Wiki
|Complex Systems Summer School 2018
- 1 Projects
- 1.1 Estimating the true number of malaria cases in Venezuela
- 1.2 Characterizing the spatiotemporal transmission dynamics of smallpox in the United States prior to eradication
- 1.3 Understanding and creating music
- 1.3.1 Understanding music from a complex system point of view
- 1.3.2 Neural style transfer in music styles via interacting agents
- 1.3.3 Potential Data
- 1.3.4 Packages to handle MIDI/music (based on python)
- 1.3.5 Thoughts?
- 1.3.6 Interested Participants
- 1.4 Optimal representations of high dimensional data in deep learning and biological systems:
- 1.5 The Emergence and Evolution of Legal Systems as Pertaining to Water Distribution
- 1.6 Academic hiring networks
- 1.7 Make deep neural networks more biologically accurate by including inter-neural travel times
- 1.8 Evolution of social norms as a process within or between societies
- 1.9 Topological features of neutral networks in evolution
- 1.10 Networks from thresholded normally distributed data
- 1.11 The Evolution of Beliefs in Abrahamic Religions
- 1.12 City as a Complex System: Clustering/Mobility Network Effects
- 1.13 The Evolution of Water Narratives in US Newspapers
- 1.14 Reproducibility and Underdeterminacy in Mathematical Modeling
- 1.15 Classifying language by grammatical motifs
- 1.16 Structures in Open Source Software Communities
- 1.17 Measuring information distortion in networks (rumors/fake news)
- 1.18 Measuring epigenetic effect of stress at a macro scale
- 1.19 Topology of natural conversations
- 1.20 Scaling of information requirements in living things
- 1.21 Modeling Utility of Cryptocurrency in a Failing Economy
- 1.22 Twitractors: What kind of non-linear dynamic attractrors exist across OSM discussions
- 1.23 Fluctuations in correlated data, random variables or models
- 1.24 Understanding Cardiac Dynamics in Health and Disease (#cardio)
- 1.25 Multi-scale Adaptive Systems
- 1.26 Evolution of trade networks
- 1.27 Exploring Income Inequality From a Game Theoretic (or Other) Perspective:
- 1.28 Understanding/Optimizing the features of social network structure to reach a quick but fair consensus
- 1.29 Searching for patterns and narratives in the SFI Complex Systems Summer Schools
- 1.30 Emergence of sustainable development contradictions
- 1.31 Metabolic rates and the collapse/transformation/adaptation of societies
- 1.32 Mean First Saturation Time (Random walks on networks)
- 1.33 The effects of changing relative timescales on complex systems
- 1.34 Dance Improvisation and Complex Systems
- 1.35 NIH Brain Activity Analysis
- 1.36 Looking for unparalleled biological innovations
- 2 Archived Projects ("Parking Lot")
- 2.1 Using Principles from Complex Systems in Thinking about AGI Development
- 2.2 Robustness of the presidential information cascade on Twitter
- 2.3 Peer-review process
- 2.4 Distribution of water resources on a national scale
Estimating the true number of malaria cases in Venezuela
In 2016, Venezuela experienced one of the worst economic collapses in Latin America. The effects of this collapse have resulted in unprecedented inflation and food insecurity. The economic collapse has also caused the subsequent collapse of the medical and public health infrastructure, resulting in a surge of malaria, a mosquito-disease that was previously eradicated from Venezuela in 1977. However, due to a lack of governmental transparency and under-reporting of malaria cases from the government, it is challenging to know the true magnitude of the malaria outbreak, and to understand where within Venezuela the center of the epidemic is occurring. It is important to understand the actual factors resulting in the increase and spread of malaria within Venezuela and the spillover of cases other countries resulting from out-migration of Venezuelans to know inform prevention and control measures for the outbreak.
Our project aims to use publicly available data sources, such as Pan American Health Organization malaria reports from Venezuela and bordering countries, migration flows from Venezuela into bordering and proximal countries around Venezuela, new reports and social media, and economic/medical indicators from previous years, such as the cost of antimalarials to reconstruct the time-series of the malaria outbreak, to quantify the true number of malaria cases occurring in Venezuela and to identify factors contributing to the outbreak.
1:15 monday June 18
Characterizing the spatiotemporal transmission dynamics of smallpox in the United States prior to eradication
Small pox is a highly contagious infectious disease eradicated through vaccination and social-distancing interventions. However, the city-to-city spatial transmission of smallpox is not well characterized. Understanding how smallpox moves between cities can have important implications for understanding how re-emerging vaccine-preventable infections, such as measles, can potentially spread, and subsequently controlled in the future.
This project aims to apply a metapopulation model to weekly case data from a number of cases in the US to estimate the rate of transmission between cities, determine if certain (i.e. larger) cities seeded epidemics to others (i.e. traveling waves), characterize any synchrony of epidemics across geographic regions, and to examine the effects of vaccination on transmission.
Grenfell BT, Bjornstad ON, Kappey J. Travelling waves and spatial hierarchies in measles epidemics. Nature 2001;414:716- 23.
Project Tycho (data repository of MMWR notifiable diseases: https://www.tycho.pitt.edu/dataset/US.67924001/)
friday @ SFI after 1st lecture (10:00 am) monday June 18, 6:45-7:30, location: 2nd fl residence hall
Understanding and creating music
This project has two direction:
- 1) Understanding music from a complex system point of view
- 2) Creating new music via neural style transformation
The two directions are not separated, if lucky enough, we hope to see them feeding each other :)
Understanding music from a complex system point of view
Music is definitely very complex. It is a combination of time (eg. melody) and space (eg. harmony structure across instruments). With all beautiful music in the world including profound and somewhat mathematic ones like Bach as well as inspiring ones as Beethoven, from rock and roll to electronic music, we don’t have a lot of understanding in them.
In this project, we aim to understand music from a complex system point of view, whether we could define the “style” for each music genre or era and composer, or whether we could quantitatively analyze the structure of a music piece. Music is composed with note sequences of different “layer”, including temporal information as well as notes interacting each other in time. Though there are only finite number of notes available, but the sequence it generated is infinite. Mathematically, music could potentially be described as a “network”, but a very complex one which is temporal, multilayer, higher-order(dyad may not be the best representation here).
One kind of a detailed idea/question: Using network theory including multilayer networks, higher-order networks and temporal networks, could we figure out how each music genre differs from others and how each composer become characteristic?
Novelty: Representing music as a network is not new, however, among the literatures, there is not many representing music as a network which is temporal, multilayer, potentially higher-order, which would add a whole new level of complexity in the study.
- Me and my friend have done a very simple course project related to this, where we cluster 330 classical music pieces and found they corresponds to music era. We also found Bach fugues has distinct look using some matrix: link to paper
- Some one in Italy did this, the thing I don’t like is that he abandoned the time information in music, which is vital: link to paper
- Complex network structure of musical compositions: Algorithmic generation of appealing music
- There are also work done on relationship between music and psychology: link to paper
- Scaling in music! Multiple scaling behaviour and nonlinear traits in music scores
- A Music-generating System Based on Network Theory
- Complex Networks of Harmonic Structure in Classical Music
- Complex network approach to classifying classical piano compositions
- Musical rhythmic pattern extraction using relevance of communities in networks
Neural style transfer in music styles via interacting agents
- A) learn generative models of different music styles using neural networks.
- B) let these networks ('agents') interact and see what `fusion' music styles result.
Novelty lies in having the a) multiple agents learn multiple styles independently then letting them exchange information in a meaningful way (probably the trickiest bit) and b) letting these fusion music styles evolve in a network etc. and see what "world-music" results at the end for example.
Details will come...
- neural style transfer for images (make images look like Van Gough paintings etc.) : https://tinyurl.com/ybpq5agm
- neural nets for music: https://tinyurl.com/yb2qdqbq and http://imanmalik.com/cs/2017/06/05/neural-style.html, https://magenta.tensorflow.org/performance-rnn
- bunch of theories of how music styles are results of combination: https://tinyurl.com/y723ugyo
- music recommendation using neural networks (from Spotify): http://benanne.github.io/2014/08/05/spotify-cnns.html#predicting, https://papers.nips.cc/paper/5004-deep-content-based-music-recommendation
- MusicNet, lots of information for each pieces, but only 330 pieces and biased on composers
- MIDI corpus
Packages to handle MIDI/music (based on python)
- Python-based toolkit for computer-aided musicology: music21
- Mido is a library for working with MIDI messages and ports
- Python MIDI, not maintained in a good way though...
- For the generation one, we could also use text corpora instead? Shakespeare etc., creating like Shakespear + Tolstoy for example :D
- You guys might be interested in checking the MusicMap project. Link: https://musicmap.info
(Please denote your background and your potentially interested direction, or providing a ranking if interested in both: A.understanding B. generating)
- Xindi (good at network science, data mining, a little bit machine learning, ranking: 1.A 2.B)
- R Maria
- Ricky (multilayer networks, machine learning, data mining. ranking: 1A 2B)
- Nam Le (Neural Networks, ML, Music lover. ranking: 1A, 2B)
- Xiaoyu (Background in control theory, electrical system, ranking: 1A 2B)
- Ana (music lover, good at synthesizing research)
- Josefine (Networks, ABMs, plays music and knows some music theory. Ranking: 1A 2B)
Optimal representations of high dimensional data in deep learning and biological systems:
What is the best way for a system to represent very high dimensional data? For example, how should the retina encode visual stimuli in neuron firing patterns? How does the immune system encode the space of antigens it might encounter? In each case, it would not be feasible (or efficient) to create a unique tag for each input. Rather, the systems in question must decide which features in the stimuli are most relevant, and trade off between specificity and generality.
Along these lines, there are two more specific questions to investigate:
-It has recently been conjectured that the success of deep learning networks is related to their optimization of a specific informational quantity in each layer https://arxiv.org/abs/1710.11324. Unfortunately this paper is not very clearly written, but basically the idea is that when binning inputs into representations, the distribution of bin sizes should be given by a specific power law, which optimizes the aforementioned information measure. Do biological systems employ the same strategy? With access to the right data, this idea should be straightforward to test. For example, if we have a list of antibodies together with the set of antigens they react to, we can compute this quantity and see whether the antigen "bins" are indeed distributed according to the predicted power law.
-A diverse collection of biological systems that are faced with this task seem to be well-modeled by maximum entropy distributions, with a constraint on pairwise correlations and parameters (i.e. lagrange multipliers) set near a critical point https://arxiv.org/pdf/1012.2242.pdf. This has been applied to the previously given examples of the retina and the immune system, as well as flocking in birds. As far as I know, it is not yet known with certainty whether this kind of encoding scheme is optimal in some sense (like in the previous bullet), or if it is an artifact of our own inference methods, but I think the answer is interesting either way. An immediate question is, if these maximum entropy models are a powerful tool for humans to model high dimensional systems, might biological systems also be producing their own maximum entropy models of environmental variables? That is, are maximum entropy models with constraints on pairwise correlations optimal in some information-theoretic sense, which can be made precise? For example, would this be a particularly useful way to model the distribution of natural images one might encounter? While less straightforward than the previous bullet, I think these are questions well-suited to the skills of the people here, and I think we could make significant progress!
If anyone has expertise to offer, your feedback/participation would be very much appreciated! In particular, I think this project would greatly benefit from those of you that have knowledge in machine learning and biology (my own area is physics and information theory). Feel free to email me at firstname.lastname@example.org
Thoughts? Recommended Papers?
A genetic model with the resulting protein products could also be useful here (e.g. looking at expression levels and/or variants in a particular gene or set of genes as it pertains to the protein(s) coded by the aforementioned gene(s). In sum, can we find/demonstrate an algorithmic basis for gene expression and/or protein coding? - Kofi
1. Castillo et al. "The Network Structure of Cancer Ecosystems." SFI WORKING PAPER: (2017)
- Sarah B. (experience with sequencing data/gene expression)
The Emergence and Evolution of Legal Systems as Pertaining to Water Distribution
There are numerous legal systems that have been identified, broadly categorized into large families – Common Law (Anglosphere and Commonwealth nations), Civil Law (Romance Language nations, Germany, China), Islamic law (most Muslim nations), Customary Law (India, sub-Saharan Africa). More importantly, most nations do not purely lie in one category, but tend to combine elements of multiple systems, either due to merging (i.e. German law combining Germanic tradition with Civil traditions), or through subsidiarity (i.e. Louisiana having Napoleonic law, despite being in a Common Law nation). We are interested in determining how these legal systems by nations and states emerged, influenced each other, and interact over national boundaries.
This is an immense task, so to scope it, one idea has been to limit this project to laws pertaining to water distribution. This is of particular interest when looking at states of nations that have different legal systems, such as Louisiana in the U.S., Quebec in Canada, and Scotland in the U.K. For international interactions, sub-Saharan African nations might also be of value in assessing, as many nations border nations with different legal systems, and water is often a scarse resource in these areas.
If anyone has interest in this topic, and/or expertise in either legal systems or water distribution, feel free to sign up or discuss.
Energy and Efficiency in the Realignment of Common-Law Water Rights, Carol M. Rose, The Journal of Legal Studies 1990 19:2, 261-296
Theories of Water Law, Samuel C. Wiel, Harvard Law Review, Vol. 27, No. 6 (Apr., 1914), pp. 530-544
1. Kevin Comer
2. Cedric Perret 3. Chris Fussner 4. Jared Edgerton
Academic hiring networks
Project discontinued due to data availability issues.
Make deep neural networks more biologically accurate by including inter-neural travel times
Make deep neural networks more biologically accurate by including inter-neural travel times. Train with some normal task like digit-recognition.
- Currently, deep neural networks only share some similarity to actual neurons: threshold behavior and hierarchical representations.
- However, in real neural networks, signals travel with finite speed and activations are integrated over time
- This ignored aspect could be one reason why real neuronal networks/brains are superior
- Further connecting the two fields of neuroscience and deep learning would be pretty cool
- We could use the "regular" neural network machinery to optimize weights etc for tasks like forecasting/image recognition and then see whether we find neural avalanches and chaotic behavior etc.
Details (first ideas):
- In artificial neural networks, different neurons are connected by weights. To this, we add another connection between the neurons: the inter-neuron travel time.
- The inter-neuron travel time is computed by a RNN
- inference works by letting the network oscillate/ come to an equilibrium
- activation of neuron i at time t: a_i(t) = sum_over_connnected_neurons [f(a_j(t)) * delta(rnn(j->i)-t ) + exp(-lambda*t) f(a_i(t))], where delta is the Kronecker delta.
- I.e. the signal from connected neurons arrives at the time specified by the RNN and then slowly decays with exponent lambda
- if the RNN just gives t=1 for all travel times, this essentially reduces the normal deep neural net output.
Currently, there are two ideas floating around:
- How do social norms evolve *within* a society? Method-wise this is perhaps be related to the spread of ideas/information on a social network. The agents in this network are people. Potentially relevant models: Opinion formation, infectious (disease) spread and/or games on social networks.
- Think of a whole society (group/tribe/nation/etc) as an agent. A society may adopt or discard various social norms over time. If one of the chosen social norms (or a combination of the chosen combination of social norms) is woefully impractical it decreases the "fitness" of the whole society and it loses members/power/resources/territory to competing societies. Potentially relevant models: Models for evolutionary game theory.
The project can focus on (1), (2), or both.
Branch: Agent Based Models and System Dynamics
This branch seeks to use the 2 tools of ABMs and SD to further understand how social norms emerge through individual interaction from the bottom up(ABM) and how governing mechanisms then influence and shape those social norms from the top down (SD). Ideally this will even allow individual agents to select between emergent social norms and governing institutions which then further influences the feedbacks and system behavior.
The current challenge is finding a parsimonious construct and identify the key elements of this model to create the desired dynamics and analyze the subsequent behavior.
Interested in Branch : Tom, Carlos Marino, Duy Huynh
Branch: Emergence of institutions on trade networks
Medieval age sees the emergence of institutions which affect or control the long exchange trading. In short, these institutions can provide informations on potential trade partners in exchange of resources. Could we explain which mechanism have led to their emergence ? To do that, we could use model with game theory (simulate the trading), evolution, network and theory and economics models. Of course, we can explore different institutions or trading networks.
Something related: Avner Greif. "Reputation and Coalitions in Medieval Trade: Evidence on the Maghribi Traders". The journal of Economic History. (1989)
To model institutions in a game theory form: Leonid Hurwicz, "Institutions as families of game forms", (1996) The Japanese Economic Review
Interested in Branch:
- Ostrom, Elinor. "Collective action and the evolution of social norms." Journal of economic perspectives 14.3 (2000): 137-158.
- Sethi, Rajiv, and Eswaran Somanathan. "The evolution of social norms in common property resource use." The American Economic Review (1996): 766-788.
- Centola, Damon, et al. "Experimental evidence for tipping points in social convention." Science 360.6393 (2018): 1116-1119.
- Powers et al, "How institutions shaped the last major evolutionary transition to large-scale human societies" Phil. Trans. R. Soc. (2016)
- Centola, D., Becker, J., Brackbill, D., & Baronchelli, A. (2018). Experimental evidence for tipping points in social convention. Science, 360(6393), 1116-1119.
- Daniels, B. C., Krakauer, D. C., & Flack, J. C. (2017). Control of finite critical behaviour in a small-scale social system. Nature communications, 8, 14301. https://www.nature.com/articles/ncomms14301.pdf
- Lorini, G., & Marrosu, F. (2018). How individual habits fit/unfit social norms: from the historical perspective to a neurobiological repositioning of an unresolved problem. Frontiers in Sociology, 3, 14. https://www.frontiersin.org/articles/10.3389/fsoc.2018.00014/full
- Martin, R., & Sunley, P. (2006). Path dependence and regional economic evolution. Journal of economic geography, 6(4), 395-437.
- Cioffi-Revilla, C. (2005). A canonical theory of origins and development of social complexity. Journal of Mathematical Sociology, 29(2), 133-153. https://www.researchgate.net/publication/233820732_A_Canonical_Theory_of_Origins_and_Development_of_Social_Complexity
Alice, Vandana, Alan, Xindi, Jenn, Matt, Sandra, Kevin, Alex, Cedric, Subash, Josefine, Tom, Carlos
Topological features of neutral networks in evolution
In a genotype network, nodes are genotypes and a link from genotype A to genotype B indicates that they are separated by a single mutation. Each genotype has a phenotype associated with it. In a fixed environment, a phenotype is associated with a fixed fitness value. So for every node, one has:
GENOTYPE -> PHENOTYPE -> FITNESS VALUE
The fitness values form a "fitness landscape", in which one can embed the genotype network. The set of nodes in a genotype network that corresponds to the same fitness value are a *neutral network*. These networks have received little or no attention from network scientists. Let's change that!
Depending on the interest of participants, this project could focus on (1) data analysis or (2) network theory.
(1) Szendro et al. mention that empirical data for genotype networks and their neutral networks is available. This is a somewhat new development (<10years). One could scout for one or several available data sets and study the topology of the networks. For example,
- what are topological characteristics of genotype networks? Can these characteristics be explained by constraints of embedding on a curved manifold? (One could compare data to random graph models, e.g. Erdos-Renyi, small world, or geometric random graph models.)
- how are neutral networks for high or low fitness values different?
- one could also think of the genotype network as a multlayer network with a lot of layers ... and analyse its topology from a multilayer perspective.
(2) A neutral network is a "level-set network" in the genotype network. The genotype network is a network that is embedded in a curved manifold in a high-dimensional space. There is so much cool math/physics/topology that one could do with this!!
- Szendro, Ivan G., et al. "Quantitative analyses of empirical fitness landscapes." Journal of Statistical Mechanics: Theory and Experiment 2013.01 (2013): P01005.
- De Visser, J. Arjan Gm, and Joachim Krug. "Empirical fitness landscapes and the predictability of evolution." Nature Reviews Genetics 15.7 (2014): 480.
- Kondrashov, Dmitry A., and Fyodor A. Kondrashov. "Topological features of rugged fitness landscapes in sequence space." Trends in Genetics 31.1 (2015): 24-33.
- Reza Rezazadegan, Chris, Barretta, Christian Reidys. "Multiplicity of phenotypes and RNA evolution". Journal of Theoretical Biology(2018) Paper on percolation of neutral space in 100 base pair long RNA's given energetic minimum folding.
Kofi K. (background in cancer genomics, data mining, and bioinformatics tools) email@example.com
Networks from thresholded normally distributed data
- real-world networks are often created by thresholding dyadic interaction;
- lots of things are approximately normally distributed.
- Suppose for each pair of nodes, i and j, there is a normally distributed interaction: x_ij ~ Normal(0,1);
- Then, we place edges between nodes i and j whenever x_ij>threshold;
- Edge correlations could be controlled by a single parameter, i.e. Cov(x, y) = beta .
- The resulting degree distributions have two limiting forms, and are approximately Poisson or power law(ish) (+ exponential cut-off), with something intermediate inbetween (log normal?)
Things to look at:
- Can we solve for the degree distribution of this model?
- Does this degree distribution look like real networks? Can we fit the model easily (e.g. maximum likelihood or method of moments)
- What about the giant component phase transition?
- Does clustering vanish in the limit of large network size?
This would be a more mathematical/theoretical project, and less about real world data.
- George (background in physics and networks)
The Evolution of Beliefs in Abrahamic Religions
One commonality across all Abrahamic faiths – Judaism, Christianity, Islam, and others – is its reliance on the written word to solidify and codify beliefs, even centuries after the text was documented. Because of this large time difference between when documents were written – Torah, New Testament, Qu’ran – and when these beliefs grow and evolve, decisions are often linked to other texts as justification for the decision. For instance, when Ecumenical Councils declare a new testament of faith, they often point to previous texts from church fathers for justification (or sometimes non-believers, like pre-Christian Greek philosophers). Similarly, when imams declare testaments of faith, these are often linked to the hadiths and sirahs as justification. Canon law and Islamic law is based on these two dynamics respectively. Religions often influence each other, both as attractors (Islam prompted Iconoclasm in Eastern Christianity) and repulsors (Early Christianity set itself in opposition to Judaic practices, despite being considered a Jewish sect).
2. Carlos Marino
3. Pete K.
4. Xiaoyu (Background in control theory, Interested in Chinese Taoism)
City as a Complex System: Clustering/Mobility Network Effects
Cities are complex systems within which many sub-systems develop, interact and evolve. The understanding of how different systems within a city interact and connect with each other can help inform better urban planning decisions, to support different communities and ecosystems.
Through this study, we aim to gain insights on human choices, development of ecosystems, and spatial distribution in a city. Data from Singapore is available as a case study.
Some possible research questions are listed below, but feel free to add on any ideas related to this topic and we can discuss how to go from there!
Possible research questions (open to more ideas!):
a. Business Clustering & Flow of Capital (human and/or monetary) between Industries
Motivation: To better distribute jobs closer to homes, a polycentric structure can be developed to establish multiple employment nodes in different areas of a city. Understanding of how business ecosystems develop and factors to support successful business/industrial clusters can help inform the strategies to establish and facilitate the growth of polycentricity.
- Are there clustering effects for business/industries across different sectors and how can this be measured/analyzed
- What are the implications of clustering on the performance of businesses and industries?
- What are the drivers to facilitate a sustainable business ecosystem?
b. Intra-city Public Transport (PT) Mobility Patterns
Motivation: The study of people’s PT mobility patterns within a city allows us to understand human movement, choice and interactions. Through understanding mobility patterns in relation to the built environment and demographic make-up of different areas, we can gain insights to the human-environment relationship. This facilitates the formulating of more informed policy decisions and urban planning strategies to cater to the needs of the society on a local and macro level.
- To study how PT mobility patterns within/across towns differ
- Relationship between PT mobility and factors such as town demography profile, job/worker ratio, and land use mix
Concept Plan: https://www.ura.gov.sg/Corporate/Planning/Concept-Plan/Past-Concept-Plans
New employment districts:
Industry Input-Output Raw Files: https://www.singstat.gov.sg/find-data/search-by-theme/economy/national-accounts/latest-data
Industry Input-Output 2010 Summary: https://www.singstat.gov.sg/-/media/files/publications/economy/io_tables_2010_publication.pdf
Industry Input-Output Explanation: https://www.singstat.gov.sg/-/media/files/publications/economy/ssnmar15-pg9-14.pdf
OECD Input-Output (for reference): https://www.dartmouth.edu/~rstaiger/OECD%20Input-Output%20Database.pdf
Shantal, Alex, Jared, Sanna, Kevin, Chris, Sarah B.
The Evolution of Water Narratives in US Newspapers
The complex interactions between physical and social factors in water management have led to the emergence of a new field in socio-hydrology. Various dynamics are studied by socio-hydrologists including the influence of economics, culture, and institutions on behaviors related to water. This study focuses on improving our understanding of the evolution of social narratives around water within local US newspapers.
We have access to ~2 million newspaper articles (across 37 newspapers/34 states over 15 years) from the LexisNexis database that touch on water to some degree shape or form.
The data provides a lot of opportunities for play! Given the text nature of dataset, we will draw heavily from natural language processing techniques (http://mschoonvelde.com/assets/pdf/Syllabus_CEU.pdf). Currently, we are thinking of exploring a variety of natural language processing (e.g., word2vec and sentiment) and network evolution techniques to help us characterize and understand the evolution of narratives. We can also try to understand the impact of the these social narratives on local behavior, by looking at:
- legal behavior (cases through LexisNexis)
- water conservation policies (Gilligan, J. G., Wold, C. A., Worland, S. C., Nay, J. J., Hess, D. J., & Hornberger, G. M. (2018). Urban water conservation policies in the United States. Earth's Future. https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2017EF000797)
There are of course other directions the project can evolve. If you are interested, please put your name down below and join us on slack (#waternewspapers) to be a part of the conversation!
- Marelli, B. (2008). Common Pool Resources: the Search for Rationality through Values. Empirical Evidence for the Theory of Collective Action in Northern Italy. https://dlc.dlib.indiana.edu/dlc/bitstream/handle/10535/1344/Marelli_119601.pdf?sequence=1 (Think about how the newspapers and their narratives are affecting the capacity for collective action around shared pool resources)
- Boumans, Jelle W., and Damian Trilling. "Taking stock of the toolkit: An overview of relevant automated content analysis approaches and techniques for digital journalism scholars." Digital Journalism 4.1 (2016): 8-23.
- Denny, M. J., & Spirling, A. (2018). Text preprocessing for unsupervised learning: why it matters, when it misleads, and what to do about it. Political Analysis, 26(2), 168-189. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2849145
- Lewis, S. C., Zamith, R., & Hermida, A. (2013). Content analysis in an era of big data: A hybrid approach to computational and manual methods. Journal of Broadcasting & Electronic Media, 57(1), 34-52.
- Atteveldt, V. (2017). Text Analysis in R. Communication Methods and Measures, 11(4), 245-265. http://kenbenoit.net/pdfs/text_analysis_in_R.pdf
- Greene, Z., Ceron, A., Schumacher, G., & Fazekas, Z. (2016). The nuts and bolts of automated text analysis. Comparing different document pre-processing techniques in four countries. https://osf.io/ghxj8/
- Azarbonyad, H., Dehghani, M., Beelen, K., Arkut, A., Marx, M., & Kamps, J. (2017, November). Words are Malleable: Computing Semantic Shifts in Political and Media Discourse. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 1509-1518). ACM. https://arxiv.org/abs/1711.05603
- Zhang, P., & Moore, C. (2014). Scalable detection of statistically significant communities and hierarchies, using message passing for modularity. Proceedings of the National Academy of Sciences, 111(51), 18144-18149. http://www.pnas.org/content/pnas/111/51/18144.full.pdf
- I wanted to throw this out as a possible method... https://www.erikgjesfjeld.net/evolution-of-diversity.html -- mapping concepts to keywords, looking for changing frequencies through time ... should be easy toi create a spatial component.
Thushara, Jenn, Sandra, Kevin, Matt, Conor
Reproducibility and Underdeterminacy in Mathematical Modeling
The reproducibility crisis has shaken the scientific world as many findings have failed to replicate in new experiments and datasets. At the same time, the rise of highly accurate predictive machine learning methods challenges the notion that we need deep scientific understanding in order to make predictions about the world around us. Will developing scientific theory still be necessary, or practicably justifiable, if we can just get enough data?
There are several dimensions of this tension that we could explore:
- We think of physics as the area that achieves the highest degree of predictive accuracy among all sciences. Can deep learning predict physical scenes more accurately than physical models? If not, perhaps there is still hope for science. If so, perhaps we need to rethink either the practice of physics or the authority of prediction.
- Data is often brought to bear when trying to provide evidence for a mechanistic model. In order to establish firm evidence, strong alternative hypotheses must be specified. Yet in many cases, alternative models are not even compared, or when alternative models are compared, those alternatives are weak strawmen. Can typical datasets actually uniqely identify mechanistic models among alternatives? One approach to answering this question is to generate data according to known model (e.g., from published papers), and see if an analyst who does not know the true model can infer it, or to see if multiple different models provide equally good accounts of the data.
- If we want to hold out hope that our scientific models are useful for prediction in the face of machine learning, perhaps we can productively combined structured scientific models with less structured machine learning approach, e.g., by predicting model residuals. Do structured models actually help in such a joint model above and beyong machine learning alone?
- Can collecting richer data, such as finer-grain neural data or interviews / ethnographies in social science, help resolve any indeterminacy we identify, or would having richer data simply make machine learning more effective as an alternative?
Some additional random thoughts (in defense of science) by Jonas:
- a lot of the flexible (low inductive bias) function approximators need a lot of (labeled) data; is this realistic in all/many/some scientific disciplines? For instance, in psychology there are fundamental limits to how many subjective measurements one can take of an individual on a given day, both in frequency and number of variables p; in such situation adding more inductive bias (e.g. through understandable parametric models) is possibly a good idea
- convenience samples vs. samples from a proper sampling scheme (probably not a big problem in classifying cat pictures, but maybe a bigger problem in more contextualized phenomena)
- observational data vs. experimentation
- predictive models (function approximation) vs. causal models (building a model of the world); Pearl argues for the latter and against the former in his new book (but didn't read it)
- in many situations it is not so interesting to predict variables, but to be able to come up with useful interventions on them; this is difficult in a black box approximators in which parameters do not map to concepts we have about the real world
Kleinberg et al. (2017) The Theory Is Predictive, but Is It Complete? An Application to Human Perception of Randomness
Youyou (2015) Computer-based personality judgments are more accurate than those made by humans
Pearl and Mackenzie (2018) The Book of Why
Classifying language by grammatical motifs
Every once in a while when we get people from different countries sitting around a table (at CSSS for example!) and then we come across words, idioms, or concepts that we can't accurately translate from one language to another. There a lots of words that exist only in one language but not in others. Consequently, there are lots of concepts that exist only in one language but not in others. In this project, let us explore the differences between language by "higher-order grammatical structure", not just single words.
We can take a sentence and think of its structure as a small network (also called motif) of words. Nodes are subjects and objects that are linked via verbs of prepositions (see for example https://en.wikipedia.org/wiki/Object_(grammar) ). Taking a text and counting the reoccurence of sentence structures, we can get a distribution of motifs. Let us explore if we can use this distribution of motifs to characterise different texts. Comparisons could be
- between texts in different languages
- between British and American English
- between texts for different purposes (fiction/novels, news, scientific writing, policy, etc.)
Given the diversity of the CSSS crowd, this would be a unique opportunity to work on the comparison of texts in different languages!
The main challenge would be to develop a text mining algorithm that can give us the motif distribution for a text (in a given language). This project could benefit from
- expertise in computational linguistics, text mining, and machine learning
- a diverse team of people who speak different languages.
Mousavi, Hamid, et al. "Mining semantic structures from syntactic structures in free text documents." Semantic Computing (ICSC), 2014 IEEE International Conference on. IEEE, 2014.
Structures in Open Source Software Communities
A lot of open source software projects organize through mailing list. This mailing list interactions in combination with for example data from github could give some insight in how those groups organize. Possible interesting questions could include:
- How does the project size influence the structure.
- What members collaborate more/less?
- Who collaborates on specific code pieces?
- How does communication behavior influence the position of contributers in the community? (sentiment analyses? )
- ... your ideas ...
Existing Work in this Field?
- linux kernel https://lkml.org/lkml/2016/
- diaspora https://github.com/diaspora/diaspora
- cyanogenmod https://github.com/CyanogenMod (https://github.com/LineageOS)
- apache https://github.com/apache
Maria W Cedric P
Measuring information distortion in networks (rumors/fake news)
Analyzing analytically, numerically and experimentally how information get distorted in networks when passed between people. The network is layered (people in one layer pass the message to people in the next layer). In-degrees and out-degrees are fixed (1,2,3...)
Possible parameters: error rate, degree, length of chains, number of agents, speed of news propagation (internet vs newspapers etc.)
11. R Maria
- Safe Cast: Radiation and air quality data primarily for Fukushima and Tokyo https://blog.safecast.org/downloads/
Measuring epigenetic effect of stress at a macro scale
Epigenetic processes describe environmental effects on genome expression/regulation which are transmitted to the next generations. In particular, recent research indicates that stress in human can have transgenerational effect. Can these epigenetic effects can be detected in data at a macro scale, for instance after a global stressful crisis (world war, etc..) ?
1. Israel Rosenfield and Edward Ziff. "Epigenetics: The Evolution Revolution" The New York Review of Books (2018)
2. McGuiness et al. "Socio-economic status is associated with epigenetic differences in the pSoBid cohort" International Journal of Epidemiology (2012)
2. Uddin et al, "Epigenetic and immune function profiles associated with posttraumatic stress disorder". Proceedings of the National Academy of Sciences (2010)
3. Borders et al. "Chronic stress and low birth weight neonates in a low-income population of women." (2007) DOI: https://doi.org/10.1097/01.AOG.0000250535.97920.b5
4. Miller GE, Chen E, Parker KJ. Psychological Stress in Childhood and Susceptibility to the Chronic Diseases of Aging: Moving Towards a Model of Behavioral and Biological Mechanisms. Psychological bulletin. (2011). doi:10.1037/a0024768.
5. Jack P. Shonkoff, Andrew S. Garner. "The Lifelong Effects of Early Childhood Adversity and Toxic Stress." Pediatrics. (2012), DOI: 10.1542/peds.2011-2663
1. Cedric P
2. Sarah B.
3. Chathika G.
4. Simon J.
5. Kofi K (background in bioinformatics, data-mining, behavioral psychology, microbiology) 6. Nam Le
Topology of natural conversations
Everyone who belongs to a Whatsapp political discussion group (or any other discussion group regarding a specific topic) knows that consensus is difficult to reach. People seem to go back and forth in their arguments trying to convince others of their own views. Looks like a dynamical system to me! I would like to use what we learned from Joshua's talk and what we will learn from Simon deDeo's lectures to represent each text sent as a point along a one dimensional opinion continuum. The state of the conversation can then be represented as a point moving along the state space composed of every person participanting in the conversation. Is there an attractor? is it a strange attractor? What is its topology? How does that topology look like when people are arguing versus when they are planning or simply chatting? Hit me up if you are interested!
1. Niccolo (proponent)
Scaling of information requirements in living things
Information about the environment is a resource that organisms must take in and process to survive, just like energy/nutrients. Inspired by West's talk, I wonder how this requirement might scale as a function of mass. Bacteria sense chemical concentrations in their environments, while more advanced organisms process increasingly sophisticated kinds of information (visual, social, and so on). However, we can simply ask how many bits per unit time are required by various creatures. By analogy with the principles underlying metabolic scaling, I would guess that bigger organisms are able to do more with less because larger networks might allow for greater processing power. On another level, innovations in processing like the emergence of nerves and brains might change that picture.
The nice thing about this project is that I think it ought to be relatively easy; if we read enough existing papers I think we should be able to produce reasonable estimates of information requirements, and there will be a story behind the answer one way or another.
Thoughts? Recommended Papers?
1. Elan (proponent)
4. Kofi K (background in (bioinformatics, data-mining, microbiology & genomics)
5. Louisa (background in societal metabolism & sustainability)
6. Xiaoyu Wang (background in control theory)
Modeling Utility of Cryptocurrency in a Failing Economy
Since the inception of Bitcoin in 2009 after the financial crisis, cryptocurrency has re-awakened the ideas and dreams of 90s cypherpunks in creating alternative and decentralized monetary systems that are separate to the state monopoly of currency production.
In the cryptocurrency space, a popular idea is that cryptocurrency would work best in a failing economy with hyperinflation and institutional weakness. It has been said that cryptocurrency can be looked at as insurance against politicians (Naval Rivikant).
We are exploring and modeling the possibility of agents switching and/or diversifying the monetary technology they use when there are macro effects of hyperinflation and weakening institutions at play. Compositing data sets (CPI, Market Index, BTC price, GDP) in order to create a susceptibility index that may indicate the likeliness of a person switching monetary systems
- New P2P Paradigm: https://www.hindawi.com/journals/misy/2018/2159082/
- Metcalfe Law in regards to Network Value: http://novel.ict.ac.cn/zxu/JournalPDF/Zhang_JCST_2015.pdf
- Governance Model Overview: https://blockchainconsultants.io/blockchain-governance-models/
- Governance Article of just one blockchain (Decred): https://www.cryptocompare.com/coins/guides/a-look-at-decreds-governance-system/
- Article on Tokenized Securities: https://medium.com/@apompliano/the-official-guide-to-tokenized-securities-44e8342bb24f
- Example of a decentralized open source coin explorer: http://explorer.threeeyed.info/info
- Does the cryptocurrency exist as a pure betting market outside cash and financial institutions?
- Can cryptocurrency be used as an insurance against weak institutions and failing economies?
1. Yuki Asano
2. Laura Mann
3. Carlos Marino
4. Chris Fussner
5. Eleonora Mavroeidi
Twitractors: What kind of non-linear dynamic attractrors exist across OSM discussions
Online social media discussions center around emotion-driven exchanges of information on current topics that participants often have considerable social and cognitive investment in. Typically, the participants on these discussions have both opposing and supporting views , leading to emergence of collective effects such as polarization or information cascades. The result is a "heartbeat" of emotion, signifying the global collective emotion among society regarding the topic under discussion.
In this project, we will explore this collective "heartbeat" over many topics on Twitter through non-linear time series analysis.
Join the discussion at #Twitractor on slack
Twitter Firehose data with sentiment analysis.
Social Networks and International Relations
This project draws from the logic of Paul Hooper's research on cooperation dynamics in communities and the fractal and scalar presentations. I think the interactions between countries follow similar social dynamics as families, hunter gatherer groups, organizations, and within countries. I would be interested in simulating conditions under which countries cooperate. I think there are clear analogs to periods of colonization, WWI, and WWII. Also, this approach would be novel to international relations research.
I thought this would be modeled with ABMs and referencing historical periods.
1. Jared Edgerton
When estimating observables (e.g. parameters) from datasets we need to quantify the error associated to our estimation in order to decide whether or not our estimation is statistically significant. In sets of correlated data, the correlations may produce fluctuations that affect the error of our estimators. In this project we are interested in studying how the fluctuations depend on the sample size in different sets of data, simulations or models that the participants bring. In particular, when the fluctuations are anomalously suppressed, this phenomenon is known as hyperuniformity. The fingerprint of these systems is the suppression of fluctuations on large scales, manifesting a regularity that is not apparent on short scales. It can be found in systems of any dimensions, examples are jammed packing systems, crystal-like materials and some biological tissues such as the chicken retina.
- hyperuniformity in buses: 
- foundations&examples: Torquato S. and Stillinger F. H., Phys. Rev. E, 68 (2003) 041113.
- hyperuniformity in jammed particle systems: L. Berthier, P. Chaudhuri, C. Coulais, O. Dauchot, and P. Sollich, Phys. Rev. Lett. 106, 120601 (2011).
- hyperuniformity in chicken retina:  and Jiao Y., Lau T., Hatzikirou H., Meyer-Hermann M., Corbo J. C. and Torquato S., Phys. Rev. E, 89 (2014) 022721.
- hyperuniformity in an avalanche model: Garcia-Millan, R., Pruessner, G., Pickering, L., & Christensen, K. (2017). Correlations and hyperuniformity in the avalanche size of the Oslo Model, arXiv preprint arXiv:1710.00179.
Understanding Cardiac Dynamics in Health and Disease (#cardio)
Arrhythmias (abnormal electrical activity of the heart) are common cardiac diseases and are amongst the most common causes of impaired quality of life and death. I am particularly interested in two of the most complex cardiac arrythmias namely 1. atrial fibrillation (disorganized electrical activity in the upper chambers of the heart -i.e. atria- not lethal but very disabling) and 2. ventricular fibrillation (disorganized activity in the bottom part of the heart -i.e. ventricles- that is lethal). We have a minimal understanding of the mechanisms of these arrhythmias and our current therapeutic strategies (namely medications, implantable cardiac devices that can deliver electrical therapy and ablation procedures where we intentionally destroy heart tissue in specific areas of the heart) are relatively ineffective. The lack of effective treatments largely reflect the lack of our understanding of the fundamental mechanisms responsible for these arrhythmias.
- 1. I have intracardiac recordings of patients that are in atrial fibrillation before and after a therapeutic procedure. These are spatiotemporal data of simultaneous recordings from 64 locations inside the heart. We could use these data to develop creative ways to either (a) understand the dynamics of the system and specifically phase transitions and changes in spatiotemporal structures (b) develop markers that predict the success of the procedure, (c ) identify locations inside the heart that would serve as "hot-spots" or would be critical for sustainment of the arrhythmia.
- 2. I have several toy models of cardiac arrhythmias. These models are simulations of reaction diffusion models (specific for cardiac dynamics) that give rise to solutions such as stable periodic activity, spiral waves, or wave breakdown with multiple daughter wavelets. These could be used for a more theoretical assessment of spatiotemporal phase transition.
- 3. Should any of the methods that we might come up ends up working, I plan to scale it up to large animal models and clinical (human) studies, in the near future and I would welcome your collaboration.
- 1. Representation of intracardiac recordings as networks using horizontal visibility graphs: we plan to analyze both synthetic (simulation) data as well as real patient data. Our preliminary plan is to develop such networks and compare network characteristics between different states.
- 2. Use Koopman analysis to get an insight in the dominant spatiotemporal patterns that govern the dynamics of healthy and diseased heart rhythms. Similar to above we plan to analyze both synthetic (simulation) data as well as real patient data.
- 1. Konstantinos (Cardiology, Translational Research)
- 2. Andrea (Mathematics)
- 3. Anastasya (Physics)
- 4. Conor (Physics / Information Theory)
Multi-scale Adaptive Systems
Many (all?) complex adaptive systems observed in nature seem to have a multi-level / hierarchical / multi-scale structure. Why is that? and what are the generic properties of that hierarchical/multi-scale structure?
- Q1. What is the nature of the relations between different scales of (complex) adaptive systems?
- Q2. What properties of these relations are essential to the dynamics of the system, both globally and at each scale/level?
- Q3. How do structural properties impact qualitative properties of the system, both globally and at each scale/level? (e.g. communication speed, robustness, ...)
- R1 & R2 (to Q1 & Q2): Some of the most interesting aspects seem to be:
- micro-macro abstraction (information loss) --> upward causation;
- macro-micro feedback and adaptation (macro control signals leading to micro adaptations) --> downward causation;
- different time scales between levels (e.g. faster adaptations at the mower levels than at the higher levels)
- R3 (to Q3à: study robustness, performance of coordination (e.g. speed of communication and/or of convergence)
Contributions : 3 types
A. Reference to relevant work in different disciplines.
You worked with or know of a system that features some sort multi-scale feedback-driven structure
--> Please let me know about it and let us discuss, to identify the above properties instantiated in this particular system;
B. Domain-specific Application
Apply and explore the impacts of the above principles onto a system or application domain that you are working /interested in
--> Develop and play with an analytical model, or simulation, of an application-specific hierarchical feedback-driven system-of-systems.
Examples of possible application domains:
- Swarms of Swarms? "Controlled" swarms
- Hierarchical institutions, organisations, politics, rule/norm formation and evolution,...
- Micro-Macro economics, finance, behavioural economics, ...
- Networks of Networks (probably relevant to most/all of the above)
- Multi-level learning
- Multi-scale chemical reactions?
- Multi-scale biological systems
C. General Theory
Extracting general principles, concepts, design patterns that apply across several system types.
Purpose: help understand, analyse and ** design ** complex adaptive systems with desirable properties (e.g. reaching local/global stakeholder goals; robustness; performance; security; reusability; flexibility/adaptability; etc)
Among other tools, we can use this simulator of a holonic cellular automata (HCA):
- videos of two configurations with different outcomes: 
- project: 
- details: see references (ALife 2018)
HCA simulation snapshot:
Please add your name and the contribution(S) you're most interested in: A, B, C, ... all :) or the link to the relevant work or project you'd like to share. Many thanks.
- Ada (A, B, C)
- Louisa (B, C)
- Jordan (A)
- Patricia (A,B,C)
- Josefine (B,C)
-- Herbert A Simon, "The Architecture of Complexity", in Proceedings of the American Philosophical Society, V. 106, No 6, December, 1962, pp.467-482 paper online: e.g., 
-- Jessica C. Flack, "Coarse-graining as a downward causation mechanism", Philosophical Transactions of the Royal Society, Volume 375, issue 2109, Nov 2017 paper online: 
-- S. McGregor and C. Fernando, "Levels of description: A novel approach to dynamical hierarchies" ALife, 11(4), 2005 paper online: 
Some works on synchronization in modular networks. What is not so present here (at least not explicitly) is an analysis in terms of the feedback from macro to micro; although this is implicit in the character of phase coupling (i.e. the force on a phase is given by its difference from the average phase of its neighbors) -- Garlaschelli, D., Hollander, F. den, Meylahn, J., & Zeegers, B. (2017). Synchronization of phase oscillators on the hierarchical lattice, 1–33. 
-- Kogan, O., Rogers, J. L., Cross, M. C., & Refael, G. (2009). Renormalization group approach to oscillator synchronization. Physical Review E - Statistical, Nonlinear, and Soft Matter Physics, 80(3), 1–12. 
-- Arenas, A., Díaz-Guilera, A., & Pérez-Vicente, C. J. (2006). Synchronization reveals topological scales in complex networks. Physical Review Letters, 96(11), 1–4. 
Some of my previous work:
-- Ada Diaconescu, Sven Tomforde and Christian Müller-Schloer, " Holonic Cellular Automata: Modelling Multi-level Self-organisation of Structure and Behaviour", ALife 2018, Tokyo, Japan paper: 
-- Ada Diaconescu, Sylvain Frey, Christian Müller-Schloer, Jeremy Pitt, Sven Tomforde, "Goal-oriented Holonics for Complex System (Self-)Integration: Concepts and Case Studies", SASO 2016, Augsburg, DE, pp 100-109 paper: 
Evolution of trade networks
Global economic integration has been a powerful driver of increased efficiency and improved living standards around the world, but has also raised concerns about the costs it has imposed on vulnerable groups and its potential impact on inequality. This project seek to analyse the evolution of trade networks and examine to what extend increased interconnectedness makes domestic economies more or less resilient to global trade shocks.
Use a multi-layer network of trading partnerships to capture the different levels of integration in global value chains and examine the evolution of the network dynamics in the presence of an exogenous shock (eg increase in import tariffs).
friday @ SFI at 3:30pm
Exploring Income Inequality From a Game Theoretic (or Other) Perspective:
Many economic markets are fundamentally unfair and lead to high level of inequality. This has consequences for how people's opinions of fairness and trust develop and evolve. Data shows that an american citizen's likelihood of making their way from the bottom to the top is lower than that of citizens from other advanced countries. Data also shows that children born into "rich" families are more likely than not to remain rich. Literature also shows very strong demographic variations.
Thoughts? Recommended Papers?
Here is some relevant literature: https://www.jstor.org/stable/pdf/3088921.pdf?refreqid=excelsior%3A1839833f8090beb4f9e3f37e55cbf6c0
One idea is to consider a evolutionary game theoretic model that considers a stratified market (stratified into different income levels). Within each stratum, you could have various groups of agents corresponding to different demographics. The model could include some systemic barriers that may be unique to certain demographics. Agents could be self-interested, altruistic, spiteful, etc.
A non-game theoretic model could also work, so this is quite an open problem. If anybody else is interested in discussing this further, please contact Priya.
Another approach could be agent based modeling.
- Carlos Marino
Consensus seeking process is crucial for groups to make coordinated actions, vote for their institutions and react to dynamics environment. Research have shown that hierarchy can make the group reach a faster consensus but also lead to unfair decision. Could we keep the benefit of hierarchy without its cost ? To answer this question, we will use different method to analyse and optimise the impact of different features of a social network structure on the time to reach consensus and the fairness of the final decision.
So far, people have proposed to explore:
- different distribution of degree and degree correlation
- other mesoscale features of the network (hierarchy, communities, clique, clustering)
- explore different voter model. For instance, individual with highly different opinion slowly influence each other (then homophily help reaching a faster consensus ?)
- multiple speaker/ listeners
Gavrilets et al. "Convergence to consensus in heterogeneous groups and the emergence of informal leadership". Nature scientific reports. (2016) Lu et al, "Consensus over directed static networks with arbitrary finite communication delays" Physical review E (2009 )
Multi-objective evolutionary computing (genetic algorithms, etc...)
Non-linear dynamic analysis
3. Zohar ?
Searching for patterns and narratives in the SFI Complex Systems Summer Schools
Hey guys, this is another project idea:
We can work with data of previous SFI Complex Systems Summer School generations available in the wiki. These include institution, country, working groups, project topics, project outcomes, (maybe not in the wiki but easy to find in google scholar) resulting collaborations post-CSSS, etc., etc.
In the wiki, there is information since 2006.
Some of you have shown interest in this project and have thought of great and interesting ways of searching for the narratives hidden in this social experiment.
Matthew, from Ohio State University, mentioned we could look for network flows. Since many of the participants are directly advised by people from their institutions to apply to the CSSS, we could see which institutions remain predominant throughout the years.
Yuki seeded the idea of analyzing career paths of the participants of the CSSS. Are they still on academia? Did they end up working in the industry? How many of these people became entrepreneurs? (Is good to know our statistical possibilities guys).
Guillaume and Amy said diversity in teams has been studied as a measure of success? So we could also play with this idea.
Someone also said we could analyze changes or trends in topics of projects throughout the years? Has interest on understanding online social networks increased throughout the years in the CSSS participants?
Anyways, we can meet soon to talk about this. Please feel free to reach out on slack or directly. I would love to know what you guys think.
Emergence of sustainable development contradictions
Achieving the United Nation's 17 sustainable development goals (SDGs) requires progress along multiple dimensions of human development, and many improvements can be tackled using new or improved technology. However, some technological interventions can lead to contradictory changes in macro-level indicators. As a simple example, building a new factory may increase employment and thereby reduce hunger, but might simultaneously increase greenhouse gas emissions from manufacturing. But how do these and more complex contradictions emerge at the micro-level? Are there combinations of technologies that make them less likely, and if so, why? Does the sequence (of how technologies are introduced) make a difference?
This project takes a technology-focused view at these questions and investigates the effects of introducing a new or improved technology portfolio into an existing network of resources, technologies, and industries. Since technologies require a similar set of resources/industries regardless of where they are being manufactured, we'll likely start by building a location-independent network and studying network changes as new technologies are added. Depending on people's interest and time constraints we can then pick one or multiple locations and incorporate data on resource availability, the rate of resource use (and temporal changes therein), existing industrial capabilities etc.
Additional ideas more than welcome!!! Feel free to indicate your interest here, on Slack, or reach out directly (firstname.lastname@example.org)
Data and industry classification systems
North American Industry Classification System
Sustainable Industry Classification System
World Development Indicators (World Bank)
Eurostat's classification server
- Magdalena Klemun
- Neil Gaikwad
- Chathika Gunaratne
- Amy Schweikert
- Sanna Ojanperä
Next meeting: Lunch meeting on Thursday (meet by tables to the left of the entrance)
Metabolic rates and the collapse/transformation/adaptation of societies
Similar to how organisms have a metabolic rate which is linked to their lifespan, societies can be described by exosomatic metabolic rates (quantified, for example, in MJ/h where the hours are calculated as the total population size times 8760 – hours in a year).
The main idea behind this project would be to explore the relation between societies’ exosomatic metabolic rates, and their lifespan/sustainability. Looking at organisms, the higher the metabolic rate the shorter the lifespan – considering societies obviously adds many layers of complexity, but it is a relation which may be interesting to discuss and explore (even if to falsify it and build a critique of applications of biological concepts to social science).
The project is still very much in an open/exploratory phase (and will hopefully remain open and exploratory throughout its evolution). Some possible questions which we could discuss and focus on include, for example:
- Is it possible to define a taxonomy of societies, based on metabolic characteristics (e.g. exosomatic metabolic rates, human activity patterns, level of openness and trade, dependence on non-renewable resources, etc.) from which we can infer something about the society’s sustainability (and therefore its lifetime?) Since societies are open systems this would also mean looking at different relations across different types of societies (e.g. resource-rich societies exporting primary sources to resource-poor and capital-rich societies, which then transform them into secondary, lucrative products and re-export them)
- How do we define and conceptualize collapse? Wha does it mean for a society to transform, collapse or adapt? Could possibly explore and conceptualize different types of transformations and define relations between societies, ecosystems and transformations – this could also build on literature that views social systems as autopoietic self-organizing structures (Maturana, Luhmann..)
- Focusing on an individual society e.g. the US and seeing what history tells us about the direction in which it is going (is it nearing some form of collapse or radical transformation?)
- Multi-scale integrated assessment of societal metabolism: introducing the approach (Giampietro and Mayumi, 2000)
- Sustainability of complex societies (Tainter, 1995)
- Allometry of human fertility and energy use (Moses and Brown, 2003)
Related fields (but anyone from any field is more than welcome to join! The more diverse the better): history, philosophy, ecological economics, theoretical ecology, anthropology, societal metabolism, energetics, hierarchy theory
Mean First Saturation Time (Random walks on networks)
Random walks on networks have been broadly studied. An interesting measurement is the mean first passage time between two nodes (i,j) which is the expected time a random walker starting from i will take to reach j for the first time. A generalization of the mean first passage time would be the mean first saturation time which is the expected time at which S (or more) of N random walkers departing from node i arrive at node j.
The idea is to explore this measurement for different networks and for different distributions of N and S
Novelty Several studies have computed both numerically and analytically properties of random walk on networks. However, to the best of my knowledge, the mean first saturation time has not been studied.
Real world applications
European countries have a limit to the number of refugees they can take. By using a network of migration flows, we might be able to understand the susceptibility of each country and optimize the flow of migrants.
- Sood, V., Antal, T., & Redner, S. (2008). Voter models on heterogeneous networks. Physical Review E, 77(4), 041121.
- Sood, V., Redner, S., & Ben-Avraham, D. (2004). First-passage properties of the Erdős–Renyi random graph. Journal of Physics A: Mathematical and General, 38(1), 109.
- Suleimenova, D., Bell, D., & Groen, D. (2017). A generalized simulation development approach for predicting refugee destinations. Scientific reports, 7(1), 13377.
- Maier, B. F., & Brockmann, D. (2017). Cover time for random walks on arbitrary complex networks. Physical Review E, 96(4), 042307.
- Schaub, M. T., Lehmann, J., Yaliraki, S. N., & Barahona, M. (2014). Structure of complex networks: Quantifying edge-to-edge relations by failure-induced flow redistribution. Network Science, 2(1), 66-89.
- Asllani, M., Carletti, T., Di Patti, F., Fanelli, D., & Piazza, F. (2018). Hopping in the crowd to unveil network topology. Physical review letters, 120(15), 158301.
- R Maria
The effects of changing relative timescales on complex systems
Most complex systems have multiple processes operating at different speeds. In general, the ratios between these processes can change - whether through evolution, the decisions of individual agents, new technologies, or external factors. In a simple linear system changing the relative timescales would not qualitatively change the dynamics, but in complex systems it often does. Our goal is to analyze several models of complex systems across different domains and using different methodologies to 1) understand how changing the relative timescales in each of these systems changes the dynamics and 2) determine if anything can be said more generally about the effects of changing the relative timescales in (a subset of) complex systems. We are looking both at "vertical" relative timescales, between for example a fast and a slow dynamics, and "horizontal" relative timescales, for example between the growth rates and the death rates in an ecosystem.
Systems being analyzed
(to be fleshed out)
1. Lotka Volterra ecosystem model.
- multiplying death rates by a constant
- slowly changing the parameters over time
2. Institutional change, David Krakauer's model.
- changing the espilon value that governs the separation of the fast and slow dynamics.
3. Spatial models with diffusion.
- Changing the diffusion rate
- Making diffusion instantaneous, removing space as a factor
- changing the rates of mutation (either in genetic or in idealized adaptive models)
5. Cooperative networks.
- removing timescale separation
- changing speed of processes on the network
- Carlos Marcelo
Dance Improvisation and Complex Systems
According to Wikipedia: "Dance improvisation is the process of spontaneously creating movement. Development of improvised movement material is facilitated through a variety of creative explorations including body mapping through levels, shape and dynamics schema."
Many (not all) choreographers will use "dance improvisation" to generate/invent "new" movements, as a part of their art-making process.
Thoughts on the central question we could consider: Is improvisational dance really improvisational dance? Theorization in Critical Dance Studies exists in this "between-ness" - the interstitial space between bodies - which can be at the membrane level - or encompass the space between bodies across a room. This space can be consumed by movement transmission, cultural transmission, thought transmission, visual transmission - all which have their own sets of cultural constraints.
Lobby (Lecture room building), 2:00 pm - 3:00pm, Tuesday, June 19, 2018 -- Feel free to stop by!
Possible research questions…but open to more!:
1. Questions we'd like to explore.
a. Can we quantify dance improvisation?
i. An emergent property. A task between two people. Interaction between two or more people that requires knowing and predicting your partner. So that you're not literally crashing into each other.
ii. Sharing a common goal -- because that's the common goal of the group.
iii. Ability to create new moves that lie outside the starting alphabet.
iv. Defining dynamics between two people that you wouldn't have with anyone else.
b. Can we define improvisation? Looking to other fields to help us define this term.
c. Simply put, is movement always already spontaneous? Is improvisation truly improvised?
d. How then, is dance improvisation differ from other fields? (Theater, music, conversation, movement improv, etc.)
e. Are your movement choices more informed by past movement choices?
f. i.e. How predictive are your movements?
g. Is improvisation complex or chaotic?
h. Can we embody something that is random?
i. How do we measure "improvisationality"? Degrees of randomness?!
j. Are we just repeating something that has already been done in the past?
Resources We've Found So Far...
Apologies for the formatting...
Bläsing, B. (2015). Segmentation of Dance Movement: effects of expertise, visual familiarity, motor experience and music. Frontiers in Psychology, 5, doi:10.3389/fpsyg.2014.01500
Cameron, W. (1954). Sociological Notes on the Jam Session. Social Forces, 177-182.
Davis, T. (2010). Complexity as Process: Complexity-inspired approaches to composition. Organised Sound, 15, pp 137-146 doi:10.1017/S135577181000030
Hart Y, Noy L, Feniger-Schaal R, Mayo AE, Alon U (2014) Individuality and Togetherness in Joint Improvised Motion. PLoS ONE 9(2): e87213. doi:10.1371/ journal.pone.0087213
Novack, C. (1988). Looking at Movement as Culture: Contact Improvisation to Disco. TDR/The Drama Review. Vol. 32. No. 4. Pp 102-119. https://ais.ku.edu.tr/course/19352/C.J.C.Bull%20-%20Contact%20Improvisation.pdf
Kaeppler, A. (2000). Dance Ethnology and the Anthropology of Dance. Dance Research Journal, 32, 1, 116-125.
Kim, D.; Dong-Hyeon, K.; and Keun-Chang, K. Classification of K-Pop Dance Movements Based on Skeleton Information Obtained by a Kinect Sensor. (2017) Sensors, 17, 1261-1275. Doi:10.3390/s17061261
Kloppenberg, A. (2010). Improvisation in Process: "Post-Control" Choreography. Dance Chronicle, 33:180-207.
Nikolai, J.; and Bennett, G. (2016) Stillness, breath and the spine - Dance Performance Enhancement Catalysed by the Interplay Between 3D motion Capture Technology in a Collaborative Improvisational Choreographic Process. Performance Enhancement & Health, 4, 58-66.
Okada, N.; Iwamoto, N.; Tsukasa, F; and Morishima, S. (2015) Dance Motion Segmentation Method Based on Choreographic Primitives. Waseda University, Waseda Research Institute for Science and Engineering. In Proceedings of the 10th International Conference on Computer Graphics Theory and Applications (GRAPP 2015), 47, Berlin, 2015.03.11-14.
Stuart, J, Bradley, E. (1998). Learning the Grammar of Dance. In Proceeding Fifteenth International Conference on Machine Learning. Madison, WI.
Zack, J.; Kumar, S.; Abrams, R.; Mehta, R. (2009). Using Movements and Intentions to Understand Human Activity. Cognition, 112, 201-216.
Please feel free to join: #improv-dance
Fun YouTube Videos
Some interesting YouTube videos, either improvisational jams, or choreography inspired by improvisation...
NIH Brain Activity Analysis
"The discovery of neuronal avalanches in superficial layers of cortex in 2001 provided solid experimental evidence that indeed the brain might be critical. The spatio-temporal, synchronized activity patterns of avalanches form a scale-free organization that spontaneously emerges in vitro in slice cultures and acute slices and in vivo in the anesthetized rat. We recently demonstrated that ongoing activity in awake monkeys is composed of neuronal avalanches. Their internal organization forms a small-world topology that combines local diversity with efficient global communication. Neuronal synchronization in the form of avalanches naturally incorporates gamma-oscillations and cascades, e.g., synfire chains. The size and timing of a single avalanche is governed by two fundamental power laws, which are equivalent to those found for other critical systems, e.g. the Gutenberg-Richter law for earthquake sizes and the Omori-law, which describes the occurrences of aftershocks following a main earthquake. These properties constitute a novel framework that allows for a precise quantification of cortex function such as the absolute discrimination of pathological from non-pathological synchronization, and the identification of maximal dynamic range for input-output processing."
Real world applications
Potential diagnosis of pathological dynamics including epilepsy, sleep deprivation, hypoxia, and schizophrenia.
- Will be fleshed out
Proposed Projects (may be merged later)
(to be fleshed out)
1. Spatial Analysis on the Array (Jordan, Konstantinos)
2. Delay Coordinate Embedding (Jacob, Simon?)
(topology of the attractor, Lyapunov exponent, predictive power)
Begin by isolating dominant oscillation bands (5-12 Hz, 25-35 Hz, 75-95 Hz) to see if the dimensionality of the system reduces when we isolate avalanches. Then incrementally reintroduce the full complexity of the time series to determine at what point it becomes a hairball.
3. Complexity Analysis (Carol)
4. Avalanche analysis in other data sets
i) EEG data from Jarno -- 1 subject; hypnosis and baseline; eyes closed; TMS pulse effect only ii) EEG data from Jarno -- 10 subjects; saccade task; TMS pulse effect and pre-pulse baseline
(data from this study http://cercor.oxfordjournals.org/content/early/2016/06/24/cercor.bhw182.abstract)
5. Great suggestions from Jordan:
A. The spatial structure of avalanches
•An avalanche is a sequence of time bins during which some number of electrodes have a spike above some threshold. We can look at the locations of the electrodes that exhibit spikes at during the various time bins.
•Functional network structure fit to a spatial branching process
B. Subthreshold indicators of impending avalanches
•The response of a neuron to a neighbor spiking depends on its own action potential. In this way the sub-threshold dynamics should be strongly informative of the possible sequence of spikes in an avalanche.
C. The effect of ambient oscillations on avalanche behavior
•It is known that overall behavioral state (e.g. awake vs. anesthetized) effects the distribution of avalanche sizes. One possible causal factor is the presence or absence of oscillations in the environment of the focal piece of brain tissue.
•This would involve positing a toy model of neural dynamics that insufficiently detailed to produce avalanches, and simulate it with each of oscillatory and constant boundary conditions. Then we would compare, say, avalanche size distribution and see if the difference has a qualitative match to experimental data on awake vs. anesthetized subjects.
•This option would require the most up-front investment, i.e. in developing and calibrating the model.
(May be full members of this project or only part-time)
Looking for unparalleled biological innovations
Biological and technological innovation processes share a number of common mechanisms, including competition or trial/error. But are there innovation processes (= ways to generate a new and improved way of doing something) or innovations (the product of an innovation process) in nature without technological equivalents? If so, how could such "unparalleled innovations" be translated into technological solutions?
The goal of this project is to map biological to technological innovation mechanisms in order to identify cases without technological equivalents, and to explore whether these cases are useful to think about new forms of technological progress.
So, there's some conceptual work and literature review to do, e.g. finding a useful definition of innovation, deciding if we want to focus on the process of creating something new, or on the new thing itself. The fun part could be to ask how current technologies might change if they were transformed using "unparalleled" innovation mechanisms. The risky part is that we might not find such mechanisms.
1. Magdalena Klemun
2. Sarah Berkemer
3. Alexandra Mikhailova
Let's set up a meeting time tomorrow (Thursday)?
Archived Projects ("Parking Lot")
This section is for projects that we decide not to continue with. Maybe they're ideas that can be picked back up later (hence the "parking lot").
Using Principles from Complex Systems in Thinking about AGI Development
AGI = Artificial General Intelligence, a catchphrase for "smarter-than-human" AI, a very misleading phrase which basically means algorithms which are generally capable of performing a wide range of tasks with high efficacy without being explicitly programmed to do each task.
For now, this is intentionally vague to keep open the various possibilities and gather together those who are interested. The project would move beyond current ML techniques, though, and either build on those techniques in significantly novel ways, propose new techniques, or consider from a theoretical standpoint how to design and train an agent (without specification of the implementation) which can perform a broad range of tasks "intelligently" and is aligned with human interests. An important focus is on ensuring alignment (doing what humans would want it to do), which is for various reasons quite hard to do both technically and philosophically.
There are two ways to use complex systems principles:
- In the design and training process of the algorithm
- In understanding how an algorithm will interact with the world around it
Specific project ideas:
- Building in an adaptive mechanism for an agent to adjust its input-output map as the dynamics of its environment change
- Using insights from various evolutionary processes to design a learning process that can produce an intelligent and aligned agent (either using existing AI techniques, or being implementation-agnostic and considering an arbitrary agent)
Feel free to add your name below, and any project ideas above! If we get a few interested people we can meet tonight or tomorrow.
- Luca Rade
- Nam Le
R1 & R2 (to Q1 & Q2): Most interesting aspects seem to be: -- micro-macro abstraction (information loss) --> upward causation; -- macro-micro feedback and adaptation (macro control signals leading to micro adaptations) --> downward causation; -- different time scales between levels (e.g. faster adaptations at the mower levels than at the higher levels) R3 (to Q3à: study robustness, performance of coordination (e.g. speed of communication and/or of convergence) R.....
Robustness of the presidential information cascade on Twitter
How does information dissemination change when Trump blocks other users on Twitter?
Investigate the peer review process from the perspectives of gender, institutional prestige, and nationality(?). Also, let's talk bigger picture about how we model and incentivize successful peer review.
- How does institution and gender impact affect time between submission and acceptance?
- How does the relationship between the gender and institution of the author and the editor impact submission and acceptance decisions?
- How does single/double blind review affect female author acceptance rate? Is single or double blind faster?
- What is the rate of co-authorship between men/men, women/women, men/women?
- Does the H-index of the last author/first author predictive of time from submission to acceptance (publication?)?
Theme 2: Other Idea by Neil
Rethinking science as an Institution
- Study of Incentives in science: What’s role of incentives in the peer review process?
- Experiments: How do we incentivize/re-engineer peer review process? Can we model the different peer review traditions (single blind, double blind, etc.)?
- Design/Engineering interventions
- http://www.pnas.org/content/pnas/114/48/12708.full.pdf (double blind vs single blind)
- https://elifesciences.org/articles/21718 gender bias in peer review (it has anonymised network with about 40k authors https://elifesciences.org/articles/21718/figures#SD3-data)
- https://link.springer.com/article/10.1007/s11192-015-1800-6 calibrated ABM on peer review
- https://www.nature.com/articles/nature12786 subjectivity/objectivity and hearding in peer review
- https://f1000research.com/browse (also data on rejected papers)
- PLOS One Data
Distribution of water resources on a national scale
The 2014 Survey of Living Conditions has produced a dataset detailing a variety of markers of quality of life/economic prosperity across over 30,000 households in Ecuador. We focus our analysis on different modes of water accessibility in order to understand how they correlate to, or can predict, other measures of economic prosperity.