Complex Systems Summer School 2018-Projects & Working Groups
From Santa Fe Institute Events Wiki
Complex Systems Summer School 2018
|
Using Principles from Complex Systems in Thinking about AGI Development
AGI = Artificial General Intelligence, a catchphrase for "smarter-than-human" AI, a very misleading phrase which basically means algorithms which are generally capable of performing a wide range of tasks with high efficacy without being explicitly programmed to do each task.
For now, this is intentionally vague to keep open the various possibilities and gather together those who are interested. The project would move beyond current ML techniques, though, and either build on those techniques in significantly novel ways, propose new techniques, or consider from a theoretical standpoint how to design and train an agent (without specification of the implementation) which can perform a broad range of tasks "intelligently" and is aligned with human interests. An important focus is on ensuring alignment (doing what humans would want it to do), which is for various reasons quite hard to do both technically and philosophically.
There are two ways to use complex systems principles:
- In the design and training process of the algorithm
- In understanding how an algorithm will interact with the world around it
Specific project ideas:
- Building in an adaptive mechanism for an agent to adjust its input-output map as the dynamics of its environment change
- Using insights from various evolutionary processes to design a learning process that can produce an intelligent and aligned agent (either using existing AI techniques, or being implementation-agnostic and considering an arbitrary agent)
Feel free to add your name below, and any project ideas above! If we get a few interested people we can meet tonight or tomorrow.
Interested Participants:
- Luca Rade
- Nam Le
Neural style transfer in music styles via interacting agents
General idea: A) learn generative models of different music styles using neural networks. B) let these networks ('agents') interact and see what `fusion' music styles result.
Relevant papers: 1) neural style transfer for images (make images look like Van Gough paintings etc.) : https://tinyurl.com/ybpq5agm
2) neural nets for music: https://tinyurl.com/yb2qdqbq and http://imanmalik.com/cs/2017/06/05/neural-style.html
3) bunch of theories of how music styles are results of combination: https://tinyurl.com/y723ugyo
4) music recommendation using neural networks (from Spotify): http://benanne.github.io/2014/08/05/spotify-cnns.html#predicting, https://papers.nips.cc/paper/5004-deep-content-based-music-recommendation
Novelty lies in having the a) multiple agents learn multiple styles independently then letting them exchange information in a meaningful way (probably the trickiest bit) and b) letting these fusion music styles evolve in a network etc. and see what "world-music" results at the end for example.
Details will come...
Thoughts?
- we could also use text corpora instead? Shakespeare etc.
- ...
Interested Participants:
- Yuki
- Vandana
- Xindi
- R Maria
- Kevin
- Allie
- Priya
- Ricky
Optimal representations of high dimensional data in deep learning and biological systems:
What is the best way for a system to represent very high dimensional data? For example, how should the retina encode visual stimuli in neuron firing patterns? How does the immune system encode the space of antigens it might encounter? In each case, it would not be feasible (or efficient) to create a unique tag for each input. Rather, the systems in question must decide which features in the stimuli are most relevant, and trade off between specificity and generality.
Along these lines, there are two more specific questions to investigate:
-It has recently been conjectured that the success of deep learning networks is related to their optimization of a specific informational quantity in each layer https://arxiv.org/abs/1710.11324. Unfortunately this paper is not very clearly written, but basically the idea is that when binning inputs into representations, the distribution of bin sizes should be given by a specific power law, which optimizes the aforementioned information measure. Do biological systems employ the same strategy? With access to the right data, this idea should be straightforward to test. For example, if we have a list of antibodies together with the set of antigens they react to, we can compute this quantity and see whether the antigen "bins" are indeed distributed according to the predicted power law.
-A diverse collection of biological systems that are faced with this task seem to be well-modeled by maximum entropy distributions, with a constraint on pairwise correlations and parameters (i.e. lagrange multipliers) set near a critical point https://arxiv.org/pdf/1012.2242.pdf. This has been applied to the previously given examples of the retina and the immune system, as well as flocking in birds. As far as I know, it is not yet known with certainty whether this kind of encoding scheme is optimal in some sense (like in the previous bullet), or if it is an artifact of our own inference methods, but I think the answer is interesting either way. An immediate question is, if these maximum entropy models are a powerful tool for humans to model high dimensional systems, might biological systems also be producing their own maximum entropy models of environmental variables? That is, are maximum entropy models with constraints on pairwise correlations optimal in some information-theoretic sense, which can be made precise? For example, would this be a particularly useful way to model the distribution of natural images one might encounter? While less straightforward than the previous bullet, I think these are questions well-suited to the skills of the people here, and I think we could make significant progress!
If anyone has expertise to offer, your feedback/participation would be very much appreciated! In particular, I think this project would greatly benefit from those of you that have knowledge in machine learning and biology (my own area is physics and information theory). Feel free to email me at e.stopnitzky@gmail.com
Thoughts?
A genetic model with the resulting protein products could also be useful here (e.g. looking at expression levels and/or variants in a particular gene or set of genes as it pertains to the protein(s) coded by the aforementioned gene(s). In sum, can we find/demonstrate an algorithmic basis for gene expression and/or protein coding? - Kofi
Interested participants:
- Kofi Khamit-Kush (Background in Biology, specifically Cancer Genomics). kkhamitk@gmail.com
- George
-Jacob
The Emergence and Evolution of Legal Systems as Pertaining to Water Distribution
General Idea
There are numerous legal systems that have been identified, broadly categorized into large families – Common Law (Anglosphere and Commonwealth nations), Civil Law (Romance Language nations, Germany, China), Islamic law (most Muslim nations), Customary Law (India, sub-Saharan Africa). More importantly, most nations do not purely lie in one category, but tend to combine elements of multiple systems, either due to merging (i.e. German law combining Germanic tradition with Civil traditions), or through subsidiarity (i.e. Louisiana having Napoleonic law, despite being in a Common Law nation). We are interested in determining how these legal systems by nations and states emerged, influenced each other, and interact over national boundaries.
This is an immense task, so to scope it, one idea has been to limit this project to laws pertaining to water distribution. This is of particular interest when looking at states of nations that have different legal systems, such as Louisiana in the U.S., Quebec in Canada, and Scotland in the U.K. For international interactions, sub-Saharan African nations might also be of value in assessing, as many nations border nations with different legal systems, and water is often a scarse resource in these areas.
If anyone has interest in this topic, and/or expertise in either legal systems or water distribution, feel free to sign up or discuss.
Recommended Papers
Energy and Efficiency in the Realignment of Common-Law Water Rights, Carol M. Rose, The Journal of Legal Studies 1990 19:2, 261-296
Theories of Water Law, Samuel C. Wiel, Harvard Law Review, Vol. 27, No. 6 (Apr., 1914), pp. 530-544
Interested Participants
1. Kevin Comer
Academic hiring networks
General idea:
I am thinking about doing something around academic hiring networks in different disciplines and to play around with idea of multilevel networks (e.g. look at the interplay between different institutional norms in various disciplines and hiring dynamics). Also, would be cool to have a look on interplay between publishing / hiring networks.
We could also explore other ideas related to the academia theme like exploring factors that excellence / equality tradeoffs, or factors that promote gender balance in science, etc.
Who would be interested?
I've created the channel #hiring_networks at slack.
Literature:
* A. Clauset, S. Arbesman and D.B. Larremore. 2015. Systematic inequality and hierarchy in faculty hiring networks. Science Advances 1(1), e1400005 (2015).
Interested Participants
1. Evgenia (Background in social network dynamics, psychology and organisation science)
2. Ricky (Background in multilayer networks, network resilience, machine learning)
3. Allie (Background in networks, science of science, gender)
4.
5.
Make deep neural networks more biologically accurate by including inter-neural travel times
General idea:
Make deep neural networks more biologically accurate by including inter-neural travel times. Train with some normal task like digit-recognition.
Motivation:
- Currently, deep neural networks only share some similarity to actual neurons: threshold behavior and hierarchical representations.
- However, in real neural networks, signals travel with finite speed and activations are integrated over time
- This ignored aspect could be one reason why real neuronal networks/brains are superior
- Further connecting the two fields of neuroscience and deep learning would be pretty cool
- We could use the "regular" neural network machinery to optimize weights etc for tasks like forecasting/image recognition and then see whether we find neural avalanches and chaotic behavior etc.
Details (first ideas):
- In artificial neural networks, different neurons are connected by weights. To this, we add another connection between the neurons: the inter-neuron travel time.
- The inter-neuron travel time is computed by a RNN
- inference works by letting the network oscillate/ come to an equilibrium
- activation of neuron i at time t: a_i(t) = sum_over_connnected_neurons [f(a_j(t)) * delta(rnn(j->i)-t ) + exp(-lambda*t) f(a_i(t))], where delta is the Kronecker delta.
- I.e. the signal from connected neurons arrives at the time specified by the RNN and then slowly decays with exponent lambda
- if the RNN just gives t=1 for all travel times, this essentially reduces the normal deep neural net output.
Evolution of social norms as a process within or between societies
General idea
Currently, there are two ideas floating around:
- How do social norms evolve *within* a society? Method-wise this is perhaps be related to the spread of ideas/information on a social network. The agents in this network are people. Potentially relevant models: Opinion formation, infectious (disease) spread and/or games on social networks.
- Think of a whole society (group/tribe/nation/etc) as an agent. A society may adopt or discard various social norms over time. If one of the chosen social norms (or a combination of the chosen combination of social norms) is woefully impractical it decreases the "fitness" of the whole society and it loses members/power/resources/territory to competing societies. Potentially relevant models: Models for evolutionary game theory.
The project can focus on (1), (2), or both.
Recommended Papers
For (1):
- Ostrom, Elinor. "Collective action and the evolution of social norms." Journal of economic perspectives 14.3 (2000): 137-158.
- Sethi, Rajiv, and Eswaran Somanathan. "The evolution of social norms in common property resource use." The American Economic Review (1996): 766-788.
- Centola, Damon, et al. "Experimental evidence for tipping points in social convention." Science 360.6393 (2018): 1116-1119.
For (2):
- ???
Interested Participants
Alice, Vandana, Alan, Xindi, Jenn, Matt, Sandra, Kevin
Topological features of neutral networks in evolution
Introduction
In a genotype network, nodes are genotypes and a link from genotype A to genotype B indicates that they are separated by a single mutation. Each genotype has a phenotype associated with it. In a fixed environment, a phenotype is associated with a fixed fitness value. So for every node, one has:
GENOTYPE -> PHENOTYPE -> FITNESS VALUE
The fitness values form a "fitness landscape", in which one can embed the genotype network. The set of nodes in a genotype network that corresponds to the same fitness value are a *neutral network*. These networks have received little or no attention from network scientists. Let's change that!
General idea
Depending on the interest of participants, this project could focus on (1) data analysis or (2) network theory.
(1) Szendro et al. mention that empirical data for genotype networks and their neutral networks is available. This is a somewhat new development (<10years). One could scout for one or several available data sets and study the topology of the networks. For example,
- what are topological characteristics of genotype networks? Can these characteristics be explained by constraints of embedding on a curved manifold? (One could compare data to random graph models, e.g. Erdos-Renyi, small world, or geometric random graph models.)
- how are neutral networks for high or low fitness values different?
- one could also think of the genotype network as a multlayer network with a lot of layers ... and analyse its topology from a multilayer perspective.
(2) A neutral network is a "level-set network" in the genotype network. The genotype network is a network that is embedded in a curved manifold in a high-dimensional space. There is so much cool math/physics/topology that one could do with this!!
Recommended Papers
- https://en.wikipedia.org/wiki/Neutral_network_(evolution)
- Szendro, Ivan G., et al. "Quantitative analyses of empirical fitness landscapes." Journal of Statistical Mechanics: Theory and Experiment 2013.01 (2013): P01005.
- De Visser, J. Arjan Gm, and Joachim Krug. "Empirical fitness landscapes and the predictability of evolution." Nature Reviews Genetics 15.7 (2014): 480.
- Kondrashov, Dmitry A., and Fyodor A. Kondrashov. "Topological features of rugged fitness landscapes in sequence space." Trends in Genetics 31.1 (2015): 24-33.
Interested Participants
Alice
Ricky
Carlos
Networks from thresholded normally distributed data
Observations:
- real-world networks are often created by thresholding dyadic interaction;
- lots of things are approximately normally distributed.
Idea:
- Suppose for each pair of nodes, i and j, there is a normally distributed interaction: x_ij ~ Normal(0,1);
- Then, we place edges between nodes i and j whenever x_ij>threshold;
- Edge correlations could be controlled by a single parameter, i.e. Cov(x, y) = beta .
Conjecture:
- The resulting degree distributions have two limiting forms, and are approximately Poisson or power law(ish) (+ exponential cut-off), with something intermediate inbetween (log normal?)
Things to look at:
- Can we solve for the degree distribution of this model?
- Does this degree distribution look like real networks? Can we fit the model easily (e.g. maximum likelihood or method of moments)
- What about the giant component phase transition?
- Does clustering vanish in the limit of large network size?
This would be a more mathematical/theoretical project, and less about real world data.
Interested participants:
- George (background in physics and networks)
- Alice
The Evolution of Beliefs in Abrahamic Religions
General Idea
One commonality across all Abrahamic faiths – Judaism, Christianity, Islam, and others – is its reliance on the written word to solidify and codify beliefs, even centuries after the text was documented. Because of this large time difference between when documents were written – Torah, New Testament, Qu’ran – and when these beliefs grow and evolve, decisions are often linked to other texts as justification for the decision. For instance, when Ecumenical Councils declare a new testament of faith, they often point to previous texts from church fathers for justification (or sometimes non-believers, like pre-Christian Greek philosophers). Similarly, when imams declare testaments of faith, these are often linked to the hadiths and sirahs as justification. Canon law and Islamic law is based on these two dynamics respectively. Religions often influence each other, both as attractors (Islam prompted Iconoclasm in Eastern Christianity) and repulsors (Early Christianity set itself in opposition to Judaic practices, despite being considered a Jewish sect).
Recommended Papers
Interested Participants
1. Kevin