Actions

Research Experiences for Undergraduates 2011-Project Presentations: Difference between revisions

From Santa Fe Institute Events Wiki

No edit summary
No edit summary
Line 81: Line 81:


Mentors: Simon DeDeo and John Miller
Mentors: Simon DeDeo and John Miller
=="Applying the Akaike Information Criterion to Degree-Corrected Stochastic Block Models"==
Jacob Jensen, Columbia University
The Akaike Information Criterion (AIC) is a powerful and highly principled method
of model selection. When working with probabilistic models it is typical to fit them to
the data by maximum likelihood, using Markov Chain Monte Carlo or Belief Propagation.
The maximum likelihood value obtained in this way gives an indication of how good a fit
the model is to the data. Unfortunately, models with more parameters tend to fit the
data strictly better, simply because they have more degrees of freedom and not necessarily
because they describe it better. The AIC counters this by taking the −�(ˆθ) maximum
likelihood value and adding a k = |θ|. The best model according to the AIC is then the
one that minimizes −�(ˆθ) + k. This simple penalty has a powerful justification, but makes
certain assumptions that pose major problems to models used to infer networks. The first
of these assumptions, that all parameters in θ are continuous, can be remedied by a monte
carlo procedure that marginalizes over these variables. The second, which assumes narrow
peak around in the model’s likelihood, is normally easy to deal with given enough data to
accurately estimate parameters. In the case of the Degree-Corrected block model, which
includes an independently parametrized poisson distribution for every, low-degree nodes
violate this assumption. We hope to overcome this problem and find an extension of the
AIC that can handle this case.
Mentors: Cris Moore and Cosma Shalizi

Revision as of 19:04, 15 August 2011

Research Experiences for Undergraduates 2011

"URBAN SCALING IN THE UNITED KINGDOM"

Kevin Carlson, Indiana University

Recently, many scaling laws have been reported to model relationships between American urban areas' population and a broad class of economic metrics. Such techniques improve on traditional per capital measures, which risk confounding the general nonlinear effects fundamental to the nature of dense human populations with deviations from these due to local influences. We intend to extend this work in an analysis of data regarding energy use and criminal activity collected in the United Kingdom. We will aggregate reports from Local Administrative Units to the city level, considering all 65 UK centers with population above ca. 100,000, and apply maximum likelihood estimation techniques to compare various distributions' explanatory power over the observed trends. If power-law scaling is found, we will have a broader domain of support for this theoretical explanation of urban phenomena and the impetus to further generalize understanding of this fundamental constraint on mass social phenomena. The complement to this work will be measuring cities' deviations from the theoretical predictions derived in this way, which information may �find applicability to a range of further research in urban planning and public policy.

Mentor: Luis Bettencourt


"A Network Approach to Interbank Lending and Systemic Risk"

Thomas Catanach, University of Notre Dame

The recent global financial collapse has highlighted the need for a more complete understanding of systemic risk within financial markets, and particular attention has been devoted to the risk stemming from the interconnectedness of banks and other financial entities. We intend to study models of banks with linked balance sheets forming a weighted directional network where edges correspond to interbank assets and liabilities. The study of systemic risk in financial networks will first be addressed using traditional static methods of stress tests, in particular to characterize the emergence of cascading failures stemming from a single bank default. With this approach we will also study the effects of different network topologies on contagion, focusing in particular on the differences between homogeneous and heterogeneous degree distributions. After exploring the static model, we will begin to create a more realistic model in which consumer deposits, interbank loans, external assets, investments, prices, and interest rates are all dynamic and bank assets, liabilities, and investment opportunities are heterogeneous. This model will help us understand economic cycles in which bank assets grow through investment but then fluctuations in assets/liabilities produces a small number of bank defaults which cause a liquidity shortage and eventually propagate to systemic collapse. Through these analysis we hope to explain the robust yet fragile nature of the global financial system, and the fact that banks seeking to minimize their individual risk can inadvertently increase the risk of failures at the systemic level.

Mentor: Fabio Caccioli

"The Onset of Difficulty in A Variant of Graph Coloring and the Jamming Transition"

Ronnie Garduño, University of New Mexico

If we are working with random graphs, it has been shown that there is an edge density level at which the probability of there being a 3-coloring of the nodes of the graph becomes zero. [Stated another way, no 3-coloring can exist of any random graph once a certain fraction of the edges possible in the graph are present.] This phase transition marks the edge of solvability for this class of problem, but there remain questions about the relative difficulty of finding solutions. It seems probable that many problems in the solvable range will require a great deal of computation to solve. Another conjecture is that the difficulty of finding these solutions will scale with the density of the random graphs involved. In order to investigate these problems, we will perform computational experiments measuring the average number of changes that are needed to correct the coloring of random graphs of various sizes and densities. Rather than working with the original discrete formulation of the coloring problem, we will instead focus on the case in which each node is associated with a real number from 0 to 1, with the restriction that any two neighboring nodes must be at least 1/3 (modulo 1) away from each other. This change to the original problem allows us to focus on changes which only affect one neighbor at a time, as opposed to the discrete formulation, in which changing the color of one node can affect every one of its neighbors simultaneously. The problem of modeling the motion of hard spheres in close proximity to each other has many applications in physics. The idea is that there are some systems which we can profitably think of as being composed of inelastic homogeneous spheres, whose interacting motions are then simulated to study various features of the system in question. Many fundamental problems in statistical mechanics are studied using this simplified model, making the difficulty of computationally simulating these interactions a serious barrier to our further understanding. Our approach to graph coloring is intended to operate in the same fashion as the event-chain algorithm for performing such simulations, and our experiments should uncover results which are useful to the design and application of such algorithms.

Mentor: Cris Moore

Jay Garlapati, University of Chicago

Nash equilibria predict the strategies of rational agents playing a game a priori. We aim to study the development of game strategies through an evolutionary process. Professor Miller has proposed an evolutionary perspective where agents are modeled as a population of automata that compete against each other and evolve their strategies through iterations of a genetic algorithm. We will use this approach to study cooperation between agents and environmental effects in n-player games. Krohn and Rhodes noted the equivalence of automata and the algebraic objects, semigroups and used it to prove the Krohn-Rhodes Theorem, which allows us to decompose an automation into an equivalent hierarchical system of permutation and reset automata. We hope to use the Krohn-Rhodes Theorem to understand the effectiveness of the evolved agents in terms of their constituent parts as well as to study their level of complexity.

Mentors: Simon DeDeo and John Miller

"Applying the Akaike Information Criterion to Degree-Corrected Stochastic Block Models"

Jacob Jensen, Columbia University

The Akaike Information Criterion (AIC) is a powerful and highly principled method of model selection. When working with probabilistic models it is typical to fit them to the data by maximum likelihood, using Markov Chain Monte Carlo or Belief Propagation. The maximum likelihood value obtained in this way gives an indication of how good a fit the model is to the data. Unfortunately, models with more parameters tend to fit the data strictly better, simply because they have more degrees of freedom and not necessarily because they describe it better. The AIC counters this by taking the −�(ˆθ) maximum likelihood value and adding a k = |θ|. The best model according to the AIC is then the one that minimizes −�(ˆθ) + k. This simple penalty has a powerful justification, but makes certain assumptions that pose major problems to models used to infer networks. The first of these assumptions, that all parameters in θ are continuous, can be remedied by a monte carlo procedure that marginalizes over these variables. The second, which assumes narrow peak around in the model’s likelihood, is normally easy to deal with given enough data to accurately estimate parameters. In the case of the Degree-Corrected block model, which includes an independently parametrized poisson distribution for every, low-degree nodes violate this assumption. We hope to overcome this problem and find an extension of the AIC that can handle this case.

Mentors: Cris Moore and Cosma Shalizi