Randomness, Structure and Causality - Abstract

Workshop Navigation

Effective Complexity of Stationary Process Realizations

Ay, Nihat (nay@mis.mpg.de)
SFI & Max Planck Institute

The concept of effective complexity of an object as the minimal description length of its regularities has been initiated by Gell-Mann and Lloyd. Based on their work we gave a precise definition of effective complexity of finite binary strings in terms of algorithmic information theory in our previous paper. Here we study the effective complexity of strings generated by stationary processes. Sufficiently long typical process realizations turn out to be effectively simple under any linear scaling with the string's length of the parameter $\Delta$ which determines the minimization domain. For a class of computable ergodic processes including i.i.d. and ergodic Markovian processes a stronger result can be shown: There exist sublinear scalings of $\Delta$ for which typical realizations turn out to be effectively simple. Our results become most transparent in the context of coarse effective complexity --a modification of plain effective complexity, where $\Delta$ appears as a minimization argument. A similar modification of the closely related concept of sophistication has been invented by Antunes and Fortnow as coarse sophistication.

Links: [[1]]

Learning Out of Equilibrium
Bell, Tony (tony@salk.edu)
UC Berkeley

Inspired by new results in non-equilibrium statistical mechanics, we define a new kind of state-machine that can be used to model time series. The machine is deterministically coupled to the inputs unlike stochastic generative models like the Kalman filter and HMM’s. The likelihood in this case is shown to be a sum of local time likelihoods. We introduce a new concept, second-order-in-time stochastic gradient, which derives from the time derivative of the likelihood, showing that the latter decomposes into a ‘work’ term, a ‘heat’ term and a term describing time asymmetry in the state machine’s dynamics. This motivates the introduction of a new time-symmetric likelihood function for time series. Our central result is that the time derivative of this is an average sum of forward and backward time ‘work’ terms, in which all partition functions, which plague Dynamic Bayesian Networks, have cancelled out. We can now do tractable time series density estimation with arbitrary models, without sampling. This is a direct result of doing second-order-in-time learning with time-symmetric likelihoods. A model is proposed, based on parameterised energy-based Markovian kinetics, with the goal of learning (bio)chemical networks from data, and taking a step towards understanding molecular-level energy-based self-organisation.

Links:

The Transmission of Sense Information

Bergstrom, Carl (cbergst@u.washington.edu)
SFI & University of Washington

Biologists rely heavily on the language of information, coding, and transmission that is commonplace in the field of information theory developed by Claude Shannon, but there is open debate about whether such language is anything more than facile metaphor. Philosophers of biology have argued that when biologists talk about information in genes and in evolution, they are not talking about the sort of information that Shannon’s theory addresses. First, philosophers have suggested that Shannon’s theory is only useful for developing a shallow notion of correlation, the so-called ‘‘causal sense’’ of information. Second, they typically argue that in genetics and evolutionary biology, infor- mation language is used in a ‘‘semantic sense,’’ whereas semantics are deliber- ately omitted from Shannon’s theory. Neither critique is well-founded. Here we propose an alternative to the causal and semantic senses of information: a transmission sense of information, in which an object X conveys information if the function of X is to reduce, by virtue of its sequence properties, uncertainty on the part of an agent who observes X. The transmission sense not only captures much of what biologists intend when they talk about information in genes, but also brings Shannon’s theory back to the fore. By taking the view- point of a communications engineer and focusing on the decision problem of how information is to be packaged for transport, this approach resolves several problems that have plagued the information concept in biology, and highlights a number of important features of the way that information is encoded, stored, and transmitted as genetic sequence.

Links: [[2]]

Optimizing Information Flow in Small Genetic Networks

Bialek, William (wbialek@Princeton.EDU)
Princeton University

Links: [[3]]

To a Mathematical Theory of Evolution and Biological Creativity

Chaitin, Gregory (gjchaitin@gmail.com)
IBM Watson Research Center

We present an information-theoretic analysis of Darwin’s theory of evolution, modeled as a hill-climbing algorithm on a fitness landscape. Our space of possible organisms consists of computer programs, which are subjected to random mutations. We study the random walk of increasing fitness made by a single mutating organism. In two different models we are able to show that evolution will occur and to characterize the rate of evolutionary progress, i.e., the rate of biological creativity.

Links: File:Darwin.pdf

Framing Complexity

Crutchfield, James (chaos@cse.ucdavis.edu)
SFI & UC Davis

Is there a theory of complex systems? And who should care, anyway?

Links: [[4]]

The Vocabulary of Grammar-Based Codes and the Logical Consistency of Texts

Debowski, Lukasz (ldebowsk@ipipan.waw.pl)
Polish Academy of Sciences

We will present a new explanation for the distribution of words in natural language which is grounded in information theory and inspired by recent research in excess entropy. Namely, we will demonstrate a theorem with the following informal statement: If a text of length $n$ describes $n^{\beta }$ independent facts in a repetitive way then the text contains at least $n^{\beta }/\log n$ different words. In the formal statement, two modeling postulates are adopted. Firstly, the words are understood as nonterminal symbols of the shortest grammar-based encoding of the text. Secondly, the text is assumed to be emitted by a finite-energy strongly nonergodic source whereas the facts are binary IID variables predictable in a shift-invariant way. Besides the theorem, we will exhibit a few stochastic processes to which this and similar statements can be related.

Links: [[5]] and [[6]]

Prediction, Retrodiction, and the Amount of Information Stored in the Present

Ellison, Christopher (cellison@cse.ucdavis.edu)
Complexity Sciences Center, UC Davis

We introduce an ambidextrous view of stochastic dynamical systems, comparing their forward-time and reverse-time representations and then integrating them into a single time-symmetric representation. The perspective is useful theoretically, computationally, and conceptually. Mathematically, we prove that the excess entropy--a familiar measure of organization in complex systems--is the mutual information not only between the past and future, but also between the predictive and retrodictive causal states. Practically, we exploit the connection between prediction and retrodiction to directly calculate the excess entropy. Conceptually, these lead one to discover new system invariants for stochastic dynamical systems: crypticity (information accessibility) and causal irreversibility. Ultimately, we introduce a time-symmetric representation that unifies all these quantities, compressing the two directional representations into one. The resulting compression offers a new conception of the amount of information stored in the present.

Links: [[7]]

Complexity Measures and Frustration

Feldman, David (dave@hornacek.coa.edu)
College of the Atlantic

In this talk I will present some new results applying complexity measures to frustrated systems, and I will also comment on some frustrations I have about past and current work in complexity measures. I will conclude with a number of open questions and ideas for future research.

I will begin with a quick review of the excess entropy/predictive information and argue that it is a well understood and broadly applicable measure of complexity that allows for a comparison of information processing abilities among very different systems. The vehicle for this comparison is the complexity-entropy diagram, a scatter-plot of the entropy and excess entropy as model parameters are varied. This allows for a direct comparison in terms of the configurations' intrinsic information processing properties. To illustrate this point, I will show complexity-entropy diagrams for: 1D and 2D Ising models, 1D Cellular Automata, the logistic map, an ensemble of Markov chains, and an ensemble of epsilon-machines.

I will then present some new work in which a local form of the 2D excess entropy is calculated for a frustrated spin system. This allows one to see how information and memory are shared unevenly across the lattice as the system enters a glassy state. These results show that localised information theoretic complexity measures can be usefully applied to heterogeneous lattice systems. I will argue that local complexity measures for higher-dimensional and heterogeneous systems is a particularly fruitful area for future research.

Finally, I will conclude by remarking upon some of the areas of complexity-measure research that have been sources of frustration. These include the persistent notions of a universal "complexity at the edge of chaos," and the relative lack of applications of complexity measures to empirical data and/or multidimensional systems. These remarks are designed to provoke dialog and discussion about interesting and fun areas for future research.

Links: File:Afm.tri.5.pdf and File:CHAOEH184043106 1.pdf

Complexity, Parallel Computation and Statistical Physics

Machta, Jon (machta@physics.umass.edu)
SFI & University of Massachusetts

Links: [[8]]

Crypticity and Information Accessibility

Mahoney, John (jmahoney3@ucmerced.edu)
UC Merced

We give a systematic expansion of the crypticity--a recently introduced measure of the inaccessibility of a stationary process's internal state information. This leads to a hierarchy of k-cryptic processes and allows us to identify finite-state processes that have infinite crypticity--the internal state information is present across arbitrarily long, observed sequences. The crypticity expansion is exact in both the finite- and infinite-order cases. It turns out that k-crypticity is complementary to the Markovian finite-order property that describes state information in processes. One application of these results is an efficient expansion of the excess entropy--the mutual information between a process's infinite past and infinite future--that is finite and exact for finite-order cryptic processes.

Links: [[9]]

Automatic Identification of Information-Processing Structures in Cellular Automata

Mitchell, Melanie (mm@cs.pdx.edu)
SFI & Portland State University

Cellular automata have been widely used as idealized models of natural spatially-extended dynamical systems. An open question is how to best understand such systems in terms of their information-processing capabilities. In this talk we address this question by describing several approaches to automatically identifying the structures underlying information processing in cellular automata. In particular, we review the computational mechanics methods of Crutchfield et al., the local sensitivity and local statistical complexity filters proposed by Shalizi et al., and the information theoretic filters proposed by Lizier et al. We illustrate these methods by applying them to several one- and two-dimensional cellular automata that have been designed to perform the so-called density (or majority) classification task.

Phase Transitions and Computational Complexity

Moore, Cris (moore@cs.unm.edu)
SFI & University of New Mexico

We study EC3, a variant of Exact Cover which is equivalent to Positive 1-in-3 SAT. Random instances of EC3 were recently used as benchmarks for simulations of an adiabatic quantum algorithm. Empirical results suggest that EC3 has a phase transition from satisfiability to unsatisfiability when the number of clauses per variable r exceeds some threshold r* ~= 0.62 +- 0.01. Using the method of differential equations, we show that if r <= 0.546 w.h.p. a random instance of EC3 is satisfiable. Combined with previous results this limits the location of the threshold, if it exists, to the range 0.546 < r* < 0.644.

Links: [[10]]

Statistical Mechanics of Interactive Learning

Still, Suzanne (sstill@hawaii.edu)
University of Hawaii at Manoa

The principles of statistical mechanics and information theory play an important role in learning and have inspired both theory and the design of numerous machine learning algorithms. The new aspect in this paper is a focus on integrating feedback from the learner. A quantitative approach to interactive learning and adaptive behavior is proposed, integrating model- and decision-making into one theoretical framework. This paper follows simple principles by requiring that the observer’s world model and action policy should result in maximal predictive power at minimal complexity. Classes of optimal action policies and of optimal models are derived from an objective function that reflects this trade-off between prediction and complexity. The resulting optimal models then summarize, at different levels of abstraction, the process’s causal organization in the presence of the learner’s actions. A fundamental consequence of the proposed principle is that the learner’s optimal action policies balance exploration and control as an emerging property. Interestingly, the explorative component is present in the absence of policy randomness, i.e. in the optimal deterministic behavior. This is a direct result of requiring maximal predictive power in the presence of feedback.

Links: [[11]]

Measuring the Complexity of Psychological States

Tononi, Guilio (gtononi@wisc.edu)
University of Michigan

Links:

Ergodic Parameters and Dynamical Complexity

Vilela-Mendes, Rui (vilela@cii.fc.ul.pt)
University of Lisbon

Using a cocycle formulation, old and new ergodic parameters beyond the Lyapunov exponent are rigorously characterized. Dynamical Renyi entropies and fluctuations of the local expansion rate are related by a generalization of the Pesin formula. How the ergodic parameters may be used to characterize the complexity of dynamical systems is illustrated by some examples: Clustering and synchronization, self-organized criticality and the topological structure of networks.

Links: [[12]]

Hidden Quantum Markov Models and Non-adaptive Read-out of Many-body States

Wiesner, Karoline (k.wiesner@bristol.ac.uk)
University of Bristol

Stochastic finite-state generators are compressed descriptions of infinite time series. Alternatively, compressed descriptions are given by quantum finite- state generators [K. Wiesner and J. P. Crutchfield, Physica D 237, 1173 (2008)]. These are based on repeated von Neumann measurements on a quantum dynamical system. Here we generalise the quantum finite-state generators by replacing the von Neumann pro jections by stochastic quantum operations. In this way we assure that any time series with a stochastic compressed description has a compressed quantum description. Moreover, we establish a link between our stochastic generators and the sequential readout of many-body states with translationally-invariant matrix product state representations. As an example, we consider the non-adaptive read-out of 1D cluster states. This is shown to be equivalent to a Hidden Quantum Model with two internal states, providing insight on the inherent complexity of the process. Finally, it is proven by example that the quantum description can have a higher degree of compression than the classical stochastic one.

Links: [[13]]

Randomness, Structure and Causality - Abstract

From Santa Fe Institute Events Wiki