

(31 intermediate revisions by 2 users not shown) 
Line 1: 
Line 1: 
 {{Randomness, Structure and Causality}}   {{Randomness, Structure and Causality}} 

 

   [[Media:Agenda.pdfAgenda PDF]] 
 == Abstracts ==
 
  
 <br>
 
  
 The Vocabulary of GrammarBased Codes and the Logical Consistency of Texts<br>
 
  
 Debowski, Lukasz (ldebowsk@ipipan.waw.pl<br>
 
 Polish Academy of Sciences<br>
 
 <br>
 
 <p>
 
 We will present a new explanation for the distribution of words in
 
 natural language which is grounded in information theory and inspired
 
 by recent research in excess entropy. Namely, we will demonstrate a
 
 theorem with the following informal statement: If a text of length $n$
 
 describes $n^\beta$ independent facts in a repetitive way then the
 
 text contains at least $n^\beta/\log n$ different words. In the
 
 formal statement, two modeling postulates are adopted. Firstly, the
 
 words are understood as nonterminal symbols of the shortest
 
 grammarbased encoding of the text. Secondly, the text is assumed to
 
 be emitted by a finiteenergy strongly nonergodic source whereas the
 
 facts are binary IID variables predictable in a shiftinvariant
 
 way. Besides the theorem, we will exhibit a few stochastic processes
 
 to which this and similar statements can be related.
 
  
 <p>
 
  
 [[http://arxiv.org/abs/0810.3125]] and [[http://arxiv.org/abs/0911.5318]]  