Unfolding History: Difference between revisions
From Santa Fe Institute Events Wiki
(→Tools) |
|||
(One intermediate revision by one other user not shown) | |||
Line 26: | Line 26: | ||
[[File:EditsChipre.png|661px]] | [[File:EditsChipre.png|661px]] | ||
Hey Luiño, I wrote something to parse txt data into dictionary. [https://www.dropbox.com/sh/wigwizffvcmf7l6/xxnew5I1VQ] Please check... (minimal programing experience) Thanks! - Mengsen. | |||
==Tools== | ==Tools== | ||
The wikipedia has got a [https://en.wikipedia.org/wiki/Wikipedia:Statistics large collection of tools] to extract statistics from the site. After a loose search, nothing was found that resembles what I (Luíño) had in mind. There are very interesting models and fits to data, though, of how an article grows in time or how much this or that user affects a wiki. If we wanted to do something with the wikipedia eventually, we should check out that what we need is has not already been invented. Someone up to navigate through these tools and tells us about them? | The wikipedia has got a [https://en.wikipedia.org/wiki/Wikipedia:Statistics large collection of tools] to extract statistics from the site. After a loose search, nothing was found that resembles what I (Luíño) had in mind. There are very interesting models and fits to data, though, of how an article grows in time or how much this or that user affects a wiki. If we wanted to do something with the wikipedia eventually, we should check out that what we need is has not already been invented. Someone up to navigate through these tools and tells us about them? | ||
=Natural Language Processing tools= | |||
Carol made some literature review on Natural Language Processing tools with the idea that peering inside the meaning of the texts can also be helpful for our task. I paste here her results: | |||
Natural Language Processing (NLP) Survey of Tools & Resources | |||
[http://emerge.mc.vanderbilt.edu/natural-language-processing-nlp-survey-tools-resources] | |||
Natural Language Toolkit | |||
NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning. | |||
(python) | |||
[http://nltk.org] | |||
The Stanford Natural Language Processing Group makes parts of our Natural Language Processing software available to everyone. These are statistical NLP toolkits for various major computational linguistics problems. They can be incorporated into applications with human language technology needs. | |||
(java) | |||
[http://nlp.stanford.edu/software/] | |||
CRAN Task View: Natural Language Processing | |||
Natural language processing has come a long way since its foundations were laid in the 1940s and 50s (for an introduction see, e.g., Jurafsky and Martin (2008): Speech and Language Processing, Pearson Prentice Hall). This CRAN task view collects relevant R packages that support computational linguists in conducting analysis of speech and language on a variety of levels - setting focus on words, syntax, semantics, and pragmatics. | |||
In recent years, we have elaborated a framework to be used in packages dealing with the processing of written material: the package tm. Extension packages in this area are highly recommended to interface with tm's basic routines and developers are cordially invited to join in the discussion on further developments of this framework package. | |||
(tm]) | |||
[http://cran.r-project.org/web/views/NaturalLanguageProcessing.html] | |||
The SPECIALIST Natural Language Processing (NLP) Tools have been developed by the The Lexical Systems Group of The Lister Hill National Center for Biomedical Communications to investigate the contributions that natural language processing techniques can make to the task of mediating between the language of users and the language of online biomedical information resources. The SPECIALIST NLP Tools facilitate natural language processing by helping application developers with lexical variation and text analysis tasks in the biomedical domain. The NLP Tools are open source resources distributed subject to these terms and conditions. | |||
[http://lexsrv3.nlm.nih.gov/Specialist/Home/index.html] | |||
Basic NLP tools | |||
[http://adimen.si.ehu.es/~rigau/research/Doctorat/LSKBs/00-NLP-tools.pdf] | |||
Software Tools for NLP | |||
[http://www-a2k.is.tokushima-u.ac.jp/member/kita/NLP/nlp_tools.html] | |||
Natural Language Processing | |||
[http://www.cs.cofc.edu/~manaris/ai-education-repository/nlp-tools.html] | |||
Natural Language Processing-(NLP) Tools | |||
[http://sqnco.com/2012/12/natural-language-processing-nlp-tools/] | |||
==Literature== | ==Literature== |
Latest revision as of 18:26, 17 June 2013
Trying to give new impulse to this project: let us use this site to share info and tools!
Brainstorming
Some of the ideas after today's meeting:
- History Epistasis:
- How does an event affect previously existing events?
- Can we make a network and check out how deep a wave of modifications propagates.
- Reaction time after an event happens.
- Make an agent based model.
- Take empirical data from the wikipedia.
- Agents biasing History.
- Do conflicts in a country reflect on their account of History?
- External vs. internal history: the cost of inner encoding vs. relying on the environment to encode important traits.
Please, post
Also, Joshua proposed some tools that can be handy to detect a change in the editing regime of an event. I just post some key-words I could catch up. If someone could add some description or some bibliography on that?
Scripts
It is very easy and fast to parse files with python. This links to a dropbox folder containing a few sample data manually cropped from the wikipedia and three python scripts to parse the data.
Hey Luiño, I wrote something to parse txt data into dictionary. [1] Please check... (minimal programing experience) Thanks! - Mengsen.
Tools
The wikipedia has got a large collection of tools to extract statistics from the site. After a loose search, nothing was found that resembles what I (Luíño) had in mind. There are very interesting models and fits to data, though, of how an article grows in time or how much this or that user affects a wiki. If we wanted to do something with the wikipedia eventually, we should check out that what we need is has not already been invented. Someone up to navigate through these tools and tells us about them?
Natural Language Processing tools
Carol made some literature review on Natural Language Processing tools with the idea that peering inside the meaning of the texts can also be helpful for our task. I paste here her results:
Natural Language Processing (NLP) Survey of Tools & Resources [2]
Natural Language Toolkit NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning. (python) [3]
The Stanford Natural Language Processing Group makes parts of our Natural Language Processing software available to everyone. These are statistical NLP toolkits for various major computational linguistics problems. They can be incorporated into applications with human language technology needs. (java) [4]
CRAN Task View: Natural Language Processing Natural language processing has come a long way since its foundations were laid in the 1940s and 50s (for an introduction see, e.g., Jurafsky and Martin (2008): Speech and Language Processing, Pearson Prentice Hall). This CRAN task view collects relevant R packages that support computational linguists in conducting analysis of speech and language on a variety of levels - setting focus on words, syntax, semantics, and pragmatics. In recent years, we have elaborated a framework to be used in packages dealing with the processing of written material: the package tm. Extension packages in this area are highly recommended to interface with tm's basic routines and developers are cordially invited to join in the discussion on further developments of this framework package. (tm]) [5]
The SPECIALIST Natural Language Processing (NLP) Tools have been developed by the The Lexical Systems Group of The Lister Hill National Center for Biomedical Communications to investigate the contributions that natural language processing techniques can make to the task of mediating between the language of users and the language of online biomedical information resources. The SPECIALIST NLP Tools facilitate natural language processing by helping application developers with lexical variation and text analysis tasks in the biomedical domain. The NLP Tools are open source resources distributed subject to these terms and conditions. [6]
Basic NLP tools [7]
Software Tools for NLP [8]
Natural Language Processing [9]
Natural Language Processing-(NLP) Tools [10]
Literature
The following more or less related papers have been posted to the project by different colleagues:
- The Evolution of God -- by Pablo_Galindo.
- A model on religion diversification that can be very handy if we decided to make a model -- by Cesar.
- Network approach to history -- by Andrea.
- A nice project by Carol Strohecker on narrative unfolding: Tired of Giving In.
- Brian Keegan wrote his dissertation on something pretty similar, and has done some other work on Wikipedia edits of current events. --David
- Herman & Chomsky's Propaganda model developed in Manufacturing Consent (1988). A really great documentary roughly based on the book is also available on youtube.
Surely more people has approached the problem of History formation from a Complex Systems approach. It would be interesting to go over the literature and maybe find some insight. Someone would like to do that? Elisa