For Concordance in Macro-historical Datasets - Bios
From Santa Fe Institute Events Wiki
Working Group Navigation |
Listed below are two sets of questions for us to discuss during the breakout groups. The first set are questions specific to the research we are currently doing. We will start with these. The second are more general questions and we may or may not get to them, given time constraints. Note that these questions generally do not require us to talk much about database structures, thesauri, or other technical issues--I think we need to first determine what we need to carry out our research, and then find technical solutions that allow us to do that.
I. Questions specific to the projects represented.
A. What are our units of analysis? How might we integrate them for cross-dataset searches?
B. What types of data are we using (e.g. textual, numeric, geospatial, etc.)? How might we integrate these across our units of analysis?
C. What are our metadata needs? What information do we require to be accessible with various data sources (e.g. bibliographic, coding or collection protocols, dates of collection, error ranges, etc.)?
D. What types of analyses do we plan to perform? What data formats and/or conversions are required?
E. Are there copyright, human subjects protection, confidentiality, or other access issues we should consider?
II. Broader questions
A. Why are we doing this? What is the purpose of integrating these data? What can we hope to accomplish?
B. What questions (current and future) can we address with these data that we cannot address otherwise? Can we envision these data being used outside of macrohistory?
C. What access, output, processing, etc. is necessary for research to proceed with these data, beyond the groups represented here? How can we enable research on the future questions envisioned under B?
D. How do we pay for all this?
Finally, we should probably examine whether the answers we generate to the questions in Set II cohere with the ones focused on the more immediate concerns in Set I. What are the differences? How do we address them?