Metadata
From Santa Fe Institute Events Wiki
Metadata
Here's a simple description in Powerpoint form of library metadata:
documents/metadata-demystified.pdf
Here's an example of managing a metadata project:
documents/metadata-strategy.pdf
Metadata Standards
Here are the standards.
From Jewel (AISTI consultant who assisted SFI in Feb. 2004)
comm.nsdl.org/download.php/196/PC3Draft1b.pdf
(If you have problems accessing the paper, let me know.)
I'll see if I can get a hold of the final version, but in the meantime, you may wish to look at section 1.2.13.1, which provides a definition of native metadata, and other types of metadata for a collection.
This document may be useful as you redesign your systems/db in the next phase. ;-) I will also read through the entire document; at this point, I've skimmed through it. At that point, if you don't mind I may make some further recommendations for SFI, based on what we have discussed.
Metadata Preservation
VICTORIA MCCARGAR, SEYBOLD REPORT = Since the mid-1990s, it has become increasingly clear that information stored digitally is terribly fragile...The task of identifying all the risk factors and putting preservation solutions in place has barely begun.
http://content.seyboldreport.com/TSR/subs/0421/disappearing_data.php
SFI Reports (chronological order)
Date: Mon, 4 Nov 2002 13:29:19 -0700 (MST) From: Margaret Alexander <mba@santafe.edu> Subject: eprint server To: ellen@pele.santafe.edu, gumerman@pele.santafe.edu, rkbv@pele.santafe.edu, grr@pele.santafe.edu Cc: cmachado@aisti.org, mba@pele.santafe.edu
Last week George, Ronda, Tim, Ginger, and I had a preliminary meeting about
putting SFI's working papers on a new e-print server that is being developed by
our consortium (called AISTI) of sci-tech libraries. During the course of a
D.C. meeting with other libraries about e-print servers, most speakers concurred
that policy issues took longer than technical ones.
MIT Libraries developed a repository for the electronic scholarship of its university called dSpace. In the process of creating dSpace, they compiled a list of institutional questions/policies that had to be answered or written before the electronic repository went into operation. Here they are:
1. Who is qualified to submit electronic content? At SFI this question is taken care of by our policies for working paper submission. (In the future, SFI may not want to limit its submissions to just working papers. However, our "first step" in this endeavor should be working papers.)
Ronda: I agree. We can, however, readdress the working paper policies as needed.
2. What is the character of submissions? At SFI this question is taken care of by our policies for working paper submissions. (At other repositories learning objects such as course materials, theses, and dissertations may be submitted.)
3. What forms of submissions are accepted? At SFI this question is taken care of by our policies for working paper submissions.
4. What are our expectations for retention? This question probably applies to learning objects which are obsolete when a course is completed. At SFI we might ask what would happen if we decide to withdraw a working paper--is this covered in our current policy? At UCLA the policy is that the citation (metadata) cannot be removed but the content can be.
Ronda:I agree that we may want to delete the electronic versions of the papers but not the actual citing from the archive. (Currently, we leave the number but delete the title and author to avoid confusion.) I think there needs to be a comments field or option which specifies that this paper has been withdrawn, and maybe by whom such as "withdrawn by author."
5. How do repository policies interact with other policies? At SFI we may be simple enough administratively to ignore this concern.
Ronda: One possible conflict is that student and nonfaculty members must have their working papers reviewed by an SFI faculty member or Science Board member. If the archive is truly open (anyone can submit), then the designation as an SFI working paper would have to be delayed until this review took place and the individual's connection to SFI verified.
6. Are there privacy issues? At SFI, this question might entail asking each individual working paper author if it's o.k. to submit SFI back files to the new e-print server.
Ronda:Since the authors have already consented to listing the paper as an SFI working paper, and we are simply changing how our papers are presented, I believe we can make the transition to another server by notifying the authors of our intent rather than getting permission in writing from each one. They can contact us if they object. Note that some past authors may be difficult to contact.
If you think about it, these papers have been in the public domain for some time now so they are not proprietary. And as long as the server remains open, we are not violating the original "agreement" with the authors. We just need to remember that we don't own the copyright for any of the papers.
7. What metadata will we require? Is the metadata at SFI consistent, complete, and accurate? If not, who will do the work?
Ronda:For SFI's working papers, we have a great deal of information in a FileMaker database and this data has been checked. We do not have electronic copies of all papers back to 1989, nor do we monitor the quality of these papers per se (we simply check to see if they print correctly). We will have to decide later whether we want to recreate key historical papers, particularly those that are frequently requested, and what to do about quality control.
The conversion of our database to the metadata is an issue
The following need to be asked of the AISTI eprint server:
8. What if the eprint server fails? At SFI, since we already have a working paper site, we are protected.
Ronda':Not if we defer to the server instead. I thought this is what George said at our meeting--it would replace what we have. We will continue to track key information about each paper and the abstract, but will have to decide whether to, and how to, archive electronic copies. And we would need a plan for how to make this information available should the eprint server fail.
9. What if revenue comes from the eprint server? We also need to make certain
that the eprint server remains open and does not require licensing.
10. Who incurs the costs of running the eprint server over the long-term? Initially, the eprint server has been funded by LANL with a grant from AISTI.
11. How will the eprint server be used? For example, can outside researchers use the repository as a "sand box" for their own research?
Ronda:I think having a repository that accepts a variety of papers is fine as long as we protect those papers classified as SFI working papers. Hence, I think it would be helpful to have some classification of articles by broad topics, such as complexity, as well as key words and specific paper series. For example, if an author submits a paper, it is not listed as an SFI wp until someone designated at SFI agrees or verifies this.
I am very excited about this project. I hope the researchers are as enthusiastic. Good luck at your presentation.
Let me know your thoughts on how these questions should be answered. On Nov. 13 I will introduce the eprint server at the noon time faculty meeting.
Margaret
Crosswalks
2) MARC 21 to DC Crosswalk from the Library of Congress
This covers both qualified and unqualified DC.
MARC to DC: http://www.loc.gov/marc/marc2dc.html This one provides the mapping in the form of an easy-to-read table.
DC to MARC: http://www.loc.gov/marc/dccross.html
As for whether or not "report number" maps to dc:relation...MARC #088 is "rept.#"; therefore, your report numbers would map to dc:identifier rather than relation. See the back page of the copy of the TRI mappings that I gave you.
I have not been able to find a better explanation of each element, other than what is in the DCMI web site. I did find a page full of links to crosswalks, though: http://www.ukoln.ac.uk/metadata/interoperability/
Regards, Jewel
