Metadata Guidelines for Collections using CONTENTdm
Table of Contents
1. How metadata is used in CONTENTdm | 4. Setting up CONTENTdm field properties Includes an introduction to Dublin Core mapping |
2. Basic decisions about metadata Thinking about your collection and how it will be used |
5. "Flattening complex reality" Keeping it simple |
3. Formatting data The importance of consistency and standards |
6. Field properties table Specific advice about choosing field names, mapping to Dublin Core, formatting data, and choosing controlled vocabularies |
A CONTENTdm collection contains digitized images, texts, videos, or other formats. Each of these digital resources has a description (or "metadata") attached to it. It is important to know that the description will not only be displayed with the resource, but that the data contained in it can also be used for searching your collection by itself or in combination with other collections.
See how CONTENTdm displays metadata ››
Description: What kind of information do you need to describe each resource? What do your users need to know about what the resource is, where it came from, who created it, what its significance is? How much detail do you need to go into?
Retrieval: How will users find resources in your collection? What will they be looking for? What aspects will they be interested in? At what level do you need to distinguish one resource from another, and at what level do you want to bring like resources together?
Using standards for inputting your data is very important. Standards insure consistency, which
- increases coherence and intelligibility of description
- enhances reliability of retrieval
- enables compatibility with other collections (cross-database searching)
- makes maintenance and possible migration of data easier
Data should be formatted in a standard way. Actually, which format you choose may not be as important as always using the same format for data in the same field.
- In a field called "Date" make sure that dates are always formatted in the same way.
- In a field called "Photographer" the same person's name should always appear in the same form.
- Similarly, the resources about the same topic should have the same term used to describe them. For example, a user looking for images of retail stores using the field "Subject" should be able to do a single search to find all the relevant images. If different terms are used, the user may not even realize that more than one search is necessary.
This is where a "controlled vocabulary" or "authority file" can be useful. A standard list of authorized terms can eliminate the ambiguity that arises from synonymous terms, homonyms, variant spellings and other pitfalls. There are controlled vocabularies that already exist for many subject areas and disciplines, or you could create your own standardized list of terms if it were reasonably short and you needed something very specialized for your collection. Either way, with a controlled vocabulary you don't have to monitor your own consistency as you input metadata--the act of adhering to the list in itself will create the consistency you need. This is especially useful if more than one person will be inputting metadata in your collection.
You can set up your metadata fields in the CONTENTdm Server Administration module under "View/edit collection field properties." CONTENTdm allows you to:
- have as many fields in the description as you want
- create your own field names
- decide whether each field will be searchable or will display
- put the fields in any order you want
- make fields available for cross-database searching
CONTENTdm has the capability to search multiple collections at once. In order to achieve this, CONTENTdm uses underlying mapping to simple Dublin Core (DC) elements to create a crosswalk between similar fields with different field names in different collections. (The Dublin Core is an internationally agreed upon basic metadata scheme that defines 15 general descriptive elements, for example, Creator, Title, Date, Subject, Publisher). You may map each field in your collection to a corresponding Dublin Core element. Or you could choose not to map certain fields to any DC element if the fields did not fit well into the DC schema, or if you didn't want to make these fields available for cross-database searching.
Example: The fields in the table below are from different databases and all somehow represent the name of a person (or organization) involved in the creation of a resource. Since all these fields have been mapped to the Dublin Core element "Creator", a cross-database search across multiple collections in the field "Creator" will retrieve the appropriate resources from whichever collection they are in, no matter what the collection-specific field name is.
Collection | Collection-Specific Field Name | DC Mapping |
---|---|---|
Collection A | Architect | Creator |
Collection A | Photographer | Creator |
Collection B | Author | Creator |
Collection C | Person Interviewed | Creator |
"By 'pretending' that a cross-section of resources is uniformly simple we thereby make it possible to search for them in a simple manner."
--Carl Lagoze, Accommodating Simplicity and Complexity in Metadata, 2000
CONTENTdm's database structure right now is flat. There is no way structurally to distinguish between metadata for different physical manifestations of a resource, for example, between the original object, the photograph of the object, and the digitized scan of the photograph.
The UW Libraries has not attempted to follow a strict 1:1 correspondence between metadata and the particular manifestation of the resource. Whatever information seemed important for users of a particular collection was included in the metadata. For example, in a collection of photographs of buildings, both the photographer and the architect are important for searching, so both fields were included and both were mapped to the underlying Dublin Core element "Creator". The name of the person who did the scanning was not considered significant and was completely left out.
To set field properties in CONTENTdm, use the Server Administration module, and select "View/edit collection field properties."
Shown below are the default values for field properties as they appear in the CONTENTdm Server Administration module. Remember, the field properties as they originally appear in the Administration module are just a starting point--you can add, delete, and reorder the fields in any way, without affecting searching within the collection or across multiple collections. (It is the DC mapping that controls searching across multiple collections, not the order of the fields.)
We have added extra explanatory information to the sample table below. Click on a field property (headers at the top) or on a field name to see advice about how to use the field.
You can see a basic template for a metadata application profile or data dictionary here. You can also see examples of how other CONTENTdm collections at the UW Libraries have set up their metadata by looking at their data dictionaries. We recommend recording all metadata decisions about your collection in a data dictionary, which would have much more detail than the CONTENTdm field properties table can contain. For instance, in CONTENTdm administration of field properties, there is no place to record decisions about formatting standards, but this can be recorded in your data dictionary.
Field name | DC mapping | Data type | Big field | Searchable | Hidden | ControlVoc |
---|---|---|---|---|---|---|
Title | Title | Text | No | Yes | No | No |
Subject | Subject | Text | No | Yes | No | No |
Description | Description | Text | Yes | Yes | No | No |
Creator | Creator | Text | No | No | No | No |
Publisher | Publisher | Text | No | No | No | No |
Contributors | Contributors | Text | No | No | No | No |
Date | Date | Text | No | No | No | No |
Type | Type | Text | No | No | No | No |
Format | Format | Text | No | No | No | No |
Identifier | Identifier | Text | No | No | No | No |
Source | Source | Text | No | No | No | No |
Language | Language | Text | No | No | No | No |
Relation | Relation | Text | No | No | No | No |
Coverage | Coverage | Text | No | No | No | No |
Rights | Rights | Text | No | No | No | No |