The ECAI Metadata Issues:
a summary
Dr T. Matthew Ciolek,
Research School of Pacific and Asian Studies,
Australian National University, Canberra ACT 0200, Australia
tmciolek@coombs.anu.edu.au
http://www.ciolek.com/PEOPLE/ciolek-tm.html
notes presented at the
3rd Electronic Cultural Atlas Initiative (ECAI) Workshop,
Villa Bosch, Heidelberg, Germany,
29-30 June 1998
Document created: 19 Aug 1998. Last revised: 27 Aug 1998
0. Introduction
The following is a set of notes I used on Mon 29th June 1998
during the ECAI Roundtable Discussion on "Metadata -
Standards and Practices". The Session was organised and moderated by Dr
Helen Jarvis, (h.jarvis@unsw.edu.au), School of Information,
Library and Archive Studies, U. of New South Wales, Sydney, Australia.
There are presented here, subject to subsequent refinements and
changes, in hope of aiding and stimulating further
discussion on the ECAI standards and methodologies.
You are warmly invited, therefore, to
send your comments and suggestions to
tmciolek@coombs.anu.edu.au.
1. Metadata Definitions
The term "metadata" in its most abstract form means "a set
of information which remains in some intentional,
hierarchical relationship with another set of information", or better still
"vital intelligence about some piece of information."
In other words, metadata is "a summary data about some other
data", or "a map, an overview, a careful
distillation of the essence of some larger unit of information".
Metadata come in many shapes and flavours.
They constitute essential ingredients of multifaceted apparatus
dedicated to efficient distribution, storage, tracing and retrieval of
publicly accessible information. Metadata can be of many types:
- Pointer to the information (eg. bibliographic record, ISBN, a
review, index)
- by using it you know where to find the required
item of information;
- Door to the information (e.g. URL, a phone number)
- by using it you access, get hold of the required item of information;
- Navigation marker (e.g. chapter name, page number)
- by using it you know where you are within the body of information;
- Information decoder (e.g. scale and a legend on a map; a
glossary of terms, a list of acronyms)
- by using it you know what is the range and significance of various items of
information you are working with;
- Content indicator (e.g. keywords list, abstract, TOC)
- by using it you establish in advance what is the body of information most likely to
be about;
- Structure indicator (e.g. TOC)
- by using it you establish what is the overall organisation as well as
and component elements of the body of information in question;
- Provenance indicator (e.g. author, date, publisher details)
- by using it, you figure out the geographical, social and temporal
context of a given body of information;
- Function indicator (e.g. labels such as "map", "dictionary", "statistical table";
"bibliography"; "album of photographs")
- by using it you determine the kind of data [numbers, text,
images, graphs, sound etc.] you are likely encounter and the
type of operations you can peform on them [browse, read, view,
analyse, recalculate, compare, chart and so forth];
- Quality indicator (e.g. publisher's logo, author's
institutional affiliation, date of publication, ranking on a
bestsellers' list)
- by using it you can form an opinion with regard to the information's
objectivity, competence, truthfulness, accuracy, timeliness, elegance of presentation
and so forth (see also Ciolek 1996);
- Attention grabber (e.g. recommendations, blurbs, reviews)
- by using it you may get used by it (i.e. do, nolens volens, the opinionmaker's bidding).
2. Metadata Functions
The above analysis suggests that there are, in fact, six (6) systematically related
functions metadata can perform in relation to a given set of information
- Find - metadata help to locate a possibly relevant subset of data within the global information soup;
- Focus - metadata help to eliminate uninteresting alternative materials;
- Preview - metadata help to see a sample, a specimen of the interesting material;
- Access - metadata helps to connect to and download the complete information resource;
- Use - metadata help to navigate across and smoothly process the obtained material;
- Interpret - metadata provide the context for drawing correct inferences from the
obtained material, they reveal and establish possible
relationships between a given chunk of information and other informational chunks.
Not all metadata tags and annotations, of course, are capable of
performing all these six functions by themselves. Hence, an
existence of multiple, specialised sets of metadata scattered
both within and without the resource in question.
3. Metadata Usability
The usability of the metadata is related to the question of the intended primary
'audience':
- Human beings alone (the system relies on the presence of human hand/eye)
- simple metadata annotations - a non specialist can read and act upon it
(eg. a book cover, a hypertext link);
- complex metadata annotations - mainly for specialists (eg. a library card)
Usability by humans is affected by the language (English, Finnish, Arabic etc.);
of the metatata and by the extent to which metadata is codified and concise.
- Computers (the system relies on the presence of digital technologies)
- machines alone - the metadata are in a format (e.g. a binary code, or a highly structured
query language) meaningless to a human being;
- machines primarily - the metadata are geared to digital processing,
however, they remain meaningful to reader;
4. Metadata Locus
There are two major types of metadata in terms of their placement with respect
to the resource they annotate
- Internal - metadata are an integral part of the resource itself. It is not
possible to retrieve the resource without retrieving the meta-data (e.g. a table of contents,
or an index of a book)
- External - meta data are pointing at, but not linked with the resource. It is
possible to access a resource without ever stumbling upon a meta-data set it pertains to.
(e.g. purchase in a shop a videotape in a 'plain manilla' wrapping).
Usually, information resources have at least one internal set of the metadata
and several external ones, the existence of which may or may not be known to the
authors of the resource in question.
On the whole 'Pointers', 'Doors', 'Content indicators',
'Provenance indicators', 'Function indicators', 'Quality
indicators', and 'Attention grabbers' tend to be located outside
the information resources they are dealing with, whereas
'Navigation markers', 'Information decoders', and 'Structure
indicators' reside mainly within such resources.
Metadata live in symbiotic relationship with the information they
pertain to. The information 'benefits' from being infused with
and surrounded by various chunks of the metadata, it gets 'more intensive' usage
or a 'better' (i.e. more relevant) clientele. The metadata,
in turn, cannot 'live' (i.e. be established and used on regular
basis) without the information it tracks, distills and heralds.
5. Metadata Standards
There are at least three ways of looking at the standards:
- Degree of their formalisation
examples => recommendations/guidelines => requirements => standards => laws & regulations
- Degree of their permanency
ad hoc => formalised => fixed
For instance, the custom of stating explicitly a web page's date of last update,
or of its URL is still an 'ad hoc' arrangement. By contrast, the custom of annotating
a printed book with the ISBN, the date of its publication, as well the location and name
of the publisher is, in the late 20th c, a well entranched and 'fixed' arrangement.
- Degree of their ubiquity and acceptance
local site electronic => multiple sites electronic => paper technical => paper popular => oral tradition
6. Metadata Production
Metadata need to be talked about, understood and appreciated. Above all, however,
they need to be put into regular, systematic, daily practice. The production of
metadata sets involves taking the following steps:
- Establishement - the properly structured metadata need to be generated simply, speedily
and almost effortlessly. Documents and data-sets' creators need to enjoy adding the metadata descriptors
to their electronic and paper publications;
- Propagation - the news of a new/improved paper/electronic document needs to percolate
very quickly to all nooks and crannies of a busy and information-overloaded scholarly community;
- Registration - there needs to be an authoritative, comprehensive, up-to-date, and effectively (24hrs/7days) and
efficiently run online catalog of all resources pertaining to a given project
- Updates/Correction - any changes
to once established metadata need to be created, implemented, propagated and
registered equally easily and speedily;
7. Metadata - Unresolved Issues
8. Recommended ECAI Strategies
[this section still needs to be revised and beefed up - tmc]
The above notes suggest a preferred course of action for the ECAI
group. Clearly, the metadata to be in the context the ECAI
documents and data sets need to:
- be simple, and easy to create/use/modify system (could automated
Dublin Core
metadata production services of Lund University (Koch 1997, Hakala 1997, 1998) be of relevance here?);
- be useful, meaningful and attractive to all ECAI researchers and associates;
- be useful in context of both paper and electronic based resources;
- provide full functionality (Find; Focus; Preview; Access; Use; Interpret capabilities);
- be a dual system (easily legible by both machines and human beings);
- be universally legible & decipherable, despite existing cultural and linguistic barriers
(could automated translation
services of Altavista (Digital 1998) be of relevance here?);
- take advantage of a software for the automated propagation amongst the ECAI community
of a new/modified set of metadata descriptors;
- take advantage of a software for the automated registration
of a new/modified set of metadata descriptors with the central ECAI register of
resources.
9. References
visitors to www.ciolek.com since 08 May 1997.
Maintainer: Dr T.Matthew Ciolek (tmciolek@ciolek.com)
Copyright (c) 1998 by T.Matthew Ciolek. All rights reserved. This Web page may be freely linked
to other Web pages. Contents may not be republished, altered or plagiarized.
This page has been tested for full accessibility
URL http://www.ciolek.com/PAPERS/heidelberg-jun98.html
[ Asian Studies WWW VL ]
[ www.ciolek.com ]
[ Buddhist Studies WWW VL ]