Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Jul 1996 18:25:26 -0500 (EST)
From:      John Fieber <jfieber@indiana.edu>
To:        Wolfram Schneider <wosch@cs.tu-berlin.de>
Cc:        doc@FreeBSD.org
Subject:   Re: FYI: IDML
Message-ID:  <Pine.BSF.3.94.960710174526.4397D-100000@Fieber-John.campusview.indiana.edu>
In-Reply-To: <199607091430.QAA24241@caramba.cs.tu-berlin.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 9 Jul 1996, Wolfram Schneider wrote:

> http://www.identify.com/welcome/idml-faq.html

<RANT>

EEEeeeeeewwwwww!!  This makes my stomach turn.  Not only is it a
brain dammaged application of SGML, it amounts to nothing more
than a database with a fixed set of field that are woefully
inadequate for describing much of anything useful.

Just consider the SUBJECT attribute.  First, it specifies "no
more than three (3)".  Well, I'm sorry, but thats a pretty lame
restriction to place on someone categorizing something and, the
way they set it up, there is no way to enforce the rule.  SGML
could enforce it *if* they bothered to use SGML properly.  If
that isn't enough, their pre-defined subject categories are an
utter insult.  The LC subject headings take up 4 large 
volumes, each about 4 inches thick with small print and even they
can't begin to capture many subtlties required in distinguishing
entities.  Then you have things like the National Library of
Medicine subject headings, a 3 inch thick fine-print listing of
subject categories just within the field of medicine!  And these
identity people think that a couple hundred headings are
sufficient for everything anyone would want to put on the
internet?

Okay, then look at the LOCATION and LANGUAGE attributes.  They
too have severly limited canned lists of countrys, place names
and languages. The US Geological Survey geographic names database
takes up a whole CD-ROM with millions of entries just for the
United States.  The Library of Congress language codes for use in
MARC records is much longer than the ISO list they use.  What
these other languages.

But wait, there is more!  What about the KEYWORDS attribute?
Isn't this somewhat redundant with the SUBJECT field?  I'm not
aware of any study that shows searching and uncontroled
keyword vocabulary as being any more effective than free text
searching.

If you want to look at some *useful* discussion of metadata
standards, look at http://www.nlc-bnc.ca/ifla/II/metadata.htm.
In particular, the Dublin Core has a proposal for HTML files that
uses a slightly modified <META> tag and is much better thought
out than this IDML crap.  Library science has been researching
this sort of thing for decades and there is a plenty of sound
research and literature on the subject.

Just say NO to IDML!

Oh, and by the way, an underscore (_) is NOT permitted in a tag
name (eg <ID_INFO>) or attribute name (eg STREET_ADDRESS)
according to the reference standard SGML declaration, or the SGML
declaration used by HTML.  So much for being HTML compatible....

</RANT>

-john

== jfieber@indiana.edu ===========================================
== http://fallout.campusview.indiana.edu/~jfieber ================




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.94.960710174526.4397D-100000>