Thursday, June 01, 2006

Major changes in LC cataloging procedures

My cataloger forwarded an eloquent, if alarming, description of changes LC has implemented today, June 1, that may have long lasting effects on our ability to support high quality scholarship in the future.

The recent announcement of LC's decision concerning series authority
records has given rise to considerable discussion about a number of issues
which have concerned me for many years. The comments which I wish to make
are, like many recent posts, intended to relate this single issue (series
authorities decisions at one library) to more fundamental issues that
involve us all. These are: 1) nature of cooperation and resource sharing;
2) successful bibliographic (re)search; 3) purpose and goals of cataloging;
4) direction and manner of change.

1) Cooperation. It is a fundamental principle of cooperation that all
parties involved do their share of the work A cooperative agreement in
which one partner (e.g. LC) or a few is/are expected to do all or most of
the work succeeds only so long as those partners continue doing the work in
a satisfactory manner. The decision to cease providing series records is
simply the latest evidence that the success of our cooperative agreements
depends upon the reliability of the cooperating institutions. The decisions
of OCLC to allow vendor records and Cornell's below minimal level records
should have sparked an even stronger opposition. Since most libraries have
assumed the contributions of LC and all others, staff have been reduced to
minimum (or lower) and now there is simply no possibility of picking up the
extra work load. Predictions concerning shared cataloging led many
libraries for many years to eliminate or not fill cataloger positions in
the expectation that our cataloging will be done by someone at another
institution. I think that everyone knows at this point that some (how
many?) PCC libraries will follow LC regardless of the official PCC stance
because they have no adequate staff to take up what LC formerly did. LC
knows this too.

In order for cooperation to work for all involved, the nature, purpose and
goals of the work must be clearly understood and affirmed by all
participants. These have been clearly stated in a number of public policy
documents and it is toward the satisfaction of these publicly stated
professional goals that most catalogers orient their work. Yet in a shared
database such as OCLC the enormous diversity of participating institutions
creates problems for determining mutually acceptable goals. Vendors need
order records, not research tools. Public libraries cater to a user
community which differs markedly from the academic users of a research
library. Rare books require a significantly greater attention to
description than current academic monographs, for which subject analysis is
more important. Some members of the profession have an overriding concern
with user perceptions of the ease of searching (everything through a single
box Google style) and are satisfied with "getting information"; the model
of the user for these librarians is the high school or undergraduate
student who is required to "include a bibliography of at least 5 sources
excluding encyclopedias". Others more concerned with the thorough and
intelligent searching which serious scholarship requires argue for detailed
subject indexing of all materials which alone will support cross-language
subject searching. In a database created and maintained by participants
with so many conflicting understandings of the goals of the information
created and entered into the database, these local and conflicting goals
must be pursued locally, which of course requires local staff.

In a cooperative database serving such disparate institutional needs, the
option for local control of local demands depends upon local staffing;
where staff are too few in number or lacking the requisite range of skills,
the only options are either following whatever everybody else does or the
abandonment of any evaluation for locally appropriate and adequate information.

2) Research. I use Google. A lot. But Ms Marcum's insistance on the
sufficiency of and preference of researchers for Google over the library
catalog is tellingly illustrated with an account of how an undergraduate
now goes about writing a paper, even stressing that the student wants an A.
( Ms. Calhoun's
report to LC was based on a review of a small portion of the last 5 years
work in the English language. To do research as described by Marcum or
published by Calhoun one does not need a library and certainly not a
librarian, whether cataloger or administrator. Nor does one need a library
catalog. However, some of us younger folk make much more rigorous demands
of scholarship, not only of the argumentation but for the literature search
as well. If everything were free and available with fulltext online, then
perhaps neither a library nor a librarian would be necessary. But
everything is not online and we must live in the actual present rather than
in someone's plans and dreams for the future. In the actual present most of
the worlds' published material is not available online anywhere, and much
of what is online is only available to subscribers. In this situation,
which is ours, we need to be able to find what is available to us here and
now in the library to which we have access. And when we find citations
anywhere, through whatever means to relevant material which is not
available online, we need to be able to find out whether or not it is
available in or through our library and if so where and how to obtain it.
We can, of course, digitize everything we have (necessitating many
copyright violations) or at least our bibliographical information and make
it available for web search engines to index, thereby allowing or even
forcing everyone to use Google to find out what is in the library. Yet if
we do the latter­contribute bibliographic information­then someone has to
do that in some intelligible and searchable manner. If one wants to search
by author, the author has to be identified as an author­rather than the
title­and the same goes for series searching and every other kind of
specific (intelligent) searching.

The trouble with Marcum, Calhoun et al. is that they are arguing for
information seeking rather than research, and in this model, any
information found implies a successful search. The Google model of
information seeking is not a model of intelligent research; it is a model
of easy information gathering. For someone who wants "a bibliography",
Google works. For someone who wants "the bibliography", Google searching is
a wonderful starting point and a great last minute source for recent
materials, but if Google alone is used, it fails miserably. No matter what
the subject, one need only compare what a Google search retrieves with what
one can find in a good multilingual bibliography on your topic to realize
the extent of Google's failure as a one-stop research method. Rather than
taking a survey to determine what "most people" do first or last or find
easiest, professional librarians in academic institutions ought to be
responsible for developing, implementing and teaching a variety of methods,
each of which may be appropriate for several different types of
search/research but not necessarily useful or adequate for others.

3) Cataloging. The simplest argument for the continued necessity of
cataloging is the demand for cataloging born-digital materials available
online. The next simplest argument is that there is no machine of any sort
which can take a book, article, painting, piece of music or computer file
and inform the searcher of the book's (computer file's, score's, etc.)
title, author, publisher, etc. Either these kinds of information about
every document must be entered into the machine in a machine-readable
fashion by someone who can interpret the document, or every item must be so
standardized that the interpretation of the particular significance of
every bit of text (sound, color, etc.) follows from its location in the
item. The vast numbers of hits in a Google search is due to the fact that
everything has the same significance in a Google search: one cannot search
for authors, composers, titles, publishers etc. in any other way than as
bits of text of no particular significance. Thus a Google search for Mongol
and Java retrieves 326,000 hits, while a search for Mongols and Java
retrieves 116,000. In contrast, a search in OCLC provides 10 hits, 9 of
which are of very limited interest or useless (e.g. maps of the Mongol
empire and general books on the Mongols), while a search in the Regenstein
Library catalog retrieves 2 books, the only two non-fictional books ever
written on the topic of the Mongol invasion of Java, neither of which are
available online, both of which have multilingual bibliographies.
Furthermore, only the more recently published book appears in the OCLC
search due to the poor subject analysis provided for the earlier book.

Information technologies are helpless without information, and worthless if
misinformation is input. The Indiana University white paper The Future of
Cataloging recognizes this while the Calhoun report does not. One of the
chief defects of automated indexing and analysis is the high rate of
production of misinformation. In this respect, I have found that automatic
indexing produces records similar in quality to the below minimal records
produced by Cornell: very bad to totally useless. David Banush of Cornell
has so lovingly suggested in a posting to PCCPOL list that catalogers are
an aging and conservative group of secure and comfortable bureaucrats
"vigorously (sometimes stridently) defending the status quo, or even the
status quo ante" whose "prospects for long-term growth in a very dynamic
global information economy are dim." I would reply that the prospects for
long-term growth of an information economy without high quality information
are even dimmer.

The library and the information economy in which it is embedded is
portrayed in Marcum's papers, the Calhoun report, the California report and
even the Indiana University white paper as a technical system in which
there is no hint of the possibility of error and misinformation and how
that will effect the efficiency of the technical system itself. There is a
desperate need for library administrators to read in depth the ergonomic
literature on error and that on failure in organizations. (I recommend
Bogner Human error in medicine, Dörner, The logic of failure, Hoc et al.,
Expertise and technology, Hollnagel Cognitive reliability and error
analysis method, Kerdellant Le prix de l’incompétence, Lagadec La
civilisation du risque, Leplat and Terssac (eds.) Les facteurs humains de
la fiabilité dans les systèmes complexes, Merry and Smith, Errors, medicine
and the law, Morel, Les décisions absurdes, Reason Human error, Silverman
Critiquing human error, Rasmussen et al. New technology and human error,
Vestrucci Modelli per la valutazione dell’affidabilità umana, Woods et al.,
Behind human error, Frese and Zapf (eds.), Fehler bei der Arbeit mit dem
Computer, and all of the writings on high reliability organizations of Karl
Weick. I have dealt with these matters at length in Theory and practice of
bibliographic failure, and in a recent paper of mine ("Colorless green
ideals in the language of bibliographic description", available in the
articles in press section of the online edition of Language &
Communication) I looked at the library catalog as a communication system,
combining linguistic aspects of information retrieval and human error

4) Change. Perhaps the most disastrous and shortsighted aspect of policy
decisions such as minimal level records and the abandonment of series
authorities is the fact that future technological capabilities will
depend­as they do now­on the presence rather than the absence of
information in the record. As many commentators have remarked, automatic
authority checking both for correction of mistakes and for collocation in
the presence of variation ONLY works if the information is both transcribed
from the piece into the bibliographical record and the
correct/authorized/standard form established in an authority file. Without
that double aspect of bibliographic description and control no automatic
error correction and no collocation software presently available or in the
future will ever be possible. On these matters library administrators are
almost universally technologically ignorant and have absolutely no idea how
information technologies work and what are the minimum requirements for
their successful implementation. Almost every change demanded by the Karen
Calhouns and Deanna Marcums will ensure failure in the implementation of
any future technological developments.

-There are vast differences in the expectations of the various users for
what the shared utilities should accept and provide (book vendors vs. PCC
Full Level standards).
-There are at least 2 completely different and mutually exclusive
understandings of what research is and what researchers require (Calhoun
vs. T. Mann).
-There are those who believe that information technologies do not require
structured information or that software and/or the market will provide
whatever is needed efficiently and adequately for research needs, and there
are others who believe that information technologies are marvelous tools
which require intelligent input and use, and that given the needs of
academic research, the necessary production of information for
technological manipulation and exploitation can only be successfully
accomplished by persons who share the intellectual backgrounds,
committments and research activities of the academic community.
-There are those who believe that current promises of future technological
possibilities must be believed and no alternative futures may be
considered, and there are those who, cognizant of the preceding three
professional debates and dilemmas surrounding cooperation, research
practices and data quality, would reject all institutional action and
forecasting that has been narrowly determined by technological hopes and
David Bade
Joseph Regenstein Library
University of Chicago

No comments: