Wednesday, September 30, 2009

Library Catalogs Getting Serious Upgrade

The Chronicle of Higher Education offers a wonderful article in its Sept. 28, 2009 issue, Technology section and online here "After Losing Users in Catalogs, Libraries Find Better Search Software."

...traditional online library catalogs don't tend to order search results by ranked relevance, and they can befuddle users with clunky interfaces. (snip)

That's changing because of two technology trends. First, a growing number of universities are shelling out serious money for sophisticated software that makes exploring their collections more like the easy-to-filter experience you might find in an online Sears catalog.

Second, Virginia and several other colleges, including Villanova University and the University of Rochester, are producing free open-source programs that tackle the same problems with no licensing fees.

A key feature of this software genre is that it helps you make sense of data through "faceted" searching, common when you shop online for a new jacket or a stereo system. Say you type in "Susan B. Anthony." The new system will ask if you want books by her or about her, said Susan L. Gibbons, vice provost and dean of Rochester's River Campus Libraries. Users can also sort by media type, language, and date.

These products can also rank search results by relevance and use prompts of "Did you mean … ?"

"It's sort of our answer to, Why it is you need a library when you have Google?" said Ms. Gibbons. "What this is going to do is show how much you've been missing."

It's a pressing issue. Libraries once had a monopoly on organizing data about content. No longer. And today some users gripe about how libraries present materials online: how scattered they are, how sluggish searches can be, and how often those searches are useful only if you already know exactly what you want.

The worry for Jennifer Bowen, assistant dean of the River Campus Libraries, is that library catalogs could become "marginalized."

"There are people who just cannot find what they need," she said. "And they're just sort of giving up on libraries."
The article points to a survey by Ithaka (the non-profit group referred to in Marie's post, just below this one), which specializes in promoting technology in higher education. The study found faculty members to decreasingly dependent on the library for their research, and increasingly ambivalent about the value of libraries. Boy, is that bad news for libraries!

According to the article, librarians are now seeking to develop "Web-scale index searching," which is the new holy grail, taking the place of our last goal, "federated searching." Where federated searching tried to pull all our collections together for a single point search, yet it skipped going through a centralized index, and searched individual databases separately. With the new "web-scale index searching," there is a single point search of the entire collection, but like Google, the system is searching a centralized index created from all the databases, books, articles, and all digital objects. At this point, it is a goal, not a reality. But the idea is to break down the silos that make searching for information in the library such a frustration to our users.

The new products are coming online, and are called "next-generation catalogs" or "discovery interfaces." Some librarians consider them dumbed-down versions of the traditional catalog. The new search interfaces don't require the user to understand the difference between monographs and serials, for instance.
Encore, from Innovative Interfaces, adopted by at least 44 academic libraries in the United States, according to Mr.[Marshall] Breeding's [director of innovative technology and research at the Vanderbilt University library,] tally; AquaBrowser, from Media lab Solutions, used by 23 libraries; and Primo, from Ex Libris, adopted by 13 libraries.

How much institutions will have to pay for new commercial systems will vary depending on both what comes with the software and the size and complexity of the library. That could mean a price as low as $10,000 for a small academic library to one in the $100,000 range for a much larger one, Mr. Breeding said.

A 'Shift of Power'

In the open-source world, at least 10 academic libraries have turned to VuFind, which originated at Villanova. Virginia's Blacklight, with Stanford University as a development partner, is in a beta phase. And Rochester's eXtensible Catalog, or XC, backed by $1.2-million from the Andrew W. Mellon Foundation, will be rolled out in the spring.

The shift from commercial products to open-source ones is about more than money, though.

Bess Sadler, chief architect of the online library environment at the University of Virginia, sees the open-source Blacklight project as a "shift of power," as she wrote recently in the journal Library Hi Tech. The idea is that libraries, which know their local needs, should control the technology that patrons use to gain access to their collections. That's a change from the one-size-is-good-enough-for-everybody, commercially managed model that has prevailed in the industry.

The ability to customize is important when it comes to something like a music collection. A librarian might get this question: "I play the guitar. My boyfriend plays the flute. What duets can we play together?" In the past, even though Virginia had cataloged the instruments used in all of its sheet music, a search of that information was impossible because the fields that were indexed were maintained by a vendor, Ms. Sadler said.

"The problem with a vendor solution is that it's hard for vendors to tailor that solution for different collections, for different user populations, for different specializations," she said.

With an open-source system, a library can set its own relevance rankings and adjust them based on what users want. By maintaining the system itself, Virginia is now able to search by musical instrument.

The downside is libraries need someone on staff who can install and maintain the open-source program. So far, vendors aren't supporting products like VuFind the way they support established open-source products like Koha and Evergreen, both integrated library systems, said Mr. Breeding. Vendors will install software like Evergreen, host it on their own servers, and provide a help desk that you can call if something breaks. Not so for the newer software. Another barrier is going to be trusting that an open-source project is sustainable. There is always a concern that there will not be a community of users to keep developing it.

Also, the open-source systems have been slower to fold in article-level data, Mr. Breeding said. Most of that action is on the commercial side.

With Blacklight, you won't be able to get individual journal articles. If you're doing research on cell division, for example, a search will tell you that Virginia subscribes to the journal about cell division, but you'll have to go to a journal database for the article.

"That's going to be true for a very long time," Ms. Sadler said. "For the foreseeable future, you're going to need to go to separate interfaces in order to search licensed content."

But commercial vendors, smelling a new market, are stepping in. Serials Solutions, a subsidiary of ProQuest, released a software product in July called Summon. The company has been negotiating deals with publishers and content providers to create a searchable index of their content. It's like Google, except what Summon provides is an index of the "deep Web" of paid content. So now university libraries that pay for a subscription to Summon can let their users search their licensed content as well as locally owned stuff, together. Summon has 17 customers so far, including Arizona State University and Dartmouth College.

The catch? It can be expensive.
The hope is that libraries can recapture the markets that we are losing to Google and other search engines on the Web. We hope the better interfaces will help us make the most of the expensive databases and materials that we purchase for our users, making them more easily accessible to them, more searchable. But the problems to be solved are very tricky. We have to work with vendors and/or our own programmers, and whatever the solution, it will certainly cost a good deal. Either we will have to pay for a turn-key operation, or for in-house maintenance and tweaking. Perhaps we will have to pay for some of both.

But if we truly are able to make our catalogs as attractive and easy-to use as Google, and as intuitive, that might be worth a good deal! It would be re-investing in the money already spent on the collections, and on the man-hours devoted over the decades to organizing the meta-data that is our vast array of cataloging records. If we can afford it... I think it would be worth every penny.

1 comment:

Marie S. Newman said...

Thanks for blogging this extremely thought-provoking article, Betsy. You beat me to it!