Friday, September 07, 2012

Trawling The Invisible Web

Here is something I try to teach my students about searching the Internet. Librarians know that:

* It's not all on the Internet. AND

* You can't even find everything on the Internet with search engines.

Google (and Bing, Yahoo, etc.) can't search things that are in databases or tables on websites. That's a LOT of valuable data! There are websites that don't allow crawling. This vast trove of really rich stuff is called the Invisible Web.

Here are a few of my favorite tips for searching those depths:

1. Think about organizations that might generate reports, maps, statistics on your area. Governmental, NGOs, not-for-profit organizations all generate huge amounts of very worthwhile data! Occasionally, it may be biased, but even so, if you keep the bias in mind, it can be worth using the information, warily!

1.a. To locate organizations in your topic area, a good method is to go to Wikipedia, and search for a keyword or N.G.O.

"environmentalism" or


"non-governmental organization" (pulls up a list of search results)

all will pull up at least one article, and sometimes a list of articles. There were always choices when I selected the link I chose for this blog post, so you may want to enter the term and search rather than simply link. Wikipedia's strength is the list of links and references that you can use to locate organizations and websites. The list may help you locate important organizations in the field that you would never have thought of or perhaps have heard of, without the links.

You should use your judgement about the quality of the website. Look for some link "ABOUT US," or something like that. Read about the organization, its mission and who founded it. Sometimes you will discover wonderful organizations that you never heard of that are doing important work and generating fabulous, reliable information while they are doing it. For U.S.-based organizations, you can look at the materials they file with the I.R.S. to support tax-exempt claims, which may help you decide how legitimate they are. For international and foreign organizations, you wont' have that sort of form to rely upon.

So, for instance, using this technique, one locates at the bottom of the Environmentalism article a list of Environmental Organizations and Conferences. One locates a list there of many United States-based groups, in addition to many United Nations links. One can flip through these fairly quickly, looking always for signs that there are reports, publications or a database on the site. Many of the organizations are set up solely as advocacy groups and so the websites simply trumpet the dangers, the organizations' claims of results and solicit donations. The researcher can skip over these. By skipping down, one finds the Wildlife Conservation International, and looking at that website, one notices a link to ICCF, International Conservation Caucus Foundation, a U.S.-based, bi-partisan foundation that lobbies the Congress on behalf of conservation issues. It has what looks like some excellent briefing material that it produces to give to Congress. I recognize the logos of the Smithsonian Institution, and the Audubon Society among others as sponsor/authors of some of these materials. It's an interesting and deep looking collection.

2. Do look for governmental information. Don't forget to look at the URL extensions, for that .gov to show you it's a government agency of some sort. The U.S. federal government and many state governments publish helpful materials. There are foreign government materials that can be helpful as well. Thus, the U.S. Federal Trade Commission, to pull an agency out of the hat, at this point in time, provides a good amount of primary law on the website. Look a the tabs along the top of the page for an easy way to navigate the site. I chose the General Counsel's office, to show simply because it includes amicus briefs, statutes, policy hearings, and more. Here is a web page with the British Foreign Office, their "Working For Us" page where you can find out about jobs there.

3. Multi-national, consortial, and non-governmental bodies like the United Nations (and all of its many, many subsidiary organizations, which often generate databases, reports, maps, and wonderful data). Here is their Environmental Programme page for Climate Change. Examine the many tabs and then the index at the bottom. They have so many links there, some of which lead to publications, databanks and rich statistics. You have to take your time to explore.

4. There are lots of wonderful quasi-governmental bodies, like the National Conference of State Legislatures, a bi-partisan N.G.O to serve the legislators and their staffs of all 50 states. They help set up agreements between states, do research on shared issues, as well, so there is a lot of information at this website.

5. And do not overlook the huge variety of not-for-profit and other organizations. From the National Rifle Association, with "News and Politics," focused on the Second Amendment, and offering their analysis and commentary, to the National College for DUI Defense (I don't think this is not-for-profit) which trains defense lawyers, and provides a very handy list of the DUI laws state-by-state. Sometimes you stumble over these things. I found the DUI college through Wikipedia again, preparing a worksheet for students!

The image is an underwater cave, courtesy of a scuba diving website:

