Monday, October 23, 2006

How do search engines rank results?

The flip side of searching is designing websites and search engines to show up what you need. The link in the title above is to a terrific post on a blog by a search engine optimization guy. He lists 20 ways that search engines reorder rank, with links to a number of patents. The list is nice plain language and very interesting for even the casual web surfer. The patents are more technical, but it's terrific to have the varying ranking methods pulled together in one place. Visit SEO by the Sea, above, to read his Oct. 14, 2006 post.

If you've ever wondered why a search turns out different results even using the same search engine on different days, look at this!

Some of the more amazing rank-ordering methods include:

3. Personalizing search results based on your prior searching. That's part of what those cursed cookies are for.

5 & 6. Re-0rdering rank based on country or language preference.

8. Re-ranking based on history -- age of website, age of documents, and links, among other things.

9. Ranking based on reading levels -- wow!

Looking at pages for things like reading levels, use of stop words, and other textual features. A patent filing from Yahoo! that describes one way to do this, allows searchers to use an interface to choose results that are introductory and ones that are advanced, and a few degrees between:

10 & 11. Ranking based on mobile-device or screen-reader accessibility.

14. Creating a new automatic query based on generated terms and similarity of language (more like this, I think):

This Google/Berkeley document describes reranking of results for a news search by considering and adding additional query terms, and by looking at document similarities.

* Query-Free News Search

15. Ranking based on visitor behavior. Boosting or busting rank based on the behavior of prior visitors to a website, and looking at their click-throughs.

18. Ranking based on creation of "story lines!"

This document from IBM takes search results, and reorganizes them into storylines which it expands upon in some ways, and filters in others, before presenting those storylines to a searcher

* System for identifying storylines that emerge from highly ranked web search results

19. Reranking by looking at blogs, news, and web pages as infectious disease

An analogy is used to disease-propagation models in this IBM patent application to describe how segmentation into topics paying attention to time-based changes and additions to those topics in the blogosphere and on bulletin boards might tell a search engine which topics and terms are popular, and where information about those might be located. While the process is described in the context of providing news-based alerts, the concept could be expanded to help with the reordering of search results based upon measures of popularity and burstiness (for instance, in the next section.)

20. Reranking based upon conceptually related information including time-based and use-based factors

In a number of ways, this next patent application describes a process similar to the last two methods listed. It involves grouping together concepts, and looking how those change over time and how different people participate in those changes. One of the co-inventors listed is Apostolos Gerasoulis, from Ask.

It will certainly change the way I look at search results and think about choosing search engines to realize that there are these various algorithms chugging along in the background. I was amazed enough when I began to understand the complexity of Natural Language search engines. And reading about ALEXA (query-free search engine that locates pages of interest by looking at links and click-throughs. This posting goes way beyond those astonishing ways of sorting information. There are popularity factors, like looking at the blogosphere and social networks like "My Space" for terms and links. Cool!

Thanks to Susan Sweetgall for pointing me to this website.
The illustration of a thinking cap is from Mark A. Hicks, illustrator at

No comments: