Second Generation Searching on the Web
This tutorial covers some of the more innovative search engine services on the Web. It includes a group of search services that make use of technology that organizes search results by peer ranking, clustered results by concept, and user behavior or rankings. This is in contrast to the more long-standing method of term relevancy ranking. To remain competitive, most search engines today are second generation tools. The newer type of results ranking usually works in addition to term ranking and looks at "off the page" information to determine the retrieval and order of your search results. Search engines that employ this alternative may be thought of as second generation search services. For example:
- Google ranks by the
number of links from the highest number of pages ranked high by the service
- Clusty organizes its clustered results by keyword and/or concept
- URL.com offers changeable search results based on user voting
Here are a few of the trends to watch with second-generation services:
- The human element: concept processing. Second generation services such as
Brainboost and SurfWax apply different kinds of concept processing to a search statement to determine the probable intent of a search. This is often accomplished by the use of human generated indexes. With these services, the burden of coming up with precise or extensive terminology is shifted from the user to the engine. These services are therefore taking on the role of thesauri.
- The human element: "horizontal" presentation of results. Most search tools return results in one long, vertical list. In contrast to this, there is a growing group of search tools that use concept processing to return results in a horizontal organization. With these tools, you can first review concept categories retrieved by your search before examining the results within particular categories. This can make it easier to zero in on the aspects of your topic that interest you. Examples of these tools include
Clusty,
Dogpile and
Exalead.
- The human element: peer ranking. Search services such as
Google derive their results from the behavior and judgment of millions of Website maintainers.
- The human element: voting or searcher behavior. Search services such as URL.com offers changeable results based on the votes of searchers. Eurekster Swickis are custom search engines whose results are influenced by searcher selection of sites from the results lists.
For a tutorial covering the more basic aspects of Web search engines, see Searching the Internet: Recommended Sites and Search Techniques.
Search engines covered in this tutorial
Ask.com
Exercise: Expanding your search options with Ask.com
Online help
Ask.com represents a merging of the former search tools Ask Jeeves and Teoma, and it has retained the user friendliness of both tools. Ask.com uses Teoma's successful link ranking scheme to list results. Called ExpertRank, this scheme ranks results based on links from pages on the same topic as your search. The idea is that people who maintain Web pages on particular topics are experts in these areas. This is a more refined link ranking scheme than the one offered by Google. In the case of Google, any link from any page is taken into account when ranking results.
Ask.com also offers a simple (but often effective) conceptual layout accompanying your search results. In this case, it offers a few options for additional searches that relate to your initial query:
- Narrow Your Search
- Expand Your Search
- Related Names
As you can see, these additional options demonstrate a conceptual "understanding" of your search and can help you to conduct additional research beyond your initial results list. Note that not all of these options appear with every search. They will appear if Ask.com has something to offer in these areas.
Use Ask.com when...
- you are doing in-depth research
- you want the option to expand on your search with alternative topics
- you want to investigate a link ranking engine that does its ranking a little differently than Google
Special Features:
- Uses ExpertRank technology for a refined type of link ranking
- Offers options for additional searching on related topics
Drawbacks:
- Suggested ideas for related searches may be more useful for some topics than others
Query: I'm interested in the theory of evolution. I know this has become controversial lately and I'm not exactly sure what this is all about.
Search:
- Type: "theory of evolution"
- Note the right side of the screen with "Narrow Your Search," "Expand Your Search" and "Related Names."
- Select the options that interest you. For example, Creationism is listed under the heading "Expand Your Search." This thread will generate a new list of search results that will help introduce you to the current debate.
[Return to Index]
Clusty
Exercise: Grouping of results into concept folders with Clusty
Concept grouping engines offer results in a horizontal layout. This means that you can first review concept categories retrieved by your search before examining the results within a particular category. This is in contrast to the more common vertical layout of results, in which you are presented with one long list. In this case, you need to examine each site one by one to determine if it relates to the aspects of the topic that interest you.
There are a growing number of search tools that offer the clustering of results. In this tutorial, we will discuss Clusty.
Online help
Clusty is a meta engine that searches multiple engines and directories and organizes results into concept clusters. Clusty uses a Clustering Engine, which automatically organizes search results into hierarchical folders "on the fly" based on words and phrases contained in your results. Rather than retrieving only one long vertical list of results, with Clusty you will also retrieve a "horizontal" layout of concept clusters. Each cluster contains a conceptually-related portion of your search results. This may be more convenient than working through one master list of results. In a variation on this theme, Clusty allows you to create a custom Clusty Cloud, a tag cloud with terms derived from a search that you can paste onto any Web page.
Use Clusty when...
- you are doing in-depth research
- you are just getting started with your topic and don't know much about it
- you don't want to retrieve results in just one long list
- you want to organize your thoughts about your topic by seeing relevant subtopics
- you want to see resources on your topic organized into subtopics
- your topic is somewhat obscure so a search across multiple sources might help
Special Features:
- Sorts search results into categories based on keywords and phrases contained in your results
- Within categories, a new group of categories is generated that represent narrower concepts
- Clusters results from the free Web as well as from numerous deep Web sources such as news sites, PubMed, FirstGov and other sources
- Offers tabbed options for searching news, blogs, the Wikipedia and more, plus a tab that you can customize with your own source
Drawbacks:
- Categories are not always well organized
- Categorization may be more useful for some topics than others
Query: The Civil War in the United States is such a complex subject. I'd like to see a list of results organized by individual topics.
Search:
- Type: +"civil war" +"united states"
- Note the clustered topics on the left side of the screen. Select "more" or "all clusters" at the bottom of the list to see more clusters.
- Select the categories that interest you. Note that categories preceded by a plus sign (+) open up to additional subcategories.
[Return to Index]
SurfWax
Exercise: Concept searching with SurfWax
Online help
SurfWax is a meta engine that offers options to see a quick view of the content of sites in your search results list, along with search terms to broaden or narrow a subsequent search. It has a somewhat busy interface, but it offers much to the user that is worth exploring.
Use SurfWax when...
- you are looking for a specific fact/person/event/narrow topic
- your topic is made up of multiple ideas
- you are doing in-depth research
- you want help choosing search terms
- you want to see a content summary of sites retrieved in your search before visiting them
- your topic is somewhat obscure so a search across multiple sources might help
Special Features:
- Offers "SiteSnaps" that display summaries of retrieved sites including Author Summary, Matched in Context, Key Points, Emphasis and FocusWords
- Focus feature may be applied to your search terms, allowing you to choose broader or narrower search terms to apply to subsequent searches
- FocusWords can be viewed in context within a textual extract from the site
- Various personalization options are available
- Aside from a general search, offers specialty searches including SurfWax LookAhead, which retrieves results from RSS feeds as you type various kinds of queries
Drawbacks:
- Site has a learning curve for first-time users
Query: I'm interested in learning about discrimination.
Search:
- Type: discrimination
- Choose a site from your results list by clicking on the magnifying glass icon
- Explore the information on the right side of the screen. Note the list of Focus Words.
- Choose a Focus Word that you would like added to your search. Click on the word. Notice that it has been added to your search box.
- Click on the Search button to run a new search
There is another way to get additional terms into your search box:
- "Focus" your original search term. Click on the small arrow icon next to "Focus: discrimination" located underneath the search window. A list of related terms will appear. Clicking on subsequent arrow icons will focus the chosen term.
- Explore a term that interests you. Click on a term to add it to your search statement.
- Click on the Search button to run a new search
Go to SurfWax to try this search.
[Return to Index]
Ixquick
Exercise: Tapping into the ranking schemes of several engines with
Ixquick
Online help
Ixquick is a meta engine that searches multiple engines and directories and returns only those documents that appear in the top 10 of any search results.
Use Ixquick when...
- your topic made up of multiple concepts
- You want a limited number of the top ten results from a variety of search tools on the Web
- You want the convenience of a meta search engine that searches multiple sources simultaneously
- your topic is somewhat obscure so a search across multiple sources might help
Special Features:
- Returns the most relevant results as ranked in the top 10 by a number of individual sources
- Uses a "star" system whereby the number of stars indicates the number of sites ranking each result in the top 10
- Shows the sources that have ranked the page and the placement within the top 10 list, e.g., Google (1)
- Offers a variety of search options including full Boolean, implied Boolean, natural language search, truncation, case sensitivity and field searching; Ixquick sends your query to the engines that support these options
Drawbacks:
- Because it offers only the top 10 results from any source, obscure sites will not appear in its results
- Some search syntax options do not work well, i.e., natural language searching is an option but the results are not necessarily successful
Query: I'm looking for good Web sites about Mozart.
Search:
- Type: Mozart
- Examine results for relevancy
- Note how, without concept processing, different meanings of the term "Mozart" have been returned. Of course, many results relate to the composer.
[Return to Index]
URL.com
Exercise: Influencing results ranking based on your votes
URL.com retrieves the top ten results from the combination of Google, ahoo! and MSN, and allows searchers to rank and comment on the results.
Use URL.com when...
- you are researching just about any topic that can retrieve results from search engines on the Web
- you want to contribute your opinions about the quality of your search results and the quality of individual sites encountered in your results
- you want relatively few results from three major search engines
Special Features
- Conveniently searches Google, MSN and Yahoo! from a single interface
- Limits the total number of results to ten
- Results ranking is changeable based on user rankings
Exercise: URL.com
Go to URL.com and sign up for an account. Then proceed with your search.
Query: Enter any query that interests you!
Search:
- Examine results for relevancy and quality.
- Vote for and comment on items in your search results to contribute to the conversation about the topic of your search.
[Return to Index]