Knowledge Handler

Information Sources & Information Sifting Techniques

My Photo
Name:
Location: Cleveland, Ohio, United States

I am a retired librarian, most recently serving at Indiana Wesleyan University's Cleveland Education Center.

Friday, December 31, 2004

Know Your Research Tools

Research tools differ in what information they index for searching. Tools such as Infotrac by default do a keyword search just on the title, any assigned subject headings, and any annotation. A keyword search for a word or phrase can be expanded to include the entire text of the document by using a check box, but it is an extra step that is often ommitted.

Google and similar search tools may report that millions of articles contain a search term, but these tools typically index only the first few hundred matches for retreival - the remainder are difficult or impossible to retrieve.

To learn more about the limitations of a tool, look at its "Help" and any "Advanced Search" page. A good website for learning the nuances of search engines is www.searchenginewatch.com - this site is written for both the searcher and the marketer who wants to advance their placement in search engine results, so it provides information on indexing and data retreival.
-DD

Thursday, December 30, 2004

Combining Boolean Expressions - Parenthesis, OR

The commercial databases and Internet search engines that recognize boolean operators [such as "AND", "OR", "NOT"] generally allow the use of the parenthesis and "OR" operator to perform multiple activities simultaneously.

The "OR" statement is used to broaden a search. A search where results containing either the term boxer OR dog are sought will result in material on dogs, boxer shorts, hot dogs, the Boxer Rebellion, the dog days of August, Senator Boxer, dog breeds, boxer George Foreman, and so forth.

Unless seeking either of two obscure terms or phrases, using the "OR" operator by itself provides too many irrelevant results for practical research. If a database or search engine recognizes the use of the parenthesis, the "OR" operator can be used to perform multiple actions. Consider the following search:

boxer NOT (shorts OR rebellion OR senator OR fight)

This search requests any document containing the word "boxer", but then sifts-out any documents containing the terms "shorts", "rebellion","senator", or "fight". As a result of the multiple siftings, most of the results will be documents about boxer dogs. One can also use the parenthesis and "OR" statement to broaden a search - for example:

boxer AND (dog OR canine OR puppy)

-DD

Wednesday, December 29, 2004

Basic Information Sifting - Phrase, AND, NOT

Whether using a search engine or a structured database, the wealth of information on most topics requires that the results be refined or sifted or limited to produce highly relevant responses. One of the best summaries of common techniques was developed by the metasearch engine Ixquick:

http://ixquick.com/eng/aboutixquick/about_hints_improve.html

When I teach students how to use information resources, I cover three basic techniques for refining a textual search for information to sift-out irrelevant material:

  1. Sometimes, the concept you are seeking information on is always expressed as a phrase. In that case, enclose the sought phrase in quotations marks when you place it in the text search box. [example: "North America"] However, this technique can miss finding a lot of material if there is any way to express the search differently - for example, if you apply a phrase search for "boxer dog" to a text that reads: "My favorite dog is the boxer. The boxer is the best breed of dog for a homeowner, because it is courageous, devoted, intelligent, and has the physical presence to defend the home. The boxer is a dog with many admirers...." The search for the phrase "boxer dog" will not find the phrase anywhere in the sample text, so this relevant document would not be retrieved.
  2. Often, the sought concept has one or more words associated with it - for example, if you are looking for the dog called the boxer. If you just type the word boxer in the search box, references to shorts, Senator Boxer, and the Boxer Rebellion may be retrieved. If the search is modified, limited to documents where both the words "boxer" and "dog" occur in the text, most of the irrelevant results should disappear. To insist that both words be present in a document, either use the Boolean method of linking terms with the word "AND" [example: boxer AND dog] or precede each required term with a "plus sign" [example: +boxer +dog]. Note that some electronic catalogs and journal databases, such as EBSCOHost, require the Boolean methods be used for keyword searching, so preceding required terms with a plus sign should be limited to "free" tools like Google or AskJeeves.
  3. Another way to refine a search strategy is to indicate that you do not want results that include specified irrelevant terms. The Boolean method is to use the word "NOT" [example: boxer NOT senator] preceeding the word that is to be barred from the results, or the minus sign may be used to preceed the unwanted term [example: boxer -senator].
-DD