Marc's Rants

 
 

The following article appeared in Leadership, Volume 30, No. 1, September/October, 2000.

Finding what you need: Using Internet search engines

by Marc Elliot Hall

Essential to finding what you need on the Internet is knowing how and where to look. Many people searching for information on the Web assume that simply typing in a key word or two and clicking OK will "automagically" produce useful results. But this method does not optimize the results of your search, either producing too many (potentially many thousands) or too few relevant links.

This article will outline search techniques, reveal how some major Web sites will react to those techniques, and finally deliver some tricks you can use to improve your search results. Although entire books have been written on the subject of Internet searches, the tips here will offer a good start to help you understand how Internet searches work - and help you find what you need.

How to look

As an example, let's say we want to find relevant Web pages referring to the latest hot topic, the shortage of school administrators. First, we need to decide which words are most likely to be referenced in Web pages we want to see. "School," "administrator" and "shortage" are all likely candidates. But what about "K-12," "principal" and "superintendent?" Which words may appear but aren't essential, like "teacher" and "student?" And which words might appear in similar, but irrelevant, articles?

There is also a shortage of information technology workers, but we probably aren't interested in Web pages about that. So are there any words we definitely don't want to see? What about "information technology"? Wait - that's not a word, it's a phrase. Hmm... I'll explain how to do phrases in a moment. First, Boolean logic, and why it matters.

Boolean logic

Boolean logic is named for its inventor, George Boole, one of the greatest mathematicians of the 19th century. Among his contributions to the field of logic is the idea that groups can be structured to include or exclude objects based on the characteristics of those objects. Further, he proposed that simple rules could be devised to make the determination of which objects go in which groups. (Technically, Boole proposed that comparison results could be expressed as true or false, but that's probably too much detail.) This now commonsensical notion was revolutionary for its time, and became one of the foundations of computer science. But what does this have to do with Web searches, you ask?

Most Internet search facilities use Boolean logic to respond to information queries. Simply put, a search engine determines whether you want some, all or none of your key words to be part of the document it finds for you. These search parameters are generally specified, depending on the search facility being used, by including symbols (+, -, !, and, or, not, Xor) in various combinations with your key words or phrases. These symbols and words are called Boolean operators because they determine what operation is performed on your key words in a Boolean search.

Plusses, minuses and bangs

The + (plus) operator is used to specify a word that must be included in the search. The - (minus) operator is used to specify a word that must not be included in the search. And the ! (exclamation point, or bang in computer parlance) operator, similar to the -, specifies a word or phrase to exclude. For example, for our hypothetical search on the administrator shortage, we could use + education + administrator - teacher to require that the words "education" and "administrator" both appear in our results while excluding the word "teacher." Not all search engines support the !, but it's fun to see how search results vary by replacing - with !.

And, or, not and Xor

And, or and not each have a special meaning for searches. Using the words rather than the symbols as your operators may make it easier for you to structure your search, especially if you're a more verbal person. Most search engines will accept both forms -- punctuation as well as words (although not necessarily mixed in together). The corresponding example of our hypothetical search on the administrator shortage would be and education and administrator not teacher to require that the words "education" and "administrator" both appear in our results while excluding the word "teacher." The word or is especially useful to remember because it doesn't have a symbolic (+, -, !) equivalent.

The final operator here is Xor, which is an exclusive or -- meaning one or the other, but never both. Not all search engines support Xor, but it's worth trying if you want more precise results. While or may include both of two choices, because either is optional, Xor will only find documents with just one of the key words. So administrator Xor principal would find documents that include either "administrator" or "principal," but not both "administrator" and "principal."

Quoted strings

If you'd like to include a phrase rather than single words in your search, enclose the phrase in quotation marks. Generally, either single quotes (' ') or double quotes (" ") will work, but you must be consistent within the search. The technical term for this kind of phrase is quoted string. Continuing with our example, the search education and "administrator shortage" would find documents that include both the word "education" and the phrase "administrator shortage," but would not likely find documents that include either administrator or shortage; nor would it likely find documents that include both those words if they aren't immediately adjacent.

Another use for quoted strings is including words like "and," "or" and "not" in your search. For example, "Curriculum, Instruction and Evaluation" is a phrase that includes "and." It also happens to be the name of an ACSA committee. If I want to read pages referencing that ACSA committee, I'd definitely want the whole phrase, including the "and," to appear on the pages my search finds. I should also point out that you should pay attention to your quotation marks, because misplaced marks can cause your search results to be wildly divergent from what you'd expect.

Grouping rules

What if I want to find documents that refer to either the administrator shortage or the principal shortage, specifically? Education and (administrator or principal) not teacher would work here. The parentheses tell the search engine to handle the contents inside them as a lump, with the preceding command (the and in this case) being the operator that controls how the lump is handled. The or inside the parentheses tells the search engine to select at least one of the words; both are permissible, but at least one is required by the and. So this search would find documents referencing "education," either "administrator" or "principal" (and perhaps both "administrator" and "principal") but not including "teacher."

Natural language

Natural language queries, in contrast to Boolean queries, take the form of a question rather than a series of words punctuated by symbols. For example, Where will I find information about the K-12 education administrator shortage? is a natural language query, while 'K-12' and education and 'administrator shortage' is a Boolean query. Note on the Boolean side the quotes surrounding "K-12." These are required because otherwise my search would exclude the numeral "12" (assuming the search engine I'm using supports searching for numbers -- not all do). That could be a problem.

Where to look

When deciding where to look, first realize that some sites often referred to as search engines are really indices. That is, they are organized lists based on categories rather than random assortments of keywords locked up in a database. Most indices include a search feature; however, just as many search engines now include an index. Also, please note that these Web sites change constantly, so what may be accurate as of this writing could have changed. Finally, all of these sites are available via ACSA Online. Several of the major Web sites are described below, each followed by a summary of how our hypothetical search for administrator shortage pages fared.

Yahoo!

One of the first and certainly among the highest trafficked search sites on the Web, Yahoo.com isn't really a search engine at all. It is, in fact, an index. Every Web page listed on Yahoo! is reviewed by a human being and placed in one of 14 categories based on that human being's impression of how it fits. As a consequence, Yahoo! Can provide highly reliable results when you're sure of what you're looking for and can classify it. Not only does Yahoo! cross-reference categories, but also has provided an add-in search facility courtesy of Inktomi and Google. A caution: Yahoo! indices can be incomplete because of the human factor. Documents about issues that cropped up a week ago may not appear because no one has had a chance to index those pages yet. Yahoo!'s search found ...

Nine entries each for + "K-12" + education + "administrator shortage" as well as and "K-12" and education and "administrator shortage"

Google

A newcomer to the search scene, Google is now among the most popular search engines. Google found ...

Eleven entries each for + "K-12" + education + "administrator shortage" as well as and "K-12" and education and "administrator shortage" (the same nine as Yahoo! plus two more).

Go/Infoseek

Infoseek was a search pioneer, but it has since been acquired by the Go Network, which is in turn owned by Disney. Infoseek found ...

3,611 matches for + "K-12" + education + "administrator shortage" and 29,276,186 matches for and "K-12" and education and "administrator shortage"

Excite!

Primarily another index rather than search engine, Excite! specializes in finding the "top 10" results for any search (although it will give you more than 10 results if it has more to offer. It just won't tell you how many more). Excite! found ...

17 matches for + "K-12" + education + "administrator shortage" and an unknown number (more than 30) of matches for and "K-12" and education and "administrator shortage"

Alta Vista

The granddaddy of all search engines, Alta Vista is renowned for its comprehensive search results. This is the engine I always resort to if nobody else can find what I'm looking for. Alta Vista found ...

16 entries for + "K-12" + education + "administrator shortage" and 1,175,172 pages found for and "K-12" and education and "administrator shortage"

Ask Jeeves

Ask Jeeves doesn't use Boolean logic in the sense that other engines do. Rather, it allows you to use natural language when querying it about your subject. The query: Where will I find information about the K-12 education administrator shortage? found ...

five related questions and then went out to Boolean engines and found an additional 37 possible links.

Northern Light

Another newcomer, Northern Light produces tightly focused results, although I have no idea what the color in their trademark "Blue Custom Search Folders" has to do with the validity of their results. Northern Light found ...

465,991 matches for + "K-12" + education + "administrator shortage" and 21 matches for and "K-12" and education and "administrator shortage"

Tips and tricks

Finally, here are a few tricks.

  • Always search on more than one service before you give up on finding something. Most search engines will provide you with links to other search engines. Yahoo!'s links, for example, are at the bottom of its results pages. Many search engines will even save your query and automagically do the search on a second engine when you click the appropriate link.

    You can run multiple, simultaneous searches across multiple search engines using sites like Dogpile.com and Metasearch.com.

  • Use more than one syntax. Because different search engines prefer different operators, try + "K-12" + education + "administrator shortage" as well as and "K-12" and education and "administrator shortage"
  • Try the "advanced search" link. Many sites include a link labeled "advanced search." Don't let this confuse you; it really means, "Let me cheat and have you fill in the blanks for me."

There's no need to be mystified by the secrets of Internet search engines. Try these techniques; be willing to attempt a little trial and error. Within a day or two you will be the person everyone seeks out to help them find the arcana of school administration on the Internet. Then you can teach your colleagues these tools to begin making the world a happier place.

Marc Elliot Hall is ACSA's former Webmaster. Search him out to tell him what you thought of this article email.

Additional Resources

Boole, George. "The Calculus of Logic" Cambridge and Dublin Mathematical Journal Vol. III (1848), pp. 183-98.

Boole, George. The Mathematical Analysis of Logic (1847)

Boole, George. An Investigation of the Laws of Thought (1854)

MacHale , P.D. George Boole--His Life and Work (Boole Press, 1985)

Barry, P.D. George Boole--A Miscellany (Cork University Press, 1969)