Internet Research
|
|
|
Search Engines & Subject
Directories |
Search engines
|
|
|
Search engines are the means by which
most people search the Web. |
|
Common examples are Google, Altavista,
Direct Hit. |
Yet they don’t search the
Internet
|
|
|
Yet a search engine does not actually
search the Web during your search. |
|
A search engine searches itself. |
|
It’s a three-step process. |
1) Bots index words
|
|
|
Search engines continually send
out hundreds of “robots” or “bots”
(or “spiders” or “crawlers” ) |
|
Bots visit web sites, read word by
word, and then index those words. |
2) A database is created
|
|
|
A huge database of Web sites thus is
gathered and indexed by word. |
|
These databases can be huge, with millions of links. |
3) The Interface gives you
access
|
|
|
Using the keywords you give it, a
search engine then searches its own current index. |
Interfaces are based on
rankings
|
|
|
Search engines return results based on
a ranking system. |
|
Ranking is the order that files are
listed when they are retrieved. |
The ranking system is secret
|
|
|
These systems are proprietary and often
“secret.” In general: |
|
Altavista ranks web pages higher if
your search terms are found in the first few words of the page |
|
Google ranks by document “popularity”
with other similar searches |
|
Direct Hit ranks by the length of time
other users spent at the site |
Not even half the Web
|
|
|
With all of this software and
sophistication, even the best search engines cover only 40-50% of the Web. |
|
And they miss much else on the
Internet. |
Bots hit and miss
|
|
|
Bots miss: |
|
XML pages, pdf files |
|
Dynamically created HTML pages |
|
Frames-based pages |
|
New pages or recent updated text |
|
Some say the Invisible Web is 500 times
larger than Web |
Subject Directories
|
|
|
A subject directory is also a database
of web sites and references. |
|
But a subject directory is organized
not by keywords but by category or subject. |
Yahoo!
|
|
|
Yahoo! Is the most popular subject
directory. |
|
www. about.com takes the idea a step
further with subject guides for selected topics. |
Subjects are organized by
people.
|
|
|
Information is selected, organized and
cataloged by a person, not software. |
|
You can usually be more assured that
the search results will make sense. |
You get an index of sites.
|
|
|
Subject directories will not often
provide you with ranked web sites. |
|
Instead, you will get a broad index
related to your topic, divided further by subheadings. |
Use for early searching.
|
|
|
Use a subject directory early in your
search process to learn about your subject. |
|
You will get fewer links of higher
quality. |
|
When you get more specific questions,
you should use a search engine. |