How to find information from Internet - Part IV Getting The Most out of Search Engines

Continuing our discussion on how to find information from Internet - we look at ways to make search result more fruitful.

At the heart of any search endeavour, no matter what kind of search tool you are using, there are three areas that can affect your search result significantly:

  1. Content of search engine
  2. Search logic or algorithm
  3. Presentation of search result

Content of search engine

A search engine collects information for its database by accepting listings sent by websites who want exposure, from its own spiders (please see earlier discussion) or by simply using databases of other search engines (e.g. meta search engines). There are two issues in the process that you, as information searcher, should be aware of:

  1. Focus of search engine
  2. Degree of information collection

There are thousands of search engines - and each has a focus area. Few big ones like Yahoo!, Alta Vista or Google are universal - they accept information on any subject or from any geographical area so long as the website satisfies their respective editorial policy. However, most others are selective on content. For example - country specific search engines accept webpages only from or on the concerned country. Subject specific search engines do not accept webpages on alien subjects. Even universal search engines like Yahoo!, MSN etc. have their country specific versions (e.g. Yahoo! India)

So, if looking for information on Australia - look for Australia specific search engines.

There are many sources in The Net that compiles information on search engines. Following are a few for your convenience:

Degree of information collection

Though actual working of Spiders is closely guarded secret in many cases - it is generally assumed that they start with a historical list of links, such as server lists, and lists of the most popular or best sites, and follow the links on these pages to find more links to add to the database. A spider could send back just the title and URL of each page it visits, or just parse some HTML tags, or it could send back the entire text of each page. The coverage and degree of indexing can have a bearing on quality of your search result.

Many search engines use 'fields' to store information collected from various parts of a webpage. The title, the URL, image tag, hypertext link etc. are common fields on a Web page. Field searching allows the searcher to designate where a specific search term will appear. Rather than searching for words anywhere on a Web page, field-specific searching can considerably reduce unwanted or junk information in search result.

For example, in Alta Vista - the searches


Finds pages that contain the specified text (i.e. infobanc) in any part of the page other than an image tag, link, or URL.

title:'The Great Indian Bazaar' 

Finds pages that contain the specified phrase 'The Great Indian Bazaar' in the page title (which appears in the title bar of most browsers).


Finds pages with a specific word or phrase in the URL. For example - url:export will find all pages on all servers that have the word export anywhere in the host name, path, or filename.

More search tips in coming issues

Related Links:

Source: FAIDA - Newsletter on Business Opportunties from India and Abroad Vol: 3, Issue 11 July 4' 2002

Author : Dr. Amit K. Chatterjee
(Amit worked in blue-chip Indian and MNCs for 15 years in various capacities like Research and Information Analysis, Market Development, MIS, R&D Information Systems etc. before starting his e-commerce venture in 1997. The views expressed in this columns are of his own. He may be reached at )

� All Rights Reserved. Limited permission is granted to publish this article in a web-site or printed in a journal/ newspaper/ magazine provided the publisher takes prior permission from author, do not make any change in the article (i.e. keep it exactly same as displayed above) and cite the Source of this article as The Great Indian Bazaar with a link to this page.