Irony of Internet search is that - there never is a paucity of information - but an overdose of it. With databases that can keep the entire Web at its fingertips - search engines almost always can retrieve relevant pages but the challenge lies in separating wheat from the chaff - keeping out unwanted stuff. Most engines find more sites from a typical search query than you could ever wade through and so finding the relevant pages from its search result looks more like proverbial needle in the haystack situation.
We have discussed Boolean search in last issue. Its a great tool in terms of simplicity and speed - but incapable of differentiating search expressions which have same keywords but in different order (hence different meaning). So, Search expressions 'Dog Bites Man' and 'Man bites Dog' retrieves virtually same result (unless using exact phrase).
Search Engines are aware of this problem and have tried to solve it in different ways. Directory type search engines display search result in alphabetic order. But they are extremely selective - so the search result seldom goes beyond one or two pages.
Spider based search engines have no such luck - so they employ what is called 'relevance score' to sort search results.
Relevance score is a measure to bring the most relevant pages at top of any search result. Many search engines display relevant score of each retrieved page.
Relevance scores reflect the number of times a search term appears, where it appears (e.g. in the title, in the meta tags, towards the beginning of the document etc.), if all the search terms are near each other and many other relevance parameters. Each parameter has a different weightage. The pages are sorted by final relevance score.
Since each search engine has its own system of calculating relevance score - you find different search result from different search engines even when the search expression is same.
- Newsletter on Business Opportunties from India and Abroad
Vol: 3, Issue 13
July 18' 2002