It takes nine months for us to grow inside the womb and apparently it takes nine months to really analyze how search engines crawl and get results.
Every week for the last nine months I have done a search for “Auren Hoffman” in eight major search engines (Google, Lycos, All the Web, Teoma, Gigablast, Wisenut, Yahoo, and A9). I recorded only the raw number of results as my objective was to understand the reach of each search engine rather than the accuracy. Since, as far as I know, I’m the only “Auren Hofman”, the results are fairly finite and easy to define. (a chart and the raw data is below)
This experiment was born of my fascination with other people’s fascination about their “Google Number” (see What Is Your Google Number). When I started, my Google Number was only 794. It climbed to 2880 in March. Today, it is inexplicitely only 2220 — which means that either:
(a) about a six hundred pages with my name on it have dropped of the face of the earth in the last five months
(b) Google changed the way it crawls and stopped caching certain sites
(c) google misrepresents its raw search results (we’ll get to that later)
1. Fluctuation in weekly results
Some search engines have results that fluctuate widely every week and others are very consistent (see graph below). Google fluctuates widely each week even though it is obvious (from just looking at the search results) that Google spiders new pages very quickly. Search engines like Gigablast and Wisenut fluctuate very little — and it also seems they rarely add newly crawled information to the search engines.
2. Numbers that end in zero
I noticed that for results above 1000, all Google and Yahoo results end in a zero. Maybe they are rounding to the nearest ten or maybe they are just making an approximation (which might be the reason that results vary so widely). If you follow every result, they both seem to overestimate the number of pages.
3. There is a huge discrepency between the search engines
Even if the results from Google and Yahoo are a bit inflated, they are still vastly higher than the other engines (though All the Web) indexes a good deal of pages too. That means that purely on who indexes the most pages, there is a big difference between the engines.
And indexing the most pages is very important. Now everyone always seems to talk about relevance. And relevance is very important. But so is mass. Especially when you are looking for something more archane.
What is your Google number??
|Lycos||All the Web||Teoma||Gigablast||Wisenut||Yahoo||A9|