Michael Martinez over at SEO-Theory.com recently posted a thoughtful article which considers the rate of expansion of the web and questions whether search engine technology has maintained pace.
What’s refreshing about this post is where it differs from the typical SEO blog entry. So often, folks in the SEO industry start a blog in an attempt to create and nurture a personal brand. The unfortunate reality, though, is that there’s only so much SEO content to go around. Much of what gets written in the pursuit of SEO Identity is either rehashed SEO basics or a variation on The Ultimate List theme. None of this is very interesting and in the long run is counter productive to the desired goal.
I figure Mr. Martinez is after the same with his blog. However, his posts at least make me think and offer a critical perspective on the search industry. In his latest, there are a few points I agree with and a few I question.
Content Inclusion
In examining the completeness of a search engine’s index, Mr. Martinez states that in order to deal with limited capacity Google has implemented a “Web Apartheid.” That is, not all sites get indexed and Google determines which of the fortunate (relatively) few make it in.
In Google’s defense it should be noted that a huge number of spam sites have been dropped out of Google’s index, but when all is said and done Web spam is a legitimate part of the World Wide Web because neither Google nor any other search engine has the authority to determine what is and is not part of the Web.
This argument falls apart a bit when you consider that all search engines attempt to provide the searcher with “relevant” results. Whether you agree with how the engine determines relevancy or not, this implies a qualitative process. If we allow Google to determine which sites are relevant for a given query, we must also allow them to determine which sites are not relevant to any queries and thus be excluded.
Link Building
As a result of this Apartheid, Mr. Martinez states:
For the search optimization community, then, the primary challenge today is the same as it was ten years ago: we have to ensure that our protected content is included in as many search indexes as possible… The more inbound links a page has, the more likely the page will be crawled and indexed. Simply having a page crawled and indexed is the SEO’s first priority. Getting the page to rank is the SEO’s second priority.
The implication here is that significant link building is required in order to get your content indexed. With the exception of truly large sites (millions of pages), my experience has shown that getting content indexed is fairly straight forward. I’ve managed a number of content driven sites, all of which had new pages indexed immediately without any external inbound links. This includes new sites that were created and all it took was the submission of a sitemap to get pages indexed. Again, for significantly large sites I would expect the engines to take their time and index slowly as you prove your worth. I would be interested to hear an account of, say, your average blog owner who must go out and procure new inbound links before new posts get indexed.
Link Counting
I haven’t found too many instances of people who get paid to perform search engine optimization for their clients openly criticizing the link based page rank algorithm. I suppose it’s a case of not wanting to bite the hand that feeds, but Mr. Martinez bucks the trend and comes out against link counting – and on this point we agree completely. Mr. Martinez introduces the issue as
… gross error of judgment that has yet to be acknowledged by Google.
He further provides excellent background on the history of link based page rank.
Larry Page and Sergey Brin tested their PageRank hypothesis on the Stanford University Web site, where pages were not embedded with links designed to assist with crawling or commercial promotional links. Just because they were able to show that Stanford’s probable most important pages were better linked than other pages did not mean their model was relevant to the real World Wide Web. Google’s founders and investors failed to reconcile the wrong assumptions with reality.
This is evidence that the SEO industry is operating from a faulty premise. Mr. Martinez reiterates his point that aggressive link building was originally an attempt simply to get pages indexed and thus:
The valuative [sic] citation model that Page and Brin believed existed in fact never existed at all, except possibly on academic and government Web sites.
And finally:
The World Wide Web cannot be accurately valued through its linking structure because the linking structure was never designed to be a valuation process.
Excellent points, all well made. Not only is the link based model built upon an unsound foundation, the structure of our relevancy ecosystem is weakened each day as SEOs everywhere continue to try to use this flaw to influence natural search rankings for their clients.
There’s much more to Mr. Martinez’ post and I recommend you read it in its entirety. Unlike much of the line-towing that goes on in the SEO industry, this entry is notable for its thoughtfulness and consideration of how the status quo simply isn’t sufficient. The search optimization industry clearly needs to evolve and it won’t unless its practitioners ask the hard questions – both of themselves and the search engines.
Related Posts: Universal Search