Search Engine History
SEO is simply an end that justifies the means of placing a website on the first page of Google and Yahoo on the left hand side of the screen or “organic” side. All other search engines are negligible, including MSN/Live and Ask.com, formerly Ask Jeeves. Google currently controls almost 70% of the market in the U.S. Thus, SEO must be targeted specifically to Google.
Google’s lead engineer is a man named Matt Cutts. Besides the founder, he alone determines how Google’s algorithms will evaluate sites. No one really knows how Google crawls or looks at sites and evaluates them. That is why they are trading at upwards on $500 per share. However, because Google thinks they are truly a noble company for the good of mankind that seeks to destroy the evil Microsoft, Matt Cutts makes a lot of the criteria that he uses public. He even has a blog,
http://www.mattcutts.com/blog
In 1980, physicist Tim Berners-Lee with the European Organization for Nuclear Research (CERN) proposed and prototyped a system for combining isolated internet in the US and Europe into one system. In 1989, Berners-Lee and CERN data systems engineer Robert Cailliau each submitted separate proposals for an Internet-based hypertext system providing similar functionality. The following year, they collaborated on a joint proposal, the WorldWideWeb (W3). The terms world wide web and internet should not be confused. Before there was a world wide web, there was an internet. The internet was invented by the Defense Advanced Research Projects Agency in the late 60s to early 70s. The World Wide Web is simply a collection of public websites on the internet for the purpose of sharing information; very different from the top secret use of the internet by the US military. The first publicly available description of HTML was a document called HTML Tags, first mentioned on the Internet by Berners-Lee in late 1991. It describes 22 elements comprising the initial, relatively simple design of HTML. Thirteen of these elements still exist in HTML today. By 1994, with the help of MIT and the European Commission, Tim Berners-Lee left CERN to take on his World Wide Web project full time.
Around this time all websites were written in a basic language by Tim Berners-Lee known as HTML. HTML, a rollout of HyperText Markup Language, is the predominant markup language for web pages. It provides a means to describe the structure of text-based information in a document — by denoting certain text as links, headings, paragraphs, lists, and so on — and to supplement that text with interactive forms, embedded images, and other objects. HTML is written in the form of tags, surrounded by angle brackets. HTML can also describe, to some degree, the appearance and semantics of a document, and can include embedded scripting language code (such as JavaScript and CSS) which can affect the behavior of Web browsers and other HTML processors. By convention, html format data files use a file extension .html or .htm.
SEO dates back to the early 90’s when the first search engines were being developed. It is little known that the first effective search engine was based off of software that could read websites to the blind under the U.S. Rehabilitation Act Section 508 and the World Wide Web Consortium’s content accessibility guidelines. The programs looked at specific lines of code (website language) that did not affect the look of sites. Before this the websites searches could only look at the title file names of websites and text within the site. The problem was that this didn’t necessarily tell the search what the site was about. As more and more sightless people began using the internet, more companies began inserting this code into their HTML tags.
In 1994 the first real search “engine” went live, Webcrawler. Prior to them, users could type in a sentence called a keyword and search the titles of websites similar to the Windows’ 3.1 desktop search. A keyword is not only one word but an entire phrase. The reason it was called a search engine is that it had the ability to not only search titles of file names; it would actually open the sites and look at them. This is where the term “crawling” came from. If the World Wide Web is a series of websites on the internet, this search engine or spider could crawl them. Webcrawler also coined the term, metasearch. They called the extra lines of code that described the website meta codes. Websites today may still include the words meta in them, whether it is a meta tag, meta keyword description, alt tag, or h1. They all describe elements in the site. It was purchased by America Online in 1996.
The second major commercial search engine was Lycos.
Some of the big names that came out after Webcrawler were Altavista, Northernlight, Excite, Infoseek, Inktomi, HotBot etc. They all copied WebCrawler’s algorithms. Yahoo! Expanded upon WebCrawler’s processors prior to 1994 and came out with their own website search. It was only based on their own directories and was by no means an engine, although it became one of the most popular. The technology was similar to the Windows 3.1 file explorer. Some people liked the feature of not having to do keyword searches.
Inktomi’s software was incorporated in the widely-used HotBot search engine, which displaced AltaVista as the leading web-crawler-based search engine.
HotBot was one of the early Internet search engines and was launched in May 1996 as a service of Wired Magazine. It was one of the first search engines to offer the ability to search within search results. It was launched using a “new links” strategy of marketing, claiming to update its search database more often than its competitors. Though competitive when it was acquired by Lycos in 1998, HotBot has in recent years reduced its scope. Google’s “new” universal search today is nothing more than an addition of HotBot’s search within a search.
While this was going on a small company called Rankdex decided that there must have been a better way to search the web. Webcrawler and Yahoo! were two completely different types of technologies, yet both were successful. There must have been a better way! The company designed a site where they would build on WebCrawler’s design but add a unique value to how they displayed websites. They would look at the amount of websites that had links pointing to the websites in the search and display the top ones that came up in the keyword search in order of incoming links and the value of the links. They called these incoming links backlinks.
Does this technology sound familiar? The two revolutionary students at Stanford that developed a new way to search the web codenamed BackRub did nothing more than figure out a way to market Rankdex’s technology. They added one value-added service to Rankdex, a PageRank and kept their home page very clean looking as they weren’t worried about generating revenue through services like email and news. We wonder where they got the name from…? This iterative algorithm ranked web pages based on the number and PageRank of other web sites and pages that link there, on the premise that good or desirable pages are linked to more than others. Clinton Cimring, a less recognized SEO specialist was able to crack the algorithm from Rankdex that Google “obtained.” It is as follows:
The PageRank of page A is given as
PR(A) = (1-d) / N + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))
where N is the total number of all pages on the web and assuming that an additional inbound link from page X increases the PageRank of page A by d × PR(X) / C(X), where PR(X) is the PageRank of page X and C(X) is the total number of its outbound links.
Besides the name, BackRub, the design nor the technology hadn’t changed until 2007. The name “Google” originated from a misspelling of googol which is 10100, the number represented by a 1 followed by one hundred zeros.
Microsoft first launched MSN Search (since re-branded Live Search) in the fall of 1998 using search results from Inktomi. In early 1999 the site began to display listings from Looksmart blended with results from Inktomi except for a short time in 1999 when results from AltaVista were used instead.
By 2000, Yahoo was providing search services based on Inktomi’s search engine.
A patent describing part of Google’s ranking mechanism (PageRank) was granted to Stanford on September 4, 2001 and Rankdex was out of business. Yahoo! acquired Inktomi in 2002 and Overture (which owned AlltheWeb and AltaVista) in 2003. Yahoo! used a basis of Google’s search engine until 2004, when it launched its own search engine based on the combined technologies of its acquisitions.
In 2004, Microsoft began a transition to its own search technology, powered by its own web crawler (called msnbot). The technology still does not work, as there is an error in their algorithm.
It turned out to be a very good thing that Google was developed at Stanford. Within months every college and University in the US set their home pages to Google simply by request. This was viral marketing on steroids. |