Google Tells Us “How Search Works”
As part of its continued effort to be more transparent, Google has recently published “How Search Works” on its Inside Search site which tells the “from algorithms to answers” story. The main “How Search Works” page tells the story in image form as you scroll down and it is separated into three parts: Crawling and Indexing, Algorithms, and Fighting Spam.
Google explains that it navigates the web by following links from page to page and that webmasters can choose whether or not to have their sites crawled. Of course, any site owner that wants to get its content ranking in Google needs to allow the site to be crawled. As Google crawls the web it sorts the pages by their content and other factors and keeps track of all of the data in the index, which is over 100 gigabytes.
The Google algorithm is made up of programs and formulas that are written to deliver the best results possible. The algorithm attempts to understand what a searcher is looking for based on components such as search methods, autocomplete, spelling, synonyms, query understanding, and Google instant. Relevant documents are pulled from the Index and results are ranked based on site and page quality, freshness, SafeSearch (reduces the amount of adult web pages), user context, translation, and universal search.
Another large component of delivering quality search results is fighting web spam. Spam comes in many formats (hidden text, keyword stuffing, user-generated, parked domains, thin content, unnatural links, cloaking, etc.). Google explains that the majority of spam removal is automatic but some questionable documents are reviewed by hand. Within the story page Google includes some neat links that show images of sites that have been removed lately along with action that has been taken on spam over time. When Google takes action on a site, it attempts to notify the website owners.
“How Search Works” is a great overview on the topic and has some great tidbits of information to look at. However, if you are looking for more detailed information, you might be better off checking out the overview page instead, which includes links to more in depth crawling and indexing, algorithm, and fighting spam content. There are videos by Matt Cutts and further infographics to view.
Perhaps one of the best resources within the Inside Search site is the Search Quality Rating Guidelines document. This document was updated on November 2, 2012 but was just recently published publicly. It provides an inside look at how “Raters” are supposed to look at a website to determine quality. Raters are third party people that help Google measure the quality of its search results, ranking, and search experience. The Search Quality Rating Guidelines was created by Google “especially for those individuals who want to understand better how Google thinks about relevance and quality
of search results”.
Every so often Google will publish a blog post, video, infographic, etc. that gives us a better idea of how it operates. Since Google is the king of the search space, it’s best to pay attention and learn as much as possible in order to succeed.
Categorized in: Search Engines
Like what you've read? Please share this article