Chapter 2: How does the search engine work?
After reading the first chapter, I believe everyone has a preliminary understanding of SEO. In this chapter, we will explain what a search engine is and its operating mode, so that everyone can have a deeper understanding of the role of search engines in SEO. At the end of the article, everyone can better grasp Google’s main ranking factors and algorithms. Want to know how to build a Search Engine-Friendly website? Read on now!
What is a search engine?
A search engine is an information retrieval tool on the Internet. It collects various online information through specific computer programs, and then processes and analyzes it. Finally, the most relevant search results are listed for users to consult. The most popular search engine in the world is Google, with a market share of over 80%, followed by Bing, Yahoo, Baidu and Yandex. Now that Google is so popular, let us understand its operating mode together!
Search engine operating mode
Search engines generally have three main functions: crawl web pages first, then create a search, and finally display search results and rankings.
Crawling
Crawling is the process of discovering new web pages and content. Each search engine will have its own automatic search robot (also known as “Web Spiders”), and Google’s web spider is Googlebot. Googlebot visits different websites and crawls new content, such as text, images, and videos, just like spiders crawling on a spider web. During the crawling process, Googlebot will first visit several known webpages and read the content on the webpage, and then obtain a new URL based on the link on the webpage. By constantly repeating these steps, Googlebot will be able to find a large amount of network information. Googlebot can also discover new URLs in other ways, such as XML Sitemaps.
Indexing
When Googlebot finds network information, it will store the information in a huge database called “Google Index-Caffeine”, and provide it to users when needed, and this process is “retrieval”.
Ranking
When users search for information, Google will find relevant information from the index, and sort the pages in order according to different factors, such as the user’s location, language, device, website content and the relevance of the query, etc. Ranking is affected by a series of ever-changing Google algorithms, so we can’t go into the weight of each ranking factor, but as early as 2016, Google announced the three most important ranking factors, namely, website content (Content) and reverse links. (Backlinks) and RankBrain, and these three items are closely related to the following algorithms.
Go with Google’s algorithm
Want to gain search advantage and improve website ranking? Understanding Google’s algorithm is an indispensable survival strategy. Google’s algorithm aims to provide users with the most suitable search results and the best user experience through machine learning and big data analysis. We will introduce three important Google algorithms.
Google Panda
Google has always focused on content quality and published the Panda algorithm in 2011. The main purpose is to punish low-quality and content-poor websites, so as to provide users with the highest quality, most relevant and original search results. In other words, the following site features will be penalized by Google:
Thin Content: The web page does not provide enough useful information for readers
Keyword Stuffing: The page is over-stacking keywords, trying to improve the page ranking by improving the relevance of the page to user queries
Duplicate Content: The web content lacks originality and is mostly copied from other websites
Google Penguin
In the early days, when people discovered that backlinks were related to page ranking, they tried to establish a large number of unnatural backlinks to improve rankings. The so-called unnatural links are artificial, counterfeit or tampered links that point to one’s own web pages. In view of this, Google published the Penguin algorithm in 2012 to monitor the quality of backlinks on each website. Google makes it clear that all acts of manipulating page rankings with unnatural backlinks belong to black hat SEO, violate Google’s “Webmaster Guidelines” (Google’s Webmaster Guidelines), and may be punished by Google.
RankBrain
RankBrain is an artificial intelligence (AI) algorithm published by Google in 2015. It is regarded as a part of the Google Hummingbird algorithm. The Hummingbird algorithm was born in 2013 with the purpose of replacing the previous keyword search technology with Semantic Search technology. To put it simply, Google’s search results are no longer based solely on the matching of web keywords and user query terms, but provide the most appropriate online information based on the user’s search intent. RankBrain has introduced an artificial intelligence system to more accurately determine the user’s search intent, analyze the quality of the website’s content, and optimize search results. Later, Google even included RankBrain as one of the most important ranking factors.
