Posted by BritneyMuller
It's been a few months since our last part of our current work rewriting the Beginner's Guide to SEO, but after a brief hiatus, we're back to sharing our draft of Chapter Two with you! This would not have been possible without the help of Kameron Jenkins who thoughtfully contributed to his talent for composing words throughout this piece.
This is your resource, the guide that has probably triggered your interest and SEO knowledge, and we want to do what you want. You have left incredibly useful comments on our draft and the draft chapter 1, and we would be honored if you took the time to let us know what you think of chapter two in the comments below.
Chapter 2: Search Engine Operation – Exploration, Indexing and Ranking
First of all,
As we mentioned in Chapter 1, search engines are answers. They exist to discover, understand and organize Internet content in order to offer the most relevant results to the questions posed by the researchers.
In order to appear in the search results, your content must first be visible to the search engines. This is probably the most important piece of the SEO puzzle: if your site can not be found, you will never end up in the SERPs (search engine results page).
How do search engines work?
Search engines have three main functions:
- Crawl: Search the Internet for content and review the code / content of each URL found.
- Index: Store and organize content found during the scanning process. Once a page is in the index, it is displayed to respond to relevant queries.
- Rank: Provides the content elements that will best answer a searcher's query. Order the search results by the most useful to a particular query.
What is search engine exploration?
Crawling is the discovery process in which search engines send a team of robots (called robots or robots) to find new and updated content. The content may vary – it may be a webpage, an image, a video, a PDF file, and so on. – but regardless of the format, the content is discovered by links.
The bot starts by retrieving a few web pages, then follows the links on these web pages to find new URLs. By browsing this path of links, crawlers can find new content and add it to their index – a massive database of discovered URLs – to be retrieved later when a user searches to see if the contents of this URL match.
What is a search engine index?
The search engines process and store the information they find in an index, a huge database of all the content they've discovered and that they deem sufficient to serve the researchers .
Ranking in the search engines
When someone performs a search, the search engines scour their index to search for highly relevant content and then order that content in the hope of resolving the searcher's query. This order of search results by relevance is called ranking. In general, you can assume that the more a website is ranked, the more the search engine believes that this site is linked to the query.
It is possible to block search engine bots from part or all of your site, or ask search engines to avoid storing certain pages in their index. Although there may be reasons for this, if you want your content to be found by researchers, you must first make sure that it is accessible to crawlers and crawlers. that it is indexable. Otherwise, it's as invisible.
At the end of this chapter, you will have the context in which you have to work with rather than against the search engine!
Note: In SEO, not all search engines are equal
Many newbies are questioning the relative importance of some search engines. Most people know that Google has the largest market share, but how important is it to optimize for Bing, Yahoo and others? The truth is that despite the existence of more than 30 major search engines, the SEO community really only pays attention to Google. Why? The short answer is that Google is where the vast majority of people search the web. If we include Google Images, Google Maps, and YouTube (a property of Google), more than 90% of web searches take place on Google, nearly 20 times more than Bing and Yahoo .
Analysis: Can Search Engines Find Your Site?
As you have just discovered, you must ensure that your site is analyzed and indexed to be able to appear in the SERPs. First of all: you can check the number and pages of your website that have been indexed by Google using "site: yourdomain.com ", an advanced search operator.
Go to Google and type "site: yourdomain.com " in the search bar. This will return results Google has in its index for the specified site:
The number of results displayed by Google (see "About __ results" above) is not accurate, but lets you know which pages are indexed on your site and how they appear in the results of research.
For more accurate results, monitor and use the index coverage report in the Google Search Console. You can sign up for a free Google Search Console account if you do not currently have one. With this tool, you can, among other things, submit sitemaps for your site and control the number of pages submitted to the Google index.
If you do not show up anywhere in the search results, there are several possible reasons:
- Your site is brand new and has not been analyzed yet.
- Your site is not linked to any external websites.
- Browsing your site prevents the robot from effectively exploring it.
- Your site contains a basic code called crawler directives that blocks search engines.
- Your site has been penalized by Google for spam tactics.
If your site does not contain any other sites, you can still have it indexed by submitting your XML sitemap in Google Search Console or manually submit individual URLs to Google. There is no guarantee that they will include a submitted URL in their index, but it's worth a try!
Can search engines see the entirety of your site?
Sometimes a search engine can search for parts of your site while browsing, but other pages or sections may be hidden for one reason or another. It's important to make sure that search engines are able to discover all the content you want to index, not just your homepage.
Ask yourself the following question: Can the bot go through your website, not just to ?
Is your content hidden behind login forms?
If you require users to sign in, fill in forms, or poll surveys before accessing certain content, search engines will not see these pages protected. A robot is certainly will not connect.
Do you use search forms?
Robots can not use search forms. Some people think that if they place a search box on their site, search engines will be able to find everything their visitors are looking for.
Is the text hidden in non-textual content?
Non-text multimedia forms (images, video, GIF, etc.) should not be used to display the text you want to index. Although search engines improve image recognition, there is no guarantee that they will be able to read and understand it at this time. It is always best to add text in the markup of your web page.
Can search engines track the navigation of your site?
Just as a crawler must discover your site via links from other sites, it must display links on your site. If you have a page that you want the search engines to find, but it is not linked to any other page, it is invisible. Many sites make the critical mistake of structuring their browsing inaccessible to search engines, which affects their ability to appear in search results.
Common browsing errors that can prevent crawlers from seeing your entire site:
- Have a mobile navigation that shows results different from those of your desktop navigation
- Customization or single navigation to a specific visitor type may appear to hide a search. engine crawler
- Forgetting to create a link to a main page of your website – remember, the links are the paths that crawlers follow on the new pages!
Information architecture is the practice of organizing and labeling content on a website to improve the efficiency and capacity of the organization. user financing. The best information architecture is intuitive, which means that users should not have to think hard to browse your website or find anything.
Your site should also have a useful page 404 (page not found) when a visitor clicks on a dead link or types a URL by mistake. The best 404 pages allow users to click on your site to not bounce back simply because they tried to access a non-existent link.
Explain to Search Engines How to Crawl Your Site
In addition to ensuring that crawlers can reach your most important pages, it is also important to note that you do not submit pages want them to find. These can include items such as old URLs with thin content, duplicate URLs (such as sorting and filtering settings for e-commerce), special promotional code pages, transfer or test pages etc.
Blocking pages from search engines can also help crawlers prioritize your most important pages and optimize your analysis budget (the average number of pages that you can use). a motor robot analysis on your site).
The robot's directives allow you to control what Googlebot needs to scan and index using a robots.txt file, a meta tag, a sitemap.xml file or the Google Search Console.
The Robots.txt files are located in the root directory of websites (eg Yourdomain.com/robots.txt) and suggest parts of your site that search engines should and should not not explore via specific robots. .Txt directives. This is a great solution when you try to block search engines from non-private pages on your site.
You do not want to prevent private / sensitive pages from being scanned here because the file is easily accessible by users and robots.
- If Googlebot can not find a robots.txt file for a site (40X HTTP status code), it scans the site.
- If Googlebot finds a robots.txt file for a site (20X HTTP status code), it will generally comply with the suggestions and proceed with the site scan.
- If Googlebot can not find a 20X or 40X HTTP status code (eg a 501 server error), it can not determine if you have a robots.txt file or not and will not analyze it. not your site.
The two types of meta directives are the meta robots tag (more commonly used) and the x-robots tag. Each provides the crawlers with more detailed instructions on how to parse and index the contents of a URL.
The x-robots tag offers more flexibility and functionality if you want to block large-scale search engines, because you can use regular expressions, block non-HTML files, and apply noindex tags to it. scale of the site.
These are the best options for blocking more sensitive * / private search engine URLs.
* For highly sensitive URLs, it is recommended to remove them or request a secure connection to view the pages.
WordPress Tip: In Dashboard> Settings> Playback, make sure the "Search Engine Visibility" box is not checked. This prevents search engines from accessing your site via your robots.txt file!
Avoid these common pitfalls and you will have clean and analyzable content that will allow robots to easily access your pages.
Once you have verified that your site has been scanned, you must make sure that it can be indexed.
A sitemap is what it looks like: a list of URLs on your site that robots can use to discover and index your content. One of the easiest ways to make sure Google finds your highest priority pages is to create a file that meets Google's standards and submit it through the Google Search Console. Although submitting a sitemap does not replace the need for good site navigation, this can certainly help crawlers access all your important pages.
Google Search Console
Some sites (most common with ecommerce) make the same content available on several different URLs by adding certain parameters to URLs. If you have already shopped online, you have probably reduced your search through filters. For example, you can search for "shoes" on Amazon, then narrow your search by size, color and style. Every time you refine, the URL changes slightly. How does Google know the version of the URL to use for Internet users? Google does a great job at determining the representative URL itself, but you can use the URL Settings feature in the Google Search Console to tell Google how you want your pages to be processed.
Indexing: How Do Search Engines Understand and Remember Your Site?
Once you have verified that your site has been scanned, you must make sure that it can be indexed. This is not fair – just because your site can be discovered and analyzed by a search engine does not necessarily mean that it will be stored in its index. In the previous section on exploration, we discussed how search engines discover your web pages. The index is where your discovered pages are stored. After a robot has found a page, the search engine makes it as a browser would. In doing so, the search engine analyzes the content of this page. All this information is stored in his index.
Read on to find out how indexing works and how you can make sure your site is integrated with this essential database.
Can I see how a Googlebot crawler sees my pages?
Yes, the cached version of your page will reflect a snapshot of the last time googlebot has crawled it.
Google crawls and caches web pages at different frequencies. More well-known and better-known sites, such as https://www.nytimes.com, will be explored more frequently than Roger the Mozbot's much less well-known website, http://www.rogerlovescupcakes.com ([19459010ifonlyitwasreal …)
You can view how your cached version of a page looks by clicking the drop-down arrow next to the URL in the SERP and selecting "Set hidden":
You can also view the text version of your site to determine if your important content is crawled and crawled effectively.
Have pages been removed from the index?
Yes, pages can be removed from the index! Some of the main reasons a URL can be removed are:
- The URL returns a "not found" error (4XX) or a server error (5XX) – This may be accidental (the page has been moved and a 301 redirect is not available). was not configured) or intentionally (the page was removed and 404ed in order to remove it from the index)
- The URL had a noindex meta tag added – This tag can be added by the site owners to tell the search engine to omit the page from its index.
- The URL was manually penalized for violating Webmaster's search engine guidelines and was therefore removed from the index.
- The URL analysis was blocked with the addition of a password required before visitors could access the page.
If you think that a page of your website that was previously in the Google index does not appear anymore, you can manually submit the URL to Google by accessing the page "Send the URL". tool in the search console.
How do search engines rank URLs?
How do search engines guarantee that when a person types a query in the search bar, she gets relevant results in return? This process is called ranking or ranking search results in order of importance or relevance to a particular query.
To determine relevance, search engines use algorithms, a process, or a formula by which stored information is retrieved and classified significantly. These algorithms have undergone many changes over the years to improve the quality of research results. Google, for example, makes algorithm adjustments every day – some of these updates are minor quality improvements, while others are updates. basic algorithms designed to solve a specific problem, such as Penguin. Check out our history of changes to the Google Algorithm for a list of confirmed and unconfirmed Google updates dating back to the year 2000.
Why does the algorithm change so often? Is Google just trying to keep us on our guard? While Google does not always reveal the reasons why they do what they do, we know that Google 's goal when making adjustments to algorithms is to. improve the overall search quality. That's why, in response to algorithm update questions, Google will respond with something like, "We're doing quality updates at all times." Guidelines or recommendations for research quality evaluators, both are very revealing in terms of what search engines want.
What do the search engines want?
Search engines have always wanted the same thing: to provide useful answers to questions from researchers in the most useful formats. If this is true then why does it seem that SEO is different now compared to previous years?
Think of it in terms of someone who is learning a new language.
In the beginning, their understanding of language is very rudimentary – "See Spot Run". Over time, their understanding begins to deepen and they learn semantics: the meaning of language and the relationship between words and sentences. Finally, with enough practice, the student knows the language well enough to understand the nuances and is able to answer questions, even vague or incomplete ones.
When search engines were just starting to learn our language, it was much easier to play the system using tips and tactics that went against quality recommendations. Take the keyword stuffing, for example. If you want to rank a particular keyword like "funny jokes", you can add the words "fun jokes" to your page and make it bold, hoping to improve your rankings for that term:
Welcome to Fun Jokes ! We tell the most funny of the world . Fun jokes are fun and crazy. Your funny joke is waiting for you. Sit down and read fun jokes because fun jokes can make you happy and funnier. Some funny funny jokes .
This tactic created terrible user experiences and, instead of laughing fun jokes, people were bombarded with annoying and hard-to-read text. It may have worked in the past, but it's never what the search engines wanted.
The Role of Links in SEO
When we talk about links, we could say two things. Backlinks or "inbound links" are links from other websites that link to your website, while internal links are links on your own site that point to your other pages (on the same site).
Links have historically played a big role in SEO. Early on, search engines needed help determining which URLs were more reliable than others to help them determine how to rank search results. Calculating the number of links pointing to a given site helped them to do so.
Backlinks work very similarly to the actual WOM (Word-of-Mouth) references. Let's take a hypothetical coffee, Jenny's Coffee, as an example:
- Referrals of others = good sign of authority
Example: Many different people have all told you that Jenny's cafe is the best in town
- Referrals of yourself = partial , so not a good sign of authority
Example: Jenny claims that Jenny's Coffee is the best in town
- References from irrelevant sources or poor quality = this does not n. is not a good sign of authority and could even report you spam
Example: Jenny paid never visited her coffee tell others how good it is.
- No Reference = Lack of Clarity
Example: Jenny's Coffee may be fine, but you have not been able to find anyone who has an opinion. of course.
That's why PageRank was created. PageRank (part of Google's core algorithm) is a link analysis algorithm named in honor of one of Google's founders, Larry Page. PageRank estimates the importance of a web page by measuring the quality and quantity of links pointing to it. The assumption is that the more relevant, important and reliable a web page is, the more links it will have.
The more natural backlinks you have from high-level (trusted) websites, the better your chances of ranking higher in search results.
The role contained in natural referencing
The links would be of no use if they did not direct the researchers towards something. That something is contained! Content is more than just words; it's all the things that researchers have to consume – there's video content, image content and, of course, text. If search engines are response machines, content is the means by which engines provide these responses.
Whenever someone performs a search, there are thousands of possible results, so how do the search engines decide which pages the searcher will find useful? A large part of determining the ranking of your page for a given query is the extent to which the content of your page matches the intent of the query. In other words, does this page correspond to the words sought and help to accomplish the task that the researcher was trying to accomplish?
Due to the priority given to user satisfaction and the completion of tasks, there are no strict criteria about the duration of your content, the number of times it should contain a keyword or header tags. All of these may play a role in the quality of the research, but the focus should be on the users who will read the content.
Today, with hundreds or even thousands of classification signals, the first three have remained quite consistent: links to your website (serving as third party credibility signals), content on the page (quality content consistent with the intention of a surfer)), and RankBrain.
What is RankBrain?
RankBrain is the self-learning component of Google's main algorithm. Machine Learning is a computer program that continues to improve its predictions over time through new observations and new training data. In other words, it is always learning, and as he is still learning, the research results should constantly improve.
For example, if RankBrain notices a lower-level URL that provides a better result for users than higher-ranked URLs, you can bet that RankBrain will adjust these results, moving the most relevant result and downgrading less important pages as as a by-product. .
Like most things with the search engine, we do not know exactly what RankBrain understands, but apparently Google's people either.
What does this mean for SEOs?
As Google continues to leverage RankBrain to promote the most relevant and useful content, we need more than ever to strive to meet the expectations of researchers. Provide the best information and experience possible to the Internet users likely to arrive on your page, and you have taken an important first step to perform well in a RankBrain world.
Metrics of Engagement: Correlation, Causality, or Both?
With Google rankings, engagement metrics are most likely a partial correlation and partial causality.
Lorsque nous parlons des indicateurs d'engagement nous entendons des données représentant la manière dont les chercheurs interagissent avec votre site à partir des résultats de recherche. Cela inclut des choses comme:
- Clics (visites depuis la recherche)
- Temps sur la page (temps passé par le visiteur sur une page avant de le quitter)
- Taux de rebond (pourcentage de toutes les sessions du site où les utilisateurs ne consultent qu'une page)  Pogo-sticking (cliquer sur un résultat organique puis retourner rapidement au SERP pour choisir un autre résultat)
De nombreux tests, y compris l’enquête factorielle de Moz, ont montré que les mesures d’engagement étaient en corrélation avec un classement plus élevé. débattu. Est-ce que de bons paramètres d'engagement indiquent uniquement des sites hautement classés? Ou les sites sont-ils hautement classés parce qu'ils possèdent de bons paramètres d'engagement?
Ce que Google a dit
Bien qu'ils n'aient jamais utilisé l'expression «signal de classement direct», Google a clairement indiqué qu'ils utilisaient absolument les données de clic pour modifier le SERP pour des requêtes particulières.
"Le classement lui-même est affecté par les données de clic. Si nous découvrons que, pour une requête particulière, 80% des personnes cliquent sur # 2 et que 10% seulement cliquent sur # 1, après un certain temps, nous pensons que le numéro 2 est celui que les gens veulent, alors nous le changerons. "
Un autre commentaire de l'ancien ingénieur Google, Edmond Lau, corrobore ceci:
"Il est évident que tout moteur de recherche raisonnable utiliserait les données de clic sur ses propres résultats pour obtenir un classement afin d'améliorer la qualité des résultats de recherche. La mécanique réelle de l'utilisation des données de clics est souvent propriétaire, mais Google met en évidence l'utilisation de données de clics avec ses brevets sur des systèmes tels que éléments de contenu ajustés en fonction du classement .
Comme Google doit maintenir et améliorer la qualité des recherches, il semble inévitable que les mesures d’engagement soient plus que la corrélation, mais il semble que Google ne les appelle pas comme un "signal de classement". ] qualité de recherche et le classement des URL individuelles en est simplement un sous-produit.
Quels tests ont confirmé
Divers tests ont confirmé que Google ajusterait l'ordre SERP en réponse à l'engagement des chercheurs:
- Le test 2014 de Rand Fishkin a abouti à un résultat n ° 7 qui est passé à la première place après avoir invité environ 200 personnes à cliquer sur l'URL du SERP. Il est intéressant de noter que l’amélioration du classement semble être isolée par rapport à l’emplacement des personnes ayant visité le lien. La position au classement a grimpé aux États-Unis, où de nombreux participants étaient situés, alors qu'elle restait plus basse sur la page de Google Canada, Google Australie, etc.
- Comparaison des pages supérieures et du temps de séjour moyen de RankBrain par Larry Kim semblait indiquer que le composant d'apprentissage automatique de l'algorithme de Google abaissait la position des pages sur lesquelles les gens ne passaient pas autant de temps.
- Les tests de Darren Shaw ont également montré l'impact du comportement des utilisateurs sur la recherche locale et les résultats de cartes.
Étant donné que les indicateurs d'engagement des utilisateurs sont clairement utilisés pour ajuster la qualité des SERP et que les changements de position de classement constituent un sous-produit, il est prudent de dire que les référenceurs doivent optimiser leur engagement. L'engagement ne modifie pas la qualité objective de votre page Web, mais sa valeur pour les chercheurs par rapport aux autres résultats de cette requête. C'est pourquoi, après aucune modification de votre page ou de ses backlinks, le classement pourrait diminuer si les comportements des chercheurs indiquent qu'ils préfèrent les autres pages.
En ce qui concerne le classement des pages Web, les mesures d'engagement agissent comme un vérificateur de faits. Les facteurs objectifs, tels que les liens et le contenu, classent tout d'abord la page, puis les mesures d'engagement aident Google à s'adapter si ce n'est pas le cas.
L'évolution des résultats de recherche
À l'époque où les moteurs de recherche manquaient de la sophistication actuelle, le terme «10 liens bleus» a été utilisé pour décrire la structure plate du SERP. À chaque fois qu'une recherche était effectuée, Google renvoyait une page contenant 10 résultats organiques, chacun au même format.
Dans ce paysage de recherche, détenir la première place était le Saint Graal du SEO . Mais quelque chose est arrivé. Google a commencé à ajouter des résultats dans de nouveaux formats sur leurs pages de résultats de recherche, appelées fonctions SERP. Certaines de ces fonctionnalités de SERP incluent:
- Annonces payantes
- Extraits en vedette
- Personnes également interrogées
- Pack local (carte)
- Panneau de connaissances
Et Google en ajoute de nouvelles tout le temps. Il a même expérimenté des "SERPs à résultat nul", un phénomène où un seul résultat du graphe de connaissance était affiché sur le SERP sans aucun résultat en dessous, à l'exception d'une option pour "voir plus de résultats".
L'ajout de ces fonctionnalités a provoqué une panique initiale pour deux raisons principales. D'une part, nombre de ces caractéristiques ont entraîné une baisse des résultats organiques sur le SERP. Un autre sous-produit est que moins de chercheurs cliquent sur les résultats organiques, car le SERP lui-même répond à davantage de questions.
Pourquoi Google ferait cela? Tout revient à l'expérience de recherche. Le comportement de l'utilisateur indique que certaines requêtes sont mieux satisfaites par différents formats de contenu. Notez que les différents types de fonctionnalités SERP correspondent aux différents types de requêtes.
Intention de requête
Caractéristique SERP possible déclenchée
Informatif avec une réponse
Graphique de connaissances / Réponse instantanée
We’ll talk more about intent in Chapter 3, but for now, it’s important to know that answers can be delivered to searchers in a wide array of formats, and how you structure your content can impact the format in which it appears in search.
A search engine like Google has its own proprietary index of local business listings, from which it creates local search results.
If you are performing local SEO work for a business that has a physical location customers can visit (ex: dentist) or for a business that travels to visit their customers (ex: plumber), make sure that you claim, verify, and optimize a free Google My Business Listing.
When it comes to localized search results, Google uses three main factors to determine ranking:
Relevance is how well a local business matches what the searcher is looking for. To ensure that the business is doing everything it can to be relevant to searchers, make sure the business’ information is thoroughly and accurately filled out.
Google use your geo-location to better serve you local results. Local search results are extremely sensitive to proximitywhich refers to the location of the searcher and/or the location specified in the query (if the searcher included one).
Organic search results are sensitive to a searcher's location, though seldom as pronounced as in local pack results.
With prominence as a factor, Google is looking to reward businesses that are well-known in the real world. In addition to a business’ offline prominence, Google also looks to some online factors to determine local ranking, such as:
The number of Google reviews a local business receives, and the sentiment of those reviews, have a notable impact on their ability to rank in local results.
A "business citation" or "business listing" is a web-based reference to a local business' "NAP" (name, address, phone number) on a localized platform (Yelp, Acxiom, YP, Infogroup, Localeze, etc.).
Local rankings are influenced by the number and consistency of local business citations. Google pulls data from a wide variety of sources in continuously making up its local business index. When Google finds multiple consistent references to a business's name, location, and phone number it strengthens Google's "trust" in the validity of that data. This then leads to Google being able to show the business with a higher degree of confidence. Google also uses information from other sources on the web, such as links and articles.
Check a local business' citation accuracy here.
SEO best practices also apply to local SEO, since Google also considers a website’s position in organic search results when determining local ranking.
In the next chapter, you’ll learn on-page best practices that will help Google and users better understand your content.
[Bonus!] Local engagement
Although not listed by Google as a local ranking determiner, the role of engagement is only going to increase as time goes on. Google continues to enrich local results by incorporating real-world data like popular times to visit and average length of visits…
…and even provides searchers with the ability to ask the business questions!
Undoubtedly now more than ever before, local results are being influenced by real-world data. This interactivity is how searchers interact with and respond to local businesses, rather than purely static (and game-able) information like links and citations.
Since Google wants to deliver the best, most relevant local businesses to searchers, it makes perfect sense for them to use real time engagement metrics to determine quality and relevance.
You don’t have to know the ins and outs of Google’s algorithm (that remains a mystery!), but by now you should have a great baseline knowledge of how the search engine finds, interprets, stores, and ranks content. Armed with that knowledge, let’s learn about choosing the keywords your content will target!
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!