- 1 Identify and Fix Crawl Errors in Google Search Console
- 1.1 What is Crawler in SEO?
- 1.2 What is Crawling in SEO?
- 1.3 What is Crawl Errors in SEO?
- 1.4 How to Identify Crawl Errors?
- 1.5 How to Fix Crawl Errors?
- 1.5.1 How to Fix “Server Errors (5xx)” in Search Console
- 1.5.2 How to Fix “Redirect Error” in Search Console
- 1.5.3 How to Fix “Submitted URL Blocked by Robots.txt” in Search Console
- 1.5.4 How to Fix “Submitted URL Marked Noindex” in Search Console
- 1.5.5 How to Fix “Submitted URL Seems to be a Soft 404” in Search Console
- 1.5.6 How to Fix “Submitted URL Returns Unauthorized Request” in Search Console
- 1.5.7 How to Fix “Submitted URL not Found (404)” in Search Console
- 1.5.8 How to Fix “Submitted URL has Crawl Issue” in Search Console
- 1.6 Conclusion:
Identify and Fix Crawl Errors in Google Search Console
Think of a situation where you have created a blog or website, performed all required On-page SEO, submitted to all popular search engines specially Google and after a month you got to know that none of the search engines you have submitted to, are showing your site in their SERP pages.
What will you do?
Of course, you will worry!!!
But along with that also you have to identify what has happened and how you can fix it out.
But how will you identify why this has happened?
So let me tell you that whenever you see these kind of errors that search engines are not showing your site in their SERP pages that means the site has crawling issue.
Crawling issue occurs when search engine Crawlers are not able to reach to your site for crawling because of some obstacles and that’s why search engines are not showing your site in their SERP pages.
Crawling Errors may occur to your site because of multiple reasons, which I will be discussing in this article with possible solutions.
So if you want to know about crawling errors and how you can fix it out then just follow me in this article and I will take you through all possible causes of crawling errors with possible solutions.
But before that let me clear some basic terminologies about crawling and crawler so that you don’t get confused what I am talking about here.
So, let’s begin one by one….
What is Crawler in SEO?
Crawler is a program of search engine which has been designed for crawling (Scanning) sites and its pages on the internet to collect the data and information for the purpose of search engine indexing.”
Crawler also known as Spider or Bots.
Every search engine has their own separate Crawler program that crawls sites on the internet to collect data separately for each search engine.
Crawlers are programmed to visit only those sites which has been submitted by the site administrator as new or updated site.
This is why you should update your site on a regular basis because when crawler get to know about any updated sites, it crawls them and give the ranking boost on SERP pages.
What is Crawling in SEO?
Crawling is a process of search engines to crawl(scan) your site with all its pages and collect important data about them like targeted keywords, title, descriptions, internal linking, external linking, site structure for indexing and ranking purpose”.
As I have discussed above that like every search engine, Google also has its own Crawler program which is also known as Spider or Bots.
When you create your site and submit it to Google, Google sends its crawler to your site for crawling(scanning) and collecting data of all pages for indexing it in its database.
As Crawler lands to your site, it crawls the complete site from left to right and top to bottom.
During the crawling, with the help of internal linking it crawls all the pages and posts available on your site.
This is why it’s always said that you should do proper internal linking because it helps Google to crawl all your pages and posts properly and gives you search engine ranking also.
Once crawler completes the crawling, Google index all your sites with all its crawled pages in its database and when somebody will search for it in Google search engine, it will show to them.
What is Crawl Errors in SEO?
Crawl error is a search engine error which occurs when the crawler tries to access your site for crawling but because of some issue it get prohibited to access the same.”
There can be multiple causes for generating this error on your site which we will be discussing further in this article.
How to Identify Crawl Errors?
To identify which kind of crawl error has occurred on your site, login in to your Google Search Console dashboard.
Once you enter into your Search Console dashboard, on the left side click on coverage option.
As you will click on the coverage option, you will get to see four different option – Errors, Valid with Warnings, Valid and Excluded.
- The Error section specifies the number of URLs which have not been indexed.
- Valid with Warning option says the URLs have been indexed but there is some minor issue which you should fix them out.
- Valid option shows all the URLs which are successfully indexed and have no errors with them.
- Excluded option shows all the URLs which you don’t want to show in Google search engine intentionally.
Now to solve the crawl errors, you have to look upon only error section.
So before proceeding further, make sure that you have highlighted only error section and none of the others.
As you will highlight the error section, you will be able to see any of these causes mentioned below which might be causing crawl errors on your site.
- Server errors (5xx)
- Redirect error
- Submitted url blocked by Robots.txt
- Submitted URL marked noindex
- Submitted URL seems to be a soft 404
- Submitted URL returns unauthorized request
- Submitted URL not found (404)
- Submitted URL has crawl issue
As you can see in the above picture, it’s showing “Submitted URL marked ‘noindex’” error. Now in place of this error you may see any of the error mentioned above.
Now let’s move ahead and see how you can fix all these errors if you are getting any.
Note: 5xx is a server side error.
When 5xx Server error occur to your site that means Google bot had tried to access your site but either because the server was down or unavailable at the moment, so it could not able to crawl it.
To identify whether the generated “5xx Server Error” is a serious server issue or temporary, click on the “Server Error (5xx)” option.
Once you click, check if there is a lot of pages appear that means it’s a serious server issue and you should look deeply into it. To solve it you can refer “How to Fix 500 Internal Server Error” guide or you can contact your hosting provider also.
But if you see only a few pages have appeared under “5xx server error” option that means it might be a temporary server issue which you can fix by following few simple steps given below…
- Click on the affected pages.
- Then from the right side, click on inspect URL option.
- And if you are getting error “URL is not on Google: Indexing Error” that means either Google has removed the URL from its database or when it had tried to crawl it, it was unavailable.
- To fix this issue, copy that particular URLand open it in your browser.
- If it loads well then go back to google search console and click on Test Live URL and then request indexing.
- Then go back to coverage report option and Fix Validate.
- If it was a temporary issue, then Google will index it and fix the issue.
- But if the URL is not working well in the browser or giving some error then it would be better that you give “noindex” tag to the particular URL, and remove it from the sitemap also to tell Google that it doesn’t show any kind of error regarding that particular URL.
So, this was the simple guideline, hope it helped you to fix this issue. If getting any issue, comment in the comment section and will together try to fix the issue.
How to Fix “Redirect Error” in Search Console
Redirect Error occurs when Google bot tries to crawl the particular URL but the URL has been redirected to some other URL which either doesn’t exist or having some issue.
So, to fix this issue, follow the steps below…
- Click on affected pages.
- Select inspect URL option from the right to get more detail about the error.
- If it’s showing redirect error, then check that particular URL manually in your browser and fix it by redirecting on the right page URL that exist and have no error.
- Once above steps get done, click on Test Live URL and then Request Indexing.
- Go back to coverage report and Validate Fix.
Follow the above steps to fix redirect error and let me know if they are working for you or not. If not, let me know in the comment section, will fix it together.
Robots.txt is a file stored on your root directory where all those URLs are listed which you don’t want Google Crawler to Crawl and show in search engine pages.
So, when Google Crawler comes to your site for crawling it, first it checks the Robots.txt file to check whether there is any URL listed to not crawl or not. And if it finds any, it will skip that particular URL from your site while crawling.
That means if by mistake you have mentioned that ‘crawl error showing page URL’ in the Robots.txt file then Google bot will never crawl that particular URL till you remove it from the Robots.txt file.
Here is how to fix Submitted URL Blocked by Robots.txt…
- Click on affected pages.
- Then on inspect URL from right side.
- If it shows “robots.txt blocking” error, consider checking your robots.txt file and remove the URL from there if you want to show that URL in search engine.
- You can follow this guide to update your robots.txt file.
- Once you update the robots.txt file, click on Test Live URL and then Request Indexing.
- Then fix validate under coverage report.
Basically this error occurs when your URL is available in your sitemap but actually you have given a “noindex tag” to the particular URL that means you don’t want Google Crawler to crawl and index that particular URL in its database.
To solve this error, you can follow the steps below..
- List down all the pages on your site which have given a “noindex tag” and identify if there is any URL given noindex tag wrongly.
- And if found any, consider removing it, you can do it by editing your HTML code.
- Once done, resubmit the URL to Google search console.
- Click on affected page.
- Then click on inspect URL.
- Click on Test Live URL and then request indexing.
- Go back to coverage report and Validate fix.
- And it will index the URL in its database, if everything works well.
Apply the above steps to fix this issue and let me know in the comment section if showing any error, will try to solve together.
To fix this error, first you have to understand, What this error is all about and Why it occurs to your site.
So, let’s start one by one…
“Soft 404 is not an official HTTP error code but given by Google when the crawled URL has very less content or no content.”
So, this was just a simple definition, in the next section I will talk about the different situations when this error can occur to your site with examples.
In most cases, soft 404 may occur to your site because of two different situations discussed below…
- When Google crawler tries to crawl your page URL, either it gets very less content or no content on the URL page and in this situation it throws soft 404 errors.
- There can be a pages on your site which might appear only when user perform some specified actions. For example your checkout page which will appear only when user will add product in the cart. So this is the second case where it can show this error.
So these are the two main reasons that might be generating soft 404 errors on your site.
Whenever soft 404 occurs to your site, identify the URL on which the error has been occured and follow the below steps to fix it accordingly.
Fix “Soft 404 – Pages With No Content” Crawl Error
If the soft 404 errors occur for these kind of pages then follow the below steps to solve them…
- Add more content to the page.
- Re-submit the page to Google Search Console.
- Sudden it might not accept, so after updating your page, try resubmitting after one day.
Fix “Soft 404 – Action Require to Access the Page” Crawl Error
For these kind of URLs the better option is to remove them from the sitemap and tell Google to not index them and not show any error about them.
And I am saying this because there is no reason to index these pages in Google because you don’t want visitors to access these pages directly without taking specified actions.
And here is how you will remove these pages from sitemap and Google indexing as well.
- Login into your wordpress dashboard of your site.
- Install Yoast SEO Plugin.
- Open the affected pages URLs, in my case it is “checkout page”.
- Scroll down and under yoast plugin, click on Advanced and give a ‘noindex’ to the particular page.
- Refresh your sitemap, the URL will not be there.
The above steps are given only for wordpress sites.
But if you are using any other platform then in that case also you will have to give the same “noindex” tag to the page and your problem will be solved.
Whenever you get this error “Submitted URL Returns Unauthorized Request” for any of your sites page, then this means that the page has been restricted with the password or any other security system which are restricting the Crawler to crawl it.
And to fix this issue, either you have to remove the security system from the page or tell the Crawler to not crawl it.
To tell the Google for not crawling that particular page, you can use the following method…
- Give “noindex” tag to the URL.
- Add it in the robots.txt file.
You can use any of the methods mentioned above but I would suggest giving “noindex” tag to the page would be much better.
“Submitted URL not Found (404)” error means that the submitted URL does not exist on your site.
Now it may occur because of some other reasons also. I have discussed some of the cases below, you can refer them.
- Google crawler didn’t find that particular URL when it crawled your site.
- You might have submitted wrong URL, so check the URL properly and match with the URL available on your site.
- You might have removed the URL permanently from your site.
Follow the below steps to fix this issue…
- Open the affected page URL manually in your browser and check whether page exist and working fine or not.
- If the page exist on your site and you want that to be exist there, then go to search console and re-submit the page.
- Copy and paste the URL in Inspection area and hit enter.
- Click on Request Indexing.
- If everything goes well, Google will index the URL in its database.
- Go back to the coverage report and click on Validate Fix.
If getting any error from the above guide, please let me know in the comment section, will fix the issue together.
Whenever the above error occur to your site, that means the URL is available on Google index database and to your site as well, but the crawler is facing some issue when crawling that particular URL page.
In most cases, this happens due to improper content loading on pages.
Mostly this is a temporary issue which you can fix by just following few simple steps given below…
- Click on inspect url and then click on view crawled pages.
- On the right side, click on screen shots and more info option to get data about how Google views your pages. Do cross check by doing manually also.
- Just copy-paste the URL in your browser and check whether all the content on the page loads properly or not.
- If everything works well then goto search console and click on Test Live URL and then Request indexing.
- But still if it’s showing the same issue then it might be a serious issue. To fix it, edit your HTML code and reduce content loading speed.
So, if this method works for you then let me know in the comment section but if not working then will try to fix it together.
Let’s wrap up…
Crawling is one of the most important part of your SEO process, suppose if Google is not able to access your site properly, how it’s gonna to index it in its database.
So if you are facing any kind of crawling issue and directly jumped here then I would suggest go and checkout the complete article where I have mentioned all the causes of crawling issues with possible solutions.
In case if any of the method discussed above is not working, please let me know in the comment section.
Bhanu is a Computer Engineer by Education but a Self learnt Blogger and SEO Guy by heart. He started Blogging Central in January 2018 with a Tagline of “Keep Learning, Keep Sharing” and still he is believing that “Learning Never Ends, And So Sharing“.