When you encounter crawl errors, your website can be hindered from appearing in a search results page, even when your target audience has performed the ideal search query. The crawl error report comes about and gives details of the website's URLs that Google is not able to successfully crawl, or those that returned with an HTTP error code. There are two main sections to the error report, site errors and URL errors.
Server errors occur when Google is not able to access the URL, the site was busy, or the request has timed out. Because of this, Google had to discard the request. Google cannot access the site either because it is being blocked or because the server is lagging too much to respond. You can remedy the situation by fixing the server's connectivity issues.
Server connectivity issues include a timeout, truncated headers, connection resets, truncated responses, refused connections, failed connections, connection timeouts, and no responses.
To resolve many of these problems, you will need to ensure that the server is connected to the internet, and that it is not overloaded or configured incorrectly.
You many also see success when you utilize Fetch as Google to check that Googlebot is able to crawl the site. If it returns the content of the homepage with no problems, it is safe to assume that Google does in fact have all of the right access to the website and can process it appropriately.
With any problems with the connection, wait a little while and see if you can connect.
In a website that operates with no flaws, the “Site errors” portion of the report for the Crawl Errors will not show any errors—and this holds true for the majority of the websites that Google crawls. If Google has detected any considerable number of errors on the site, you will be notified of them in the form of a message to your account, no matter the size of the website. Then first looking at the Crawl Errors page, the Site Errors portion will show a status code adjacent to all of the 3 error types (Server connectivity, DNS, and robots.txt fetch). The normal indication for each of these will be a green check mark. If this is not the case, you may click on the box and view a detailed graphic of the details for crawls that took place in the last 90 days.
If your website is reporting a 100% error rate for any of the 3 categories, it is highly likely that your website is not working for some reason.
There are a number of possibilities for this:
If none of these reasons are why the website is experiencing crawl errors, the error rate could just be a passing influx, or could be attributed to an external cause, like someone linking to a nonexistent page. If this is the case, then there is no real problem with your website. Whatever the reason may be, when Google sees that there is an unusually large number of errors for a website, the webmaster will be notified so that they can look into the problem and fix it.
DNS errors occur when a DNS server is down or there is a problem routing it to the domain, Googlebot is unable to communicate with it. Many times these types of errors will not impact Googlebot's ability to access the site, it can be a symptom of high stagnancy, which ultimately will impact your website's users.
To fix DNS errors, the first thing you can do is to have Google crawl the website. Instruct Fetch as Google to run on an important page (like the home page). If it comes back and does not report any problems, it is safe to assume that Google is able to properly access all of your site.
Next, if you are having recurring DNS errors, get in contact with your DNS provider. Many times, the DNS provider and web host are the same entity.
Furthermore, you may need to set your server up to respond to host names that do not exist that have an HTTP error code like 404 or 500. This is most applicable when the website has content generated by users and gives each user their own domain. In some cases, this can cause content to accidentally be duplicated across hostnames, and then in turn mess with Googlebot's crawling.
If you encounter either a DNS Lookup or a DNS Timeout, Google was not able to recognize the hostname. You can utilize Fetch as Google to make sure that the site may be properly crawled. If the site is returned with no issues, then Google is accessing your site properly. You may need to check with your registrar to ensure that the site is properly set up and the server is in fact connected to the internet.
When a website has an error rate that is less than 100% in any of the given categories, it may indicate a passing condition, but it may also mean that the site is being unnoticed or configured in a way that is incorrect. These issues should be looked at deeper, or you may perform your own search query to find out answers. Don't be surprised if Google alerts you if the general error rate is on the lower end, it is normal for a site with good configuration to have zero errors present in these categories.
This happens when there is a problem finding a website's robots.txt file. Prior to Googlebot crawling a site, Googlebot will look at this file to see which pages that it will not be crawling. If the files exists but cannot be reached, the crawl is postponed so that Google does not crawl any unintended URLs. When this is the case, Googlebot will return to the site and crawl it when the robots.txt file can be accessed.
It is not always necessary to have a robots.txt file, surprisingly. It is only needed when a website contains URLs that the site owner does not want Googlebot to index. If your goal is to have search engines to crawl everything on your site, you don't even need an empty robots.txt file.
The section of the error report that is dedicated to URL errors is divided up into categories that will show the top 1,000 URL errors that are limited to that category. Not every error that shows up in this area will require your attention, but you should be vigilant in making sure that you monitor the errors closely, as some of them may have a negative effect on your website's users and Google's crawlers. Google will have already taken the liberty of placing the most important URLs at the top of the report. The importance is based on things like the number of errors as well as the pages that reference the URL.
More specifically, look at the following:
There are a few different ways to view URL errors:
Mobile or desktop URLs error details will show their status information on the error along with a list of web pages that reference the URL, and even a link to Fetch as Google so that you are able to troubleshoot the problems with that URL.
One you have figured out where the problem is and what is causing the crawl errors, you are able to hide it from the list. Do this one at a time or many at a time. You will simply select the box next to the URL and then click Mark as Fixed, and the URL will vanish from the list.
Create, edit, customize, and share visual sitemaps integrated with Google Analytics for easy discovery, planning, and collaboration.