Through the use of software programs and algorithms, Google gathers web content so that those using Google to search can readily access the information that they are looking for. Because of this, there is very little effort required by webmasters to make sure that Google retrieves their content.
The process known as crawling is when Google collects all of the web content that is available to the public so that it can be displayed in search results for an appropriate search query. During this crawling process, Google's special software known appropriately as web crawlers, which will automatically discover and acquire websites. A web crawler works by pursuing web links from site to site, and then downloads the pages to store them to be used later. Sorting and analyzing is done by intricate algorithms, and then they are updated within Google's search engine results. Google's main web crawler is known affectionately as Googlebot.
Another process that Google uses to understand web pages is rendering. Rendering helps Google to interpret the way that web pages look and how they behave for visitors that are using different browsers and devices for their internet activities. Similarly to how a web browser will display a page, Google will retrieve the URL and then execute the code file given for that page—this is generally JavaScript or HTML. Google will then crawl every resource contained within the main code file in order to put together all of the visual aspects of the page and to get a better understanding of the website.
When Google is not able to render or crawl a web page, the site's visibility in Google's search results pages can be impacted. First, when Google is not able to crawl a website, it is impossible for them to gather any information about the website. This means that the site, or parts of the site, cannot be discovered in a natural way, therefore it cannot be relayed to Google users that are searching queries that are relevant to these web sites or pages.
Next, if Google is not able to render the web pages contained on a site, it will be a chore to try to understand the web content because important information regarding the visual layout is missing for the web page. If this were to happen, the site's content visibility can be greatly reduced within Google's search pages. Google takes action to render web pages in order to estimate how valuable the website is to varying audiences, and to establish where specific links will be shown within Google's search results pages. Luckily, there is a tool known as Fetch as Google that can help with diagnosing web page's crawling and rendering in order to improve the position of the site within Google's search engine results pages, and work to reach the web site's target audience.
It is vital to the success of a website that it is able to be crawled and rendered in the correct way, ensuring that it will receive the best efforts from Google search. Though crawling and rendering is very important, it is also important to realize when blocking content from being crawled and rendered will improve the overall success of the website.
You should take the time to confirm that Googlebot and Google's other web crawlers have access to your website on the network level. It is vital that the URLs you would like Google to display in any search results are actually reachable by Google. Often times, URLs are actually blocked on purpose by website owners. Prior to blocking URLs, you need to ensure that this will not hide any content that you would like to be discovered by Google and subsequently displayed in their search results pages.
It is up to the website owner/designer to allow Googlebot access to all of the resources that are referenced on the website. Google considers all of the content that is not text as well as the total visual layout to decide where the website appears within the search results pages. The visual elements of the website help Google to totally understand the web pages. When Google understands a website the best that it can, it is able to better match the website to the individuals that are looking to find the particular content that it offers. After Google has retrieved the pages, Googlebot will run the code and decipher the content to better understand the overall structure of the website. The information that is collected by Google during the rendering process is used to rank the value and quality of the content compared to other websites and what other individuals are searching for using Google's search engine.
If there are web pages on your site that use code to arrange or display the content, Google has to properly render the content in order for it to be displayed in Google search. Many times, the meat of the textual content of a dynamic website may only be retrieved through the rendering of the web pages so Google is able to see the website like any other internet user would. If the website is going through faulty rendering, Google might not be able to retrieve any of the content. To bring this all full circle, when Google is not able to retrieve any of the content from a web page, it cannot know if the information and content within the website is relevant to any specific search queries, and will not show the site within search results.
Googlebot must have access to various resources on a web page so that it can render and index the page as needed. This includes things like image files, CSS, and JavaScript so that the bot is able to view the page like a normal user would. If a robots.txt file does not allow crawling of those resources, it will impact how well Google will render and index the page, thus impacting how the page is ranked in Google's search engine results.
The blocked resource report displays the resources that are utilized by the site, yet are blocked to Googlebot. Every resource is not shown, only the ones that Google assumes are under the webmaster's control.
In order to unblock your resources, you will need to do the following:
Fetch as Google is a tool that will help crawl a web page. It enables any user to test how Google will render or crawl a URL within a website. The tool can be used to determine if Googlebot is able to access a page on the website, how it will render the page, and if any page resources are blocked by Googlebot. Essentially, it simulates a crawl and render process that is done as Google normally would, and is quite useful for ironing out any crawling issues that a website may be having.
Using Fetch as Google is pretty simple, and takes only a few steps to complete.
The last 100 fetch requests will be shown on the fetch history. You may choose to see the details of any completed request, and you may be shown a status of completed, partial, redirected, or specific error type.
If the request has been completed, that means Google contacted, crawled, and can get the resources that have been referenced by the page.
If the fetch status is partial, it means that Google was able to get a response from the site and has fetched the URL, but was not able to get the resources that were referenced by the page. This could happen if they were blocked by certain files. The process was fetch only, try to do a fetch and render. Look at the rendered page to see if there are any important resources that were blocked. If this is the case, unblock them on any robot.txt files that you own. For the ones that you don't own (if any) ask the owners to unblock them.
When you are shown a redirected status, you will have to follow it manually. If this is redirected to the same property, the tool will display an option to allow you to follow the redirect through the population of the fetch box via redirect URL. If the redirect is to another property, you will be able to click on the “Follow” option to auto populate the URL box. Copy the URL and paste it into the fetch box.
Create, edit, customize, and share visual sitemaps integrated with Google Analytics for easy discovery, planning, and collaboration.