What is a Googlebot and what does it do?
When a user types a keyword or phrase to search for in a search engine, it is a common misconception that the search engine then trawls through all the live websites in the world, finds the best match and displays the results. If this were true, every search would take weeks to perform! Search engines work by first ‘caching’ websites and saving them within their own index so that they can perform very fast searches, and a unique algorithm is used to display the results most relevant to your search. This means that each search can take something like 0.2 seconds, which is much more acceptable!
Search engines gather all the information on a website by sending out what are called ‘spiders’ or ‘robots’, or in Google’s case, ‘Googlebot’. These are basically scripts that interpret the information on your website and save them in the search engines index. A page on your website won’t be discoverable until it has been cached, and this can happen daily, weekly or even monthly. If you have a large or popular website with lots of links, you may get visited by search engine spiders many times per day.
Some websites, such as news websites, blogs etc will automatically ping or notify Google everytime a new post is added. This ensures that new / fresh content is being cached very quickly, and the results may appear in Google in a very short amount of time.
From an SEO perspective, it is important to know about how search engines work to fully understand how to optimise your website. You can control the robots on your site using a robots text file and by using meta tags on individual pages. For example, you may want to prevent a robot from caching a certain page of your website because you know that it has duplicate content on it and you are trying to resolve a canonicalisation issue.