Crawlers pass through webpages and follow links on those web pages bypassing from link to link and bring information about those webpages back to the Google servers.
A concise guide about crawling here,…
Link and site structure
Crawlers visit every website page and to make the job easy, it would be better to link pages to each other with the purpose to increase the website’s visibility. On the other hand, for external, crawlers look for support that the pages on the website have enough quality. It will possible to achieve the same with external links to your web pages. For that, you need to avoid pages with plagiarized content, prefer fresh & engaging content, keep updating the content, use smart tags, avoid keyword stuffing, and also you need to use breadcrumb trail to leave links to share the website structure.
Server errors & redirection
Crawler passes through your website’s HTTP header at an initial stage. It will also find a status like 202, 305, and 404. Googlebot also makes use of this with the purpose to determine the performance of the page. It works with a goal to maintain the status code health.
Websites receive a crawl budget and this is the time allocated for the search engine to crawl on a daily basis. The best practice is to lead the crawler to the most targeted pages so it can crawl more.
Block web crawler access
Robot.txt, a file that prevents crawlers from any other pages. It can be used to hide certain parts of any website to lead a crawler to engaging content.