Sitecore Search : Addressing common exceptions on the website crawling for the failed pages
This blog is to assist on how to address common exceptions with crawling errors on Sitecore Search for the failed pages. If you're experiencing issues with a site crawler, where the crawler is encountering errors while crawling pages. The dashboard page does show the errors , but it does not provide a detailed log of the issue. This blog will help you to locate the full details around these errors.
For checking the status of the Scheduled scans on the Site Crawler,
- Login to the CEC portal and click Sources
- The summary on the last crawling is displayed here. It basically shows:
- Last Run Status : shows Finished if it completed crawling or Failed if it stopped due to a failure
- Last Run time
- Items Indexed: Number of items indexed
- Also, it also shows a summary of errors if there was any errors while crawling the site.
On the below example, after finishing the Crawling, it shows there are 3 configuration errors, but it does not show any further details for additional troubleshooting.
So, in order to get more information on the crawling results, we can find it under the Analytics tab. Below are the steps to see more information on the crawling results:
- Navigate to Analytics -> Sources -> Overview and then select the Source at the bottom
- The reason of the failure is that the crawling of the page https://devsite.com/about-us/news failed because the page is unavailable or throwing errors while loading, which can be looked into with some troubleshooting. So, using the above method, we can potentially identify the crawling errors for faster troubleshooting.