Search engines use XML sitemap files as a signal for which pages need to be crawled. Having well-formed, clean and optimized sitemap files is critical for websites that are large, contain videos, images and/or news.
The most common SEO issues with sitemaps:
HEADMasterSEO is a sitemap checker tool that can:
By default, HEADMasterSEO uses the User Agent "HEADMasterSEO" to check URLs. Open Configuration -> User Agents screen and change the user agent to GoogleBot Regular. You want HEADMasterSEO to present itself as GoogleBot when checking the URLs in the sitemap.
HEADMasterSEO can open your sitemap files from your local drives. If you don't have local copies, download your sitemaps to your computer.
Select the "Check URLs" -> "Check URLs... (From Files)" menu and select one or more sitemap files to check.
First, HEADMasterSEO will start by reading the sitemap file and validating its XML markup. If the sitemap does not validate, HEADMasterSEO will show you a message with the detected error and abort the operation.
Next, the XML sitemap checker will request every URL in the sitemap and save the status code and the rest of the http response headers.
The program automatically recognizes the different sitemap files (video sitemaps, image sitemaps, news sitemaps, standard sitemaps). HEADMasterSEO will check all URLs in the sitemap including thumbnail images, video player URLs, rel alternate links etc. If it finds redirects, HEADMasterSEO follows them to the final URL.
If you select multiple xml sitemap files, HEADMasterSEO will validate and import the URLs from every sitemap.
Filtering The Results By Status Code
You can use the HEADMasterSEO filters to check the sitemap results for non-200 status codes (broken links, redirections, internal errors) as well as find pages with slow response time. You can also sort the results by any column.
Exporting the Results To CSV
Export all results or a filtered subset to a CSV file by running one of the reports in the "CSV Exports" menu.
How To Export All Dead (Broken Links)
Select the "Page Not Found" or "4XX Client Errors" filter. Doing the latter would also include 410 Gone and all other 400 to 499 status code links. Click the "CSV Reports" menu, "Standard Report" to export all dead links.
How To Export All Redirected Sitemap Links
Select the "3XX - Redirects" filter. Invoke the "Redirect Details" report for the "CSV Exports" menu.
How To Export Server Error Sitemap Links
Select the "5XX - Server Errors" filter. Run the "Standard Report" report for the "CSV Exports" menu.