Is An XML Or HTML Sitemap Better For SEO?

Is An XML Or HTML Sitemap Better For SEO?

In this edition of Ask An SEO, we break down a common point of confusion for site owners and technical SEOs:

Do I need both an XML sitemap and an HTML one, and which one is better to use for SEO?

It can be a bit confusing to know whether it’s better to use an XML sitemap or an HTML one for your site. In some instances, neither is needed, and in some, both are helpful. Let’s dive into what they are, what they do, and when to use them.

What Is An XML sitemap?

An XML sitemap is essentially a list of URLs for pages and files on your website that you want the search bots to be able to find and crawl. You can also use the XML sitemap to detail information about the files, like the length of run-time for the video file specified, or the publication date of an article.

It is primarily used for bots. There is little reason why you would want a human visitor to use an XML sitemap. Well, unless they are debugging an SEO issue!

What Is The XML Sitemap Used For?

The purpose of the XML sitemap is to help search bots understand which pages on your website should be crawled, as well as giving them extra information about those pages.

The XML sitemap can help bots identify pages on the site that would otherwise be difficult to find. This can be orphaned pages, those with low internal links, or even pages that have changed recently that you may want to encourage the bots to recrawl.

Best Practices For XML Sitemaps

Most search bots will understand XML sitemaps that follow the sitemaps.org protocol. This protocol defines the necessary location of the XML sitemap on a site, schema it needs to use to be understood by bots, and how to prove ownership of domains in the instance of cross-domain references.

There is typically a limit on the size an XML sitemap can be, and still be parsed by the search bots. This means when building an XML sitemap, you should ensure it is under 50 MB uncompressed, and no more than 50,000 URLs. If your website is larger, you may need multiple XML sitemaps to cover all of the URLs. In that instance, you can use a sitemap index file to help organize your sitemaps into one location.

As the purpose of the XML sitemap is typically to help bots find your crawlable, indexable pages, it is usually necessary to ensure the file references it contains all lead to URLs with 200 server response codes. In most instances, the URLs should be the canonical version, and not contain any crawl or index restrictions.

Things To Be Aware Of With XML Sitemaps

There may be good reasons to go against “best practice” for XML sitemaps. For example, if you are instigating a lot of redirects, you may wish to include the old URLs in an XML sitemap even though they will return a 301 server response code. Adding a new XML sitemap containing those altered URLs can encourage the bots to recrawl them and pick up the redirects sooner than if they were just left to find them via crawling the site. This is especially the case if you have gone to the trouble of removing links to the 301 redirects on the site itself.

What Is An HTML Sitemap?

The HTML sitemap is a set of links to pages within your website. It is usually linked to from somewhere on the site, like the footer, that is easily accessed by users if they are specifically looking for it. However, it doesn’t form part of the main navigation of the site, but more as an accompaniment to it.

What Is An HTML Sitemap Used For?

The idea of the HTML sitemap is to serve as a catch-all for navigation. If a user is struggling to find a page on your site through your main navigation elements, or search, they can go to the HTML sitemap and find links to the most important pages on your site. If your website isn’t that large, you may be able to include links to all of the pages on your site.

The HTML sitemap pulls double duty. Not only does it work as a mega-navigation for humans, but it can also help bots find pages. As bots will follow links on a website (as long as they are followable), it can aid in helping them to find pages that are otherwise not linked to, or are poorly linked to, on the site.

Best Practices For HTML Sitemaps

Unlike the XML sitemap, there is no specific format that an HTML sitemap needs to follow. As the name suggests, it tends to be a simple HTML page that contains hyperlinks to the pages you want users to find through it.

In order to make it usable for bots too, it is important that the links are followable, i.e., they do not have a nofollow attribute on them. It is also prudent to make sure the URLs they link to aren’t disallowed through the robots.txt. It won’t cause you any serious issues if the links aren’t followable for bots; it just stops the sitemap from being useful for bots.

Things To Be Aware Of With HTML Sitemaps

Most users are not going to go to the HTML sitemap as their first port of call on a site. It is important to realize that if a user is going to your HTML sitemap to find a page, it suggests that your primary navigation on the site has failed them. It really should be seen as a last resort to support navigation.

Which Is Better To Use For SEO?

So, which is more important for SEO? Well, neither. That is, it really is dependent on your website and its needs.

For example, a small website with fewer than 20 pages may not have a need for either an XML sitemap or an HTML sitemap. In this instance, if all the pages are linked to well from the main navigation system, the chances are high that users and search bots alike will easily be able to find each of the site’s pages without additional help from sitemaps.

However, if your website has millions of pages, and has a main navigation system that buries links several sub-menus deep, an XML sitemap and an HTML sitemap may be useful.

They both serve different purposes and audiences.

When To Use The XML Sitemap

In practice, having an XML sitemap, or several, can help combat crawl issues. It gives a clear list of all the pages that you want a search bot to crawl and index. An XML sitemap can also be very helpful for debugging crawling issues, as when you upload it to Google Search Console, you will get an alert if there are issues with it or the URLs it contains. It can allow you to narrow in on the indexing status of URLs within the XML sitemap. This can be very useful for large websites that have millions of pages.

Essentially, there isn’t really a reason not to use an XML sitemap, apart from the time and cost of creating and maintaining them. Many content management systems will automatically generate them, which can take away some of the hassle.

Really, if you can have an XML sitemap, you might as well. If, however, it will be too costly or developer-resource intensive, it is not critical if your site is fairly small and the search engines already do a good job of crawling and indexing it.

When To Use The HTML Sitemap

The HTML sitemap is more useful when a website’s navigation isn’t very intuitive, or the search functionality isn’t comprehensive. It serves as a backstop to ensure users can find deeply buried pages. An HTML sitemap is particularly useful for larger sites that have a more complicated internal linking structure. It can also show the relationship between different pages well, depending on the structure of the sitemap. Overall, it is helpful to both users and bots, but is only really needed when the website is suffering from architectural problems or is just exceedingly large.

So, in summary, there is no right or wrong answer to which is more important. It is, however, very dependent on your website. Overall, there’s no harm in including both, but it might not be critical to do so.

More Resources:


Featured Image: Paulo Bobita/Search Engine Journal


#XML #HTML #Sitemap #SEO

Leave a Reply

Your email address will not be published. Required fields are marked *