Sitemap Formats

Sitemap Formats

There are different types of Sitemaps, each with their intended purpose. The history of Sitemaps dates back to the 2000s, having been officially launched by Google, the search engine giant in 2005. A year later, other search engines such as Yahoo and Bing adopted a common sitemap protocol, entering a unique marriage because very few competitors run joint projects sucsh as sitemaps. Sitemaps are designed for use by search engines as well as internet users. When you create a sitemap, you can choose from various formats, even though most sitemap generators support two main formats; HTML sitemaps and XML sitemaps.

XML Sitemaps

The basic XML Sitemap file takes any name you give it and does not necessarily have to be stored in the website root directory (even though it is ideal to store it there). The file needs to be a UTF-8 (encoded) text file, meaning that URLs with some special characters should use entity escaping, so that the URLs in the Sitemap are rightly crawled by search engines. XML sitemaps can be stored in an uncompressed form, submitted as .XML files or compressed in gzip format and then submitted as .GZ files.

The XML Sitemap protocol comes with a set of structured XML tags, with some of the tags being mandatory while others are optional. This allows webmasters to define specific details about their pages as they create a sitemap. These details include the URL, the last modification date of the page, and the frequency of content modification as well as the page’s rated priority to other pages included in the Sitemap.

XML Sitemap files come with file size limitations in that they cannot exceed 10MB and can only contain a maximum of 50,000 URL entries. Since this size limitation is unfavorable for enterprise-size websites, the Sitemap Index file doubles as a Sitemap XML protocol refinement thus enabling Sitemap Index file to reference up to 50,000 individual Sitemap files with each of these files accommodating up to 50,000 links, totaling to more than 2.5 billion links. Creating multiple sitemaps is therefore highly recommended for sites with thousands of links.

HTML Sitemaps

HTML sitemaps are actually real web pages that contain links to all pages in your website. The number of HTML sitemaps to have is dependent on the size of your site. If it is small, then you can have one HTML sitemap but if yours is a enterprise-size site, the sitemap  will most probably be designed as content archives organized by section and then divided into different publication dates.

One of the most outstanding aspects about a HTML sitemap is that it links to all the published content on your site. If, for instance you have a large news site with your pages listing only 24-48 hours of new stories before they are delinked by the running pages on the website, an XML Sitemap supplies these links to be crawled, but still it is not good enough. In this case, when you use a content archive option, you provide more than one link in the running site to all published pages. This makes it easy for human readers to find their favorite stories in future. This is also important when you create a content inventory for the site because you can still access your old stories in one place and improve on them.

RSS Feeds

Sitemaps can also come in other formats like Atom feeds or RSS feeds. It is recommended that if your site publishes several new pages daily, you should find a way to properly present and archive them for the future. RSS feeds inject the Freshness Index of search engines as they are read more often daily, by the search engine bots. Once you publish your pages, the search engines will crawl the information in these pages immediately.

Tips on how to create RSS feeds for your site

  • Make sure it has less than 500 URLs or not more than the last 7 days’ published content links. Since it is a freshness update, you should keep it fresh and clean.
  • Make sure your RSS feed link remains constant-does not change from time to time.

Avoid using date stamps as well as other incremental notations in these files as this can result in feeds’ URL to change every time it is generated, search engines will find it difficult to effectively crawl such links.

Video Sitemaps

If your website publishes new and original video content, search engines should be informed accordingly. Videos are exceptionally hard to index, and therefore you have to include accurate metadata with each item you publish. Doing this makes the media relevant to search queries. Before creating your video Sitemap, it is important to ensure that you carefully review the protocol specifications. Once you have created the sitemap, follow the instructions and submit it to the search engine’s webmaster tools.

Mobile Sitemaps

A mobile sitemap is ideal for site owners with a large number of pages on their site dedicated for use by mobile device browsers and have not yet invested in a responsive design of their sites.  This sitemap adds new tags and identifies listed links as mobile content. With more people accessing the web via mobile, it is highly recommended that you create mobile sitemaps for your site.

Image Sitemaps

You should create an Image Sitemap if your website is rich in images (especially original content), and you want to have all or nearly all of the images in your site indexed. By creating an Image Sitemap, you it easy for search engines to define the most relevant and important images on the site to be presented during search queries.

Text File Format

For sites with few pages, the .txt extension file type is the simplest text file format to use in creating a Sitemap. With the use of a Notepad and other simple text editors, you can easily create a text Sitemap. Here are the general guidelines to follow when you want to create your text Sitemap:

  • Feed one link per line.
  • URLs/links should not have any line breaks or contain any other information.
  • The URL must be written in full including the http section.

Additionally, each search engine may put in place other guidelines that include:

  • The number of URLs in one file should not be more than 50,000. If your site has over 50,000 URLs, then you should use multiple text files.
  • Text file size must not exceed 10MB (10,485,760 bytes).
  • Text file should use UTF-8 encoding thus you should save your file in the UTF-8 format.

Depending on the size and nature of your website, you cannot afford to manage a site without at least two of the above sitemap formats. Other formats can be optional, but HTML and XML sitemaps are highly recommended. Use a sitemap generator such as DYNO Mapper to create sitemaps of your choice fast. 


About Sitemaps

Sitemap Tutorial

Create Visual Sitemaps

Create, edit, customize, and share visual sitemaps integrated with Google Analytics for easy discovery, planning, and collaboration.

Popular Articles

Create Interactive Visual Sitemaps

Discovery has never been easier.

Sign up today!