Skip to content Skip to navigation

Creating a sitemap for auditing your site

Suppose you're getting ready to migrate your website from Drupal 6 to 7, or maybe you just inherited a site. You'll want do an audit to learn what content is on the site and how it is structured.  When it comes time for a site audit, it’s good to start with a sitemap. 

A sitemap is a list of the pages on a website. The sitemap may take many forms, from machine-readable XML, to an HTML listing of page names, to a text list of all the page URLs. For a content audit, both the HTML listing and the text list give you not only an excellent view into the content, but a good idea of the structure of the site as well.

The quickest and most thorough way to build a sitemap is to generate it automatically.  Several free tools on the web will do this for you. One of my favorites is http://www.xml-sitemaps.com. It will generate a sitemap for sites up to 500 pages.

Generate a sitemap

To generate a sitemap:

  1. Navigate to http://www.xml-sitemaps.com.
  2. Enter the URL for your website. Note: If your site uses https instead of http, you’ll need to enter the ‘s’.
  3. Click on the ‘Start’ button to have it crawl the pages on your site.

When you click on start, it will visit all the pages on your site, and make a list of all the URLs.

Download your sitemap

After a few minutes, or longer depending on the size your site, it will produce a list of sitemaps in different formats for you to download. For a sitemap consisting of HTML links, download sitemap.html. For a sitemap consisting of URLs, download urllist.txt.

Check your results

Here is a sample of the HTML sitemap (sitemap.html):

And a sample of the text format (urllist.txt):

Once you have this list of pages, you can copy and paste them into an Excel spreadsheet. From there, you can track and review each page of your site. 

More resources

XML-sitemaps.com is one of many tools on the web that can create sitemaps and assist with auditing your site. Other audit tools include:

Content Insight: This is a more comprehensive audit tool that provides information such as screenshots and page-level details.

Sleuth: This free tool which runs on a Windows machine will crawl a site and report broken links.

What is your favorite?

 We'd like your help! We'd like to compile a list of favorite content audit tools.  Please share your tips and favorites in the comments below. 

Categories: 

Comments

Hi, thanks for your interesting blog post. I found it very useful. It inspired some further thinking. Creating a sitemap using a crawling tool is a good start, but there is more to be gained from also including data from the content management system. And after you've collected all of the information, the most important stuff comes: analyzing, assessing, and improving. Having collected my thoughts, I devoted a blog post to this exact topic. Hope it helps. http://www.xillio.com/blog/the-gap-between-a-sitemap-and-a-content-audit

Hi Bas, Yes, a sitemap is just one tool for auditing a site. As your post points out, we can access much more information for use in understanding a website. Thanks for the input!

Is it possible to use Sitemaps to do a content audit (& generate a spreadsheet) of an intranet, behind a firewall?

Hi Karen,

I imagine that any application would need to be authenticated or local to your intranet to see a site behind your firewall. You will need to use a different tool than Sitemap.

Thanks ... Bummer, though! I was hoping to avoid manually creating a spreadsheet. :-( Any other suggestions would be welcome.