Generate Sitemap in Rails

December 09, 2008 / category: Ruby on Rails / 6 comments

If you don't know what is a sitemap, I strongly encourage you to fill this gap in knowledge first and then get back to reading.

It seems to be clear that manual updating a sitemap can turn into a horror. Hopefully we can make the Rails do the job for us.

My eBook: “Memoirs of a Software Team Leader”
Read more »


So, what components are we going to need to build a dynamic sitemap? Well, we are going to need a separate action (or even controller), an XML view and model methods to provide us with URL data. Let's start from creating the Sitemap controller and its index action:

$ script/generate controller Sitemap index

Above code will create the app/views/sitemap/index.html.erb file, but we need to change its extension to rxml. The reason is that we need an XML view, not an HTML view as usual. Let's change the extension now:

$ mv index.html.erb index.rxml

To instruct Rails to send appropiate content type headers, we need to set the headers option inside the controller. We also have to disable layout rendering:

class SitemapController < ApplicationController
  def index
    @urls = ['http://lukaszwrobel.pl/', 'http://lukaszwrobel.pl/about-me']

    headers['Content-Type'] = 'application/xml'
    render :layout => false
  end    
end

The next thing we are going to need is an adequate XML view, located in the app/views/sitemap/index.rxml file. On the assumption that the @urls property contains list of all URLs we want to show in the sitemap, our XML view looks like this:

xml.instruct! :xml, :version => '1.0'
xml.tag! 'urlset', 'xmlns' => "http://www.sitemaps.org/schemas/sitemap/0.9" do
  @urls.each do |url|
    xml.tag! 'url' do
      xml.tag! 'loc', url
    end
  end
end

When we display the OUR_SITE_URL/sitemap page, we should see properly generated XML sitemap file:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://lukaszwrobel.pl/</loc>
  </url>
  <url>
    <loc>http://lukaszwrobel.pl/about-me</loc>
  </url>
</urlset>

But the truth is that we didn't achieve anything interesting; we still have to manually input the URL list.

The next part is application-specific. If we maintain a list of products and we want to put the URL to every product page in our sitemap, we have to get all product ids, names etc. using the find() method and build all URLs. The url_for() method could be turned into good use here. Automatically generated URLs can than be merged with the @urls property inside the Sitemap controller.

Last but not least, for some reason we may want to pretend that the sitemap is located inside a file, not a controller's action. To achieve this effect, we need to potter with the config/routes.rb file a little. We should add the following line:

map.sitemap 'sitemap.xml', :controller => 'sitemap'

Voila! The sitemap is now available at the OUR_SITE_URL/sitemap.xml address.

Comments

There are 6 comments / Submit your comment

Jo
April 20, 2009 08:47 AM

Was helpfull.. thanks... :)

Lukasz Wrobel
April 22, 2009 06:40 PM

No problem, I'm glad I could help.

ITContractorMortgages
September 23, 2010 05:44 PM

Nice article, thanks for that.

Is there an extension which does the same thing yet?

Good chance to write one!

Lukasz Wrobel
September 23, 2010 06:13 PM

Actually, there are a few Rails plugins available which can be used to generate a sitemap. I didn't try them, though, but I'll give you a short list in case you're interested:

http://github.com/queso/sitemap

http://aktagon.com/projects/rails/sitemap-generator.html

http://agilewebdevelopment.com/plugins/chris_martin

First and second one work in a similar way as described in my article, while the third one is rather a crawler.

sumit bisht
July 08, 2013 05:27 AM

Thanks! But the <url> and <loc> tags are not enough for a proper sitemap; you need <lastmod>, </changefreq> and </priority> as well http://www.sitemaps.org/protocol.html As this post is probably getting old, i'd recommend to use a new gem like https://github.com/kjvarga/sitemap_generator for automatic sitemap generation

Lukasz Wrobel
July 11, 2013 09:01 AM

@sumit bisht:

I wouldn't say that the <loc> tag alone is insufficient. The truth is, search engines are much better than most of us at detecting change frequency and modification date. Not to mention priority - what are the search engines supposed to do with it? Take it literally, as they once used to do with keywords?

Probably because of these reasons, even the sitemap protocol defines <lastmod>, <changefreq> and <priority> tags as optional.

I've been using dead-simple sitemaps like the one described above in many websites, including such having page rank of 7, and never encountered any problems.

You can use Markdown in your comments if you wish. Examples:

*emphasis*
emphasis
**strong**
strong
`inline code`
inline code
[My blog](http://lukaszwrobel.pl)
My blog
# use 4 spaces to indent
# a block of code
    def my_method(x)
      x = x + 1
    end
def my_method(x)
  x = x + 1
end

* First.
* Second.
  • First.
  • Second.

> This is a citation.
> Even more citation.

I don't agree with you.

This is a citation. Even more citation.

I don't agree with you.


Submit your comment

(required)

(optional)

(required, Markdown supported)


Preview:

My eBook: “Memoirs of a Software Team Leader”

Read more »