Copy Object in Ruby

December 24, 2008 / category: Ruby / 10 comments

Sometimes, when we are working with objects in Ruby, we want to make a copy of them. But what for? Well, in most cases we want to have a working copy and still maintain the original, intact object. Changing a reference back to the primary object is much simpler than repairing object's state.

To our relief, Ruby makes object copying easy. It provides the clone() method that can be applied to any object. Let's follow the example:

My eBook: “Memoirs of a Software Team Leader”
Read more »


class Thief
  attr_accessor :name, :hot_goods

  def initialize(name, hot_goods)
    @name = name
    @hot_goods = hot_goods
  end

  def introduce
    puts "My name is #{@name}, I have stolen: #{@hot_goods.join(', ')}."
  end
end

We showed the Thief class that provides the constructor and the introduce() method. This class is used to represent a thief: his name and hot goods he was cute enough to steal. Let's use this class:

poor = Thief.new('Poor', ['a pen', 'a dollar'])
poor.introduce

The output is:

My name is Poor, I have stolen: a pen, a dollar.

Our first thief, Poor, didn't hit the jackpot. But let's imagine the other thief named Richie. Richie has stolen the same things that the Poor has, but has also stolen some valuables:

richie = poor.clone
richie.name = 'Richie'
richie.hot_goods << 'famous painting'
richie.hot_goods << 'a good car'

Let the Richie introduce himself:

richie.introduce

The output is:

My name is Richie, I have stolen: a pen, a dollar, famous painting, a good car.

That's just what we wanted to achieve. By using the clone() method, we have reached our goal with a minimal amount of effort. Let's go back to the Poor now:

poor.introduce

The output is:

My name is Poor, I have stolen: a pen, a dollar, famous painting, a good car.

Something has definitely gone wrong. The Poor didn't stole all these things! What happened? I think you already might have guessed. It's all because of the clone() method.

The clone() method performs a shallow copy. It means that it produces a new object of the same type and different object_id, but it doesn't perform a deep copy, i.e. it doesn't clone the object attributes, attributes' attributes etc. Let's make few simple checks:

puts poor
puts richie

The output should be similar to this:

#<Thief:0xb7c66fa8>
#<Thief:0xb7c66f94>

That is correct: we have two Thief instances with different ids. Let's go one step further now:

printf("0x%x\n", poor.name.object_id)
printf("0x%x\n", richie.name.object_id)

The output should be similar to this:

0xfdbe23014
0xfdbe22f38

In both cases, the name attribute is a String instance and both instances have different ids. That's absolutely correct. But what with the hot_goods attribute?

printf("0x%x\n", poor.hot_goods.object_id)
printf("0x%x\n", richie.hot_goods.object_id)

The output should be similar to this:

0xfdbe256ca
0xfdbe256ca

That's it! Both Thief instances refer to the same Array instance. But what's the difference between the name and the hot_goods attribute? I would say: slight and tricky. Let's look again how we've created and changed the Richie Thief instance:

richie = poor.clone
richie.name = 'Richie'
richie.hot_goods << 'famous painting'
richie.hot_goods << 'a good car'

The richie.name = 'Richie' created a new String object at the right side of the assignment. name attribute reference was changed then to refer to the new 'Richie' string instance.

The next two lines use the << operator. It doesn't substitute object reference, it only changes its state somehow. That is why both Thief instances refer to the same array.

What should we act like in this kind of situation? Well, we can either write the clone() method for specific class to perform a deep copy or use the marshalling mechanism. The first way out can be implemented like this:

class Thief
  ...

  def clone
    Thief.new(self.name, self.hot_goods.clone)
  end

  ...
end

That's it! By using the clone() method provided, we make a deep copy, not just a shallow copy. It works because we tell Ruby to clone the hot_goods attribute.

Solution that uses the marshalling mechanism would look like this:

richie = Marshal::load(Marshal.dump(poor))

This code serializes given object to string first and then recreates it. Marshalling is a more general solution, it can be used for almost any object. It has some limitations though. If the object contains context-sensitive information like bindings or IO class instances or if it is a singleton object, it cannot be serialized. In such a situation, we should provide the clone() method implementation, just like we did before.

I hope this article was a good teaching aid in understanding the way Ruby clones objects. Thanks for reading!

Comments

There are 10 comments / Submit your comment

Marco Colli
June 01, 2009 10:29 AM

Great article! I couldn't understand how the clone method works until I read this explaination.

Deepti
September 28, 2009 09:56 PM

Thanks for putting this up. For a newbie, this is very valuable.

Richard Muller
March 16, 2011 12:51 AM

Great article. I wrote my_string = "xxx" and then saved_copy = my_string. Got shocked once more when changes to my_string modified my saved_copy. And then the light went on: In Ruby, they're assigned the same object_id's. Grrr!

Thanks for a deeper analysis than I've seen so far.

Patternexon
April 02, 2011 07:44 PM

Thanks ! This was exactly what I was looking for.

Lucas
May 16, 2012 03:00 PM

Thanks for this.

Mat
April 09, 2013 08:12 AM

There is a native implementation to perform deep clones of ruby objects.

gem install ruby_deep_clone

require "deep_clone"
object = SomeComplexClass.new()
cloned_object = DeepClone.clone(object)

It's approximately 6 to 7 times faster than the Marshal approach and event works with frozen objects.

Tom Ryan
July 29, 2013 07:06 AM

thank you!! there was nowhere else on the web i was gonna find this level of information to solve my problem. i needed every bit of it

Fred
January 19, 2014 07:38 PM

You might find my gem useful for doing deep copies in a controlled fashion:

https://rubygems.org/gems/deep_dive

Will work with any object graph of arbitrary complexity, and you can specify which instance variables to reference instead of copy.

Can handle circular references, of course.

To install it:

gem install deep_dive

jd
July 20, 2014 02:30 AM

thx!

Matt Fernandez
October 20, 2015 02:00 AM

This was very helpful, thank you!

You can use Markdown in your comments if you wish. Examples:

*emphasis*
emphasis
**strong**
strong
`inline code`
inline code
[My blog](http://lukaszwrobel.pl)
My blog
# use 4 spaces to indent
# a block of code
    def my_method(x)
      x = x + 1
    end
def my_method(x)
  x = x + 1
end

* First.
* Second.
  • First.
  • Second.

> This is a citation.
> Even more citation.

I don't agree with you.

This is a citation. Even more citation.

I don't agree with you.


Submit your comment

(required)

(optional)

(required, Markdown supported)


Preview:

My eBook: “Memoirs of a Software Team Leader”

Read more »