Sometimes, when we are working with objects in Ruby, we want to make a copy of them. But what for? Well, in most cases we want to have a working copy and still maintain the original, intact object. Changing a reference back to the primary object is much simpler than repairing object’s state.

To our relief, Ruby makes object copying easy. It provides the clone() method that can be applied to any object. Let’s follow the example:

class Thief
  attr_accessor :name, :hot_goods

  def initialize(name, hot_goods)
    @name = name
    @hot_goods = hot_goods
  end

  def introduce
    puts "My name is #{@name}, I have stolen: #{@hot_goods.join(', ')}."
  end
end

We showed the Thief class that provides the constructor and the introduce() method. This class is used to represent a thief: his name and hot goods he was cute enough to steal. Let’s use this class:

poor = Thief.new('Poor', ['a pen', 'a dollar'])
poor.introduce

The output is:

My name is Poor, I have stolen: a pen, a dollar.

Our first thief, Poor, didn’t hit the jackpot. But let’s imagine the other thief named Richie. Richie has stolen the same things that the Poor has, but has also stolen some valuables:

richie = poor.clone
richie.name = 'Richie'
richie.hot_goods << 'famous painting'
richie.hot_goods << 'a good car'

Let the Richie introduce himself:

richie.introduce

The output is:

My name is Richie, I have stolen: a pen, a dollar, famous painting, a good car.

That’s just what we wanted to achieve. By using the clone() method, we have reached our goal with a minimal amount of effort. Let’s go back to the Poor now:

poor.introduce

The output is:

My name is Poor, I have stolen: a pen, a dollar, famous painting, a good car.

Something has definitely gone wrong. The Poor didn’t stole all these things! What happened? I think you already might have guessed. It’s all because of the clone() method.

The clone() method performs a shallow copy. It means that it produces a new object of the same type and different object_id, but it doesn’t perform a deep copy, i.e. it doesn’t clone the object attributes, attributes’ attributes etc. Let’s make few simple checks:

puts poor
puts richie

The output should be similar to this:

#<Thief:0xb7c66fa8>
#<Thief:0xb7c66f94>

That is correct: we have two Thief instances with different ids. Let’s go one step further now:

printf("0x%x\n", poor.name.object_id)
printf("0x%x\n", richie.name.object_id)

The output should be similar to this:

0xfdbe23014
0xfdbe22f38

In both cases, the name attribute is a String instance and both instances have different ids. That’s absolutely correct. But what with the hot_goods attribute?

printf("0x%x\n", poor.hot_goods.object_id)
printf("0x%x\n", richie.hot_goods.object_id)

The output should be similar to this:

0xfdbe256ca
0xfdbe256ca

That’s it! Both Thief instances refer to the same Array instance. But what’s the difference between the name and the hot_goods attribute? I would say: slight and tricky. Let’s look again how we’ve created and changed the Richie Thief instance:

richie = poor.clone
richie.name = 'Richie'
richie.hot_goods << 'famous painting'
richie.hot_goods << 'a good car'

The richie.name = 'Richie' created a new String object at the right side of the assignment. name attribute reference was changed then to refer to the new ‘Richie’ string instance.

The next two lines use the &lt;&lt; operator. It doesn’t substitute object reference, it only changes its state somehow. That is why both Thief instances refer to the same array.

What should we act like in this kind of situation? Well, we can either write the clone() method for specific class to perform a deep copy or use the marshalling mechanism. The first way out can be implemented like this:

class Thief
  ...

  def clone
    Thief.new(self.name, self.hot_goods.clone)
  end

  ...
end

That’s it! By using the clone() method provided, we make a deep copy, not just a shallow copy. It works because we tell Ruby to clone the hot_goods attribute.

Solution that uses the marshalling mechanism would look like this:

richie = Marshal::load(Marshal.dump(poor))

This code serializes given object to string first and then recreates it. Marshalling is a more general solution, it can be used for almost any object. It has some limitations though. If the object contains context-sensitive information like bindings or IO class instances or if it is a singleton object, it cannot be serialized. In such a situation, we should provide the clone() method implementation, just like we did before.

I hope this article was a good teaching aid in understanding the way Ruby clones objects. Thanks for reading!

comments powered by Disqus