Sometimes, when we are working with objects in Ruby, we want to make a copy of them. But what for? Well, in most cases we want to have a working copy and still maintain the original, intact object. Changing a reference back to the primary object is much simpler than repairing object’s state.
To our relief, Ruby makes object copying easy. It provides the clone()
method that can be applied to any object. Let’s follow the example:
class Thief
attr_accessor :name, :hot_goods
def initialize(name, hot_goods)
@name = name
@hot_goods = hot_goods
end
def introduce
puts "My name is #{@name}, I have stolen: #{@hot_goods.join(', ')}."
end
end
We showed the Thief
class that provides the constructor and the introduce()
method. This class is used to represent a thief: his name and hot goods he was cute enough to steal. Let’s use this class:
poor = Thief.new('Poor', ['a pen', 'a dollar'])
poor.introduce
The output is:
My name is Poor, I have stolen: a pen, a dollar.
Our first thief, Poor, didn’t hit the jackpot. But let’s imagine the other thief named Richie. Richie has stolen the same things that the Poor has, but has also stolen some valuables:
richie = poor.clone
richie.name = 'Richie'
richie.hot_goods << 'famous painting'
richie.hot_goods << 'a good car'
Let the Richie introduce himself:
richie.introduce
The output is:
My name is Richie, I have stolen: a pen, a dollar, famous painting, a good car.
That’s just what we wanted to achieve. By using the clone()
method, we have reached our goal with a minimal amount of effort. Let’s go back to the Poor now:
poor.introduce
The output is:
My name is Poor, I have stolen: a pen, a dollar, famous painting, a good car.
Something has definitely gone wrong. The Poor didn’t stole all these things! What happened? I think you already might have guessed. It’s all because of the clone()
method.
The clone()
method performs a shallow copy. It means that it produces a new object of the same type and different object_id
, but it doesn’t perform a deep copy, i.e. it doesn’t clone the object attributes, attributes’ attributes etc. Let’s make few simple checks:
puts poor
puts richie
The output should be similar to this:
#<Thief:0xb7c66fa8>
#<Thief:0xb7c66f94>
That is correct: we have two Thief
instances with different ids. Let’s go one step further now:
printf("0x%x\n", poor.name.object_id)
printf("0x%x\n", richie.name.object_id)
The output should be similar to this:
0xfdbe23014
0xfdbe22f38
In both cases, the name
attribute is a String
instance and both instances have different ids. That’s absolutely correct. But what with the hot_goods
attribute?
printf("0x%x\n", poor.hot_goods.object_id)
printf("0x%x\n", richie.hot_goods.object_id)
The output should be similar to this:
0xfdbe256ca
0xfdbe256ca
That’s it! Both Thief
instances refer to the same Array
instance. But what’s the difference between the name
and the hot_goods
attribute? I would say: slight and tricky. Let’s look again how we’ve created and changed the Richie Thief
instance:
richie = poor.clone
richie.name = 'Richie'
richie.hot_goods << 'famous painting'
richie.hot_goods << 'a good car'
The richie.name = 'Richie'
created a new String
object at the right side of the assignment. name
attribute reference was changed then to refer to the new ‘Richie’ string instance.
The next two lines use the <<
operator. It doesn’t substitute object reference, it only changes its state somehow. That is why both Thief
instances refer to the same array.
What should we act like in this kind of situation? Well, we can either write the clone()
method for specific class to perform a deep copy or use the marshalling mechanism. The first way out can be implemented like this:
class Thief
...
def clone
Thief.new(self.name, self.hot_goods.clone)
end
...
end
That’s it! By using the clone()
method provided, we make a deep copy, not just a shallow copy. It works because we tell Ruby to clone the hot_goods
attribute.
Solution that uses the marshalling mechanism would look like this:
richie = Marshal::load(Marshal.dump(poor))
This code serializes given object to string first and then recreates it. Marshalling is a more general solution, it can be used for almost any object. It has some limitations though. If the object contains context-sensitive information like bindings or IO
class instances or if it is a singleton object, it cannot be serialized. In such a situation, we should provide the clone()
method implementation, just like we did before.
I hope this article was a good teaching aid in understanding the way Ruby clones objects. Thanks for reading!