Recently I have started a new part-time thing, mentoring students who are attending code camps on various major universities. This has caused me to delve into subjects and areas of programming that I had not visited for some time. The fundamentals mostly, and areas which are important to programmers in training. This is what the students need and honestly I am enjoying it also.

One of the programming challenges present in the class lessons involves learning about the Array datatype through creation of a structure that represents the pixels in an image. The student is tasked with storing the “pixel” data, which is either 1 or 0, in a 2-dimensional array which corresponds to the rows and columns in an image. Simply put, for a 4 x 4 image you will have a four rows or four 1s or 0s. The challenges continue to build upon this data structure by having the students figure out how to manipulate it in various ways and to various ends.

Enter the 2D Array

As an example, let us see seme code. The initial Image class has a ctor that takes in the 2D array containing its 1s and 0s of pixel data. And looks something like this:

class Image
    def initialize(array)
      @values = array
    end

    def output_image
        @new_string = ''
        @values.each do |x|
        x.each do |cell|
          @new_string += cell.to_s
        end
        puts @new_string
        @new_string = ''
     end
     puts ""
end

Usage of which is provided as part of the make this code work challenge. And the provided calls look like this:

Image.new([
    [0,0,0,0],
    [0,1,0,0],
    [0,0,0,1],
    [0,0,0,1]
])

This call to the class constructor stores this 2D array in a class variable as shown, called @values and it is accessed when needed to display the data in the output_image method. Simple so far, but the challenge builds upon this with the next requirement.

Overall this is a great lesson for the fundamentals and led me to dive into other topics like the SOLID principles and recursion and an example of the fibonacci sequence all of which were excellent building topics for the students…

The next requirement for the challenge only matters for the sake of this post, in that it required me to point the student toward iterating through the 2D array. And at the same time, alter the data in the array based upon the current position. To satisfy your curiosity I will say though. We were to add onto the Image class with a method called blur that would look for 1s and if found would also flip the values above, below, left, and right of the found 1 into 1s also.

It is important to note that my experience led me to the next decision, which was to copy the array into a new variable and use the new copy for iterating and thus be able to alter the class variable @values. I have always been told to not alter a collection you are iterating through lest you mess up the iteration by doing so. Specifically if removing values or altering values that are used in a conditional.

Anyways, suffice it to say we needed to copy the 2D array. And therein we find the subject of this post, finally!

Shallow Copy in Ruby

I did a quick search for how to copy an array in Ruby and came up easily with the dup method. And it is important to note that making copies of simple objects is easy and is the intended purpose of the dup method. Take the following as an example:

array_one = [1,2,3]
array_two = array_one
array_two << 4
array_one.inspect  # [1,2,3,4]

This example shows that you cannot do a simple value copy on an Array instance. The reason for this is that an Array is a reference object vs. a value object and therefore the new variable in our example, array_two is pointing to a place in memory rather than holding the value in memory. We end up with both array_one and array_two pointing to the same address in memory which contains our array.

A shallow reference or object copy is demonstrated in the next snippet and is how you copy a simple object in ruby.

array_one = [1,2]
array_two = array_one.dup
array_one << 3
array_one.inspect   # [1,2,3]
array_two.inspect   # [1,2]

And you will see here that making a shallow copy of array_one into the array_two variable does in this case make a new copy of the object so that we can alter array_one by adding another value to it and then show that the copied array_two is not altered because it is its own object.

Deep Copy in Ruby?

Well then what is the problem you might say. The problem comes when you have more than one dimension to your object thus meaning it is no longer a simple object. This applies to hash maps, arrays and any collection capable of being nested into a multi-dimensional collection. In our Image class example we have a 2D array, where we nest the columnar pixel values into rows in second dimension of the @values array. Let us see what happens when we try a simple copy…

array_one = [ [1,2], [3,4] ]
array_two = array_one.dup
array_one[0] << 3
array_two.inspect   #  [[1,2,3], [3,4]]

Now it is important to see that altering array_one still altered the duplicated array_two which is NOT the behavior we want. Because we used the dup method as in the previous example we expected our copy to remain just that, its own copy. But because a shallow copy only duplicates the first layer of complexity we get a reference copy from there down and that is, as we said, just a pointer to the same instance in memory.

To get our deep copy Ruby expects us to override the clone method which by its default or base behavior will make the same shallow copy as dup but the difference withclone is that it can be overridden in our Image class to perform the deep copy we need. Well actually we should extend it by making our own version of the initialize_copy method, which is called by both dup and clone.

So to be clear, the goal is to get a separate copy of our object/array/hash so that we can alter it w/o altering the original. To do this the best practice is to provide our own deep copy implementation for use by our class. This implementation can be done in a bunch of ways depending upon our object structure, but one way to do it that is somewhat univeral is to use Marshal.dump and Marshal.load which are methods that simply dump and reload the object directly from the memory heap.

Deep Copy Implemented

One possible solution for our student’s example might be this:

class Image
    def initialize(array)
      @values = array
    end

    def initialize_copy(orig)
        cp = Marshal.load(Marshal.dump(orig))
        return cp
    end

    def output_image
        @new_string = ''
        @values.each do |x|
        x.each do |cell|
          @new_string += cell.to_s
        end
        puts @new_string
        @new_string = ''
     end
     puts ""
end

Now our Image class is equipped with a deep copy that will work for whatever it is call upon. And furthermore because both the dup and clone methods call initialize_copy internally, they will both now perform the same deep copy.

References Used

How to Make Deep Copies in RubyMichael Morin Making Deep Copies in Rubyr/ruby: bannister