How to mix two lists with different lengths

Here is one interesting problem I was facing today. We have to lists, each may have a different size, and we want to mix their elements together forming a single list. We also want to keep the relative order of the elements, but more than that, we want the position from each element in the larger list to roughly correspond to that element’s original relative position in their original, smaller list.

If the elements have the same length, it’s pretty easy. We just pick one element from one list, then one element from the other. But suppose one of the lists have approximately 3 time as many elements than the shorter one. In that case you need to pick roughly 3 elements from the larger list, then one from the smaller, then three, then one…

But the first problem we see here is that, suppose you have 10 times more elements in the larger list. It would be better if you picked 5 from the first, then one from the smaller, then 10, 1, 10, 1… Then finish with the last 5 form the larger list. Another problem is that if the proportion is not a whole number, you can’t always pick the same amount of elements from the larger list for each one from the smaller, you need to pick, for instance 10 then 9 then 10 then 9… And you can’t simply pick the 10s first and the 9s at the end, you must evenly distribute these different spacings in the final list. And finally, we would like to be able to do all that with a very simple algorithm. How can we do it?

When I started to learn programming for real I did some computer graphics “demo” tutorials, and one of the first algorithms I learned that I found very interesting was the Bresenham algorithm. The problem it solves is pretty much the same we have here. We must calculate how many pixels we move to the “right” while drawing a straight line, for each time we move “up”, for instance. It’s just the same thing, but instead of moving pixels we pick elements from lists.

Here is how the code looks like:

def mix_lists(L1, L2):
    '''
    Uses something similar to the Bresenham algorithm to mix two lists
    with different sizes, keeping the elements from the smaller list
    evenly distributed among the elements from the larger list.

    '''

    if len(L1) < len(L2):
        L1, L2 = L2, L1

    i1, i2 = 0, 0
    N1, N2 = len(L1), len(N2)

    error = 0
    out = []
    while (i1 < N1) or (i2 < N2):
        if error < N1:
            out.append(L1[i1])
            i1 += 1
            error += 2 * N2
        else:
            out.append(L2[i2])
            i2 += 1
            error -= 2 * N1
            
    return out

Here are some tests, see what happens:

In [12]: mix_blocks([0,1,2,3,4,5,6,7,8,9], ['a','b','c','d','e'])
Out[12]: [0, 'a', 1, 2, 'b', 3, 4, 'c', 5, 6, 'd', 7, 8, 'e', 9]

In [13]: mix_blocks([0,1,2,3,4,5,6,7,8,9,10], ['a','b','c','d','e'])
Out[13]: [0, 1, 'a', 2, 3, 'b', 4, 5, 'c', 6, 7, 'd', 8, 9, 'e', 10]

In [14]: mix_blocks([0,1,2,3,4,5,6,7,8,9,10,11], ['a','b','c','d','e'])
Out[14]: [0, 1, 'a', 2, 3, 'b', 4, 5, 'c', 6, 7, 8, 'd', 9, 10, 'e', 11]

In [15]: mix_blocks([0,1,2,3,4,5,6,7,8,9,10,11,12], ['a','b','c','d','e'])
Out[15]: [0, 1, 'a', 2, 3, 'b', 4, 5, 6, 'c', 7, 8, 9, 'd', 10, 11, 'e', 12]

In [16]: mix_blocks([0,1,2,3,4,5,6,7,8,9,10,11,12,13], ['a','b','c','d','e'])
Out[16]: [0, 1, 'a', 2, 3, 4, 'b', 5, 6, 'c', 7, 8, 9, 'd', 10, 11, 12, 'e', 13]

In [17]: mix_blocks([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14], ['a','b','c','d','e'])
Out[17]: [0, 1, 'a', 2, 3, 4, 'b', 5, 6, 7, 'c', 8, 9, 10, 'd', 11, 12, 13, 'e', 14]

In [18]: mix_blocks([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15], ['a','b','c','d','e'])
Out[18]: [0, 1, 'a', 2, 3, 4, 'b', 5, 6, 7, 'c', 8, 9, 10, 11, 'd', 12, 13, 14, 'e', 15]

In the first and the penultimate cases here we are mixing 10 and 5 elements, and 15 and 5 elements. So the elements from the second list are simply spaced from each other by 2 and by 3 elements from the first list. In the other cases we see patterns of 2 and 3 elements. In the last case we have only one spacing of 4 elements. And in all cases the elements from the smaller list are “centered” along the list, not very close from either the beginning or the end of the final list.

The algorithm is great because its implementation is very simple. You just have to figure out how to update the “error” variable using the list sizes, and then you have a loop where you increment two counters, until both of them reach their limits. You could try to make a more well-behaved “for” implementation, but I love how this wild “while” looks, depending on two different counters that will eventually reach their destinations. You can actually adapt the algorithm to draw curves too, and that would be necessary to create a list-merging algorithm where you concentrate values from one of the lists with a varying density in the final list. Who knows, you might like to merge lists with a “smooth” transition for example, using a logistic function to control the density of points from each of the lists… Something like that. But I am talking about something very deterministic and exact, and not “pick from each list according to a Bernoulli trial changing the p…”. Wouldn’t that be cool? 🙂

Final note: the code would look more pythonic if we created a generator, using “yield” instead of the “out” list that is returned…

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s