Wednesday, October 20, 2010

[UPDATED] Python: Don't touch the list you're iterating

If you are a developer that likes safe code, there's probably nothing in this post you don't know. However, if you have read my previous post, you already know that I'm a lazy developer who likes his programming languages to tell him he's a jerk. I stumbled across a Python feature that didn't tell me I am a jerk. Let's call it a quirk.

Enough for the poetry. The problem: I made a list and I wanted to remove objects from that list, when they met a certain condition. The thing I did, looked a little bit like the following snippet of code.

roll = [1, 2, 3, 4]
for segment in roll:
    if segment == 2:
        roll.remove(segment)
    else:
        print segment

When you try to run this snippet, it will print out '1, 4'. Indeed, '3' is missing. Without the code telling anybody you are a jerk. The list will eventually be [1, 3, 4], but things can be really messed up. What if we try to remove the segments that are equal to two as well as three? Python will just skip the third segment, due to an internal pointer, and the jerk in this story will end up really confused, with a list like [1, 3, 4], while he thought it would be [1, 4].

Of course you are now craving for the solution of this problem. And of course it is really simple. Just make a copy of the list you try to alter and iterate through that one. Then alter the list you want to alter. Do I make myself sound confusing again? This will probably enlighten you:

roll = [1, 2, 3, 4]
copy = list(roll) # list() makes a copy
for segment in copy:
    if segment == 2 or segment == 3:
        roll.remove(segment)

print roll

See, '[1, 4]'. Just the way you like it.

*** UPDATE ***

The example above works with removing a segment from a list. If you really want to mess things up, try appending segments. Every time you iterate through the list, you can add a segment. When you add a segment, the loop needs one more iteration. One more iteration, in which one more segment will be added and so on.

Try the following snippet AT YOUR OWN RISK!
roll = [1]
for segment in roll:
    roll.append(segment+1)
    print segment

If you want, leave a comment with the segment at which the program died.

8 comments:

  1. You can also iterate backwards and modify elements in the list. It will work because if you remove/add at the position your iteration is currently at, it still won't skip any iteration because the indexes lower than the current index won't change.

    ReplyDelete
  2. If backwards iterating was possible in a Python for-loop, that would be a great solution indeed. But I can't find a reversed for-loop anywhere on the web.

    So if you got a great way to reverse the iteration, please enlighten me.

    BTW: I noticed that the Python docs use a construction like
    "for segment in roll[:]:"

    ReplyDelete
  3. http://docs.python.org/library/functions.html#reversed

    ReplyDelete
  4. print(filter(lambda x: x != 2, roll))

    ReplyDelete
  5. Great comments. Can't make more of it.

    ReplyDelete
  6. Hi, J.H. In answer to your question, I've written this
    http://jedp.posterous.com/iterating-over-python-lists-while-modifying-t

    I hope it's useful. Cheers!
    j

    ReplyDelete
  7. Hi, J.H., and thanks for checking out my post. Glad it was useful. One more comment on the comments here:

    an Anonymous poster suggests using filter and lambda to find all the elements in roll that are not equal to 2. Whatever you're testing for, the use of lambda incurs the overhead of a building an anonymous function with each iteration. This can get quite heavy.

    I would suggest using list comprehensions instead. Like so:

    print [x for x in roll if x!=2]

    As much as I love functional programming, this is very legible, and runs much faster as well. List comprehensions in python are very much awesome. hth!
    j

    ReplyDelete
  8. At least it doesn't throw a Java ConcurrentModificationExcepetion (if i remember correctly)

    ReplyDelete