Contents

Python subtleties

A collection of subtle (or not so subtle) mistakes I made and puzzles I’ve come across.

1
2
import pandas as pd
import numpy as np

Changing a mutable element of an immutable sequence

The puzzle is from page 40 in Fluent Python.

1
2
t = (1, 2, [3, 4])
t[2] += [5, 6]
TypeError: 'tuple' object does not support item assignment
1
type(t).__name__
'tuple'
1
t
(1, 2, [3, 4, 5, 6])

What’s going on here? As part of the assignment, Python does the following:

  1. Performs augmented addition on the value of t[2], which works because that value is the list [3, 4], which is mutable.

  2. Then it tries to assign the result from 1 to t[2], which doesn’t work, because t is immutable.

  3. But because the 2nd element in t is not the list itself but a reference to it, and because the list was changed in step 1, the value of t[2] has changed, too.

A great way to visualise the process is to see what happens under the hood using the amazing Python Tutor.

NANs are True

I have a dataframe with some data:

1
2
df = pd.DataFrame({'data': list('abcde')})
df

data
0a
1b
2c
3d
4e

I can shift the data column:

1
df.data.shift()
0    NaN
1      a
2      b
3      c
4      d
Name: data, dtype: object

I want to add a check column that tells me where the shift is missing:

1
2
df['check'] = np.where(df.data.shift(), 'ok', 'missing')
df

datacheck
0aok
1bok
2cok
3dok
4eok

That’s not what I wanted. The reason it happens is that missing values that aren’t None evaluate to True (follows from the docs). One way to see this:

1
[e for e in [np.nan, 'hello', True, None] if e]
[nan, 'hello', True]

Hence, to get the check I wanted I should do this:

1
2
df['correct_check'] = np.where(df.data.shift().notna(), 'ok', 'missing')
df

datacheckcorrect_check
0aokmissing
1bokok
2cokok
3dokok
4eokok

Truthy vs True

As follows clearly from the docs, True is one of many values that evaluate to True. This seems clear enough. Yet I just caught myself getting confused by the following:

I have a list of values that I want to filter for Truthy elements – elements that evaluate to True:

1
2
mylist = [np.nan, 'hello', True, None]
[e for e in mylist if e]
[nan, 'hello', True]

This works as intended. For a moment, however, I got confused by the following:

1
[e for e in mylist if e is True]
[True]

I expected it to yield the same result as the above. But it doesn’t becuase it only returns valus that actually are True, as in having the same object ID as the value True (this Stack Overflow answer makes the point nicely). We can see this below:

1
[id(e) for e in mylist]
[4599359344, 4859333552, 4556488160, 4556589160]
1
id(True)
4556488160