I thought I understood Python names and immutable objects such as strings until I got this unexpected behaviour in a Jupyter notebook. Then I noticed the same code has different result when you run at as a Python script file (.py) from the command line.
- Executing as .pyscript (using Python 2.7.12)
Script file:
a, b = 'my text', 'my text'
print id(a), id(b), a is b
c = 'my text'
d = 'my text'
print id(c), id(d), c is d
Output
4300053168 4300053168 True
4300053168 4300053168 True
As I expected - Python does not make copies of strings. All names point to the same object.
- Interpreting in interactive iPython (version 2.7.12)
If I enter the exact same code above into an iPython interactive shell or a Jupyter notebook cell I get output like this
4361310096 4361310096 True
4361509168 4361509648 False
In the second case, Python has created two new objects to represent 'my text'.
The reason for this post is that I am developing code in the notebook that uses identity tests such as a is 'my text' (rather than a == 'my text').  I thought this would be a very efficient, yet readable way to achieve what I want to achieve.  Obviously, for this to work consistently, I need to ensure that there are no duplicates of each string literal.
