Chapter 1. The Python Data Model
Guido’s sense of the aesthetics of language design is amazing. I’ve met many fine language
designers who could build theoretically beautiful languages that no one would ever use, but Guido
is one of those rare people who can build a language that is just slightly less theoretically beautiful
but thereby is a joy to write programs in.
Jim Hugunin, creator of Jython, cocreator of AspectJ, and architect of the .Net DLR1
One of the best qualities of Python is its consistency. After working with Python for a while, you
are able to start making informed, correct guesses about features that are new to you.
However, if you learned another object-oriented language before Python, you may find it strange
to use len(collection) instead of collection.len(). This apparent oddity is the tip of an iceberg
that, when properly understood, is the key to everything we call Pythonic. The iceberg is called
the Python Data Model, and it is the API that we use to make our own objects play well with the
most idiomatic language features.
You can think of the data model as a description of Python as a framework. It formalizes the
interfaces of the building blocks of the language itself, such as sequences, functions, iterators,
coroutines, classes, context managers, and so on.
When using a framework, we spend a lot of time coding methods that are called by the framework.
The same happens when we leverage the Python Data Model to build new classes. The Python
interpreter invokes special methods to perform basic object operations, often triggered by special
syntax. The special method names are always written with leading and trailing double underscores.
For example, the syntax obj[key] is supported by the __getitem__ special method. In order to
evaluate my_collection[key], the interpreter calls my_collection.__getitem__(key).
We implement special methods when we want our objects to support and interact with fundamental
language constructs such as:
Collections
Attribute access
Iteration (including asynchronous iteration using async for)
Operator overloading
Function and method invocation
String representation and formatting
Asynchronous programming using await
Object creation and destruction
Managed contexts using the with or async with statements
,MAGIC AND DUNDER
The term magic method is slang for special method, but how do we talk about a specific method
like __getitem__? I learned to say “dunder-getitem” from author and teacher Steve Holden. “Dunder” is
a shortcut for “double underscore before and after.” That’s why the special methods are also known
as dunder methods. The “Lexical Analysis” chapter of The Python Language Reference warns that
“Any use of __*__ names, in any context, that does not follow explicitly documented use, is subject to
breakage without warning.”
What’s New in This Chapter
This chapter had few changes from the first edition because it is an introduction to the Python Data
Model, which is quite stable. The most significant changes are:
Special methods supporting asynchronous programming and other new features, added to
the tables in “Overview of Special Methods”.
Figure 1-2 showing the use of special methods in “Collection API”, including
the collections.abc.Collection abstract base class introduced in Python 3.6.
Also, here and throughout this second edition I adopted the f-string syntax introduced in Python
3.6, which is more readable and often more convenient than the older string formatting notations:
the str.format() method and the % operator.
TIP
One reason to still use my_fmt.format() is when the definition of my_fmt must be in a different place
in the code than where the formatting operation needs to happen. For instance, when my_fmt has multiple
lines and is better defined in a constant, or when it must come from a configuration file, or from the database.
Those are real needs, but don’t happen very often.
A Pythonic Card Deck
Example 1-1 is simple, but it demonstrates the power of implementing just two special
methods, __getitem__ and __len__.
Example 1-1. A deck as a sequence of playing cards
import collections
Card = collections.namedtuple('Card', ['rank', 'suit'])
class FrenchDeck:
ranks = [str(n) for n in range(2, 11)] + list('JQKA')
suits = 'spades diamonds clubs hearts'.split()
def __init__(self):
self._cards = [Card(rank, suit) for suit in self.suits
, for rank in self.ranks]
def __len__(self):
return len(self._cards)
def __getitem__(self, position):
return self._cards[position]
The first thing to note is the use of collections.namedtuple to construct a simple class to represent
individual cards. We use namedtuple to build classes of objects that are just bundles of attributes
with no custom methods, like a database record. In the example, we use it to provide a nice
representation for the cards in the deck, as shown in the console session:
>>> beer_card = Card('7', 'diamonds')
>>> beer_card
Card(rank='7', suit='diamonds')
But the point of this example is the FrenchDeck class. It’s short, but it packs a punch. First, like
any standard Python collection, a deck responds to the len() function by returning the number of
cards in it:
>>> deck = FrenchDeck()
>>> len(deck)
52
Reading specific cards from the deck—say, the first or the last—is easy, thanks to
the __getitem__ method:
>>> deck[0]
Card(rank='2', suit='spades')
>>> deck[-1]
Card(rank='A', suit='hearts')
Should we create a method to pick a random card? No need. Python already has a function to get
a random item from a sequence: random.choice. We can use it on a deck instance:
>>> from random import choice
>>> choice(deck)
Card(rank='3', suit='hearts')
>>> choice(deck)
Card(rank='K', suit='spades')
>>> choice(deck)
Card(rank='2', suit='clubs')
We’ve just seen two advantages of using special methods to leverage the Python Data Model:
Users of your classes don’t have to memorize arbitrary method names for standard
operations. (“How to get the number of items? Is it .size(), .length(), or what?”)
It’s easier to benefit from the rich Python standard library and avoid reinventing the wheel,
like the random.choice function.