Fun with Python List Comprehensions and Generators - Fresh Blurbs by Irakli Nadareishvili

Python is an incredibly expressive language. It’s also extremely fun to code in, once you get hang of its idiomatic ways of succinctly expressing complex expressions.

One of my favorite tricks, in Python, is using list comprehensions for things that you would be writing long, boring loops for, in more primitive languages.

Let’s consider some fun examples. For most of the code samples, further in this post, we assume Python 3.7+. We will also be working with this sample list, as our input:

products = [
  {"name": "Bike", "color" : "teal", "price": "311.00"},
  {"name": "Mechanical Keyboard", "color" : "brown", "price": "141.00"},
  {"name": "Frozen Chips", "price": "9.00"},
  {"name": "Shoes", "color" : "red", "price": "218.00"}
]

Let’s say we would like to create a new list of “labels” that contains formatted name - price strings, for the products that are at least $100 or more. The brute-force (a.k.a. boring and verbose) way would be to write a good ol’ loop:

labels = []
for p in products:
  if (float(p['price']) > 100):
    labels.append((f"{p['name']} - {p['price']}"))

print (labels)

which generates output like:

['Bike - 311.00', 'Mechanical Keyboard - 141.00', 'Frozen Chips - 9.00', 'Shoes - 218.00']

or we can use list comprehensions for the same, which is more expressive and significantly more fun:

fun_labels = [
  f"{p['name']} - {p['price']}"
  for p in products
  if float(p['price']) > 100
]

print(fun_labels)

Let’s now assume we would like to apply 20% discount to all products in the list. A fairly functional, but still boring, code would look something like the following:

import copy

def apply_discounts(product):
  product['price'] = product['price']*0.8
  return product

discounted = list(map(apply_discounts, copy.deepcopy(products)))

Please note that we need to use copy.deepcopy() over the original list unless we are ok modifying the original, since map() changes elements in-place.

In contrast, this is what list comprehension way of achieving the same looks like:

fun_discounted = [{**p, 'price': p['price']*0.8} for p in products]

Fun! And please note that in this case we are not modifying the original list so there is no need for a potentially expensive copy.deepcopy() call!

Moreover, the same technique can be used to add new attributes to the elements in the list (in case they are dictionaries like in our case). Let’s say we would like to add "status": "available" attribute to each element, it would be the now-familiar:

fun_mod_prods = [{**pr, 'status': "available"} for pr in products]

Generator Expressions

Generators are constructs that yield one element at a time, when iterated over. We can iterate over them very much like over lists, but while a list actually allocates space, in memory, for its every element, generators can calculate their next element on-the-fly, leading to much smaller memory footprint.

For instance, if you have 10,000 products in a list and you need to print-out “labels” for each one of them with discounted price, using techniques we saw above, you can write succinct and fun code like following:

fun_discounted = [{**p, 'price': p['price']*0.8} for p in products]
fun_labels = [f"{p['name']} - {p['price']}" for p in fun_discounted]
for label in fun_labels:
  print (label)

This reads great, but you can quickly notice how we created two whole lists in-memory: the fun_discounted and fun_labels ones. If we are dealing with large or complex lists, such extra memory allocation may not be quite desirable. It’s also unnecessary, because we can easily substitute these lists with generators, simply by using “()” instead of “[]” in the comprehension expression:

gen_discounted = ({**p, 'price': float(p['price'])*0.8} for p in products)
gen_labels = (f"{p['name']} - {p['price']}" for p in gen_discounted)
for label in gen_labels:
  print (label)

As you can notice, the code looks virtually identical, except for three things:

In case of the generators we call these “generator expressions” not “generator comprehensions”, and use parentheses “()” instead of square brackets “[]” to signify that we are creating generators, not lists.
Generators do not allocate memory for its elements and are memory-efficient.
With generators, pretty much the only thing we can do is iterate over them, since they do not actually exist in memory. Meaning, you could have printed-out fun_labels with print(fun_labels) but if you try the same with print(gen_labels) all you are going to get is a reference to a generator object, i.e. an output such as:

<generator object <genexpr> at 0x103e11228>

In the above code, we explicitely introduced the intermediary gen_discounted generator, to demonstrate generators in their simplest form, but we can also nest generators, avoiding unnecessary extra variables. And, we can break code onto multiple-lines for better readability:

gen_labels = (
  f"{p['name']} - {p['price']}"
  for p in (
    {**p, 'price': round(float(p['price'])*0.8,2)}
    for p in products
  )
)
for label in gen_labels:
  print (label)

Comprehensions for Dictionaries

Lists are not the only iterable that you can use comprehensions with, in Python. Let’s see an example where we do it with Dictionaries.

Given a dictionary:

merch = {
	'sid:001':  { 'name': "pen",    'price': 5},
	'sid:002':  { 'name': "pencil", 'price': 4},
	'sid:003':  { 'name': "eraser"},
}

We can add default price of 2 to every item that is missing price with the following code:

defaults_applied = {
	k: {**merch[k], 
		'price': merch[k].get('price', 2)}
	for k in merch		
}
print (defaults_applied)

In conclusion

List comprehensions and generator expressions are very expressive and a lot of fun in Python. I hope I was able to spark your interest in them, with this blog post and you are going to enjoy them as much as many of us do.