Blocking gevent's Hub Part 1: Understanding Blocking

In the beginning we talked about gevent's hub and how greenlets switch in and out of it to implement IO. Following that we showed how locks in gevent are implemented much the same way, by "parking" a waiting greenlet, switching to the hub to let other greenlets run or do IO, and eventually switching back to the parked greenlet.

That's a lot of switching. What does it mean if that switching doesn't happen? What should a programmer know about switching and its opposite, blocking? (There's also part 2.)

Contents

Cooperation

gevent (and greenlets in general) is a cooperative multitasking system, meaning that any given greenlet runs until it chooses to give up control. Under gevent, choosing to give up control is usually automatic, and usually happens when the greenlet wants to handle IO, such as reading or writing a socket to handle an incoming HTTP request or HTTP response. A greenlet can also give up control automatically by waiting for a lock that another greenlet owns, or it can explicitly give up control by using the gevent.sleep() API. In all those cases, control passes to the hub (recall that the hub is the greenlet running the event loop); this is referred to as yielding to the hub. When a greenlet yields to the hub, another greenlet gets a chance to run or IO gets serviced by the event loop. A greenlet that yields to the hub is called cooperative (or sometimes green). When a greenlet doesn't yield, it's non-cooperative.

Examples

Of course, it's not actually the greenlet itself that's yielding, it's the code that the greenlet is running that yields. We can thus say that any given piece of code is cooperative or green or not. Here are some examples:

gevent.sleep() is cooperative

import time
import gevent

def f():
    gevent.sleep(3)

now = time.time()
greenlet = gevent.spawn(f)
while not greenlet.dead:
    print('.', end='')
    gevent.sleep(1)
after = time.time()
print("\nElapsed time: %1.0f" % (after - now))

On my system, that prints three dots, indicating that the greenlets are yielding to the hub and switching back and forth:

$ python test.py
...
Elapsed time: 3.0

I will omit the timing part of the main greenlet in the rest of the examples unless it changes.

time.sleep() is not cooperative

Here's that same example, with the inner greenlet using time.sleep() from the standard library.
```
import time
import gevent

def f():
    time.sleep(3)
```
```
$ python test.py
.
Elapsed time: 3.0
```
We get only one dot, indicating that as soon as the main greenlet yielded to the hub, which in turn handed control over to the greenlet running f, that greenlet failed to yield again until it was completely finished.

Code that is CPU-bound is not cooperative

If a function is in a tight computation loop, there is no opportunity for it to yield control.
```
import time
import gevent

def f():
    i = 0
    while i < 2 ** 28:
        i += 1
```
```
$ python test.py
.
Elapsed time: 28.2
```
If that had been cooperative, we would have expected to see about 28 dots, since that's how long the inner greenlet ran. We only got one.

Socket IO using gevent.socket is cooperative

import time
import gevent
import gevent.socket

def f():
    gevent.socket.gethostbyname('www.python.org')

now = time.time()
greenlet = gevent.spawn(f)
while not greenlet.dead:
    print('.', end='')
    gevent.sleep(0.001) # NOTE: We switch to a smaller sleep duration
after = time.time()
print("\nElapsed time: %1.1f" % (after - now))

Lots of dots here.

$ python test.py
.....................
Elapsed time: 0.0

Socket IO using the standard socket is not cooperative [1]

import time
import gevent
import socket

def f():
    socket.gethostbyname('www.python.org')

Again, only one dot, meaning no yielding.

$ python test.py
.
Elapsed time: 0.0

C extension libraries may or may not be cooperative

This is a tricky area. C extensions are so common in Python (much of the standard library is implemented in C) that you may not even realize you're using one. Commonly, C extensions are used for two major purposes.

The first is to accelerate CPU intensive code (for example, numpy to speed up linear algebra). These are probably no more or less cooperative than a counterpart written in Python. (I'll explain what I mean by that in a minute.)

The second is to integrate with other existing native (C) libraries, such as imaging libraries or database clients. Some of these are essentially CPU accelerators, so they basically fall into the first category. But libraries that want to do IO, especially talking to remote systems, particularly databases, are likely to want to handle IO their own way and, without special hooks, are unlikely to be cooperative. (The PostgreSQL driver and its Python C extension psycopg2 is an example of a database driver that provides the necessary hooks to be made cooperative using psycogreen.)

Back to that "no more or less cooperative than a counterpart written in Python" statement. If the C extension calls from C back into Python code, and that Python code itself is cooperative, than thanks to the magic of greenlet being able to switch C call stacks, the whole thing winds up cooperative. The call from C into Python could either be explicit, or implicit (for example, sorting or hashing values can call arbitrary Python code in an object's __eq__ and __hash__ methods).

A few good examples of CPU accelerator extensions that can operate cooperatively are the BTrees (sorted dictionaries) and zope.interface (fast adapter and component registration and lookup) libraries. These libraries need to sort or hash objects. They are commonly used with objects that are persistent and stored in a ZODB database, so sorting or hashing may need to transparently fetch the object from the database. ZODB is implemented in pure Python, so the original C operation winds up being cooperative. [2]

Synergism Matters

The point of having one greenlet periodically print dots in those examples while the other greenlet either cooperated or not was to show why cooperative code matters. If one greenlet is not cooperating, no other greenlet can make any forward progress (print dots). This might stop other greenlets for a noticeably long time (like the CPU-bound example of counting to 268,435,456, or it might be so quick it's difficult for a human to catch (like resolving a hostname). In either case, it can be a problem, especially if you're trying to scale a program such as a server to handle more connections. Every yield lost potentially decreases overall velocity.

Avoiding Blocking

Above, we saw some examples of non-cooperative code. There were some clear transformations to go through to make code cooperative:

Switch from standard library API to gevent API.

Change imports to import from gevent. E.g., import time⟹from gevent import time or import socket⟹from gevent import socket.
Monkey patch

In code that's designed to be portable to a non-gevent system, you can use the standard library API and have gevent automatically make it cooperative by monkey patching. Many third-party libraries such as requests that use only the Python standard API can become cooperative this way.
Break up CPU intensive loops by explicitly yielding

If you know you'll have a CPU intensive loop, like in the example, you can periodically explicitly yield by calling gevent.sleep() in the loop. Beginning in gevent 1.3, this attempts to automatically balance the amount of time CPU intensive tasks get while still allowing other greenlets to perform IO.

gevent also provides a threadpool you can use to offload CPU intensive tasks.

Diagnosing Blocking

You may have done all of that and are still not sure if you're blocking the hub or not: maybe a colleague added a new C extension to your project and you don't know its internals, or maybe there's an inner loop somewhere that might be CPU intensive and you're just not sure.

That's where my next post comes in. We'll discuss how to use gevent's monitoring facilities to find greenlets that aren't cooperating.

Footnotes

[1]	Unless of course you monkey patch, but that's another post. Or at least another section of this post.

[2]	At least when the system is monkey patched. But that's another post. Or at least another section of this post.