The definition of concurrency is simultaneous occurrence. In Python, things that are occurring simultaneously include threads, tasks, and processes, but at a high level, they are each a set of instructions running in order. Each instruction can be stopped at certain points so the CPU processing them can switch to a different instruction set. The state of each instruction set gets saved so the CPU can restart it right where it stopped it.
Simultaneously does not necessarily mean at the same time. In Python, threading and asyncio both run on a single processor running a single instruction set at a time. Python intelligently finds ways to interleave instructions to speed up the overall process.
When the operating system uses Python threads, it can pre-emptively interrupt a thread’s execution to execute a different thread. This is called pre-emptive multitasking since the operating system may pre-empt a thread at any time. This interruption may happen in the middle of a statement.
With threads it’s important that our code executes in a thread-safe manner. Thread-safe execution means only a single thread at a time has access to things like a value in memory or a file. Writing thread-safe code adds a bit of overhead, but Python has thread-safe data structure like Queue and storage like Threading.local() to help. Without thread-safe code, threads may access stale copies of data and generate unpredictable results (e.g. race conditions).
Spawning additional threads is not without cost. Each thread requires a certain amount of memory and processing resources to create the thread and eventually destroy it. Spawning too many threads can eliminate the time savings we were trying to achieve by using concurrency in the first place.
Another concurrency model is cooperative multitasking. Tasks must cooperate by announcing when they are ready to stop processing. The benefit of cooperative multitasking is we always know when Python will stop processing our code. This simplifies the architecture, because Python can run many tasks on a single thread inside a single process. Python’s asyncio module uses cooperative multitasking.
In the case of asyncio, a single Python object called the “event loop” controls how and when each task gets run. A running task is in complete control until it hands control back to the event loop. When the running task relinquishes control the event loop goes to the event queue and selects the next task to run (note there may be multiple classes of queues). Web browsers use a similar process to respond to user events and their event bindings. The process repeats until the event loops exhausts the queue.
Because tasks have explicit control over their execution, we don’t have to worry about another thread accessing the data on which we’re operating. This makes sharing resources a bit easier in asyncio compared to threading namely because we don’t have to worry about writing thread-safe code.
Because the event loop operates on a on a single thread, it typically uses less resources than a threaded model. This allows it to scale incredibly well. The caveat to operating on a single thread is the importance of keeping task execution short. A long-running task will prevent the event loop from processing other tasks.
An important consideration for using asyncio is whether the libraries you plan to utilize take advantage of Python’s async and await operators. Unless libraries design for asynchronous processing, the library will not notify the event loop when the code execution is waiting for a response. At best this elminates the advantages of the event loop model; at worst it destroys performance due to long-running tasks blocking execution on the single thread.
Parallelism is a form of concurrency where Python creates multiple processes that execute code at the same time. Think of a process as a collection of memory and files each running its own Python interpreter. With different processes, each process can run on a different CPU core.
Initializing a separate Python interpreter for each process is a heavyweight operation and is not as fast as spawning a new thread. There is also communication overhead, because the separate processes do not share memory and must introduce special mechanisms for sharing state. For CPU-intensive workloads, however, the ability to execute code in parallel can dramatically improve performance.
When is Concurrency Useful?
Concurrency makes a significant difference for two types of workloads:
I/O bound: program slows because it frequently waits for input/output (I/O) from external resources.
Web requests typically spend several orders of magnitude more time waiting than they do executing CPU instructions.
Optimizing this kind of program involves overlapping time spent waiting for request responses so we can perform other computations while we wait.
CPU-bound: classes of programs that do significant computation without talking to the network or accessing files. The resource limiting the speed of our program is the CPU, not the network or the file system.
Optimizing this kind of program involves finding ways to do more computations at exactly the same time.
For I/O-intensive workloads, cooperative multitasking as implemented by the asyncio module offers scalability and an asynchronous programming paradigm familiar to developers who program web applications. The key to asyncio is writing short-running, asynchronous tasks that minimize execution time on the single threaded event loop.
An alternative model for I/O-intensive workloads is threading. Threading requires thread-safe data structures and managing the number of threads being spawned. Threading can introduce race conditions that are difficult to track down due to their inconsistent and unpredictable nature.
For CPU-intensive workloads, creating multiple Python processes via parallelism enables our workloads to execute simultaneously. The downside is process initialization takes time, and sharing state between processes introduces additional complexity.
Concurrency is a useful tool for speeding up the execution of our Python programs, but it does add additional complexity to the code. Choosing the right concurrency model depends on whether a workload is I/O- or CPU-intensive; this is often the easy part. Striking a balance between simplicity and optimization is often the more difficult decision to make.