Concurrency and Asynchronicity in Python: Combining Approaches and Best Practices

In the previous articles, we have thoroughly examined the three main approaches to concurrent programming in Python: multithreading (threading), multiprocessing (multiprocessing), and asynchronous programming (asyncio). Each has its strengths and areas of application.

In this final article, we will:

Compare these approaches.
Introduce the high-level concurrent.futures module.
Discuss common debugging and profiling issues.
Review best practices for writing concurrent code.

Comparison: threading vs multiprocessing vs asyncio

Choosing the right tool for your task is key to an effective concurrent application.

Feature	threading	multiprocessing	asyncio
Primary Model	Threads within a single process	Multiple processes	Cooperative multitasking on a single thread
CPU Parallelism	Limited by GIL (None for CPU-bound)	True (bypasses GIL, uses multiple cores)	None (single-threaded)
I/O-bound Tasks	Good (GIL is released)	Possible, but higher overhead	Excellent (minimal overhead)
CPU-bound Tasks	Poor (due to GIL)	Excellent	Poor (blocks the event loop)
Memory	Shared (easy, but risks race conditions)	Isolated (safer, but more complex data exchange)	Shared within one thread (unless run_in_executor is used)
Data Exchange	Shared variables + synchronization	IPC (Pipe, Queue, Value, Array, Manager) - slower	Complex exchange usually not needed (if everything is async)
Overhead	Low (for thread creation/switching)	High (for process creation/switching)	Very low (for coroutine switching)
Complexity	Medium (synchronization)	High (IPC, serialization)	Medium/High (async/await concept)

Quick Recommendations:

asyncio: Your first choice for new I/O-bound applications, especially network-related ones (web servers, API clients, bots). It provides the best performance and scalability for such tasks.
threading: Use for I/O-bound tasks if:
- You are working with existing blocking code that is difficult to rewrite for asyncio.
- The libraries you use don't have asynchronous alternatives.
- The task is relatively simple and doesn't require handling thousands of concurrent operations.
multiprocessing: Your choice for CPU-bound tasks that can be effectively parallelized across multiple cores (e.g., mathematical calculations, large data processing).

Sometimes, it's possible and reasonable to combine approaches. For example, use multiprocessing to distribute CPU-bound load, and within each process, use asyncio for efficient I/O handling.

concurrent.futures: A High-Level Interface

The concurrent.futures module provides a simple, high-level interface for asynchronously executing callables (functions or methods) using threads or processes.

It offers two main classes:

ThreadPoolExecutor: Uses a pool of threads to execute tasks.
ProcessPoolExecutor: Uses a pool of processes to execute tasks.

Both Executors have a similar API, making it easy to switch between them.

Python 3.13
import concurrent.futures
import time
import os
import threading # Added to show thread/process ID difference

def io_bound_task(url):
    # Use a more informative print including thread/process id
    thread_id = threading.get_ident() if hasattr(threading, 'get_ident') else 'N/A' # Thread ID
    pid = os.getpid() # Process ID
    # print(f"[PID: {pid}, Thread: {thread_id}] Downloading {url}")
    time.sleep(1) # Simulate download
    return f"Data from {url}"

def cpu_bound_task(n):
    thread_id = threading.get_ident() if hasattr(threading, 'get_ident') else 'N/A'
    pid = os.getpid()
    # print(f"[PID: {pid}, Thread: {thread_id}] Calculating sum for {n}")
    return sum(i*i for i in range(n))

if __name__ == "__main__":
    urls = ["url1", "url2", "url3"]
    numbers_for_sum = [1000000, 2000000, 3000000]

    # --- ThreadPoolExecutor for I/O-bound tasks ---
    print("--- ThreadPoolExecutor (I/O-bound) ---")
    start_time = time.time()
    with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
        # map preserves the order of results
        results_io = list(executor.map(io_bound_task, urls))
    print(f"I/O Results: {results_io}")
    print(f"ThreadPoolExecutor Time: {time.time() - start_time:.2f} sec.\n")

    # --- ProcessPoolExecutor for CPU-bound tasks ---
    print("--- ProcessPoolExecutor (CPU-bound) ---")
    start_time = time.time()
    # Using context manager ensures processes are cleaned up
    with concurrent.futures.ProcessPoolExecutor(max_workers=3) as executor:
        results_cpu = list(executor.map(cpu_bound_task, numbers_for_sum))
    # Results can be large, just print that they were obtained
    print(f"CPU Results obtained (count: {len(results_cpu)})")
    print(f"ProcessPoolExecutor Time: {time.time() - start_time:.2f} sec.\n")

    # --- Example using submit to get Future objects ---
    print("--- Using submit and Future ---")
    with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:
        future1 = executor.submit(io_bound_task, "url_future1")
        future2 = executor.submit(cpu_bound_task, 500000)

        print(f"Task 1 (I/O) running: {future1.running()}")
        print(f"Task 2 (CPU) running: {future2.running()}")

        # .result() blocks until the result is available
        print(f"Result future1: {future1.result()}")
        print(f"Result future2: {future2.result()}")

executor.submit(fn, *args, **kwargs): Submits a callable to be executed and immediately returns a Future object.
executor.map(fn, *iterables, timeout=None, chunksize=1): Similar to the built-in map(), but executes calls asynchronously. Returns an iterator over the results.
The Future object represents the future result of an operation. It has methods like result(), exception(), done(), running(), cancelled(), add_done_callback().

concurrent.futures is an excellent choice when you need a simple way to parallelize task execution without diving deep into the details of managing threads/processes or the complexities of asyncio.

Debugging and Profiling Concurrent Code

Debugging and profiling concurrent applications can be more challenging than for synchronous code due to non-determinism and potential race conditions.

Common Problems:

Race Conditions: When the outcome depends on the unpredictable sequence of execution of threads/processes accessing shared data.
Deadlocks: When two or more threads/processes wait indefinitely for each other to release resources.
Starvation: When one or more threads/processes cannot access a resource for an extended period.
Difficulty Reproducing Errors: Bugs may appear sporadically.

Tools and Approaches:

Logging: Thorough logging (including thread/process IDs and timestamps) is invaluable.
Standard Python Debuggers (pdb, ipdb): Can be useful, but debugging multiple threads/processes can be difficult.
Specialized Debuggers: Some IDEs provide enhanced tools for debugging multithreaded/multiprocess applications.
Profilers: cProfile, profile for measuring execution time. For asyncio, specific tools exist, like the built-in asyncio debug mode (loop.set_debug(True)).
GIL Analysis: For multithreaded applications, understanding how the GIL affects performance is useful (e.g., using nogil-python or OS performance analysis tools).
Careful Design: Minimize shared state from the beginning and use the correct synchronization primitives.

Error Handling and Exceptions

In concurrent systems, an exception in one thread/process/task might not interrupt others or the main thread if not handled properly.

threading: Uncaught exceptions in a thread typically terminate that thread and print error information to stderr. The main thread might continue running. Use try...except inside the thread function or threading.excepthook.
multiprocessing: Exceptions in a child process also lead to its termination. When using Pool or a Future from ProcessPoolExecutor, the exception from the worker process will be passed back and raised when attempting to get the result (future.result() or when iterating over pool.map results).
asyncio: Exceptions in a coroutine run as a Task do not immediately interrupt other tasks or the event loop. They are stored in the Task object. If the task's result is retrieved via await task or asyncio.gather(), the exception will be raised at the point of waiting. It's important to handle exceptions or use task.add_done_callback() to check for them.
concurrent.futures: Future.result() will raise the exception if the task completed with an error.

Best Practices

Minimize Shared Mutable State: The less shared data that can be changed, the less need for complex synchronization and the lower the risk of errors.
Use Correct Synchronization Primitives: Lock for simple cases, Queue for data exchange, Event for signaling, etc. Don't overcomplicate unnecessarily.
Avoid Deadlocks: Acquire locks in the same order across all threads/processes. Use timeouts when acquiring locks if possible.
Make Operations Idempotent: If an operation can be safely performed multiple times with the same result, it simplifies error handling and retries.
Handle Timeouts: For external calls (network, IPC), always include timeouts to prevent indefinite waiting.
Manage Resources Carefully: Ensure resources (files, sockets, locks) are released using try...finally or context managers (with, async with).
Test Thoroughly: Concurrent code requires extra attention to testing, including checking edge cases and potential race conditions (though they are hard to catch reliably).
Choose the Right Tool for the Job: Don't use multiprocessing for simple I/O-bound tasks where asyncio or threading would be more efficient and lightweight.

A Brief Look at the asyncio Ecosystem

The popularity of asyncio has led to numerous libraries built upon it, simplifying the development of asynchronous applications:

aiohttp: Asynchronous HTTP client/server.
httpx: Modern HTTP client supporting both synchronous and asynchronous requests.
FastAPI, Starlette: High-performance web frameworks using asyncio.
aioredis, asyncpg: Asynchronous database drivers (Redis, PostgreSQL).

These libraries provide ready-to-use asynchronous interfaces for common tasks, allowing you to focus on your application's business logic.

Congratulations! You have completed the basics of concurrent and asynchronous programming in Python. We hope this knowledge helps you write more performant and responsive applications.

Which concurrent.futures tool is best suited for running CPU-bound tasks in parallel?