Course Progress82%
🍎 Python Concurrency & Data Topic 82 / 100
⏳ 8 min read

Multiprocessing

Bypass the GIL and use all your CPU cores with the multiprocessing module — true parallel computation in Python.

"Multiprocessing gives each worker their own desk, their own tools, their own memory. No sharing, no waiting for a lock. True parallelism on all your CPU cores."

— Shurai

The Problem Threading Can't Solve

Because of the GIL, Python threads can’t run CPU-heavy code in parallel. Adding more threads to a calculation doesn’t speed it up — they still take turns:

🔁 Threading + GIL
4 threads, 1 CPU core
Still one at a time
No speedup for CPU work
⚡ Multiprocessing
4 processes, 4 CPU cores
Truly parallel
~4x speedup possible

The Basic Pattern — Process

python
from multiprocessing import Process
import os

def heavy_task(name):
    print(f"[{name}] running in process {os.getpid()}")
    # Simulate CPU-heavy work
    total = sum(i**2 for i in range(10_000_000))
    print(f"[{name}] done. sum={total}")

if __name__ == "__main__":  # REQUIRED on Windows/Mac
    p1 = Process(target=heavy_task, args=("Worker-1",))
    p2 = Process(target=heavy_task, args=("Worker-2",))
    p1.start()
    p2.start()
    p1.join()
    p2.join()
    print("Both processes finished")
Always use if __name__ == "__main__":

On Windows and macOS, Python spawns new processes by importing your script. Without this guard, each child process would try to spawn more children — causing an infinite loop of processes. Always wrap multiprocessing code in this guard.

Pool — The Easy Way to Parallelize a List

Pool.map() is the most useful multiprocessing tool: split a list across multiple processes and collect all results:

python
from multiprocessing import Pool
import time

def is_prime(n):
    """CPU-bound: check if n is prime."""
    if n < 2: return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

if __name__ == "__main__":
    numbers = list(range(1, 100_001))

    # processes=4 uses 4 CPU cores in parallel
    with Pool(processes=4) as pool:
        results = pool.map(is_prime, numbers)

    primes = [n for n, p in zip(numbers, results) if p]
    print(f"Found {len(primes)} primes up to 100,000")
    # Found 9592 primes up to 100,000

ProcessPoolExecutor — The Modern Way

python
from concurrent.futures import ProcessPoolExecutor
import math

def compute(n):
    return sum(math.sqrt(i) for i in range(n))

if __name__ == "__main__":
    inputs = [1_000_000] * 8   # 8 heavy calculations

    with ProcessPoolExecutor() as ex:  # auto-detects CPU count
        results = list(ex.map(compute, inputs))
    print(f"Done. First result: {results[0]:.2f}")

When to Use What?

Tool Best for Shares memory?
asyncioMany I/O waits (APIs, sockets)Yes
threadingI/O tasks, simple concurrencyYes
multiprocessingCPU-heavy computationNo (separate memory)

"The rule is simple: waiting → asyncio or threading. Computing → multiprocessing. When in doubt, benchmark. Concurrency bugs are subtle — don't add complexity until you've measured you need it."

— Shurai

🧠 Quiz — Q1

Why can't Python threads speed up CPU-heavy code?

🧠 Quiz — Q2

What does Pool.map(func, list) do?

🧠 Quiz — Q3

Why is if __name__ == "__main__": required for multiprocessing on Windows/Mac?

🧠 Quiz — Q4

Which tool is the best choice for checking primality of 10 million numbers as fast as possible?