Threading
Run multiple threads to do I/O-bound tasks simultaneously — and understand the GIL, thread safety, and when not to use threads.
"Threads are like workers sharing the same desk. They can work at the same time but must take turns using the same tools — which in Python means taking turns with the GIL."
— ShuraiWhat is a Thread?
A thread is a separate line of execution inside your program. All threads share the same memory and variables. Your Python program always has at least one thread (the main thread):
Download A, then B, then C.
Total time = sum of all.
Download A, B, C together.
Total time ≈ longest one.
Creating and Starting Threads
import threading
import time
def download(file_name, duration):
print(f"Starting download: {file_name}")
time.sleep(duration) # simulate network delay
print(f"Finished: {file_name}")
# Create threads — target is the function, args is a tuple
t1 = threading.Thread(target=download, args=("video.mp4", 3))
t2 = threading.Thread(target=download, args=("music.mp3", 2))
t3 = threading.Thread(target=download, args=("photo.jpg", 1))
t1.start() # launch each thread
t2.start()
t3.start()
t1.join() # wait for each to finish
t2.join()
t3.join()
print("All downloads complete!")
Starting download: video.mp4
Starting download: music.mp3
Starting download: photo.jpg
Finished: photo.jpg ← 1s
Finished: music.mp3 ← 2s
Finished: video.mp4 ← 3s
All downloads complete!
The GIL — Python's Important Limitation
Python has a Global Interpreter Lock (GIL). It allows only one thread to execute Python bytecode at a time. This means threads can’t truly run in parallel for CPU-heavy work:
Sleeping (I/O-bound work)
Running C extensions that release the GIL (like NumPy)
Sorting huge lists
CPU-bound number crunching
(Use multiprocessing instead)
ThreadPoolExecutor — The Modern Way
Instead of managing threads manually, use concurrent.futures.ThreadPoolExecutor. It handles creating, starting, and joining threads for you:
from concurrent.futures import ThreadPoolExecutor
import time
def fetch_price(stock):
time.sleep(1) # pretend API call
return {"INFY": 1800, "TCS": 3900, "WIPRO": 450}[stock]
stocks = ["INFY", "TCS", "WIPRO"]
# max_workers = number of threads in the pool
with ThreadPoolExecutor(max_workers=3) as executor:
prices = list(executor.map(fetch_price, stocks))
for stock, price in zip(stocks, prices):
print(f"{stock}: ₹{price}")
# All 3 fetched in ~1s instead of 3s
Thread Safety — Protecting Shared Data
When threads share data, race conditions can corrupt it. Use a Lock to ensure only one thread changes shared data at a time:
import threading
counter = 0
lock = threading.Lock()
def increment():
global counter
for _ in range(100_000):
with lock: # only one thread at a time
counter += 1
threads = [threading.Thread(target=increment) for _ in range(5)]
for t in threads: t.start()
for t in threads: t.join()
print(counter) # 500000 — correct! (without lock: unpredictable)
"Threading shines for I/O-bound work: downloading files, querying APIs, reading databases. For CPU-heavy work, reach for multiprocessing. And for I/O in modern async code, use asyncio."
— Shurai🧠 Quiz — Q1
What does thread.join() do?
🧠 Quiz — Q2
What is the GIL?
🧠 Quiz — Q3
You have 10 API calls that each take 1 second. Roughly how long does ThreadPoolExecutor(max_workers=10) take?
🧠 Quiz — Q4
Why should you use a Lock when multiple threads modify the same variable?