Is Python single-threaded or multi-threaded? Explain with examples. How does thread locking work in Python?
#python#threading#gil#concurrency#locks#multiprocessing
Answer
Python Threading: Single-Threaded vs Multi-Threaded
Python supports multi-threading via the
text
threadingThe Global Interpreter Lock (GIL)
The GIL is a mutex in CPython that allows only one thread to execute Python bytecode at a time, even on multi-core CPUs.
| Aspect | Single-Threaded | Multi-Threaded (with GIL) | Multi-Processing |
|---|---|---|---|
| Concurrency | None | Yes (concurrent, not parallel) | Yes (true parallelism) |
| CPU-bound tasks | Baseline | No speedup (GIL bottleneck) | Linear speedup |
| I/O-bound tasks | Blocking | Significant speedup | Speedup (but heavier) |
| Memory | Single space | Shared memory | Separate memory per process |
| Overhead | None | Low | High (process creation) |
Example 1: Multi-Threading for I/O-Bound Tasks
Threading shines when threads spend time waiting for I/O (API calls, file reads, network requests) — the GIL is released during I/O.
pythonimport threading import time import requests def fetch_url(url: str, results: list, index: int) -> None: """Fetch a URL - I/O bound task where threading helps.""" response = requests.get(url) results[index] = len(response.content) print(f"Thread {index}: {url} -> {len(response.content)} bytes") urls = [ "https://httpbin.org/delay/1", "https://httpbin.org/delay/1", "https://httpbin.org/delay/1", ] # --- Sequential (slow) --- start = time.time() for url in urls: requests.get(url) print(f"Sequential: {time.time() - start:.2f}s") # ~3 seconds # --- Multi-threaded (fast) --- start = time.time() results = [None] * len(urls) threads = [] for i, url in enumerate(urls): t = threading.Thread(target=fetch_url, args=(url, results, i)) threads.append(t) t.start() for t in threads: t.join() # Wait for all threads to complete print(f"Threaded: {time.time() - start:.2f}s") # ~1 second
Example 2: GIL Limitation for CPU-Bound Tasks
pythonimport threading import time def cpu_heavy_task(n: int) -> int: """CPU-bound task - GIL prevents parallel execution.""" total = 0 for i in range(n): total += i * i return total N = 10_000_000 # --- Sequential --- start = time.time() cpu_heavy_task(N) cpu_heavy_task(N) print(f"Sequential: {time.time() - start:.2f}s") # --- Multi-threaded (NOT faster due to GIL) --- start = time.time() t1 = threading.Thread(target=cpu_heavy_task, args=(N,)) t2 = threading.Thread(target=cpu_heavy_task, args=(N,)) t1.start() t2.start() t1.join() t2.join() print(f"Threaded: {time.time() - start:.2f}s") # Same or slower!
Key Insight: For CPU-bound work, use
instead oftextmultiprocessingto bypass the GIL.textthreading
Thread Locking in Python
When multiple threads access shared data, you need locks to prevent race conditions — where threads read/write data simultaneously and corrupt it.
Example 3: Race Condition Without Lock
pythonimport threading counter = 0 def increment_without_lock(n: int) -> None: """Unsafe: race condition on shared counter.""" global counter for _ in range(n): counter += 1 # NOT atomic: read -> increment -> write threads = [] for _ in range(10): t = threading.Thread(target=increment_without_lock, args=(100_000,)) threads.append(t) t.start() for t in threads: t.join() print(f"Expected: 1,000,000") print(f"Actual: {counter:,}") # Often less than 1,000,000!
Example 4: Thread-Safe with Lock
pythonimport threading counter = 0 lock = threading.Lock() def increment_with_lock(n: int) -> None: """Safe: lock protects shared counter.""" global counter for _ in range(n): with lock: # Acquires lock, releases automatically counter += 1 threads = [] for _ in range(10): t = threading.Thread(target=increment_with_lock, args=(100_000,)) threads.append(t) t.start() for t in threads: t.join() print(f"Expected: 1,000,000") print(f"Actual: {counter:,}") # Always 1,000,000
Types of Locks in Python
| Lock Type | Class | Use Case |
|---|---|---|
| Basic Lock | text | Simple mutual exclusion |
| Reentrant Lock | text | Same thread can acquire lock multiple times |
| Semaphore | text | Allow up to text |
| Event | text | Signal between threads (set/wait) |
| Condition | text | Wait for a condition with notify/wait |
Example 5: RLock and Semaphore
pythonimport threading import time # --- RLock: same thread can acquire multiple times --- rlock = threading.RLock() def recursive_task(depth: int) -> None: if depth <= 0: return with rlock: # Same thread re-acquires - no deadlock print(f"Depth {depth}, Thread {threading.current_thread().name}") recursive_task(depth - 1) recursive_task(3) # --- Semaphore: limit concurrent access --- semaphore = threading.Semaphore(3) # Max 3 concurrent threads def rate_limited_api_call(call_id: int) -> None: with semaphore: print(f"Call {call_id} started") time.sleep(1) # Simulate API call print(f"Call {call_id} done") threads = [threading.Thread(target=rate_limited_api_call, args=(i,)) for i in range(10)] for t in threads: t.start() for t in threads: t.join() # Only 3 calls run at a time
When to Use What in Gen AI Applications
| Scenario | Approach | Why |
|---|---|---|
| Calling multiple LLM APIs | text text | I/O-bound — GIL released during network wait |
| Embedding large document batches | text | CPU-bound preprocessing |
| Shared token counter across threads | text | Prevent race condition on counter |
| Rate-limiting API calls | text | Limit concurrent requests to text |
| Async RAG pipeline | text | Best for high-concurrency I/O patterns |
Best Practice: For modern Gen AI applications, prefer
overtextasynciofor I/O-bound concurrency, andtextthreading(or libraries liketextmultiprocessing) for CPU-bound parallelism. Use locks only when threads must share mutable state.textRay
Resources: