4.5 Free-Threading - Python 的真正多執行緒時代

2026-01-20

Python 3.13 開始提供實驗性的 Free-threading 支援，Python 3.14 正式將其升級為官方支援功能。這是 Python 歷史上最重要的並行處理改進之一。

什麼是 Free-Threading？

GIL 的歷史與限制

長久以來，CPython 使用 GIL（Global Interpreter Lock）來簡化記憶體管理和 C 擴展的開發。但這也意味著：

1傳統 Python（有 GIL）：
2┌─────────────────────────────────┐
3│  Thread 1  →  執行中             │
4│  Thread 2  →  等待 GIL...        │
5│  Thread 3  →  等待 GIL...        │
6│  Thread 4  →  等待 GIL...        │
7└─────────────────────────────────┘
8   同一時間只有一個執行緒能執行 Python 程式碼

1Free-threaded Python（無 GIL）：
2┌─────────────────────────────────┐
3│  Thread 1  →  執行中  (Core 1)   │
4│  Thread 2  →  執行中  (Core 2)   │
5│  Thread 3  →  執行中  (Core 3)   │
6│  Thread 4  →  執行中  (Core 4)   │
7└─────────────────────────────────┘
8   多個執行緒可以真正並行執行

發展歷程

版本	狀態	PEP
Python 3.13	實驗性支援	PEP 703
Python 3.14	正式支援	PEP 779
Python 3.15/3.16	可能成為預設	待定

安裝與啟用

各平台安裝方式

Windows / macOS

從 python.org 下載安裝程式，選擇「Customize installation」，勾選「Free threaded mode」。

Ubuntu / Debian

1# 使用 deadsnakes PPA
2sudo add-apt-repository ppa:deadsnakes/ppa
3sudo apt update
4sudo apt install python3.13-nogil
5# 或
6sudo apt install python3.14-nogil

安裝後可使用 python3.13t 或 python3.14t 執行。

從原始碼編譯

1./configure --disable-gil
2make -j$(nproc)
3sudo make install

確認安裝

1# 檢查版本資訊
2python3.14t -VV
3# 輸出應包含 "free-threading build"
4
5# 確認 GIL 狀態
6python3.14t -c "import sys; print('GIL enabled:', sys._is_gil_enabled())"
7# 應該輸出：GIL enabled: False

控制 GIL 狀態

1# 強制停用 GIL（即使有不相容模組）
2PYTHON_GIL=0 python3.14t script.py
3
4# 或使用命令列參數
5python3.14t -Xgil=0 script.py
6
7# 強制啟用 GIL（在 free-threaded 版本中）
8python3.14t -Xgil=1 script.py

效能實測數據

以下數據來自多個可信來源（Real Python、CodSpeed、Facebook Benchmarking）：

單執行緒 vs 多執行緒效能

場景	傳統 Python	Free-threaded	差異
單執行緒	1.44s	1.86s	慢 ~30% (3.13)
單執行緒	基準	慢 ~9%	(3.14 改善)
多執行緒 4 核	1.37s	0.39s	快 3.5x
Fibonacci 並行	1377ms	279ms	快 ~5x

關鍵數據

Python 3.13 單執行緒額外負擔：約 40%
Python 3.14 單執行緒額外負擔：約 5-10%（大幅改善）
多執行緒加速比：接近線性擴展（視任務而定）

重點：Free-threading 在單執行緒下有效能損失，但在多執行緒 CPU 密集任務中可獲得顯著加速。

適用場景判斷

適合使用 Free-threading

CPU 密集的並行計算：數學運算、資料處理
可分割的獨立任務：批次處理、平行搜尋
資料科學工作流程：大規模資料轉換
科學計算：模擬、數值分析

不適合使用 Free-threading

單執行緒應用：會有 5-10% 效能損失
I/O 密集任務：傳統 threading 已經足夠
大量使用尚未支援的 C 擴展：可能導致 GIL 被重新啟用
需要穩定性的生產環境：生態系統仍在成熟中

實際範例

範例 1：檢查是否在 Free-threaded 模式

 1import sys
 2
 3def is_free_threaded() -> bool:
 4    """檢查是否在 free-threaded 模式執行"""
 5    try:
 6        return not sys._is_gil_enabled()
 7    except AttributeError:
 8        # Python 3.12 或更早版本沒有這個函式
 9        return False
10
11def get_python_build_info() -> dict:
12    """取得 Python 建置資訊"""
13    return {
14        "version": sys.version,
15        "free_threaded": is_free_threaded(),
16        "gil_enabled": getattr(sys, '_is_gil_enabled', lambda: True)(),
17    }
18
19if __name__ == "__main__":
20    info = get_python_build_info()
21    print(f"Python 版本: {info['version']}")
22    print(f"Free-threaded: {info['free_threaded']}")
23    print(f"GIL 啟用: {info['gil_enabled']}")

範例 2：並行 CPU 計算

 1import threading
 2import time
 3import sys
 4
 5def cpu_intensive(n: int) -> int:
 6    """CPU 密集計算：計算平方和"""
 7    return sum(i * i for i in range(n))
 8
 9def sequential_compute(numbers: list[int]) -> list[int]:
10    """序列計算"""
11    return [cpu_intensive(n) for n in numbers]
12
13def parallel_compute(numbers: list[int]) -> list[int]:
14    """並行計算"""
15    results = [None] * len(numbers)
16
17    def worker(idx: int, n: int):
18        results[idx] = cpu_intensive(n)
19
20    threads = [
21        threading.Thread(target=worker, args=(i, n))
22        for i, n in enumerate(numbers)
23    ]
24
25    for t in threads:
26        t.start()
27    for t in threads:
28        t.join()
29
30    return results
31
32def benchmark():
33    """效能比較"""
34    numbers = [5_000_000] * 4
35
36    # 序列執行
37    start = time.perf_counter()
38    sequential_compute(numbers)
39    sequential_time = time.perf_counter() - start
40
41    # 並行執行
42    start = time.perf_counter()
43    parallel_compute(numbers)
44    parallel_time = time.perf_counter() - start
45
46    print(f"序列執行: {sequential_time:.3f}s")
47    print(f"並行執行: {parallel_time:.3f}s")
48    print(f"加速比: {sequential_time / parallel_time:.2f}x")
49
50    # 在傳統 Python 中，加速比接近 1（無改善）
51    # 在 Free-threaded Python 中，加速比接近 CPU 核心數
52
53if __name__ == "__main__":
54    try:
55        print(f"GIL 啟用: {sys._is_gil_enabled()}")
56    except AttributeError:
57        print("GIL 狀態: 無法檢測（舊版 Python）")
58
59    benchmark()

範例 3：使用 ThreadPoolExecutor

 1from concurrent.futures import ThreadPoolExecutor, as_completed
 2import time
 3import sys
 4
 5def process_chunk(chunk_id: int, size: int) -> dict:
 6    """處理一個資料區塊"""
 7    result = sum(i * i for i in range(size))
 8    return {"chunk_id": chunk_id, "result": result}
 9
10def parallel_process(num_chunks: int = 8, chunk_size: int = 2_000_000):
11    """並行處理多個資料區塊"""
12    start = time.perf_counter()
13
14    with ThreadPoolExecutor(max_workers=num_chunks) as executor:
15        futures = {
16            executor.submit(process_chunk, i, chunk_size): i
17            for i in range(num_chunks)
18        }
19
20        results = []
21        for future in as_completed(futures):
22            chunk_id = futures[future]
23            result = future.result()
24            results.append(result)
25            print(f"Chunk {chunk_id} 完成")
26
27    elapsed = time.perf_counter() - start
28    print(f"\n總耗時: {elapsed:.3f}s")
29    print(f"平均每個 chunk: {elapsed / num_chunks:.3f}s")
30
31    return results
32
33if __name__ == "__main__":
34    try:
35        print(f"Free-threaded 模式: {not sys._is_gil_enabled()}\n")
36    except AttributeError:
37        print("傳統 Python 模式\n")
38
39    parallel_process()

concurrent.interpreters 模組（Python 3.14 新增）

Python 3.14 引入了全新的 concurrent.interpreters 模組，提供了另一種並行方式。

什麼是多解釋器？

多解釋器（Multiple Interpreters）是在同一個進程中運行多個獨立的 Python 直譯器：

1┌─────────────────────────────────────────┐
2│              單一進程                     │
3│  ┌──────────┐  ┌──────────┐             │
4│  │ 解釋器 1  │  │ 解釋器 2  │             │
5│  │ (獨立)   │  │ (獨立)   │             │
6│  │ sys.path │  │ sys.path │  ← 完全隔離  │
7│  │ modules  │  │ modules  │             │
8│  └──────────┘  └──────────┘             │
9└─────────────────────────────────────────┘

基本用法

 1from concurrent.futures import InterpreterPoolExecutor
 2
 3def cpu_task(n: int) -> int:
 4    """在獨立解釋器中執行的任務"""
 5    return sum(i * i for i in range(n))
 6
 7if __name__ == "__main__":
 8    numbers = [1_000_000, 2_000_000, 3_000_000, 4_000_000]
 9
10    # 使用多解釋器池
11    with InterpreterPoolExecutor(max_workers=4) as executor:
12        results = list(executor.map(cpu_task, numbers))
13
14    print(f"結果: {results}")

多解釋器 vs 多進程 vs 多執行緒

特性	threading	multiprocessing	interpreters
隔離程度	共享記憶體	完全隔離	部分隔離
資源消耗	最低	最高	中等
啟動速度	最快	最慢	中等
通訊方式	直接存取	pickle/Queue	pickle
GIL 影響	受限（傳統）/ 無（Free-threaded）	無	無

何時使用多解釋器

需要隔離但不想付出多進程的代價
想要類似 CSP/Actor 模型的並行方式
需要在同一進程中運行不同配置的 Python 環境

已知問題與陷阱

來自 GitHub Issues 的真實案例

1. pathlib 的 race condition（#139001）

1# 在 3.14t 中可能有問題
2from pathlib import Path
3import threading
4
5path = Path("/some/path")
6
7def check_path():
8    # is_dir() 在多執行緒下可能有競爭條件
9    return path.is_dir()

2. click 套件的問題（#136248）

使用 click 套件時，在 free-threaded 模式下可能出現意外行為。

3. buffer interface 的資料競爭（#130977）

使用 memoryview 或其他 buffer interface 時需特別注意。

常見錯誤模式

 1# 錯誤：全域狀態未加鎖保護
 2cache = {}
 3
 4def get_cached(key):
 5    if key not in cache:
 6        cache[key] = expensive_compute(key)  # 競爭條件！
 7    return cache[key]
 8
 9# 正確：使用 Lock 保護
10import threading
11
12cache = {}
13cache_lock = threading.Lock()
14
15def get_cached_safe(key):
16    with cache_lock:
17        if key not in cache:
18            cache[key] = expensive_compute(key)
19        return cache[key]

 1# 錯誤：依賴內建型別的「隱式」執行緒安全
 2results = []
 3
 4def worker(n):
 5    result = compute(n)
 6    results.append(result)  # 在 free-threaded 中可能不安全
 7
 8# 正確：返回結果，由主執行緒收集
 9def worker_safe(n):
10    return compute(n)
11
12with ThreadPoolExecutor() as executor:
13    results = list(executor.map(worker_safe, items))

套件相容性現況（2025 年底）

已完全支援

套件	版本	備註
NumPy	2.1.0+	科學計算基礎
SciPy	1.15.0+	科學計算
pandas	2.2.3+	資料分析
PyTorch	2.6.0+	深度學習
scikit-learn	1.6.0+	機器學習
Pillow	11.0.0+	圖像處理
Matplotlib	3.9.0+	繪圖

部分支援或開發中

cryptography、h5py、polars
aiohttp、multidict、yarl
多個 aio-libs 套件

尚未支援

lxml、cupy 等特定套件
部分 C 擴展模組

追蹤最新狀態：py-free-threading.github.io/tracking

最佳實踐與建議

1. 漸進式採用

 1import sys
 2
 3def main():
 4    # 檢查執行環境
 5    free_threaded = getattr(sys, '_is_gil_enabled', lambda: True)() == False
 6
 7    if free_threaded:
 8        print("使用 Free-threaded 最佳化路徑")
 9        run_parallel_optimized()
10    else:
11        print("使用傳統多進程路徑")
12        run_multiprocess_fallback()

2. 明確使用同步原語

1import threading
2
3# 永遠明確使用 Lock，不要依賴「可能」的執行緒安全
4lock = threading.Lock()
5
6def thread_safe_operation():
7    with lock:
8        # 關鍵區段
9        pass

3. 測試策略

 1# 使用較短的執行緒切換間隔來暴露潛在的競爭條件
 2import sys
 3sys.setswitchinterval(0.0001)  # 測試時使用
 4
 5# 運行大量並行測試
 6import concurrent.futures
 7import random
 8
 9def stress_test(func, iterations=1000):
10    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
11        futures = [executor.submit(func) for _ in range(iterations)]
12        for f in concurrent.futures.as_completed(futures):
13            f.result()  # 會拋出任何異常

4. 檢查依賴套件相容性

 1def check_dependencies():
 2    """檢查關鍵依賴是否支援 free-threading"""
 3    import importlib.metadata
 4
 5    packages_to_check = ['numpy', 'pandas', 'scikit-learn']
 6    results = {}
 7
 8    for pkg in packages_to_check:
 9        try:
10            version = importlib.metadata.version(pkg)
11            results[pkg] = version
12        except importlib.metadata.PackageNotFoundError:
13            results[pkg] = "未安裝"
14
15    return results

未來展望

路線圖

Python 3.15：預期 Free-threading 將成為可選的預設選項
Python 3.16：可能成為真正的預設建置
長期：GIL 可能完全移除

社群呼籲

「Free-threaded 建置是這個語言的未來。現階段我們需要更多來自真實工作流程的回饋報告。」 — Quansight Labs

如果你正在使用 Free-threaded Python，歡迎：

回報問題到 Python Bug Tracker
參與 py-free-threading Discord 討論
測試你的套件並提交相容性報告

思考題

為什麼 Free-threading 在單執行緒下會有效能損失？這個損失從 40% 降到 9% 是如何達成的？
什麼情況下應該使用 InterpreterPoolExecutor 而不是 ThreadPoolExecutor？
如果你的程式依賴一個尚未支援 Free-threading 的套件，有什麼替代方案？

實作練習

寫一個程式，比較在傳統 Python 和 Free-threaded Python 下的多執行緒效能差異
使用 InterpreterPoolExecutor 實作一個簡單的任務佇列系統
為一個現有的單執行緒程式添加 Free-threading 支援，並處理執行緒安全問題

先備知識

入門系列並行處理 - threading、multiprocessing 基礎

上一章：GIL 與執行緒模型 下一模組：模組五：用 C 擴展 Python

#python #python-advanced #cpython #free-threading #gil