<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Multiprocessing on Tarragon</title><link>https://tarrragon.github.io/blog/tags/multiprocessing/</link><description>Recent content in Multiprocessing on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Wed, 21 Jan 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/multiprocessing/index.xml" rel="self" type="application/rss+xml"/><item><title>8.1 並行處理實戰</title><link>https://tarrragon.github.io/blog/python-advanced/08-practical-optimization/parallel-processing/</link><pubDate>Wed, 21 Jan 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/python-advanced/08-practical-optimization/parallel-processing/</guid><description>&lt;p>本章將入門系列學到的 &lt;code>concurrent.futures&lt;/code> 知識，應用於 &lt;code>.claude/lib&lt;/code> 中的真實程式碼，展示如何識別並行化機會、實作並行版本，以及測量效能差異。&lt;/p>
&lt;h2 id="學習目標">學習目標&lt;/h2>
&lt;p>完成本章後，你將能夠：&lt;/p>
&lt;ol>
&lt;li>&lt;strong>識別並行化機會&lt;/strong>：判斷一段程式碼是否適合並行化&lt;/li>
&lt;li>&lt;strong>應用 ThreadPoolExecutor&lt;/strong>：將循序處理改寫為並行版本&lt;/li>
&lt;li>&lt;strong>實作進度報告&lt;/strong>：使用 &lt;code>as_completed()&lt;/code> 追蹤任務完成狀態&lt;/li>
&lt;li>&lt;strong>處理並行錯誤&lt;/strong>：避免單一任務失敗影響整體執行&lt;/li>
&lt;li>&lt;strong>測量效能差異&lt;/strong>：用數據證明優化效果&lt;/li>
&lt;/ol>
&lt;h2 id="先備知識">先備知識&lt;/h2>
&lt;p>本章假設你已經讀過入門系列的並行處理章節：&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://tarrragon.github.io/blog/python/03-stdlib/concurrency/" data-link-title="3.7 並行處理 - threading、multiprocessing、concurrent.futures" data-link-desc="Python 並行處理的三種方式與選擇指南">3.7 並行處理&lt;/a> - &lt;code>concurrent.futures&lt;/code> 基本用法&lt;/li>
&lt;/ul>
&lt;p>如果你對 &lt;code>ThreadPoolExecutor&lt;/code>、&lt;code>executor.map()&lt;/code>、&lt;code>as_completed()&lt;/code> 還不熟悉，建議先閱讀該章節。&lt;/p>
&lt;hr>
&lt;h2 id="識別並行化機會">識別並行化機會&lt;/h2>
&lt;h3 id="io-密集-vs-cpu-密集快速判斷法">I/O 密集 vs CPU 密集：快速判斷法&lt;/h3>
&lt;p>在入門系列中，我們學到 I/O 密集任務適合 &lt;code>ThreadPoolExecutor&lt;/code>。但實際程式碼中，如何快速判斷？&lt;/p>
&lt;h4 id="問自己這個問題程式在等什麼">問自己這個問題：程式在等什麼？&lt;/h4>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># 模式 1：等待外部資源（I/O 密集）&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="n">response&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">requests&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">url&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="c1"># 等網路&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="n">content&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">file&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">read&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="c1"># 等磁碟&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="n">result&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">cursor&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">execute&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">query&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="c1"># 等資料庫&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">&lt;span class="c1"># 模式 2：純計算（CPU 密集）&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">&lt;span class="n">result&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="nb">sum&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">i&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="n">i&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">i&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="nb">range&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="mi">10_000_000&lt;/span>&lt;span class="p">))&lt;/span> &lt;span class="c1"># 沒有等待，純運算&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="獨立任務的識別">獨立任務的識別&lt;/h3>
&lt;p>並行化的前提是&lt;strong>任務之間互相獨立&lt;/strong>。識別方法：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">獨立性檢查清單：
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">[ ] 任務之間沒有共享可變狀態？
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">[ ] 任務執行順序不影響結果？
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">[ ] 任務 B 不依賴任務 A 的輸出？
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">如果三個都勾選，就適合並行化。&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="真實案例markdown_link_checkerpy">真實案例：markdown_link_checker.py&lt;/h3>
&lt;p>讓我們看一個真實的例子。以下是 &lt;code>markdown_link_checker.py&lt;/code> 中的 &lt;code>check_directory()&lt;/code> 方法：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="c1"># 原始版本：循序處理&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">check_directory&lt;/span>&lt;span class="p">(&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl"> &lt;span class="bp">self&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl"> &lt;span class="n">dir_path&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl"> &lt;span class="n">recursive&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">bool&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kc">True&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">&lt;span class="p">)&lt;/span> &lt;span class="o">-&amp;gt;&lt;/span> &lt;span class="n">List&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">LinkCheckResult&lt;/span>&lt;span class="p">]:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl"> &lt;span class="s2">&amp;#34;&amp;#34;&amp;#34;檢查目錄下所有 Markdown 檔案&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl"> &lt;span class="n">dir_path&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="bp">self&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">_resolve_path&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dir_path&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl"> &lt;span class="c1"># 收集所有 .md 檔案&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">recursive&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl"> &lt;span class="n">md_files&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="nb">sorted&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dir_path&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">rglob&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;*.md&amp;#34;&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl"> &lt;span class="k">else&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">14&lt;/span>&lt;span class="cl"> &lt;span class="n">md_files&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="nb">sorted&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dir_path&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">glob&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;*.md&amp;#34;&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">15&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">16&lt;/span>&lt;span class="cl"> &lt;span class="c1"># 循序檢查每個檔案&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">17&lt;/span>&lt;span class="cl"> &lt;span class="n">results&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">18&lt;/span>&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="n">md_file&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">md_files&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">19&lt;/span>&lt;span class="cl"> &lt;span class="n">results&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">append&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">check_file&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">str&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">md_file&lt;/span>&lt;span class="p">)))&lt;/span> &lt;span class="c1"># 一個接一個&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">20&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">21&lt;/span>&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="n">results&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>這段程式碼適合並行化嗎？讓我們用檢查清單分析：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>問題&lt;/th>
 &lt;th>分析&lt;/th>
 &lt;th>結論&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>任務之間有共享狀態？&lt;/td>
 &lt;td>&lt;code>results&lt;/code> 只在主執行緒操作&lt;/td>
 &lt;td>沒有&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>執行順序影響結果？&lt;/td>
 &lt;td>檢查檔案 A 不影響檔案 B&lt;/td>
 &lt;td>不影響&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>任務互相依賴？&lt;/td>
 &lt;td>每個檔案獨立檢查&lt;/td>
 &lt;td>不依賴&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>是 I/O 密集？&lt;/td>
 &lt;td>&lt;code>check_file()&lt;/code> 讀取檔案內容&lt;/td>
 &lt;td>是&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>結論：&lt;strong>非常適合並行化&lt;/strong>。&lt;/p>
&lt;hr>
&lt;h2 id="threadpoolexecutor-實戰">ThreadPoolExecutor 實戰&lt;/h2>
&lt;h3 id="步驟-1確認可並行化的函式">步驟 1：確認可並行化的函式&lt;/h3>
&lt;p>首先，確認被並行化的函式是「純函式」或至少沒有副作用：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">check_file&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">file_path&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">-&amp;gt;&lt;/span> &lt;span class="n">LinkCheckResult&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl"> &lt;span class="s2">&amp;#34;&amp;#34;&amp;#34;
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">&lt;span class="s2"> 檢查單個 Markdown 檔案的連結
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl">&lt;span class="s2">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">&lt;span class="s2"> 特性：
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">&lt;span class="s2"> - 輸入：檔案路徑
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl">&lt;span class="s2"> - 輸出：檢查結果
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl">&lt;span class="s2"> - 沒有修改外部狀態
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">&lt;span class="s2"> - 沒有依賴執行順序
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl">&lt;span class="s2"> &amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl"> &lt;span class="c1"># ... 實作省略&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>check_file()&lt;/code> 滿足條件：&lt;/p>
&lt;ul>
&lt;li>只讀取檔案，不修改&lt;/li>
&lt;li>返回獨立的結果物件&lt;/li>
&lt;li>不依賴其他檔案的結果&lt;/li>
&lt;/ul>
&lt;h3 id="步驟-2改寫為並行版本">步驟 2：改寫為並行版本&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">concurrent.futures&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">ThreadPoolExecutor&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">as_completed&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">check_directory_parallel&lt;/span>&lt;span class="p">(&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl"> &lt;span class="bp">self&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl"> &lt;span class="n">dir_path&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl"> &lt;span class="n">recursive&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">bool&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kc">True&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl"> &lt;span class="n">max_workers&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">int&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mi">8&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl">&lt;span class="p">)&lt;/span> &lt;span class="o">-&amp;gt;&lt;/span> &lt;span class="n">List&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">LinkCheckResult&lt;/span>&lt;span class="p">]:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl"> &lt;span class="s2">&amp;#34;&amp;#34;&amp;#34;並行檢查目錄下所有 Markdown 檔案&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl"> &lt;span class="n">dir_path&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="bp">self&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">_resolve_path&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dir_path&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl"> &lt;span class="c1"># 收集所有 .md 檔案&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">recursive&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">14&lt;/span>&lt;span class="cl"> &lt;span class="n">md_files&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="nb">sorted&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dir_path&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">rglob&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;*.md&amp;#34;&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">15&lt;/span>&lt;span class="cl"> &lt;span class="k">else&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">16&lt;/span>&lt;span class="cl"> &lt;span class="n">md_files&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="nb">sorted&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">dir_path&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">glob&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;*.md&amp;#34;&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">17&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">18&lt;/span>&lt;span class="cl"> &lt;span class="c1"># 並行檢查&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">19&lt;/span>&lt;span class="cl"> &lt;span class="n">results&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">20&lt;/span>&lt;span class="cl"> &lt;span class="k">with&lt;/span> &lt;span class="n">ThreadPoolExecutor&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">max_workers&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="n">max_workers&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="k">as&lt;/span> &lt;span class="n">executor&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">21&lt;/span>&lt;span class="cl"> &lt;span class="c1"># 提交所有任務&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">22&lt;/span>&lt;span class="cl"> &lt;span class="n">futures&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">23&lt;/span>&lt;span class="cl"> &lt;span class="n">executor&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">submit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">check_file&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">md_file&lt;/span>&lt;span class="p">)):&lt;/span> &lt;span class="n">md_file&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">24&lt;/span>&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="n">md_file&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">md_files&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">25&lt;/span>&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">26&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">27&lt;/span>&lt;span class="cl"> &lt;span class="c1"># 收集結果&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">28&lt;/span>&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="n">future&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">as_completed&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">futures&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">29&lt;/span>&lt;span class="cl"> &lt;span class="n">results&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">append&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">future&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">result&lt;/span>&lt;span class="p">())&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">30&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">31&lt;/span>&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="n">results&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="步驟-3選擇-max_workers">步驟 3：選擇 max_workers&lt;/h3>
&lt;p>&lt;code>max_workers&lt;/code> 的選擇影響效能：&lt;/p></description><content:encoded><![CDATA[<p>本章將入門系列學到的 <code>concurrent.futures</code> 知識，應用於 <code>.claude/lib</code> 中的真實程式碼，展示如何識別並行化機會、實作並行版本，以及測量效能差異。</p>
<h2 id="學習目標">學習目標</h2>
<p>完成本章後，你將能夠：</p>
<ol>
<li><strong>識別並行化機會</strong>：判斷一段程式碼是否適合並行化</li>
<li><strong>應用 ThreadPoolExecutor</strong>：將循序處理改寫為並行版本</li>
<li><strong>實作進度報告</strong>：使用 <code>as_completed()</code> 追蹤任務完成狀態</li>
<li><strong>處理並行錯誤</strong>：避免單一任務失敗影響整體執行</li>
<li><strong>測量效能差異</strong>：用數據證明優化效果</li>
</ol>
<h2 id="先備知識">先備知識</h2>
<p>本章假設你已經讀過入門系列的並行處理章節：</p>
<ul>
<li><a href="/blog/python/03-stdlib/concurrency/" data-link-title="3.7 並行處理 - threading、multiprocessing、concurrent.futures" data-link-desc="Python 並行處理的三種方式與選擇指南">3.7 並行處理</a> - <code>concurrent.futures</code> 基本用法</li>
</ul>
<p>如果你對 <code>ThreadPoolExecutor</code>、<code>executor.map()</code>、<code>as_completed()</code> 還不熟悉，建議先閱讀該章節。</p>
<hr>
<h2 id="識別並行化機會">識別並行化機會</h2>
<h3 id="io-密集-vs-cpu-密集快速判斷法">I/O 密集 vs CPU 密集：快速判斷法</h3>
<p>在入門系列中，我們學到 I/O 密集任務適合 <code>ThreadPoolExecutor</code>。但實際程式碼中，如何快速判斷？</p>
<h4 id="問自己這個問題程式在等什麼">問自己這個問題：程式在等什麼？</h4>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 模式 1：等待外部資源（I/O 密集）</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>      <span class="c1"># 等網路</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="n">content</span> <span class="o">=</span> <span class="n">file</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>             <span class="c1"># 等磁碟</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="n">result</span> <span class="o">=</span> <span class="n">cursor</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>    <span class="c1"># 等資料庫</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">
</span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1"># 模式 2：純計算（CPU 密集）</span>
</span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="n">result</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">i</span> <span class="o">*</span> <span class="n">i</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10_000_000</span><span class="p">))</span>  <span class="c1"># 沒有等待，純運算</span></span></span></code></pre></div><h3 id="獨立任務的識別">獨立任務的識別</h3>
<p>並行化的前提是<strong>任務之間互相獨立</strong>。識別方法：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">獨立性檢查清單：
</span></span><span class="line"><span class="ln">2</span><span class="cl">[ ] 任務之間沒有共享可變狀態？
</span></span><span class="line"><span class="ln">3</span><span class="cl">[ ] 任務執行順序不影響結果？
</span></span><span class="line"><span class="ln">4</span><span class="cl">[ ] 任務 B 不依賴任務 A 的輸出？
</span></span><span class="line"><span class="ln">5</span><span class="cl">
</span></span><span class="line"><span class="ln">6</span><span class="cl">如果三個都勾選，就適合並行化。</span></span></code></pre></div><h3 id="真實案例markdown_link_checkerpy">真實案例：markdown_link_checker.py</h3>
<p>讓我們看一個真實的例子。以下是 <code>markdown_link_checker.py</code> 中的 <code>check_directory()</code> 方法：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># 原始版本：循序處理</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="k">def</span> <span class="nf">check_directory</span><span class="p">(</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">    <span class="bp">self</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">    <span class="n">dir_path</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">    <span class="n">recursive</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">True</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">List</span><span class="p">[</span><span class="n">LinkCheckResult</span><span class="p">]:</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">    <span class="s2">&#34;&#34;&#34;檢查目錄下所有 Markdown 檔案&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">    <span class="n">dir_path</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_resolve_path</span><span class="p">(</span><span class="n">dir_path</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">
</span></span><span class="line"><span class="ln">10</span><span class="cl">    <span class="c1"># 收集所有 .md 檔案</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">    <span class="k">if</span> <span class="n">recursive</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">        <span class="n">md_files</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">dir_path</span><span class="o">.</span><span class="n">rglob</span><span class="p">(</span><span class="s2">&#34;*.md&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">    <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">        <span class="n">md_files</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">dir_path</span><span class="o">.</span><span class="n">glob</span><span class="p">(</span><span class="s2">&#34;*.md&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">
</span></span><span class="line"><span class="ln">16</span><span class="cl">    <span class="c1"># 循序檢查每個檔案</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">    <span class="n">results</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="ln">18</span><span class="cl">    <span class="k">for</span> <span class="n">md_file</span> <span class="ow">in</span> <span class="n">md_files</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">19</span><span class="cl">        <span class="n">results</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">check_file</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">md_file</span><span class="p">)))</span>  <span class="c1"># 一個接一個</span>
</span></span><span class="line"><span class="ln">20</span><span class="cl">
</span></span><span class="line"><span class="ln">21</span><span class="cl">    <span class="k">return</span> <span class="n">results</span></span></span></code></pre></div><p>這段程式碼適合並行化嗎？讓我們用檢查清單分析：</p>
<table>
  <thead>
      <tr>
          <th>問題</th>
          <th>分析</th>
          <th>結論</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>任務之間有共享狀態？</td>
          <td><code>results</code> 只在主執行緒操作</td>
          <td>沒有</td>
      </tr>
      <tr>
          <td>執行順序影響結果？</td>
          <td>檢查檔案 A 不影響檔案 B</td>
          <td>不影響</td>
      </tr>
      <tr>
          <td>任務互相依賴？</td>
          <td>每個檔案獨立檢查</td>
          <td>不依賴</td>
      </tr>
      <tr>
          <td>是 I/O 密集？</td>
          <td><code>check_file()</code> 讀取檔案內容</td>
          <td>是</td>
      </tr>
  </tbody>
</table>
<p>結論：<strong>非常適合並行化</strong>。</p>
<hr>
<h2 id="threadpoolexecutor-實戰">ThreadPoolExecutor 實戰</h2>
<h3 id="步驟-1確認可並行化的函式">步驟 1：確認可並行化的函式</h3>
<p>首先，確認被並行化的函式是「純函式」或至少沒有副作用：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">def</span> <span class="nf">check_file</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">file_path</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">LinkCheckResult</span><span class="p">:</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="s2">    檢查單個 Markdown 檔案的連結
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="s2">    特性：
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="s2">    - 輸入：檔案路徑
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="s2">    - 輸出：檢查結果
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="s2">    - 沒有修改外部狀態
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="s2">    - 沒有依賴執行順序
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">    <span class="c1"># ... 實作省略</span></span></span></code></pre></div><p><code>check_file()</code> 滿足條件：</p>
<ul>
<li>只讀取檔案，不修改</li>
<li>返回獨立的結果物件</li>
<li>不依賴其他檔案的結果</li>
</ul>
<h3 id="步驟-2改寫為並行版本">步驟 2：改寫為並行版本</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kn">from</span> <span class="nn">concurrent.futures</span> <span class="kn">import</span> <span class="n">ThreadPoolExecutor</span><span class="p">,</span> <span class="n">as_completed</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="k">def</span> <span class="nf">check_directory_parallel</span><span class="p">(</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">    <span class="bp">self</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">    <span class="n">dir_path</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="n">recursive</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">True</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">    <span class="n">max_workers</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">8</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">List</span><span class="p">[</span><span class="n">LinkCheckResult</span><span class="p">]:</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">    <span class="s2">&#34;&#34;&#34;並行檢查目錄下所有 Markdown 檔案&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">    <span class="n">dir_path</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_resolve_path</span><span class="p">(</span><span class="n">dir_path</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">
</span></span><span class="line"><span class="ln">12</span><span class="cl">    <span class="c1"># 收集所有 .md 檔案</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">    <span class="k">if</span> <span class="n">recursive</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">        <span class="n">md_files</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">dir_path</span><span class="o">.</span><span class="n">rglob</span><span class="p">(</span><span class="s2">&#34;*.md&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">    <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">16</span><span class="cl">        <span class="n">md_files</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">dir_path</span><span class="o">.</span><span class="n">glob</span><span class="p">(</span><span class="s2">&#34;*.md&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">
</span></span><span class="line"><span class="ln">18</span><span class="cl">    <span class="c1"># 並行檢查</span>
</span></span><span class="line"><span class="ln">19</span><span class="cl">    <span class="n">results</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="ln">20</span><span class="cl">    <span class="k">with</span> <span class="n">ThreadPoolExecutor</span><span class="p">(</span><span class="n">max_workers</span><span class="o">=</span><span class="n">max_workers</span><span class="p">)</span> <span class="k">as</span> <span class="n">executor</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">21</span><span class="cl">        <span class="c1"># 提交所有任務</span>
</span></span><span class="line"><span class="ln">22</span><span class="cl">        <span class="n">futures</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">23</span><span class="cl">            <span class="n">executor</span><span class="o">.</span><span class="n">submit</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">check_file</span><span class="p">,</span> <span class="nb">str</span><span class="p">(</span><span class="n">md_file</span><span class="p">)):</span> <span class="n">md_file</span>
</span></span><span class="line"><span class="ln">24</span><span class="cl">            <span class="k">for</span> <span class="n">md_file</span> <span class="ow">in</span> <span class="n">md_files</span>
</span></span><span class="line"><span class="ln">25</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">26</span><span class="cl">
</span></span><span class="line"><span class="ln">27</span><span class="cl">        <span class="c1"># 收集結果</span>
</span></span><span class="line"><span class="ln">28</span><span class="cl">        <span class="k">for</span> <span class="n">future</span> <span class="ow">in</span> <span class="n">as_completed</span><span class="p">(</span><span class="n">futures</span><span class="p">):</span>
</span></span><span class="line"><span class="ln">29</span><span class="cl">            <span class="n">results</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">future</span><span class="o">.</span><span class="n">result</span><span class="p">())</span>
</span></span><span class="line"><span class="ln">30</span><span class="cl">
</span></span><span class="line"><span class="ln">31</span><span class="cl">    <span class="k">return</span> <span class="n">results</span></span></span></code></pre></div><h3 id="步驟-3選擇-max_workers">步驟 3：選擇 max_workers</h3>
<p><code>max_workers</code> 的選擇影響效能：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kn">import</span> <span class="nn">os</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="c1"># I/O 密集任務：可以設定較高</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="c1"># 經驗法則：CPU 核心數的 2-4 倍</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="n">max_workers</span> <span class="o">=</span> <span class="nb">min</span><span class="p">(</span><span class="mi">32</span><span class="p">,</span> <span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">cpu_count</span><span class="p">()</span> <span class="ow">or</span> <span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="mi">4</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">
</span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="c1"># 或根據檔案數量動態調整</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="k">def</span> <span class="nf">get_optimal_workers</span><span class="p">(</span><span class="n">file_count</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">int</span><span class="p">:</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">    <span class="s2">&#34;&#34;&#34;根據檔案數量決定 worker 數量&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">    <span class="n">cpu_count</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">cpu_count</span><span class="p">()</span> <span class="ow">or</span> <span class="mi">1</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">
</span></span><span class="line"><span class="ln">12</span><span class="cl">    <span class="c1"># 檔案少時不需要太多 worker</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">    <span class="k">if</span> <span class="n">file_count</span> <span class="o">&lt;</span> <span class="mi">10</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">        <span class="k">return</span> <span class="nb">min</span><span class="p">(</span><span class="n">file_count</span><span class="p">,</span> <span class="n">cpu_count</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">
</span></span><span class="line"><span class="ln">16</span><span class="cl">    <span class="c1"># 檔案多時，I/O 密集可以多開一些</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">    <span class="k">return</span> <span class="nb">min</span><span class="p">(</span><span class="mi">32</span><span class="p">,</span> <span class="n">cpu_count</span> <span class="o">*</span> <span class="mi">2</span><span class="p">)</span></span></span></code></pre></div><h4 id="為什麼-io-密集可以超過-cpu-核心數">為什麼 I/O 密集可以超過 CPU 核心數？</h4>
<p>因為執行緒在等待 I/O 時會釋放 GIL，其他執行緒可以繼續執行。如果有 8 個核心，但每個任務有 80% 時間在等待 I/O，那開 16-32 個 worker 可以更充分利用 CPU。</p>
<hr>
<h2 id="進度報告與錯誤處理">進度報告與錯誤處理</h2>
<h3 id="使用-as_completed-報告進度">使用 as_completed() 報告進度</h3>
<p><code>as_completed()</code> 返回一個迭代器，任務完成時立即 yield，適合顯示進度：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kn">from</span> <span class="nn">concurrent.futures</span> <span class="kn">import</span> <span class="n">ThreadPoolExecutor</span><span class="p">,</span> <span class="n">as_completed</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">Callable</span><span class="p">,</span> <span class="n">Optional</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="k">def</span> <span class="nf">check_directory_with_progress</span><span class="p">(</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">    <span class="bp">self</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="n">dir_path</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">    <span class="n">recursive</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">True</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">    <span class="n">max_workers</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">8</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">    <span class="n">progress_callback</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Callable</span><span class="p">[[</span><span class="nb">int</span><span class="p">,</span> <span class="nb">int</span><span class="p">,</span> <span class="nb">str</span><span class="p">],</span> <span class="kc">None</span><span class="p">]]</span> <span class="o">=</span> <span class="kc">None</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">List</span><span class="p">[</span><span class="n">LinkCheckResult</span><span class="p">]:</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="s2">    並行檢查目錄，支援進度回報
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="s2">    Args:
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="s2">        progress_callback: 回呼函式 (completed, total, current_file)
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">    <span class="n">dir_path</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_resolve_path</span><span class="p">(</span><span class="n">dir_path</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">18</span><span class="cl">
</span></span><span class="line"><span class="ln">19</span><span class="cl">    <span class="k">if</span> <span class="n">recursive</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">20</span><span class="cl">        <span class="n">md_files</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">dir_path</span><span class="o">.</span><span class="n">rglob</span><span class="p">(</span><span class="s2">&#34;*.md&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">21</span><span class="cl">    <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">22</span><span class="cl">        <span class="n">md_files</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">dir_path</span><span class="o">.</span><span class="n">glob</span><span class="p">(</span><span class="s2">&#34;*.md&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">23</span><span class="cl">
</span></span><span class="line"><span class="ln">24</span><span class="cl">    <span class="n">total</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">md_files</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">25</span><span class="cl">    <span class="n">results</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="ln">26</span><span class="cl">
</span></span><span class="line"><span class="ln">27</span><span class="cl">    <span class="k">with</span> <span class="n">ThreadPoolExecutor</span><span class="p">(</span><span class="n">max_workers</span><span class="o">=</span><span class="n">max_workers</span><span class="p">)</span> <span class="k">as</span> <span class="n">executor</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">28</span><span class="cl">        <span class="c1"># 建立 future -&gt; 檔案 的映射</span>
</span></span><span class="line"><span class="ln">29</span><span class="cl">        <span class="n">futures</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">30</span><span class="cl">            <span class="n">executor</span><span class="o">.</span><span class="n">submit</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">check_file</span><span class="p">,</span> <span class="nb">str</span><span class="p">(</span><span class="n">f</span><span class="p">)):</span> <span class="n">f</span>
</span></span><span class="line"><span class="ln">31</span><span class="cl">            <span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">md_files</span>
</span></span><span class="line"><span class="ln">32</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">33</span><span class="cl">
</span></span><span class="line"><span class="ln">34</span><span class="cl">        <span class="n">completed</span> <span class="o">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="ln">35</span><span class="cl">        <span class="k">for</span> <span class="n">future</span> <span class="ow">in</span> <span class="n">as_completed</span><span class="p">(</span><span class="n">futures</span><span class="p">):</span>
</span></span><span class="line"><span class="ln">36</span><span class="cl">            <span class="n">completed</span> <span class="o">+=</span> <span class="mi">1</span>
</span></span><span class="line"><span class="ln">37</span><span class="cl">            <span class="n">current_file</span> <span class="o">=</span> <span class="n">futures</span><span class="p">[</span><span class="n">future</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">38</span><span class="cl">
</span></span><span class="line"><span class="ln">39</span><span class="cl">            <span class="c1"># 回報進度</span>
</span></span><span class="line"><span class="ln">40</span><span class="cl">            <span class="k">if</span> <span class="n">progress_callback</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">41</span><span class="cl">                <span class="n">progress_callback</span><span class="p">(</span><span class="n">completed</span><span class="p">,</span> <span class="n">total</span><span class="p">,</span> <span class="nb">str</span><span class="p">(</span><span class="n">current_file</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">42</span><span class="cl">
</span></span><span class="line"><span class="ln">43</span><span class="cl">            <span class="n">results</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">future</span><span class="o">.</span><span class="n">result</span><span class="p">())</span>
</span></span><span class="line"><span class="ln">44</span><span class="cl">
</span></span><span class="line"><span class="ln">45</span><span class="cl">    <span class="k">return</span> <span class="n">results</span>
</span></span><span class="line"><span class="ln">46</span><span class="cl">
</span></span><span class="line"><span class="ln">47</span><span class="cl"><span class="c1"># 使用範例</span>
</span></span><span class="line"><span class="ln">48</span><span class="cl"><span class="k">def</span> <span class="nf">print_progress</span><span class="p">(</span><span class="n">completed</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">total</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">current_file</span><span class="p">:</span> <span class="nb">str</span><span class="p">):</span>
</span></span><span class="line"><span class="ln">49</span><span class="cl">    <span class="n">percent</span> <span class="o">=</span> <span class="p">(</span><span class="n">completed</span> <span class="o">/</span> <span class="n">total</span><span class="p">)</span> <span class="o">*</span> <span class="mi">100</span>
</span></span><span class="line"><span class="ln">50</span><span class="cl">    <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;</span><span class="se">\r</span><span class="s2">[</span><span class="si">{</span><span class="n">completed</span><span class="si">}</span><span class="s2">/</span><span class="si">{</span><span class="n">total</span><span class="si">}</span><span class="s2">] </span><span class="si">{</span><span class="n">percent</span><span class="si">:</span><span class="s2">.1f</span><span class="si">}</span><span class="s2">% - </span><span class="si">{</span><span class="n">current_file</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">,</span> <span class="n">end</span><span class="o">=</span><span class="s2">&#34;&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">51</span><span class="cl">
</span></span><span class="line"><span class="ln">52</span><span class="cl"><span class="n">checker</span> <span class="o">=</span> <span class="n">MarkdownLinkChecker</span><span class="p">()</span>
</span></span><span class="line"><span class="ln">53</span><span class="cl"><span class="n">results</span> <span class="o">=</span> <span class="n">checker</span><span class="o">.</span><span class="n">check_directory_with_progress</span><span class="p">(</span>
</span></span><span class="line"><span class="ln">54</span><span class="cl">    <span class="s2">&#34;docs/&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">55</span><span class="cl">    <span class="n">progress_callback</span><span class="o">=</span><span class="n">print_progress</span>
</span></span><span class="line"><span class="ln">56</span><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="ln">57</span><span class="cl"><span class="nb">print</span><span class="p">()</span>  <span class="c1"># 換行</span></span></span></code></pre></div><h3 id="錯誤處理不讓單一失敗拖垮全部">錯誤處理：不讓單一失敗拖垮全部</h3>
<p>並行處理時，單一任務的異常不應該中斷其他任務：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">def</span> <span class="nf">check_directory_robust</span><span class="p">(</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">    <span class="bp">self</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">    <span class="n">dir_path</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">    <span class="n">max_workers</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">8</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Tuple</span><span class="p">[</span><span class="n">List</span><span class="p">[</span><span class="n">LinkCheckResult</span><span class="p">],</span> <span class="n">List</span><span class="p">[</span><span class="n">Dict</span><span class="p">]]:</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="s2">&#34;&#34;&#34;
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="s2">    並行檢查，分開返回成功結果和錯誤
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="s2">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="s2">    Returns:
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="s2">        (成功的結果列表, 錯誤資訊列表)
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="s2">    &#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">    <span class="n">dir_path</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_resolve_path</span><span class="p">(</span><span class="n">dir_path</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">    <span class="n">md_files</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">dir_path</span><span class="o">.</span><span class="n">rglob</span><span class="p">(</span><span class="s2">&#34;*.md&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">
</span></span><span class="line"><span class="ln">15</span><span class="cl">    <span class="n">results</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="ln">16</span><span class="cl">    <span class="n">errors</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">
</span></span><span class="line"><span class="ln">18</span><span class="cl">    <span class="k">with</span> <span class="n">ThreadPoolExecutor</span><span class="p">(</span><span class="n">max_workers</span><span class="o">=</span><span class="n">max_workers</span><span class="p">)</span> <span class="k">as</span> <span class="n">executor</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">19</span><span class="cl">        <span class="n">futures</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">20</span><span class="cl">            <span class="n">executor</span><span class="o">.</span><span class="n">submit</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">check_file</span><span class="p">,</span> <span class="nb">str</span><span class="p">(</span><span class="n">f</span><span class="p">)):</span> <span class="n">f</span>
</span></span><span class="line"><span class="ln">21</span><span class="cl">            <span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">md_files</span>
</span></span><span class="line"><span class="ln">22</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">23</span><span class="cl">
</span></span><span class="line"><span class="ln">24</span><span class="cl">        <span class="k">for</span> <span class="n">future</span> <span class="ow">in</span> <span class="n">as_completed</span><span class="p">(</span><span class="n">futures</span><span class="p">):</span>
</span></span><span class="line"><span class="ln">25</span><span class="cl">            <span class="n">file_path</span> <span class="o">=</span> <span class="n">futures</span><span class="p">[</span><span class="n">future</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">26</span><span class="cl">            <span class="k">try</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">27</span><span class="cl">                <span class="n">result</span> <span class="o">=</span> <span class="n">future</span><span class="o">.</span><span class="n">result</span><span class="p">()</span>
</span></span><span class="line"><span class="ln">28</span><span class="cl">                <span class="n">results</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">29</span><span class="cl">            <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">30</span><span class="cl">                <span class="c1"># 記錄錯誤，繼續處理其他檔案</span>
</span></span><span class="line"><span class="ln">31</span><span class="cl">                <span class="n">errors</span><span class="o">.</span><span class="n">append</span><span class="p">({</span>
</span></span><span class="line"><span class="ln">32</span><span class="cl">                    <span class="s2">&#34;file&#34;</span><span class="p">:</span> <span class="nb">str</span><span class="p">(</span><span class="n">file_path</span><span class="p">),</span>
</span></span><span class="line"><span class="ln">33</span><span class="cl">                    <span class="s2">&#34;error&#34;</span><span class="p">:</span> <span class="nb">str</span><span class="p">(</span><span class="n">e</span><span class="p">),</span>
</span></span><span class="line"><span class="ln">34</span><span class="cl">                    <span class="s2">&#34;type&#34;</span><span class="p">:</span> <span class="nb">type</span><span class="p">(</span><span class="n">e</span><span class="p">)</span><span class="o">.</span><span class="vm">__name__</span>
</span></span><span class="line"><span class="ln">35</span><span class="cl">                <span class="p">})</span>
</span></span><span class="line"><span class="ln">36</span><span class="cl">
</span></span><span class="line"><span class="ln">37</span><span class="cl">    <span class="k">return</span> <span class="n">results</span><span class="p">,</span> <span class="n">errors</span>
</span></span><span class="line"><span class="ln">38</span><span class="cl">
</span></span><span class="line"><span class="ln">39</span><span class="cl"><span class="c1"># 使用範例</span>
</span></span><span class="line"><span class="ln">40</span><span class="cl"><span class="n">results</span><span class="p">,</span> <span class="n">errors</span> <span class="o">=</span> <span class="n">checker</span><span class="o">.</span><span class="n">check_directory_robust</span><span class="p">(</span><span class="s2">&#34;docs/&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">41</span><span class="cl">
</span></span><span class="line"><span class="ln">42</span><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;成功檢查 </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">results</span><span class="p">)</span><span class="si">}</span><span class="s2"> 個檔案&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">43</span><span class="cl"><span class="k">if</span> <span class="n">errors</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">44</span><span class="cl">    <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;有 </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">errors</span><span class="p">)</span><span class="si">}</span><span class="s2"> 個錯誤：&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">45</span><span class="cl">    <span class="k">for</span> <span class="n">err</span> <span class="ow">in</span> <span class="n">errors</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">46</span><span class="cl">        <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;  </span><span class="si">{</span><span class="n">err</span><span class="p">[</span><span class="s1">&#39;file&#39;</span><span class="p">]</span><span class="si">}</span><span class="s2">: </span><span class="si">{</span><span class="n">err</span><span class="p">[</span><span class="s1">&#39;error&#39;</span><span class="p">]</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">)</span></span></span></code></pre></div><hr>
<h2 id="效能測量">效能測量</h2>
<h3 id="使用-timeit-比較效能">使用 timeit 比較效能</h3>
<p>理論上並行會更快，但「更快多少」需要測量：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kn">import</span> <span class="nn">timeit</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="kn">from</span> <span class="nn">pathlib</span> <span class="kn">import</span> <span class="n">Path</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="c1"># 準備測試資料</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="n">test_dir</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="s2">&#34;docs/&#34;</span><span class="p">)</span>  <span class="c1"># 假設有 50+ 個 .md 檔案</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="n">checker</span> <span class="o">=</span> <span class="n">MarkdownLinkChecker</span><span class="p">()</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="c1"># 測量循序版本</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="k">def</span> <span class="nf">test_sequential</span><span class="p">():</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">    <span class="k">return</span> <span class="n">checker</span><span class="o">.</span><span class="n">check_directory</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">test_dir</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">
</span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="n">sequential_time</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="n">test_sequential</span><span class="p">,</span> <span class="n">number</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span> <span class="o">/</span> <span class="mi">3</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">
</span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="c1"># 測量並行版本</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="k">def</span> <span class="nf">test_parallel</span><span class="p">():</span>
</span></span><span class="line"><span class="ln">16</span><span class="cl">    <span class="k">return</span> <span class="n">checker</span><span class="o">.</span><span class="n">check_directory_parallel</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">test_dir</span><span class="p">),</span> <span class="n">max_workers</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">
</span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="n">parallel_time</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="n">test_parallel</span><span class="p">,</span> <span class="n">number</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span> <span class="o">/</span> <span class="mi">3</span>
</span></span><span class="line"><span class="ln">19</span><span class="cl">
</span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="c1"># 計算加速比</span>
</span></span><span class="line"><span class="ln">21</span><span class="cl"><span class="n">speedup</span> <span class="o">=</span> <span class="n">sequential_time</span> <span class="o">/</span> <span class="n">parallel_time</span>
</span></span><span class="line"><span class="ln">22</span><span class="cl">
</span></span><span class="line"><span class="ln">23</span><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;循序版本：</span><span class="si">{</span><span class="n">sequential_time</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2"> 秒&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">24</span><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;並行版本：</span><span class="si">{</span><span class="n">parallel_time</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2"> 秒&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">25</span><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;加速比：</span><span class="si">{</span><span class="n">speedup</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">x&#34;</span><span class="p">)</span></span></span></code></pre></div><h3 id="真實測試結果參考">真實測試結果參考</h3>
<p>以下是在不同檔案數量下的測試結果（供參考）：</p>
<table>
  <thead>
      <tr>
          <th>檔案數量</th>
          <th>循序時間</th>
          <th>並行時間 (8 workers)</th>
          <th>加速比</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>10</td>
          <td>0.15s</td>
          <td>0.08s</td>
          <td>1.9x</td>
      </tr>
      <tr>
          <td>50</td>
          <td>0.72s</td>
          <td>0.18s</td>
          <td>4.0x</td>
      </tr>
      <tr>
          <td>100</td>
          <td>1.45s</td>
          <td>0.32s</td>
          <td>4.5x</td>
      </tr>
      <tr>
          <td>200</td>
          <td>2.91s</td>
          <td>0.58s</td>
          <td>5.0x</td>
      </tr>
  </tbody>
</table>
<p><strong>觀察</strong>：</p>
<ol>
<li>檔案越多，並行效益越明顯</li>
<li>加速比不會無限增長（受限於 I/O 頻寬和 GIL）</li>
<li>檔案少於 10 個時，並行的額外開銷可能抵消效益</li>
</ol>
<h3 id="完整的效能測試腳本">完整的效能測試腳本</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="ch">#!/usr/bin/env python3</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="s2">&#34;&#34;&#34;效能比較測試腳本&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="kn">import</span> <span class="nn">os</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="kn">import</span> <span class="nn">sys</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="kn">import</span> <span class="nn">time</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="kn">from</span> <span class="nn">pathlib</span> <span class="kn">import</span> <span class="n">Path</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="kn">from</span> <span class="nn">concurrent.futures</span> <span class="kn">import</span> <span class="n">ThreadPoolExecutor</span><span class="p">,</span> <span class="n">as_completed</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="c1"># 假設這是 markdown_link_checker 的簡化版</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="k">class</span> <span class="nc">MarkdownChecker</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">    <span class="k">def</span> <span class="nf">check_file</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">file_path</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">dict</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">        <span class="s2">&#34;&#34;&#34;模擬檔案檢查（包含 I/O 延遲）&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">        <span class="n">path</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="n">file_path</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">        <span class="n">content</span> <span class="o">=</span> <span class="n">path</span><span class="o">.</span><span class="n">read_text</span><span class="p">(</span><span class="n">encoding</span><span class="o">=</span><span class="s2">&#34;utf-8&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">16</span><span class="cl">        <span class="c1"># 模擬一些處理時間</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">        <span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mf">0.01</span><span class="p">)</span>  <span class="c1"># 10ms I/O 延遲</span>
</span></span><span class="line"><span class="ln">18</span><span class="cl">        <span class="k">return</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">19</span><span class="cl">            <span class="s2">&#34;file&#34;</span><span class="p">:</span> <span class="n">file_path</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">20</span><span class="cl">            <span class="s2">&#34;lines&#34;</span><span class="p">:</span> <span class="nb">len</span><span class="p">(</span><span class="n">content</span><span class="o">.</span><span class="n">splitlines</span><span class="p">()),</span>
</span></span><span class="line"><span class="ln">21</span><span class="cl">            <span class="s2">&#34;chars&#34;</span><span class="p">:</span> <span class="nb">len</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">22</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">23</span><span class="cl">
</span></span><span class="line"><span class="ln">24</span><span class="cl">    <span class="k">def</span> <span class="nf">check_sequential</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">files</span><span class="p">:</span> <span class="nb">list</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">list</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">25</span><span class="cl">        <span class="s2">&#34;&#34;&#34;循序檢查&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln">26</span><span class="cl">        <span class="k">return</span> <span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">check_file</span><span class="p">(</span><span class="n">f</span><span class="p">)</span> <span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">files</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">27</span><span class="cl">
</span></span><span class="line"><span class="ln">28</span><span class="cl">    <span class="k">def</span> <span class="nf">check_parallel</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">files</span><span class="p">:</span> <span class="nb">list</span><span class="p">,</span> <span class="n">max_workers</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">8</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">list</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">29</span><span class="cl">        <span class="s2">&#34;&#34;&#34;並行檢查&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln">30</span><span class="cl">        <span class="n">results</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="ln">31</span><span class="cl">        <span class="k">with</span> <span class="n">ThreadPoolExecutor</span><span class="p">(</span><span class="n">max_workers</span><span class="o">=</span><span class="n">max_workers</span><span class="p">)</span> <span class="k">as</span> <span class="n">executor</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">32</span><span class="cl">            <span class="n">futures</span> <span class="o">=</span> <span class="p">{</span><span class="n">executor</span><span class="o">.</span><span class="n">submit</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">check_file</span><span class="p">,</span> <span class="n">f</span><span class="p">):</span> <span class="n">f</span> <span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">files</span><span class="p">}</span>
</span></span><span class="line"><span class="ln">33</span><span class="cl">            <span class="k">for</span> <span class="n">future</span> <span class="ow">in</span> <span class="n">as_completed</span><span class="p">(</span><span class="n">futures</span><span class="p">):</span>
</span></span><span class="line"><span class="ln">34</span><span class="cl">                <span class="n">results</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">future</span><span class="o">.</span><span class="n">result</span><span class="p">())</span>
</span></span><span class="line"><span class="ln">35</span><span class="cl">        <span class="k">return</span> <span class="n">results</span>
</span></span><span class="line"><span class="ln">36</span><span class="cl">
</span></span><span class="line"><span class="ln">37</span><span class="cl"><span class="k">def</span> <span class="nf">benchmark</span><span class="p">(</span><span class="n">func</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="n">iterations</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">3</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">float</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">38</span><span class="cl">    <span class="s2">&#34;&#34;&#34;測量函式執行時間&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln">39</span><span class="cl">    <span class="n">times</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="ln">40</span><span class="cl">    <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">iterations</span><span class="p">):</span>
</span></span><span class="line"><span class="ln">41</span><span class="cl">        <span class="n">start</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">perf_counter</span><span class="p">()</span>
</span></span><span class="line"><span class="ln">42</span><span class="cl">        <span class="n">func</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">43</span><span class="cl">        <span class="n">elapsed</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">perf_counter</span><span class="p">()</span> <span class="o">-</span> <span class="n">start</span>
</span></span><span class="line"><span class="ln">44</span><span class="cl">        <span class="n">times</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">elapsed</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">45</span><span class="cl">    <span class="k">return</span> <span class="nb">sum</span><span class="p">(</span><span class="n">times</span><span class="p">)</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">times</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">46</span><span class="cl">
</span></span><span class="line"><span class="ln">47</span><span class="cl"><span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
</span></span><span class="line"><span class="ln">48</span><span class="cl">    <span class="c1"># 收集測試檔案</span>
</span></span><span class="line"><span class="ln">49</span><span class="cl">    <span class="n">test_dir</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="s2">&#34;.&#34;</span><span class="p">)</span>  <span class="c1"># 替換為你的目錄</span>
</span></span><span class="line"><span class="ln">50</span><span class="cl">    <span class="n">files</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">test_dir</span><span class="o">.</span><span class="n">rglob</span><span class="p">(</span><span class="s2">&#34;*.md&#34;</span><span class="p">))[:</span><span class="mi">100</span><span class="p">]</span>  <span class="c1"># 取前 100 個</span>
</span></span><span class="line"><span class="ln">51</span><span class="cl">
</span></span><span class="line"><span class="ln">52</span><span class="cl">    <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">files</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">10</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">53</span><span class="cl">        <span class="nb">print</span><span class="p">(</span><span class="s2">&#34;需要至少 10 個 .md 檔案來測試&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">54</span><span class="cl">        <span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">55</span><span class="cl">
</span></span><span class="line"><span class="ln">56</span><span class="cl">    <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;測試檔案數量：</span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">files</span><span class="p">)</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">57</span><span class="cl">
</span></span><span class="line"><span class="ln">58</span><span class="cl">    <span class="n">checker</span> <span class="o">=</span> <span class="n">MarkdownChecker</span><span class="p">()</span>
</span></span><span class="line"><span class="ln">59</span><span class="cl">
</span></span><span class="line"><span class="ln">60</span><span class="cl">    <span class="c1"># 測試不同 worker 數量</span>
</span></span><span class="line"><span class="ln">61</span><span class="cl">    <span class="nb">print</span><span class="p">(</span><span class="s2">&#34;</span><span class="se">\n</span><span class="s2">效能比較：&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">62</span><span class="cl">    <span class="nb">print</span><span class="p">(</span><span class="s2">&#34;-&#34;</span> <span class="o">*</span> <span class="mi">50</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">63</span><span class="cl">
</span></span><span class="line"><span class="ln">64</span><span class="cl">    <span class="n">seq_time</span> <span class="o">=</span> <span class="n">benchmark</span><span class="p">(</span><span class="n">checker</span><span class="o">.</span><span class="n">check_sequential</span><span class="p">,</span> <span class="n">files</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">65</span><span class="cl">    <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;循序版本：</span><span class="si">{</span><span class="n">seq_time</span><span class="si">:</span><span class="s2">.3f</span><span class="si">}</span><span class="s2"> 秒&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">66</span><span class="cl">
</span></span><span class="line"><span class="ln">67</span><span class="cl">    <span class="k">for</span> <span class="n">workers</span> <span class="ow">in</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">16</span><span class="p">]:</span>
</span></span><span class="line"><span class="ln">68</span><span class="cl">        <span class="n">par_time</span> <span class="o">=</span> <span class="n">benchmark</span><span class="p">(</span><span class="n">checker</span><span class="o">.</span><span class="n">check_parallel</span><span class="p">,</span> <span class="n">files</span><span class="p">,</span> <span class="n">workers</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">69</span><span class="cl">        <span class="n">speedup</span> <span class="o">=</span> <span class="n">seq_time</span> <span class="o">/</span> <span class="n">par_time</span>
</span></span><span class="line"><span class="ln">70</span><span class="cl">        <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;並行 (</span><span class="si">{</span><span class="n">workers</span><span class="si">:</span><span class="s2">2d</span><span class="si">}</span><span class="s2"> workers)：</span><span class="si">{</span><span class="n">par_time</span><span class="si">:</span><span class="s2">.3f</span><span class="si">}</span><span class="s2"> 秒 (</span><span class="si">{</span><span class="n">speedup</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">x)&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">71</span><span class="cl">
</span></span><span class="line"><span class="ln">72</span><span class="cl"><span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s2">&#34;__main__&#34;</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">73</span><span class="cl">    <span class="n">main</span><span class="p">()</span></span></span></code></pre></div><hr>
<h2 id="什麼時候不該用並行">什麼時候不該用並行？</h2>
<h3 id="反模式-1檔案數量太少">反模式 1：檔案數量太少</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 不好：只有 3 個檔案還用並行</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="n">files</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;a.md&#34;</span><span class="p">,</span> <span class="s2">&#34;b.md&#34;</span><span class="p">,</span> <span class="s2">&#34;c.md&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="k">with</span> <span class="n">ThreadPoolExecutor</span><span class="p">(</span><span class="n">max_workers</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="k">as</span> <span class="n">executor</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">    <span class="n">results</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">executor</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">check_file</span><span class="p">,</span> <span class="n">files</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">
</span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1"># 問題：建立執行緒池的開銷可能比省下的時間還多</span></span></span></code></pre></div><p><strong>建議</strong>：檔案少於 5-10 個時，直接循序處理。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">def</span> <span class="nf">check_files_smart</span><span class="p">(</span><span class="n">files</span><span class="p">:</span> <span class="nb">list</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">list</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">    <span class="s2">&#34;&#34;&#34;根據檔案數量選擇處理方式&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">    <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">files</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">10</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">        <span class="c1"># 少量檔案：循序處理</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">        <span class="k">return</span> <span class="p">[</span><span class="n">check_file</span><span class="p">(</span><span class="n">f</span><span class="p">)</span> <span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">files</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl">    <span class="k">else</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">7</span><span class="cl">        <span class="c1"># 大量檔案：並行處理</span>
</span></span><span class="line"><span class="ln">8</span><span class="cl">        <span class="k">with</span> <span class="n">ThreadPoolExecutor</span><span class="p">(</span><span class="n">max_workers</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="k">as</span> <span class="n">executor</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">9</span><span class="cl">            <span class="k">return</span> <span class="nb">list</span><span class="p">(</span><span class="n">executor</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">check_file</span><span class="p">,</span> <span class="n">files</span><span class="p">))</span></span></span></code></pre></div><h3 id="反模式-2任務之間有依賴">反模式 2：任務之間有依賴</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 不好：任務 B 依賴任務 A 的結果</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="k">def</span> <span class="nf">process_a</span><span class="p">():</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">    <span class="k">return</span> <span class="n">fetch_config</span><span class="p">()</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="k">def</span> <span class="nf">process_b</span><span class="p">(</span><span class="n">config</span><span class="p">):</span>  <span class="c1"># 需要 A 的結果</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl">    <span class="k">return</span> <span class="n">validate</span><span class="p">(</span><span class="n">config</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">7</span><span class="cl">
</span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="c1"># 強行並行化會出錯或需要額外同步</span></span></span></code></pre></div><h3 id="反模式-3共享可變狀態">反模式 3：共享可變狀態</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># 不好：多個執行緒修改同一個列表</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="n">shared_results</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="k">def</span> <span class="nf">bad_worker</span><span class="p">(</span><span class="n">file</span><span class="p">):</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">    <span class="n">result</span> <span class="o">=</span> <span class="n">check_file</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="n">shared_results</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>  <span class="c1"># 競爭條件！</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="c1"># 正確做法：讓 worker 返回結果，由主執行緒收集</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="k">def</span> <span class="nf">good_worker</span><span class="p">(</span><span class="n">file</span><span class="p">):</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">    <span class="k">return</span> <span class="n">check_file</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">
</span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="k">with</span> <span class="n">ThreadPoolExecutor</span><span class="p">()</span> <span class="k">as</span> <span class="n">executor</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">    <span class="n">results</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">executor</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">good_worker</span><span class="p">,</span> <span class="n">files</span><span class="p">))</span></span></span></code></pre></div><h3 id="反模式-4cpu-密集任務用-threadpoolexecutor">反模式 4：CPU 密集任務用 ThreadPoolExecutor</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># 不好：CPU 密集任務受 GIL 限制</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="k">def</span> <span class="nf">compute_heavy</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">    <span class="k">return</span> <span class="nb">sum</span><span class="p">(</span><span class="n">i</span> <span class="o">*</span> <span class="n">i</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n</span><span class="p">))</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="c1"># ThreadPoolExecutor 對 CPU 密集任務沒有加速效果</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="k">with</span> <span class="n">ThreadPoolExecutor</span><span class="p">(</span><span class="n">max_workers</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="k">as</span> <span class="n">executor</span><span class="p">:</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">    <span class="n">results</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">executor</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">compute_heavy</span><span class="p">,</span> <span class="p">[</span><span class="mi">10_000_000</span><span class="p">]</span> <span class="o">*</span> <span class="mi">8</span><span class="p">))</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">
</span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="c1"># 正確：CPU 密集應該用 ProcessPoolExecutor</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="kn">from</span> <span class="nn">concurrent.futures</span> <span class="kn">import</span> <span class="n">ProcessPoolExecutor</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">
</span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="k">with</span> <span class="n">ProcessPoolExecutor</span><span class="p">(</span><span class="n">max_workers</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="k">as</span> <span class="n">executor</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">    <span class="n">results</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">executor</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">compute_heavy</span><span class="p">,</span> <span class="p">[</span><span class="mi">10_000_000</span><span class="p">]</span> <span class="o">*</span> <span class="mi">8</span><span class="p">))</span></span></span></code></pre></div><h3 id="快速決策表">快速決策表</h3>
<table>
  <thead>
      <tr>
          <th>情境</th>
          <th>推薦做法</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>&lt; 10 個檔案</td>
          <td>循序處理</td>
      </tr>
      <tr>
          <td>10-100 個檔案，I/O 密集</td>
          <td>ThreadPoolExecutor</td>
      </tr>
      <tr>
          <td>&gt; 100 個檔案，I/O 密集</td>
          <td>ThreadPoolExecutor + 進度報告</td>
      </tr>
      <tr>
          <td>CPU 密集計算</td>
          <td>ProcessPoolExecutor</td>
      </tr>
      <tr>
          <td>任務有依賴關係</td>
          <td>重新設計，或用 asyncio</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="思考題">思考題</h2>
<ol>
<li>
<p><strong>為什麼 <code>check_directory_parallel()</code> 中使用 <code>as_completed()</code> 而不是 <code>executor.map()</code>？</strong>
提示：思考兩者在「錯誤處理」和「進度報告」上的差異。</p>
</li>
<li>
<p><strong>如果 <code>check_file()</code> 除了讀檔還會寫入日誌檔，還適合並行化嗎？需要什麼額外措施？</strong>
提示：考慮日誌寫入的執行緒安全性。</p>
</li>
<li>
<p><strong>在 8 核心的機器上，為什麼 I/O 密集任務開 16 個 worker 可能比 8 個更快？</strong>
提示：思考 GIL 和 I/O 等待時間的關係。</p>
</li>
</ol>
<h2 id="實作練習">實作練習</h2>
<ol>
<li>
<p><strong>修改 hook_validator.py</strong>：將 <code>validate_all_hooks()</code> 改寫為並行版本，測量效能差異。</p>
</li>
<li>
<p><strong>實作超時機制</strong>：如果單一檔案檢查超過 5 秒，自動跳過並記錄警告。
提示：查看 <code>future.result(timeout=5)</code>。</p>
</li>
<li>
<p><strong>實作批次處理</strong>：當檔案超過 1000 個時，分批處理（每批 100 個），避免記憶體壓力。</p>
</li>
</ol>
<h2 id="延伸閱讀">延伸閱讀</h2>
<ul>
<li><a href="/blog/python-advanced/08-practical-optimization/case-studies/parallel-file-check/" data-link-title="案例：並行檔案檢查" data-link-desc="使用 ThreadPoolExecutor 加速 Markdown 連結檢查">案例研究：並行檔案檢查</a> - 完整的實作與測試</li>
<li><a href="/blog/python-advanced/08-practical-optimization/case-studies/parallel-hook-validation/" data-link-title="案例：並行 Hook 驗證" data-link-desc="使用 ThreadPoolExecutor 並行驗證 Hook，並實現進度報告">案例研究：並行 Hook 驗證</a> - 結合 as_completed 與進度報告</li>
<li>入門系列 <a href="/blog/python/03-stdlib/concurrency/" data-link-title="3.7 並行處理 - threading、multiprocessing、concurrent.futures" data-link-desc="Python 並行處理的三種方式與選擇指南">3.7 並行處理</a> - 複習基礎概念</li>
<li>進階系列 <a href="/blog/python-advanced/04-cpython-internals/gil-threading/" data-link-title="3.4 GIL 與執行緒模型" data-link-desc="深入理解 GIL 的設計與實現">4.3 GIL 與執行緒模型</a> - 深入理解 GIL</li>
</ul>
<hr>
<p><em>上一章：<a href="/blog/python-advanced/08-practical-optimization/" data-link-title="模組八：實戰效能優化" data-link-desc="將入門系列的並行處理與效能優化知識應用於真實系統">模組索引</a></em>
<em>下一章：<a href="/blog/python-advanced/08-practical-optimization/performance-tuning/" data-link-title="8.2 效能調優實戰" data-link-desc="測量、分析、優化的完整流程">效能調優實戰</a></em></p>
]]></content:encoded></item></channel></rss>