<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Tts on Tarragon</title><link>https://tarrragon.github.io/blog/tags/tts/</link><description>Recent content in Tts on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Tue, 12 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/tts/index.xml" rel="self" type="application/rss+xml"/><item><title>Hands-on：安裝 Piper TTS 做文字轉語音</title><link>https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/piper-tts-setup/</link><pubDate>Tue, 12 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/piper-tts-setup/</guid><description>&lt;p>本篇紀錄裝 Piper TTS 並用它合成英文語音、再用 Whisper 轉回文字做 round-trip 驗證。選 Piper 而非雲端 TTS（OpenAI / ElevenLabs）的理由：&lt;/p>
&lt;ul>
&lt;li>完全本地、隱私邊界乾淨。&lt;/li>
&lt;li>ONNX runtime、Apple Silicon 跑得動、不依賴 GPU。&lt;/li>
&lt;li>模型小（low quality ~17-65 MB、medium ~50 MB、high ~125 MB）、適合 minimal 驗證。&lt;/li>
&lt;li>CLI-first、stdin 餵文字、stdout 或檔案輸出 WAV、容易串 pipeline。&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>&lt;strong>驗證日期&lt;/strong>：2026-05-12
&lt;strong>Piper 版本&lt;/strong>：透過 pip 安裝
&lt;strong>示範 voice&lt;/strong>：&lt;code>en_US-lessac-low.onnx&lt;/code>（63 MB、英文女聲、low quality）
&lt;strong>實測&lt;/strong>：4 秒文字合成 &amp;lt; 1 秒、品質夠日常用&lt;/p>&lt;/blockquote>
&lt;h2 id="前置設定">前置設定&lt;/h2>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>項目&lt;/th>
 &lt;th>檢查指令&lt;/th>
 &lt;th>預期&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Python&lt;/td>
 &lt;td>&lt;code>python3 --version&lt;/code>&lt;/td>
 &lt;td>3.11+&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>pip&lt;/td>
 &lt;td>&lt;code>pip3 --version&lt;/code>&lt;/td>
 &lt;td>25+&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>磁碟空間&lt;/td>
 &lt;td>&lt;code>df -h ~&lt;/code>&lt;/td>
 &lt;td>至少 200 MB（Piper + 一個 voice）&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Piper 跟 Whisper 一樣分離 binary 跟 model：先裝 runtime、再下載 voice。&lt;/p>
&lt;h2 id="安裝-piper">安裝 Piper&lt;/h2>
&lt;p>&lt;code>piper-tts&lt;/code> 沒有 Homebrew formula、用 pip 裝：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">pip3 install piper-tts --break-system-packages&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>PEP 668&lt;/code> 是 macOS / Homebrew Python 的 external-management 機制、保護系統 Python 不被 pip 安裝污染；&lt;code>--break-system-packages&lt;/code> 是 bypass flag、跳過該檢查直接裝。比較乾淨的做法是用 venv：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">python3 -m venv ~/.piper-venv
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="nb">source&lt;/span> ~/.piper-venv/bin/activate
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">pip install piper-tts&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>但裝完 PATH 要指到 venv 的 piper、稍麻煩。本 demo 用 &lt;code>--break-system-packages&lt;/code> 簡化。實際生產建議用 venv 或 pipx。&lt;/p>
&lt;p>驗證 binary 在 PATH：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">which piper
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="c1"># /opt/homebrew/bin/piper（若 pip3 來自 Homebrew Python）&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="c1"># 或 ~/Library/Python/3.x/bin/piper（若 pip3 來自系統 Python）&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">piper --help &lt;span class="p">|&lt;/span> head -10&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>which piper&lt;/code> 找不到時、檢查兩個 bin 目錄哪邊有檔案、把該目錄加進 &lt;code>PATH&lt;/code>。&lt;/p>
&lt;h2 id="下載-voice-model">下載 Voice Model&lt;/h2>
&lt;p>Piper 用 ONNX 格式的 voice model、每個 voice 是一對 &lt;code>.onnx&lt;/code>（model 權重）+ &lt;code>.onnx.json&lt;/code>（metadata、含採樣率、phoneme map）。&lt;/p>
&lt;p>從 Hugging Face &lt;code>rhasspy/piper-voices&lt;/code> repo 拉：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">mkdir -p ~/.piper-voices
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="nb">cd&lt;/span> ~/.piper-voices
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="c1"># 英文女聲、low quality（小、快）&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">curl -L -o en_US-lessac-low.onnx &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">&lt;span class="se">&lt;/span> &lt;span class="s2">&amp;#34;https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/low/en_US-lessac-low.onnx&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">curl -L -o en_US-lessac-low.onnx.json &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">8&lt;/span>&lt;span class="cl">&lt;span class="se">&lt;/span> &lt;span class="s2">&amp;#34;https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/low/en_US-lessac-low.onnx.json&amp;#34;&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>可用 voice quality 等級：&lt;/p></description><content:encoded><![CDATA[<p>本篇紀錄裝 Piper TTS 並用它合成英文語音、再用 Whisper 轉回文字做 round-trip 驗證。選 Piper 而非雲端 TTS（OpenAI / ElevenLabs）的理由：</p>
<ul>
<li>完全本地、隱私邊界乾淨。</li>
<li>ONNX runtime、Apple Silicon 跑得動、不依賴 GPU。</li>
<li>模型小（low quality ~17-65 MB、medium ~50 MB、high ~125 MB）、適合 minimal 驗證。</li>
<li>CLI-first、stdin 餵文字、stdout 或檔案輸出 WAV、容易串 pipeline。</li>
</ul>
<blockquote>
<p><strong>驗證日期</strong>：2026-05-12
<strong>Piper 版本</strong>：透過 pip 安裝
<strong>示範 voice</strong>：<code>en_US-lessac-low.onnx</code>（63 MB、英文女聲、low quality）
<strong>實測</strong>：4 秒文字合成 &lt; 1 秒、品質夠日常用</p></blockquote>
<h2 id="前置設定">前置設定</h2>
<table>
  <thead>
      <tr>
          <th>項目</th>
          <th>檢查指令</th>
          <th>預期</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Python</td>
          <td><code>python3 --version</code></td>
          <td>3.11+</td>
      </tr>
      <tr>
          <td>pip</td>
          <td><code>pip3 --version</code></td>
          <td>25+</td>
      </tr>
      <tr>
          <td>磁碟空間</td>
          <td><code>df -h ~</code></td>
          <td>至少 200 MB（Piper + 一個 voice）</td>
      </tr>
  </tbody>
</table>
<p>Piper 跟 Whisper 一樣分離 binary 跟 model：先裝 runtime、再下載 voice。</p>
<h2 id="安裝-piper">安裝 Piper</h2>
<p><code>piper-tts</code> 沒有 Homebrew formula、用 pip 裝：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">pip3 install piper-tts --break-system-packages</span></span></code></pre></div><p><code>PEP 668</code> 是 macOS / Homebrew Python 的 external-management 機制、保護系統 Python 不被 pip 安裝污染；<code>--break-system-packages</code> 是 bypass flag、跳過該檢查直接裝。比較乾淨的做法是用 venv：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">python3 -m venv ~/.piper-venv
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="nb">source</span> ~/.piper-venv/bin/activate
</span></span><span class="line"><span class="ln">3</span><span class="cl">pip install piper-tts</span></span></code></pre></div><p>但裝完 PATH 要指到 venv 的 piper、稍麻煩。本 demo 用 <code>--break-system-packages</code> 簡化。實際生產建議用 venv 或 pipx。</p>
<p>驗證 binary 在 PATH：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">which piper
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"># /opt/homebrew/bin/piper（若 pip3 來自 Homebrew Python）</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="c1"># 或 ~/Library/Python/3.x/bin/piper（若 pip3 來自系統 Python）</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">
</span></span><span class="line"><span class="ln">5</span><span class="cl">piper --help <span class="p">|</span> head -10</span></span></code></pre></div><p><code>which piper</code> 找不到時、檢查兩個 bin 目錄哪邊有檔案、把該目錄加進 <code>PATH</code>。</p>
<h2 id="下載-voice-model">下載 Voice Model</h2>
<p>Piper 用 ONNX 格式的 voice model、每個 voice 是一對 <code>.onnx</code>（model 權重）+ <code>.onnx.json</code>（metadata、含採樣率、phoneme map）。</p>
<p>從 Hugging Face <code>rhasspy/piper-voices</code> repo 拉：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">mkdir -p ~/.piper-voices
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="nb">cd</span> ~/.piper-voices
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 英文女聲、low quality（小、快）</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">curl -L -o en_US-lessac-low.onnx <span class="se">\
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="se"></span>  <span class="s2">&#34;https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/low/en_US-lessac-low.onnx&#34;</span>
</span></span><span class="line"><span class="ln">7</span><span class="cl">curl -L -o en_US-lessac-low.onnx.json <span class="se">\
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="se"></span>  <span class="s2">&#34;https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/low/en_US-lessac-low.onnx.json&#34;</span></span></span></code></pre></div><p>可用 voice quality 等級：</p>
<table>
  <thead>
      <tr>
          <th>Quality</th>
          <th>大小</th>
          <th>用途</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>low</code></td>
          <td>17-65 MB</td>
          <td>快、品質粗糙、適合 prototype</td>
      </tr>
      <tr>
          <td><code>medium</code></td>
          <td>50-100 MB</td>
          <td>平衡、日常用</td>
      </tr>
      <tr>
          <td><code>high</code></td>
          <td>100-200 MB</td>
          <td>品質佳、合成略慢</td>
      </tr>
      <tr>
          <td><code>x_low</code></td>
          <td>&lt; 20 MB</td>
          <td>極小、品質明顯差、適合受限環境</td>
      </tr>
  </tbody>
</table>
<p>語言 / 地區覆蓋（部分）：</p>
<table>
  <thead>
      <tr>
          <th>Locale</th>
          <th>Voice 範例</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>en_US</code></td>
          <td>lessac、ryan、amy、libritts</td>
      </tr>
      <tr>
          <td><code>en_GB</code></td>
          <td>alan、cori、jenny</td>
      </tr>
      <tr>
          <td><code>zh_CN</code></td>
          <td>huayan（北京話）</td>
      </tr>
      <tr>
          <td><code>ja_JP</code>（社群）</td>
          <td>較少</td>
      </tr>
      <tr>
          <td><code>de_DE</code> / <code>fr_FR</code> / <code>es_ES</code> 等</td>
          <td>各有多個</td>
      </tr>
  </tbody>
</table>
<p>完整清單在 <code>rhasspy/piper-voices</code> 的 <a href="https://github.com/rhasspy/piper">VOICES.md</a>。</p>
<p>驗證下載：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">ls -lh ~/.piper-voices/
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"># en_US-lessac-low.onnx       63M</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="c1"># en_US-lessac-low.onnx.json  4.9K</span></span></span></code></pre></div><h2 id="跑第一次合成">跑第一次合成</h2>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Hello from Piper TTS, this is a synthesized voice test.&#34;</span> <span class="se">\
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="se"></span>  <span class="p">|</span> piper -m ~/.piper-voices/en_US-lessac-low.onnx -f /tmp/piper-out.wav</span></span></code></pre></div><p>說明：</p>
<ul>
<li>文字從 stdin 進、是 Piper 的標準輸入方式。</li>
<li><code>-m</code>：voice model <code>.onnx</code> path。Piper 自動找同目錄的 <code>.onnx.json</code>。</li>
<li><code>-f</code>：output WAV path。不指定的話直接寫 stdout（可以 pipe 到 <code>aplay</code> / <code>afplay</code> 即時播放）。</li>
</ul>
<p>預期輸出：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">ls -lh /tmp/piper-out.wav
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"># 128 KB</span></span></span></code></pre></div><p>驗證 WAV 規格：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">file /tmp/piper-out.wav
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"># RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 16000 Hz</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl">ffprobe -loglevel error -show_format /tmp/piper-out.wav <span class="p">|</span> grep duration
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"># duration=3.984000</span></span></span></code></pre></div><p>16-bit PCM、16 kHz mono——跟 <a href="/blog/llm/01-local-llm-services/hands-on/whisper-setup/" data-link-title="Hands-on：安裝 whisper.cpp 做語音轉文字" data-link-desc="brew install whisper-cpp、下載 GGML model、Metal 加速、ffmpeg 餵 WAV、484ms 完成 7 秒音訊轉錄">Whisper</a> 期望的輸入規格一致、可以直接 round-trip。</p>
<p>播放確認：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">afplay /tmp/piper-out.wav</span></span></code></pre></div><h2 id="常用選項">常用選項</h2>
<table>
  <thead>
      <tr>
          <th>選項</th>
          <th>作用</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>-m MODEL</code></td>
          <td>voice model <code>.onnx</code> 路徑（必填）</td>
      </tr>
      <tr>
          <td><code>-c CONFIG</code></td>
          <td>metadata json 路徑（預設自動找同名 <code>.onnx.json</code>）</td>
      </tr>
      <tr>
          <td><code>-i FILE</code></td>
          <td>輸入文字檔（替代 stdin）</td>
      </tr>
      <tr>
          <td><code>-f OUTPUT</code></td>
          <td>輸出 WAV 路徑</td>
      </tr>
      <tr>
          <td><code>-d DIR</code></td>
          <td>輸出目錄（多句時自動分檔）</td>
      </tr>
      <tr>
          <td><code>--length-scale FACTOR</code></td>
          <td>速度調整（&lt; 1 加速、&gt; 1 減速、預設 1.0）</td>
      </tr>
      <tr>
          <td><code>--volume FACTOR</code></td>
          <td>音量調整（0.0-1.0）</td>
      </tr>
      <tr>
          <td><code>-s SPEAKER</code></td>
          <td>多 speaker model 選 speaker（如 libritts）</td>
      </tr>
      <tr>
          <td><code>--cuda</code></td>
          <td>用 CUDA（Apple Silicon 用不到、留 default）</td>
      </tr>
  </tbody>
</table>
<p>典型應用：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># 從文字檔合成</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">piper -m ~/.piper-voices/en_US-lessac-low.onnx <span class="se">\
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="se"></span>  -i article.txt <span class="se">\
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="se"></span>  -f narration.wav
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">
</span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="c1"># 多句子分檔</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">piper -m ~/.piper-voices/en_US-lessac-medium.onnx <span class="se">\
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="se"></span>  -i script.txt <span class="se">\
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="se"></span>  -d ~/audio-output/ <span class="se">\
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="se"></span>  --output-dir-naming text
</span></span><span class="line"><span class="ln">11</span><span class="cl">
</span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="c1"># 慢速朗讀（學習用）</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">piper -m ~/.piper-voices/en_US-lessac-low.onnx <span class="se">\
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="se"></span>  --length-scale 1.4 <span class="se">\
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="se"></span>  -f slow.wav <span class="o">&lt;&lt;&lt;</span> <span class="s2">&#34;Slowly read this sentence.&#34;</span></span></span></code></pre></div><h2 id="round-trip-驗證">Round-Trip 驗證</h2>
<p>確認 TTS + STT 整條串得起來：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 1. Piper TTS：文字 → WAV</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;The quick brown fox jumps over the lazy dog.&#34;</span> <span class="se">\
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="se"></span>  <span class="p">|</span> piper -m ~/.piper-voices/en_US-lessac-low.onnx -f /tmp/test.wav
</span></span><span class="line"><span class="ln">4</span><span class="cl">
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"># 2. Whisper STT：WAV → 文字</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl">whisper-cli -m ~/.whisper-models/ggml-tiny.en.bin -f /tmp/test.wav -nt</span></span></code></pre></div><p>預期 Whisper 回應接近原文字（可能大小寫 / 標點稍變）。Round-trip 成功表示：</p>
<ul>
<li>Piper 輸出格式（16kHz mono WAV）符合 Whisper 輸入需求。</li>
<li>兩個模型對英文的訓練分佈相容。</li>
</ul>
<h2 id="跟-llm-串接llm-說話的-minimal-pipeline">跟 LLM 串接：「LLM 說話」的 minimal pipeline</h2>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># 1. LLM 生成回答</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="nv">ANSWER</span><span class="o">=</span><span class="k">$(</span>curl -s http://localhost:11434/v1/chat/completions <span class="se">\
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="se"></span>  -H <span class="s2">&#34;Content-Type: application/json&#34;</span> <span class="se">\
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="se"></span>  -d <span class="s1">&#39;{
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="s1">    &#34;model&#34;: &#34;gemma3:1b&#34;,
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="s1">    &#34;messages&#34;: [{&#34;role&#34;:&#34;user&#34;,&#34;content&#34;:&#34;Tell me a one-sentence joke.&#34;}],
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="s1">    &#34;stream&#34;: false
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="s1">  }&#39;</span> <span class="p">|</span> python3 -c <span class="s2">&#34;import json,sys; print(json.load(sys.stdin)[&#39;choices&#39;][0][&#39;message&#39;][&#39;content&#39;])&#34;</span><span class="k">)</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="c1"># 2. Piper 把回答念出來</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;</span><span class="nv">$ANSWER</span><span class="s2">&#34;</span> <span class="p">|</span> piper -m ~/.piper-voices/en_US-lessac-low.onnx -f /tmp/llm-says.wav
</span></span><span class="line"><span class="ln">12</span><span class="cl">
</span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="c1"># 3. 播放</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">afplay /tmp/llm-says.wav</span></span></code></pre></div><p>三行 shell 完成「Local LLM 講笑話」整條 pipeline、無雲端、無 GPU。</p>
<h2 id="常見坑">常見坑</h2>
<h3 id="中文--多語言">中文 / 多語言</h3>
<p><code>en_US-lessac-low</code> 是英文 voice、餵中文會發音怪。中文要下載 <code>zh_CN-huayan-*</code>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">curl -L -o ~/.piper-voices/zh_CN-huayan-medium.onnx <span class="se">\
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="se"></span>  <span class="s2">&#34;https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/zh/zh_CN/huayan/medium/zh_CN-huayan-medium.onnx&#34;</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">curl -L -o ~/.piper-voices/zh_CN-huayan-medium.onnx.json <span class="se">\
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="se"></span>  <span class="s2">&#34;https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/zh/zh_CN/huayan/medium/zh_CN-huayan-medium.onnx.json&#34;</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">
</span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;你好，這是 Piper TTS 的中文測試。&#34;</span> <span class="se">\
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="se"></span>  <span class="p">|</span> piper -m ~/.piper-voices/zh_CN-huayan-medium.onnx -f /tmp/zh-out.wav</span></span></code></pre></div><p>zh_CN 預設是北京話腔調。</p>
<h3 id="--break-system-packages-警告"><code>--break-system-packages</code> 警告</h3>
<p>macOS 系統 Python 3.13+ 預設禁止 pip 直接裝。安全做法用 venv 或 pipx；不想搞 venv 就用 <code>--break-system-packages</code> flag（會跳警告但能裝）。長期建議遷到 venv、避免污染系統 Python。</p>
<h3 id="voice-quality-不夠">Voice quality 不夠</h3>
<p><code>low</code> quality 的 voice 適合驗證 / prototype、實際用 <code>medium</code> 或 <code>high</code>。低品質 voice 在長段文字會聽起來機械、自然度差。</p>
<h3 id="sample-rate-mismatch">Sample rate mismatch</h3>
<p>Voice metadata（<code>.onnx.json</code> 內 <code>sample_rate</code>）決定輸出 sample rate、不同 voice 可能不同（多數 22050 或 16000）。Whisper 期望 16000、若 Piper 輸出 22050、可能需要 ffmpeg 降採樣：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">ffmpeg -i piper-out.wav -ar <span class="m">16000</span> piper-out-16k.wav</span></span></code></pre></div><p><code>en_US-lessac-low</code> 本來就是 16k、沒這問題。</p>
<h2 id="何時這篇會過時">何時這篇會過時</h2>
<ul>
<li><code>pip install piper-tts</code> 安裝方式可能演化（轉純 binary release？）、但 ONNX model + CLI invocation 形式應該穩定。</li>
<li>Voice model 格式（ONNX）是 web 通用標準、未來增加 quality / locale、現有 voice 不會被 deprecate。</li>
<li>Hugging Face <code>rhasspy/piper-voices</code> repo 是 maintainer 官方、不會消失。</li>
</ul>
<p>讀的時候若 pip install 失敗、查 <a href="https://github.com/rhasspy/piper">piper GitHub</a> 最新 install 路徑；voice 列表看 piper-voices repo。</p>
<p>跟其他 hands-on 章節的關係：完整 hands-on 系列見 <a href="/blog/llm/01-local-llm-services/hands-on/" data-link-title="Hands-on：本地 AI 工具實作筆記" data-link-desc="Ollama / ComfyUI / Whisper / Piper TTS：實際安裝、驗證、跑通的紀錄。隨工具版本演化、跟 1.x 原理章節互補。">Hands-on 章節索引</a>、語音 round-trip 對接見 <a href="/blog/llm/01-local-llm-services/hands-on/whisper-setup/" data-link-title="Hands-on：安裝 whisper.cpp 做語音轉文字" data-link-desc="brew install whisper-cpp、下載 GGML model、Metal 加速、ffmpeg 餵 WAV、484ms 完成 7 秒音訊轉錄">Whisper STT</a>、跨服務 lifecycle 與記憶體管理見 <a href="/blog/llm/01-local-llm-services/hands-on/resource-management/" data-link-title="Hands-on：LLM 運行中 &#43; 結束的資源管理" data-link-desc="RAM / 磁碟 / port 三個 dimension 的觀察跟釋放、Ollama keep_alive 跟 ComfyUI 兩種 lifecycle 對比、實測釋放數字">Resource management</a>。</p>
]]></content:encoded></item><item><title>Hands-on：本地 AI 工具實作筆記</title><link>https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/</link><pubDate>Mon, 11 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/</guid><description>&lt;p>本子資料夾收錄本地 AI 工具的實際安裝跟驗證紀錄。跟 1.x 原理章節的關係：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>1.x 原理章節&lt;/th>
 &lt;th>Hands-on 紀錄&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>為什麼選 Ollama&lt;/td>
 &lt;td>實際 &lt;code>brew install&lt;/code> + &lt;code>ollama pull&lt;/code> 流程&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Speculative decoding 原理&lt;/td>
 &lt;td>MTP 模型實際載入 + 速度量測&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>ComfyUI 在生態的位置&lt;/td>
 &lt;td>實際 git clone + Python 環境 + 模型路徑配置&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>本資料夾的內容&lt;strong>會隨工具版本演化&lt;/strong>：指令、目錄結構、相依套件版本都會變。寫的時間戳記在每篇開頭、版本資訊在 frontmatter。跟 1.x 原理章節的差別是「原理跨工具世代不變、實作筆記是當下這版的快照」。&lt;/p>
&lt;h2 id="章節列表">章節列表&lt;/h2>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>章節&lt;/th>
 &lt;th>主題&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/quickstart/" data-link-title="Hands-on Quickstart：clone repo 後跑通所有 demo" data-link-desc="4 步驟跑通 RAG / MCP / permission demo 的 setup 跟驗證指令、整合 hands-on 系列所有章節的 prerequisite">Quickstart：clone repo 後跑通所有 demo&lt;/a>&lt;/td>
 &lt;td>4 步驟整合 setup、跑 RAG / MCP / permission demo、跨 hands-on 系列導讀&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/ollama-setup/" data-link-title="Hands-on：安裝 Ollama &amp;#43; 拉第一個 Gemma 模型" data-link-desc="brew install ollama、launchd service、ollama pull、curl 驗證 OpenAI 相容 API">Ollama 安裝 + Gemma 模型&lt;/a>&lt;/td>
 &lt;td>brew install、ollama pull、curl 驗證&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/comfyui-setup/" data-link-title="Hands-on：安裝 ComfyUI &amp;#43; SDXL base" data-link-desc="git clone、venv、pip install requirements、SDXL safetensors 放哪、--listen 啟動 server、瀏覽器 workflow 驗證">ComfyUI + Stable Diffusion XL&lt;/a>&lt;/td>
 &lt;td>git clone、Python 環境、SDXL 模型放哪&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/whisper-setup/" data-link-title="Hands-on：安裝 whisper.cpp 做語音轉文字" data-link-desc="brew install whisper-cpp、下載 GGML model、Metal 加速、ffmpeg 餵 WAV、484ms 完成 7 秒音訊轉錄">Whisper 語音轉文字&lt;/a>&lt;/td>
 &lt;td>&lt;code>brew install whisper-cpp&lt;/code> + Metal 加速、GGML 模型選擇、&lt;code>whisper-cli&lt;/code> + ffmpeg 驗證轉錄&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/piper-tts-setup/" data-link-title="Hands-on：安裝 Piper TTS 做文字轉語音" data-link-desc="pip install piper-tts、ONNX voice model、stdin 餵文字、WAV 輸出、跟 Whisper 互為 round-trip 驗證">Piper TTS 文字轉語音&lt;/a>&lt;/td>
 &lt;td>下載 binary、voice 選擇、wav 輸出&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/rag-demo/" data-link-title="Hands-on：用 blog content 當 corpus 跑 RAG" data-link-desc="200 行 Python：embedding &amp;#43; cosine retrieval &amp;#43; Ollama chat、validating 4.0 RAG 原理">RAG demo：用 blog content 當 corpus&lt;/a>&lt;/td>
 &lt;td>embedding + retrieval、串 Ollama&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/mcp-demo/" data-link-title="Hands-on：用 blog content 寫一個最小 MCP server" data-link-desc="stdio JSON-RPC、stdlib-only Python、暴露 blog content 給 LLM 用、validating 4.3 應用層協議">MCP server demo：暴露 blog content&lt;/a>&lt;/td>
 &lt;td>最小 MCP server、給 LLM 用&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/permission-boundary/" data-link-title="Hands-on：Ollama 改檔案 / 寫程式碼的權限邊界在哪" data-link-desc="四組對照實驗：Ollama 自己沒 FS / shell 權限、wrapper 才有；--dry-run / --confirm / --auto 三檔審查粒度的取捨">權限邊界實驗：LLM 改檔案 / 寫 shell 誰執行&lt;/a>&lt;/td>
 &lt;td>LLM 是 pure function、wrapper 才是權限 gate、&lt;code>--dry-run&lt;/code> / &lt;code>--confirm&lt;/code> / &lt;code>--auto&lt;/code> 取捨&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/instruction-following-test/" data-link-title="Hands-on：跨資料夾風格 follow 任務的模型對比" data-link-desc="1B / 4B / 8B / 跨代 4B 在「讀風格參考、follow 既有格式、寫新章節」任務上的 structural metrics 對比、揭示 model size 不是唯一因素">跨資料夾風格 follow 任務的 model size 對比&lt;/a>&lt;/td>
 &lt;td>1B vs 4B 在「讀資料夾、follow 既有格式、寫新章節」任務上的 structural metrics phase transition&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/resource-management/" data-link-title="Hands-on：LLM 運行中 &amp;#43; 結束的資源管理" data-link-desc="RAM / 磁碟 / port 三個 dimension 的觀察跟釋放、Ollama keep_alive 跟 ComfyUI 兩種 lifecycle 對比、實測釋放數字">LLM 運行中 + 結束的資源管理&lt;/a>&lt;/td>
 &lt;td>RAM / 磁碟 / port 三 dimension 觀察、Ollama auto-unload vs ComfyUI persistent lifecycle、實測釋放數字、自動化 cleanup shell function&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/llm/01-local-llm-services/hands-on/rag-mcp-resources/" data-link-title="Hands-on：RAG / MCP 的資源 footprint" data-link-desc="RAG ingest / query / MCP server 三階段的 RAM / 磁碟 / process 實測、多模型並存的 RAM 衝突、本地 LLM 跑 RAG 跟單純 chat 的差異">RAG / MCP 的資源 footprint&lt;/a>&lt;/td>
 &lt;td>RAG ingest / query / MCP server 三階段 RAM / 磁碟 / process 實測、多模型並存 RAM 衝突、長期累積管理&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;h2 id="通用前置">通用前置&lt;/h2>
&lt;p>所有工具都假設你的 Mac 滿足：&lt;/p></description><content:encoded><![CDATA[<p>本子資料夾收錄本地 AI 工具的實際安裝跟驗證紀錄。跟 1.x 原理章節的關係：</p>
<table>
  <thead>
      <tr>
          <th>1.x 原理章節</th>
          <th>Hands-on 紀錄</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>為什麼選 Ollama</td>
          <td>實際 <code>brew install</code> + <code>ollama pull</code> 流程</td>
      </tr>
      <tr>
          <td>Speculative decoding 原理</td>
          <td>MTP 模型實際載入 + 速度量測</td>
      </tr>
      <tr>
          <td>ComfyUI 在生態的位置</td>
          <td>實際 git clone + Python 環境 + 模型路徑配置</td>
      </tr>
  </tbody>
</table>
<p>本資料夾的內容<strong>會隨工具版本演化</strong>：指令、目錄結構、相依套件版本都會變。寫的時間戳記在每篇開頭、版本資訊在 frontmatter。跟 1.x 原理章節的差別是「原理跨工具世代不變、實作筆記是當下這版的快照」。</p>
<h2 id="章節列表">章節列表</h2>
<table>
  <thead>
      <tr>
          <th>章節</th>
          <th>主題</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="/blog/llm/01-local-llm-services/hands-on/quickstart/" data-link-title="Hands-on Quickstart：clone repo 後跑通所有 demo" data-link-desc="4 步驟跑通 RAG / MCP / permission demo 的 setup 跟驗證指令、整合 hands-on 系列所有章節的 prerequisite">Quickstart：clone repo 後跑通所有 demo</a></td>
          <td>4 步驟整合 setup、跑 RAG / MCP / permission demo、跨 hands-on 系列導讀</td>
      </tr>
      <tr>
          <td><a href="/blog/llm/01-local-llm-services/hands-on/ollama-setup/" data-link-title="Hands-on：安裝 Ollama &#43; 拉第一個 Gemma 模型" data-link-desc="brew install ollama、launchd service、ollama pull、curl 驗證 OpenAI 相容 API">Ollama 安裝 + Gemma 模型</a></td>
          <td>brew install、ollama pull、curl 驗證</td>
      </tr>
      <tr>
          <td><a href="/blog/llm/01-local-llm-services/hands-on/comfyui-setup/" data-link-title="Hands-on：安裝 ComfyUI &#43; SDXL base" data-link-desc="git clone、venv、pip install requirements、SDXL safetensors 放哪、--listen 啟動 server、瀏覽器 workflow 驗證">ComfyUI + Stable Diffusion XL</a></td>
          <td>git clone、Python 環境、SDXL 模型放哪</td>
      </tr>
      <tr>
          <td><a href="/blog/llm/01-local-llm-services/hands-on/whisper-setup/" data-link-title="Hands-on：安裝 whisper.cpp 做語音轉文字" data-link-desc="brew install whisper-cpp、下載 GGML model、Metal 加速、ffmpeg 餵 WAV、484ms 完成 7 秒音訊轉錄">Whisper 語音轉文字</a></td>
          <td><code>brew install whisper-cpp</code> + Metal 加速、GGML 模型選擇、<code>whisper-cli</code> + ffmpeg 驗證轉錄</td>
      </tr>
      <tr>
          <td><a href="/blog/llm/01-local-llm-services/hands-on/piper-tts-setup/" data-link-title="Hands-on：安裝 Piper TTS 做文字轉語音" data-link-desc="pip install piper-tts、ONNX voice model、stdin 餵文字、WAV 輸出、跟 Whisper 互為 round-trip 驗證">Piper TTS 文字轉語音</a></td>
          <td>下載 binary、voice 選擇、wav 輸出</td>
      </tr>
      <tr>
          <td><a href="/blog/llm/01-local-llm-services/hands-on/rag-demo/" data-link-title="Hands-on：用 blog content 當 corpus 跑 RAG" data-link-desc="200 行 Python：embedding &#43; cosine retrieval &#43; Ollama chat、validating 4.0 RAG 原理">RAG demo：用 blog content 當 corpus</a></td>
          <td>embedding + retrieval、串 Ollama</td>
      </tr>
      <tr>
          <td><a href="/blog/llm/01-local-llm-services/hands-on/mcp-demo/" data-link-title="Hands-on：用 blog content 寫一個最小 MCP server" data-link-desc="stdio JSON-RPC、stdlib-only Python、暴露 blog content 給 LLM 用、validating 4.3 應用層協議">MCP server demo：暴露 blog content</a></td>
          <td>最小 MCP server、給 LLM 用</td>
      </tr>
      <tr>
          <td><a href="/blog/llm/01-local-llm-services/hands-on/permission-boundary/" data-link-title="Hands-on：Ollama 改檔案 / 寫程式碼的權限邊界在哪" data-link-desc="四組對照實驗：Ollama 自己沒 FS / shell 權限、wrapper 才有；--dry-run / --confirm / --auto 三檔審查粒度的取捨">權限邊界實驗：LLM 改檔案 / 寫 shell 誰執行</a></td>
          <td>LLM 是 pure function、wrapper 才是權限 gate、<code>--dry-run</code> / <code>--confirm</code> / <code>--auto</code> 取捨</td>
      </tr>
      <tr>
          <td><a href="/blog/llm/01-local-llm-services/hands-on/instruction-following-test/" data-link-title="Hands-on：跨資料夾風格 follow 任務的模型對比" data-link-desc="1B / 4B / 8B / 跨代 4B 在「讀風格參考、follow 既有格式、寫新章節」任務上的 structural metrics 對比、揭示 model size 不是唯一因素">跨資料夾風格 follow 任務的 model size 對比</a></td>
          <td>1B vs 4B 在「讀資料夾、follow 既有格式、寫新章節」任務上的 structural metrics phase transition</td>
      </tr>
      <tr>
          <td><a href="/blog/llm/01-local-llm-services/hands-on/resource-management/" data-link-title="Hands-on：LLM 運行中 &#43; 結束的資源管理" data-link-desc="RAM / 磁碟 / port 三個 dimension 的觀察跟釋放、Ollama keep_alive 跟 ComfyUI 兩種 lifecycle 對比、實測釋放數字">LLM 運行中 + 結束的資源管理</a></td>
          <td>RAM / 磁碟 / port 三 dimension 觀察、Ollama auto-unload vs ComfyUI persistent lifecycle、實測釋放數字、自動化 cleanup shell function</td>
      </tr>
      <tr>
          <td><a href="/blog/llm/01-local-llm-services/hands-on/rag-mcp-resources/" data-link-title="Hands-on：RAG / MCP 的資源 footprint" data-link-desc="RAG ingest / query / MCP server 三階段的 RAM / 磁碟 / process 實測、多模型並存的 RAM 衝突、本地 LLM 跑 RAG 跟單純 chat 的差異">RAG / MCP 的資源 footprint</a></td>
          <td>RAG ingest / query / MCP server 三階段 RAM / 磁碟 / process 實測、多模型並存 RAM 衝突、長期累積管理</td>
      </tr>
  </tbody>
</table>
<h2 id="通用前置">通用前置</h2>
<p>所有工具都假設你的 Mac 滿足：</p>
<ul>
<li>Apple Silicon Mac（M1 / M2 / M3 / M4）</li>
<li>macOS 14 (Sonoma) 或以上</li>
<li>Homebrew 安裝完成（<code>brew --version</code> 可看版本）</li>
<li>至少 16 GB 統一記憶體（24 GB+ 較順）</li>
<li>至少 20 GB 可用磁碟空間（本系列總共會佔約 15 GB）</li>
</ul>
<p>需要 Python 環境的工具（ComfyUI、Whisper）會用 venv 隔離、不污染系統 Python。</p>
<h2 id="驗證紀錄環境">驗證紀錄環境</h2>
<p>本系列的指令在以下環境驗證：</p>
<table>
  <thead>
      <tr>
          <th>項目</th>
          <th>版本</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>macOS</td>
          <td>Darwin 24.3.0（Sonoma 14.x）</td>
      </tr>
      <tr>
          <td>Homebrew</td>
          <td>由 <code>/opt/homebrew/bin/brew</code> 提供</td>
      </tr>
      <tr>
          <td>Python</td>
          <td>3.x（系統或 pyenv 都可）</td>
      </tr>
      <tr>
          <td>驗證日期</td>
          <td>2026-05-11</td>
      </tr>
  </tbody>
</table>
<p>換 Mac 規格、換 macOS 版本、半年後再讀本系列、指令可能要小調整、但<strong>前置設定的種類跟驗證步驟的結構</strong>通常不變。看到指令跑不過時、回 1.7 <a href="/blog/llm/01-local-llm-services/troubleshooting/" data-link-title="1.7 排錯方法論：用三層架構做故障定位" data-link-desc="故障定位的分層思考、症狀到層級的對應反射、log 在三層的角色差異、最小可重現的縮減策略">排錯方法論</a> 的三層架構定位、不要把錯誤訊息當絕對。</p>
]]></content:encoded></item></channel></rss>