<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Pre-Commit on Tarragon</title><link>https://tarrragon.github.io/blog/tags/pre-commit/</link><description>Recent content in Pre-Commit on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Fri, 24 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/pre-commit/index.xml" rel="self" type="application/rss+xml"/><item><title>mdtools：Go + goldmark 的 markdown 工具鏈設計</title><link>https://tarrragon.github.io/blog/posts/mdtoolsgo--goldmark-%E7%9A%84-markdown-%E5%B7%A5%E5%85%B7%E9%8F%88%E8%A8%AD%E8%A8%88/</link><pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/posts/mdtoolsgo--goldmark-%E7%9A%84-markdown-%E5%B7%A5%E5%85%B7%E9%8F%88%E8%A8%AD%E8%A8%88/</guid><description>&lt;h2 id="背景為什麼要自訂工具">背景：為什麼要自訂工具&lt;/h2>
&lt;p>Blog 專案的 markdown 規範有三類不同性質的檢查需求：&lt;/p>
&lt;ol>
&lt;li>&lt;strong>基礎格式&lt;/strong>（MD022 / MD024 / MD034 / MD060 等）— 市面 linter 都有，但規則細節不一致，我們對 &lt;code>MD024&lt;/code> 要特殊處理（&lt;code>siblings_only&lt;/code> 模式允許平行結構下的同名標題）。&lt;/li>
&lt;li>&lt;strong>反釣魚校驗&lt;/strong>（R-URL-1/2）— 顯示文字含 TLD 字樣時必須與 href 的 domain 一致，避免釣魚型連結。這條規則不在 markdownlint 標準集內。&lt;/li>
&lt;li>&lt;strong>卡片雙向完整性&lt;/strong>（L1/L2/L4）— 跨文件的圖論檢查：每張卡片至少被一篇正文引用、相對連結目標存在、卡片首段含鄰卡連結。&lt;/li>
&lt;/ol>
&lt;p>三類檢查共享兩個技術需求：&lt;strong>AST 層的語法理解&lt;/strong>、&lt;strong>goldmark 與 Hugo render 的一致性&lt;/strong>。詳細原因寫在&lt;a href="https://tarrragon.github.io/blog/posts/%E4%BB%80%E9%BA%BC%E6%98%AF-ast-%E5%BE%9E%E5%AD%97%E4%B8%B2%E5%88%B0%E8%AA%9E%E6%B3%95%E6%A8%B9%E7%9A%84%E8%A6%96%E8%A7%92%E8%BD%89%E6%8F%9B/" data-link-title="什麼是 AST — 從字串到語法樹的視角轉換" data-link-desc="AST 與 regex 的差異判準：規則需要知道文字處在什麼結構中時 regex 就不夠。附 regex 誤判的具體 case。">什麼是 AST&lt;/a>。&lt;/p>
&lt;p>Markdownlint-cli2 涵蓋第一類、無法表達第二、三類。現成方案湊不出來，就自己寫。&lt;/p>
&lt;h2 id="語言選擇go-vs-python-的-tripwire-式決策">語言選擇：Go vs Python 的 tripwire 式決策&lt;/h2>
&lt;p>這是實際討論過的決策，值得留下紀錄。&lt;/p>
&lt;h3 id="表面的直覺blog-用-hugogo-寫的所以用-go-最自然">表面的直覺：Blog 用 Hugo（Go 寫的），所以用 Go 最自然&lt;/h3>
&lt;p>這個推論有個破口：Hugo 雖然用 Go 寫，但我們用的是 &lt;strong>pre-built binary&lt;/strong>。&lt;code>hugo server&lt;/code> 本地跑的是下載好的執行檔，CI 用 &lt;code>peaceiris/actions-hugo&lt;/code> 這類 action，整個 blog 的 build 流程完全不碰 Go toolchain。&lt;/p>
&lt;p>「專案已有 Go 依賴」這個前提不成立。真正要問的是：&lt;strong>我是否願意為這組工具引入 Go toolchain 這個新依賴？&lt;/strong>&lt;/p>
&lt;h3 id="務實的對比">務實的對比&lt;/h3>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>面向&lt;/th>
 &lt;th>Python&lt;/th>
 &lt;th>Go&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Pre-commit 啟動速度&lt;/td>
 &lt;td>~50ms（interpreter 啟動）&lt;/td>
 &lt;td>&lt;code>go run&lt;/code> ~500ms/次；pre-build binary 則要 commit 進 repo&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>CI 新增依賴&lt;/td>
 &lt;td>&lt;code>setup-python&lt;/code>（runner 通常自帶）&lt;/td>
 &lt;td>&lt;code>setup-go&lt;/code> + build step&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>開發速度（regex / 字串處理）&lt;/td>
 &lt;td>快&lt;/td>
 &lt;td>慢 2-3x，boilerplate 較多&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>AST 解析選擇&lt;/td>
 &lt;td>mistune / markdown-it-py&lt;/td>
 &lt;td>&lt;strong>goldmark（與 Hugo 同源）&lt;/strong>&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Go 唯一的決定性優勢是 goldmark — 跟 Hugo 用同一個 parser 可以保證「lint 通過 ↔ Hugo render 成功」等價。&lt;/p>
&lt;h3 id="關鍵一問現在需要-ast-嗎">關鍵一問：現在需要 AST 嗎？&lt;/h3>
&lt;p>我們最初傾向的是 tripwire 策略：&lt;strong>現在用 Python + regex 先頂著，等 rule 複雜度超過臨界就升級 Go + goldmark&lt;/strong>。Tripwire 條件大致是：&lt;/p>
&lt;ol>
&lt;li>Rule 數量超過 5 條。&lt;/li>
&lt;li>任一規則需要「這段文字在 code block 內嗎」這類上下文判斷。&lt;/li>
&lt;li>Hugo render 結果跟 lint 判讀開始不一致。&lt;/li>
&lt;/ol>
&lt;p>但事實是：&lt;/p>
&lt;ul>
&lt;li>MD024 的 siblings_only 已經需要父子關係追蹤 — 條件 2 馬上命中。&lt;/li>
&lt;li>卡片雙向完整性是當前任務（不是未來可能）— 跨文件檢查 regex 做不到。&lt;/li>
&lt;/ul>
&lt;p>兩個條件當下已經滿足，delay migration 反而要兩次寫工具。所以直接選 Go + goldmark。&lt;/p>
&lt;p>這個決定的邏輯層面是：&lt;strong>當需求已在手上而非 speculative，延遲決策的代價 &amp;gt; 直接上的代價&lt;/strong>。&lt;/p>
&lt;h2 id="為什麼選-goldmark">為什麼選 goldmark&lt;/h2>
&lt;p>三個具體理由：&lt;/p>
&lt;h3 id="1-解析結果與-hugo-一致">1. 解析結果與 Hugo 一致&lt;/h3>
&lt;p>Hugo 的 content render pipeline 走 goldmark。用同一個 parser 寫 lint，可以杜絕「lint 通過但 Hugo render 失敗」或「Hugo 看得懂但 lint 誤判」這類長尾 bug。&lt;/p></description><content:encoded><![CDATA[<h2 id="背景為什麼要自訂工具">背景：為什麼要自訂工具</h2>
<p>Blog 專案的 markdown 規範有三類不同性質的檢查需求：</p>
<ol>
<li><strong>基礎格式</strong>（MD022 / MD024 / MD034 / MD060 等）— 市面 linter 都有，但規則細節不一致，我們對 <code>MD024</code> 要特殊處理（<code>siblings_only</code> 模式允許平行結構下的同名標題）。</li>
<li><strong>反釣魚校驗</strong>（R-URL-1/2）— 顯示文字含 TLD 字樣時必須與 href 的 domain 一致，避免釣魚型連結。這條規則不在 markdownlint 標準集內。</li>
<li><strong>卡片雙向完整性</strong>（L1/L2/L4）— 跨文件的圖論檢查：每張卡片至少被一篇正文引用、相對連結目標存在、卡片首段含鄰卡連結。</li>
</ol>
<p>三類檢查共享兩個技術需求：<strong>AST 層的語法理解</strong>、<strong>goldmark 與 Hugo render 的一致性</strong>。詳細原因寫在<a href="/blog/posts/%E4%BB%80%E9%BA%BC%E6%98%AF-ast-%E5%BE%9E%E5%AD%97%E4%B8%B2%E5%88%B0%E8%AA%9E%E6%B3%95%E6%A8%B9%E7%9A%84%E8%A6%96%E8%A7%92%E8%BD%89%E6%8F%9B/" data-link-title="什麼是 AST — 從字串到語法樹的視角轉換" data-link-desc="AST 與 regex 的差異判準：規則需要知道文字處在什麼結構中時 regex 就不夠。附 regex 誤判的具體 case。">什麼是 AST</a>。</p>
<p>Markdownlint-cli2 涵蓋第一類、無法表達第二、三類。現成方案湊不出來，就自己寫。</p>
<h2 id="語言選擇go-vs-python-的-tripwire-式決策">語言選擇：Go vs Python 的 tripwire 式決策</h2>
<p>這是實際討論過的決策，值得留下紀錄。</p>
<h3 id="表面的直覺blog-用-hugogo-寫的所以用-go-最自然">表面的直覺：Blog 用 Hugo（Go 寫的），所以用 Go 最自然</h3>
<p>這個推論有個破口：Hugo 雖然用 Go 寫，但我們用的是 <strong>pre-built binary</strong>。<code>hugo server</code> 本地跑的是下載好的執行檔，CI 用 <code>peaceiris/actions-hugo</code> 這類 action，整個 blog 的 build 流程完全不碰 Go toolchain。</p>
<p>「專案已有 Go 依賴」這個前提不成立。真正要問的是：<strong>我是否願意為這組工具引入 Go toolchain 這個新依賴？</strong></p>
<h3 id="務實的對比">務實的對比</h3>
<table>
  <thead>
      <tr>
          <th>面向</th>
          <th>Python</th>
          <th>Go</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Pre-commit 啟動速度</td>
          <td>~50ms（interpreter 啟動）</td>
          <td><code>go run</code> ~500ms/次；pre-build binary 則要 commit 進 repo</td>
      </tr>
      <tr>
          <td>CI 新增依賴</td>
          <td><code>setup-python</code>（runner 通常自帶）</td>
          <td><code>setup-go</code> + build step</td>
      </tr>
      <tr>
          <td>開發速度（regex / 字串處理）</td>
          <td>快</td>
          <td>慢 2-3x，boilerplate 較多</td>
      </tr>
      <tr>
          <td>AST 解析選擇</td>
          <td>mistune / markdown-it-py</td>
          <td><strong>goldmark（與 Hugo 同源）</strong></td>
      </tr>
  </tbody>
</table>
<p>Go 唯一的決定性優勢是 goldmark — 跟 Hugo 用同一個 parser 可以保證「lint 通過 ↔ Hugo render 成功」等價。</p>
<h3 id="關鍵一問現在需要-ast-嗎">關鍵一問：現在需要 AST 嗎？</h3>
<p>我們最初傾向的是 tripwire 策略：<strong>現在用 Python + regex 先頂著，等 rule 複雜度超過臨界就升級 Go + goldmark</strong>。Tripwire 條件大致是：</p>
<ol>
<li>Rule 數量超過 5 條。</li>
<li>任一規則需要「這段文字在 code block 內嗎」這類上下文判斷。</li>
<li>Hugo render 結果跟 lint 判讀開始不一致。</li>
</ol>
<p>但事實是：</p>
<ul>
<li>MD024 的 siblings_only 已經需要父子關係追蹤 — 條件 2 馬上命中。</li>
<li>卡片雙向完整性是當前任務（不是未來可能）— 跨文件檢查 regex 做不到。</li>
</ul>
<p>兩個條件當下已經滿足，delay migration 反而要兩次寫工具。所以直接選 Go + goldmark。</p>
<p>這個決定的邏輯層面是：<strong>當需求已在手上而非 speculative，延遲決策的代價 &gt; 直接上的代價</strong>。</p>
<h2 id="為什麼選-goldmark">為什麼選 goldmark</h2>
<p>三個具體理由：</p>
<h3 id="1-解析結果與-hugo-一致">1. 解析結果與 Hugo 一致</h3>
<p>Hugo 的 content render pipeline 走 goldmark。用同一個 parser 寫 lint，可以杜絕「lint 通過但 Hugo render 失敗」或「Hugo 看得懂但 lint 誤判」這類長尾 bug。</p>
<h3 id="2-ast-api-直觀">2. AST API 直觀</h3>
<p>Goldmark 的 AST 節點型別設計貼近 CommonMark spec：<code>Document</code> / <code>Heading</code> / <code>Paragraph</code> / <code>Link</code> / <code>Table</code> / <code>FencedCodeBlock</code>。要寫 rule 時幾乎不需要翻對照表，直接比對心中的 markdown 結構。</p>
<h3 id="3-活躍且嵌入在主流-go-生態">3. 活躍且嵌入在主流 Go 生態</h3>
<p>Goldmark 是 Hugo 使用的 parser，社群活躍、bug fix 持續進來。不會變成 abandoned dependency。</p>
<h2 id="架構設計單一-binary--子命令">架構設計：單一 binary + 子命令</h2>
<p>三個檢查功能分開寫比較好懂，但如果寫成三個 binary，每次 pre-commit 都要 parse markdown 三次，對大型 repo（我們這個已經超過 300 個 markdown）會明顯拖慢。</p>
<p>折衷方案是<strong>單一 binary + 子命令</strong>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln"> 1</span><span class="cl">scripts/mdtools/
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">├── go.mod
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">├── main.go                    # subcommand dispatcher
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">├── cmd/
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">│   ├── fmt.go                 # mdtools fmt [--fix|--check]
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">│   ├── lint.go                # mdtools lint
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">│   └── cards.go               # mdtools cards
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">├── internal/
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">│   ├── astutil/               # goldmark 封裝（parse, walk, parent chain）
</span></span><span class="line"><span class="ln">10</span><span class="cl">│   ├── rules/                 # 規則定義（可被三個子命令共用）
</span></span><span class="line"><span class="ln">11</span><span class="cl">│   │   ├── config.go          # 全域開關與參數
</span></span><span class="line"><span class="ln">12</span><span class="cl">│   │   ├── headings.go        # 標題規則
</span></span><span class="line"><span class="ln">13</span><span class="cl">│   │   ├── urls.go            # URL + 反釣魚
</span></span><span class="line"><span class="ln">14</span><span class="cl">│   │   ├── tables.go          # 表格正規化
</span></span><span class="line"><span class="ln">15</span><span class="cl">│   │   ├── frontmatter.go     # front matter schema
</span></span><span class="line"><span class="ln">16</span><span class="cl">│   │   └── identifiers.go     # 識別碼白名單（CVE、KB、...）
</span></span><span class="line"><span class="ln">17</span><span class="cl">│   └── report/                # 統一錯誤輸出格式
</span></span><span class="line"><span class="ln">18</span><span class="cl">└── README.md</span></span></code></pre></div><p>三個子命令共享 <code>internal/astutil</code> 和 <code>internal/rules</code>，同一個 parse 結果可以在不同規則間重用。</p>
<h2 id="實際走訪md024-siblings_only-在-goldmark-上怎麼寫">實際走訪：MD024 siblings_only 在 goldmark 上怎麼寫</h2>
<p>這段是示範 AST-based rule 的可讀性，不是最終實作版本。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kn">package</span> <span class="nx">rules</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="kn">import</span> <span class="p">(</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">    <span class="s">&#34;bytes&#34;</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="s">&#34;github.com/yuin/goldmark/ast&#34;</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">    <span class="s">&#34;github.com/yuin/goldmark/text&#34;</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="c1">// CheckSiblingsOnlyHeadings walks the document and flags headings</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="c1">// that share the same text with a sibling under the same parent heading.</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="kd">func</span> <span class="nf">CheckSiblingsOnlyHeadings</span><span class="p">(</span><span class="nx">doc</span> <span class="nx">ast</span><span class="p">.</span><span class="nx">Node</span><span class="p">,</span> <span class="nx">src</span> <span class="p">[]</span><span class="kt">byte</span><span class="p">)</span> <span class="p">[]</span><span class="nx">Violation</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">    <span class="kd">var</span> <span class="nx">violations</span> <span class="p">[]</span><span class="nx">Violation</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">
</span></span><span class="line"><span class="ln">15</span><span class="cl">    <span class="c1">// parentMap[level] 保留目前走到的各層 heading，作為後續 H(n+1) 的 parent context</span>
</span></span><span class="line"><span class="ln">16</span><span class="cl">    <span class="nx">parentMap</span> <span class="o">:=</span> <span class="kd">map</span><span class="p">[</span><span class="kt">int</span><span class="p">]</span><span class="o">*</span><span class="nx">ast</span><span class="p">.</span><span class="nx">Heading</span><span class="p">{}</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">    <span class="c1">// 每個 parent context 下，收集已見過的子 heading 文字</span>
</span></span><span class="line"><span class="ln">18</span><span class="cl">    <span class="nx">seenUnderParent</span> <span class="o">:=</span> <span class="kd">map</span><span class="p">[</span><span class="o">*</span><span class="nx">ast</span><span class="p">.</span><span class="nx">Heading</span><span class="p">]</span><span class="kd">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="nx">ast</span><span class="p">.</span><span class="nx">Node</span><span class="p">{}</span>
</span></span><span class="line"><span class="ln">19</span><span class="cl">
</span></span><span class="line"><span class="ln">20</span><span class="cl">    <span class="nx">ast</span><span class="p">.</span><span class="nf">Walk</span><span class="p">(</span><span class="nx">doc</span><span class="p">,</span> <span class="kd">func</span><span class="p">(</span><span class="nx">n</span> <span class="nx">ast</span><span class="p">.</span><span class="nx">Node</span><span class="p">,</span> <span class="nx">entering</span> <span class="kt">bool</span><span class="p">)</span> <span class="p">(</span><span class="nx">ast</span><span class="p">.</span><span class="nx">WalkStatus</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">21</span><span class="cl">        <span class="k">if</span> <span class="p">!</span><span class="nx">entering</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">22</span><span class="cl">            <span class="k">return</span> <span class="nx">ast</span><span class="p">.</span><span class="nx">WalkContinue</span><span class="p">,</span> <span class="kc">nil</span>
</span></span><span class="line"><span class="ln">23</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">24</span><span class="cl">
</span></span><span class="line"><span class="ln">25</span><span class="cl">        <span class="nx">h</span><span class="p">,</span> <span class="nx">ok</span> <span class="o">:=</span> <span class="nx">n</span><span class="p">.(</span><span class="o">*</span><span class="nx">ast</span><span class="p">.</span><span class="nx">Heading</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">26</span><span class="cl">        <span class="k">if</span> <span class="p">!</span><span class="nx">ok</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">27</span><span class="cl">            <span class="k">return</span> <span class="nx">ast</span><span class="p">.</span><span class="nx">WalkContinue</span><span class="p">,</span> <span class="kc">nil</span>
</span></span><span class="line"><span class="ln">28</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">29</span><span class="cl">
</span></span><span class="line"><span class="ln">30</span><span class="cl">        <span class="nx">text</span> <span class="o">:=</span> <span class="nb">string</span><span class="p">(</span><span class="nx">h</span><span class="p">.</span><span class="nf">Text</span><span class="p">(</span><span class="nx">src</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">31</span><span class="cl">        <span class="nx">parent</span> <span class="o">:=</span> <span class="nx">parentMap</span><span class="p">[</span><span class="nx">h</span><span class="p">.</span><span class="nx">Level</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="c1">// 直接上層 heading</span>
</span></span><span class="line"><span class="ln">32</span><span class="cl">        <span class="nx">seen</span><span class="p">,</span> <span class="nx">exists</span> <span class="o">:=</span> <span class="nx">seenUnderParent</span><span class="p">[</span><span class="nx">parent</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">33</span><span class="cl">        <span class="k">if</span> <span class="p">!</span><span class="nx">exists</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">34</span><span class="cl">            <span class="nx">seen</span> <span class="p">=</span> <span class="kd">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="nx">ast</span><span class="p">.</span><span class="nx">Node</span><span class="p">{}</span>
</span></span><span class="line"><span class="ln">35</span><span class="cl">            <span class="nx">seenUnderParent</span><span class="p">[</span><span class="nx">parent</span><span class="p">]</span> <span class="p">=</span> <span class="nx">seen</span>
</span></span><span class="line"><span class="ln">36</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">37</span><span class="cl">
</span></span><span class="line"><span class="ln">38</span><span class="cl">        <span class="k">if</span> <span class="nx">prev</span><span class="p">,</span> <span class="nx">dup</span> <span class="o">:=</span> <span class="nx">seen</span><span class="p">[</span><span class="nx">text</span><span class="p">];</span> <span class="nx">dup</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">39</span><span class="cl">            <span class="nx">violations</span> <span class="p">=</span> <span class="nb">append</span><span class="p">(</span><span class="nx">violations</span><span class="p">,</span> <span class="nx">Violation</span><span class="p">{</span>
</span></span><span class="line"><span class="ln">40</span><span class="cl">                <span class="nx">Rule</span><span class="p">:</span>    <span class="s">&#34;MD024-siblings_only&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">41</span><span class="cl">                <span class="nx">Node</span><span class="p">:</span>    <span class="nx">h</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">42</span><span class="cl">                <span class="nx">Message</span><span class="p">:</span> <span class="s">&#34;duplicate heading under the same parent: &#34;</span> <span class="o">+</span> <span class="nx">text</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">43</span><span class="cl">                <span class="nx">Prev</span><span class="p">:</span>    <span class="nx">prev</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">44</span><span class="cl">            <span class="p">})</span>
</span></span><span class="line"><span class="ln">45</span><span class="cl">        <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">46</span><span class="cl">            <span class="nx">seen</span><span class="p">[</span><span class="nx">text</span><span class="p">]</span> <span class="p">=</span> <span class="nx">h</span>
</span></span><span class="line"><span class="ln">47</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">48</span><span class="cl">
</span></span><span class="line"><span class="ln">49</span><span class="cl">        <span class="nx">parentMap</span><span class="p">[</span><span class="nx">h</span><span class="p">.</span><span class="nx">Level</span><span class="p">]</span> <span class="p">=</span> <span class="nx">h</span>
</span></span><span class="line"><span class="ln">50</span><span class="cl">        <span class="c1">// 進到更深層時，清空下層的舊狀態</span>
</span></span><span class="line"><span class="ln">51</span><span class="cl">        <span class="k">for</span> <span class="nx">lv</span> <span class="o">:=</span> <span class="nx">h</span><span class="p">.</span><span class="nx">Level</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span> <span class="nx">lv</span> <span class="o">&lt;=</span> <span class="mi">6</span><span class="p">;</span> <span class="nx">lv</span><span class="o">++</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">52</span><span class="cl">            <span class="nb">delete</span><span class="p">(</span><span class="nx">parentMap</span><span class="p">,</span> <span class="nx">lv</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">53</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">54</span><span class="cl">
</span></span><span class="line"><span class="ln">55</span><span class="cl">        <span class="k">return</span> <span class="nx">ast</span><span class="p">.</span><span class="nx">WalkContinue</span><span class="p">,</span> <span class="kc">nil</span>
</span></span><span class="line"><span class="ln">56</span><span class="cl">    <span class="p">})</span>
</span></span><span class="line"><span class="ln">57</span><span class="cl">
</span></span><span class="line"><span class="ln">58</span><span class="cl">    <span class="k">return</span> <span class="nx">violations</span>
</span></span><span class="line"><span class="ln">59</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>對比 regex 版本要自己寫「目前 H2 是誰」狀態機 + 「切回上層時清狀態」— goldmark 的 walker pattern 把階層邏輯外部化到樹結構，rule 本身只處理「同一 parent 下有沒有重複」的核心語義。</p>
<p>幾百行 regex 才能穩定做到的事，AST 版本大概 30 行。規則越多，這個倍率越明顯。</p>
<h2 id="pre-commit-與-ci-整合">Pre-commit 與 CI 整合</h2>
<h3 id="本地開發githookspre-commit-與-githookspre-push">本地開發：<code>.githooks/pre-commit</code> 與 <code>.githooks/pre-push</code></h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="cp">#!/usr/bin/env bash
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="cp"></span><span class="nb">set</span> -euo pipefail
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="c1"># 確保 binary 最新</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="k">if</span> <span class="o">[[</span> ! -x bin/mdtools <span class="o">]]</span> <span class="o">||</span> <span class="o">[[</span> scripts/mdtools/main.go -nt bin/mdtools <span class="o">]]</span><span class="p">;</span> <span class="k">then</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="nb">echo</span> <span class="s2">&#34;Rebuilding mdtools...&#34;</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">    <span class="o">(</span><span class="nb">cd</span> scripts/mdtools <span class="o">&amp;&amp;</span> go build -o ../../bin/mdtools .<span class="o">)</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="k">fi</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="c1"># 三段式檢查</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">bin/mdtools fmt --fix   <span class="c1"># 自動修格式；改動會 re-stage</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">git add <span class="k">$(</span>git diff --name-only --cached --diff-filter<span class="o">=</span>AM <span class="p">|</span> grep <span class="s1">&#39;\.md$&#39;</span> <span class="o">||</span> <span class="nb">true</span><span class="k">)</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">
</span></span><span class="line"><span class="ln">14</span><span class="cl">bin/mdtools lint        <span class="c1"># 結構檢查，失敗即阻擋</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">bin/mdtools cards       <span class="c1"># 跨文件檢查，失敗即阻擋</span></span></span></code></pre></div><p><code>pre-push</code> 補上全量 gate：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">make check</span></span></code></pre></div><p>關鍵設計：</p>
<ul>
<li><code>mdtools fmt --fix</code> 會改檔，改完後要 <code>git add</code> 回 staged，否則 commit 進去的還是舊內容。</li>
<li><code>lint</code> 和 <code>cards</code> 不改檔，只讀與報告。</li>
<li><code>pre-commit</code> 保持 staged-file scoped，讓 commit 回饋夠快；<code>pre-push</code> 跑全量 <code>make check</code>，讓本機結果和 CI 同步。</li>
<li>Binary mtime 檢查避免每次 commit 都 rebuild。</li>
<li><code>bin/mdtools</code> 本身 gitignore，不 commit 進 repo。</li>
</ul>
<h3 id="cigithubworkflowsmd-checkyml">CI：<code>.github/workflows/md-check.yml</code></h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">md-check</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">on</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">push, pull_request]</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">  </span><span class="nt">check</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/setup-go@v5</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w"> </span>{<span class="w"> </span><span class="nt">go-version</span><span class="p">:</span><span class="w"> </span><span class="s1">&#39;stable&#39;</span><span class="w"> </span>}<span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Build mdtools</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">(cd scripts/mdtools &amp;&amp; go build -o ../../bin/mdtools .)</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Format check</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">bin/mdtools fmt --check</span><span class="w">
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Structural lint</span><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">bin/mdtools lint</span><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Cross-file completeness</span><span class="w">
</span></span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">bin/mdtools cards</span></span></span></code></pre></div><p>CI 用 <code>--check</code> 而非 <code>--fix</code> — 任何格式偏差都 fail，不自動修（避免 CI 把修復 commit 推回去造成誤會）。</p>
<h3 id="安裝-hook">安裝 hook</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">git config core.hooksPath .githooks
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"># 或用 Makefile target：</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">make install-hooks</span></span></code></pre></div><h2 id="維運成本的長期考量">維運成本的長期考量</h2>
<h3 id="誤判率是規則生命週期的關鍵">誤判率是規則生命週期的關鍵</h3>
<p>每條規則都可能誤判。我們的處理策略寫在規範的規則擴充流程段：</p>
<ol>
<li>新規則先在 <code>internal/rules/</code> 實作為<strong>可開關</strong>（預設關）。</li>
<li>在代表性檔案上測試誤判率。</li>
<li>誤判率 &lt; 1% 且有明確教材品質收益時，預設開啟。</li>
<li>預設開啟後，同步修正既有違規。</li>
</ol>
<p>關鍵在「預設關閉」這一步 — 給規則一個試水期，不會直接擋 commit。</p>
<h3 id="規則與-spec-文件的同步">規則與 spec 文件的同步</h3>
<p>Rule config 在 <code>internal/rules/config.go</code>，spec 文件在 <code>content/posts/markdown-writing-spec.md</code>。兩者修改時必須同步，否則會出現「spec 寫的規則跟工具實際跑的規則不同步」的沉默 bug。</p>
<p>這是目前靠紀律維持的部分。未來如果發現同步偏差重複發生，可以考慮從 config.go 產生 spec 的片段（或反過來）。目前手動同步的成本還可接受。</p>
<h3 id="規則數量的預期曲線">規則數量的預期曲線</h3>
<p>當前覆蓋 22 條 rule-config 條目。接下來加規則的收益會遞減 — 大部分重要的基礎格式 + 結構 + 跨文件檢查都已在內。未來新增應該集中在：</p>
<ul>
<li>新內容類型帶來的 schema 擴充（例如做 podcast 或者 video posts）。</li>
<li>術語字典完成後的 <strong>L3 術語覆蓋</strong>（正文首次出現術語自動連卡片）。</li>
<li>特定領域的品質檢查（例如紅隊教材「每個案例必須有 3 來源」）。</li>
</ul>
<p>基礎 markdownlint 規則能加的都加完了，再追規則就是在吸邊際收益極低的條目，不值得。</p>
<h2 id="延伸閱讀">延伸閱讀</h2>
<ul>
<li><a href="/blog/posts/%E4%BB%80%E9%BA%BC%E6%98%AF-ast-%E5%BE%9E%E5%AD%97%E4%B8%B2%E5%88%B0%E8%AA%9E%E6%B3%95%E6%A8%B9%E7%9A%84%E8%A6%96%E8%A7%92%E8%BD%89%E6%8F%9B/" data-link-title="什麼是 AST — 從字串到語法樹的視角轉換" data-link-desc="AST 與 regex 的差異判準：規則需要知道文字處在什麼結構中時 regex 就不夠。附 regex 誤判的具體 case。">什麼是 AST — 從字串到語法樹的視角轉換</a> — 為什麼要升級到 AST 工具鏈</li>
<li><a href="/blog/posts/blog-markdown-%E5%AF%AB%E4%BD%9C%E8%A6%8F%E7%AF%84%E8%88%87-mdtools-%E6%AA%A2%E6%9F%A5/" data-link-title="Blog Markdown 寫作規範與 mdtools 檢查" data-link-desc="本 blog 的 Markdown 排版規範權威契約。涵蓋 H1 禁用、MD024 siblings_only、反釣魚 TLD 校驗、卡片雙向完整性、front matter schema；改規則時要與 scripts/mdtools 實作同步。">Blog Markdown 寫作規範與 mdtools 檢查</a> — mdtools 檢查的完整規則清單</li>
<li><a href="https://github.com/yuin/goldmark">goldmark 官方 repo</a> — Hugo 所用的 markdown parser</li>
<li><a href="https://pkg.go.dev/github.com/yuin/goldmark/ast">goldmark AST package reference</a> — <code>ast.Walk</code>、節點型別、parent traversal API</li>
</ul>
]]></content:encoded></item></channel></rss>