<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Autovacuum on Tarragon</title><link>https://tarrragon.github.io/blog/tags/autovacuum/</link><description>Recent content in Autovacuum on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Mon, 18 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/autovacuum/index.xml" rel="self" type="application/rss+xml"/><item><title>PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/</link><pubDate>Mon, 18 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL&lt;/a> overview 的 implementation-layer deep article。Overview 已說明 PostgreSQL MVCC 的 vacuum 必要性、本文聚焦 &lt;em>autovacuum 在 production write-heavy workload 為什麼追不上&lt;/em> 的根因 + 各維度 tuning。&lt;/p>&lt;/blockquote>
&lt;h2 id="你的-autovacuum-永遠追不上-bloat--為什麼">你的 autovacuum 永遠追不上 bloat — 為什麼&lt;/h2>
&lt;p>write-heavy table 的常見故事：上線時表 10GB、3 個月後 30GB、6 個月 80GB；DBA 看 &lt;code>pg_stat_user_tables&lt;/code> 發現 &lt;code>n_dead_tup&lt;/code> 比 &lt;code>n_live_tup&lt;/code> 還多、&lt;code>pg_stat_progress_vacuum&lt;/code> 顯示 autovacuum 一直在跑、但 dead tuple 從沒清乾淨。表本身才 5M row、實際磁碟卻佔 80GB。&lt;/p>
&lt;p>這不是 PostgreSQL bug、是 autovacuum &lt;em>cost-based throttling 預設保守&lt;/em> 的設計意圖 — autovacuum 不該影響 OLTP query 性能、所以每跑一段就 sleep。預設 &lt;code>autovacuum_vacuum_cost_limit=200&lt;/code> + &lt;code>autovacuum_vacuum_cost_delay=2ms&lt;/code> 在 write-heavy 表（每秒幾千 UPDATE）下、清理速度 &lt;em>永遠慢於&lt;/em> dead tuple 產生速度。預設配置適合 read-heavy / write-light workload；OLTP write-heavy 必須調。&lt;/p>
&lt;h2 id="mvcc-跟-dead-tuplevacuum-在解什麼">MVCC 跟 dead tuple：vacuum 在解什麼&lt;/h2>
&lt;p>PostgreSQL MVCC：每次 UPDATE 都是 &lt;em>insert new row + mark old row as deleted&lt;/em>；DELETE 是 &lt;em>mark as deleted、不立刻釋放空間&lt;/em>。dead tuple 在 disk 上佔位、但不能被 query 讀到。autovacuum 的責任：&lt;/p>
&lt;ol>
&lt;li>&lt;strong>回收 dead tuple 空間&lt;/strong> 供新 row reuse（不縮 table 大小、是 free space map）&lt;/li>
&lt;li>&lt;strong>更新 visibility map&lt;/strong> 讓 index-only scan 跳過 heap fetch&lt;/li>
&lt;li>&lt;strong>凍結老 row 的 xid&lt;/strong>（freeze）避免 xid wraparound 災難&lt;/li>
&lt;li>&lt;strong>重整 index B-tree&lt;/strong> 標記 dead pointer（不刪 index page）&lt;/li>
&lt;/ol>
&lt;p>Vacuum 不縮表 — 真要縮要跑 &lt;code>VACUUM FULL&lt;/code>（全表 exclusive lock、production 不能跑）或 &lt;code>pg_repack&lt;/code>（online repack tool）。預期 vacuum 只能 &lt;em>讓表停止長大&lt;/em>、不能 &lt;em>讓表變小&lt;/em>。&lt;/p>
&lt;h2 id="tuningcost-based-throttle-跟-trigger-threshold">Tuning：cost-based throttle 跟 trigger threshold&lt;/h2>
&lt;h3 id="cost-based-throttle全-instance">Cost-based throttle（全 instance）&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-ini" data-lang="ini">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># postgresql.conf&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="na">autovacuum_vacuum_cost_limit&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s">2000 # 預設 200、production 拉 5-10 倍&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="na">autovacuum_vacuum_cost_delay&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s">2ms # 預設 2ms、不太需要動&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="na">autovacuum_max_workers&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s">6 # 預設 3、CPU 多時拉到 6-10&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">&lt;span class="na">maintenance_work_mem&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s">1GB # 預設 64MB、單一 vacuum 用的記憶體&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>直覺：&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL</a> overview 的 implementation-layer deep article。Overview 已說明 PostgreSQL MVCC 的 vacuum 必要性、本文聚焦 <em>autovacuum 在 production write-heavy workload 為什麼追不上</em> 的根因 + 各維度 tuning。</p></blockquote>
<h2 id="你的-autovacuum-永遠追不上-bloat--為什麼">你的 autovacuum 永遠追不上 bloat — 為什麼</h2>
<p>write-heavy table 的常見故事：上線時表 10GB、3 個月後 30GB、6 個月 80GB；DBA 看 <code>pg_stat_user_tables</code> 發現 <code>n_dead_tup</code> 比 <code>n_live_tup</code> 還多、<code>pg_stat_progress_vacuum</code> 顯示 autovacuum 一直在跑、但 dead tuple 從沒清乾淨。表本身才 5M row、實際磁碟卻佔 80GB。</p>
<p>這不是 PostgreSQL bug、是 autovacuum <em>cost-based throttling 預設保守</em> 的設計意圖 — autovacuum 不該影響 OLTP query 性能、所以每跑一段就 sleep。預設 <code>autovacuum_vacuum_cost_limit=200</code> + <code>autovacuum_vacuum_cost_delay=2ms</code> 在 write-heavy 表（每秒幾千 UPDATE）下、清理速度 <em>永遠慢於</em> dead tuple 產生速度。預設配置適合 read-heavy / write-light workload；OLTP write-heavy 必須調。</p>
<h2 id="mvcc-跟-dead-tuplevacuum-在解什麼">MVCC 跟 dead tuple：vacuum 在解什麼</h2>
<p>PostgreSQL MVCC：每次 UPDATE 都是 <em>insert new row + mark old row as deleted</em>；DELETE 是 <em>mark as deleted、不立刻釋放空間</em>。dead tuple 在 disk 上佔位、但不能被 query 讀到。autovacuum 的責任：</p>
<ol>
<li><strong>回收 dead tuple 空間</strong> 供新 row reuse（不縮 table 大小、是 free space map）</li>
<li><strong>更新 visibility map</strong> 讓 index-only scan 跳過 heap fetch</li>
<li><strong>凍結老 row 的 xid</strong>（freeze）避免 xid wraparound 災難</li>
<li><strong>重整 index B-tree</strong> 標記 dead pointer（不刪 index page）</li>
</ol>
<p>Vacuum 不縮表 — 真要縮要跑 <code>VACUUM FULL</code>（全表 exclusive lock、production 不能跑）或 <code>pg_repack</code>（online repack tool）。預期 vacuum 只能 <em>讓表停止長大</em>、不能 <em>讓表變小</em>。</p>
<h2 id="tuningcost-based-throttle-跟-trigger-threshold">Tuning：cost-based throttle 跟 trigger threshold</h2>
<h3 id="cost-based-throttle全-instance">Cost-based throttle（全 instance）</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># postgresql.conf</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="na">autovacuum_vacuum_cost_limit</span> <span class="o">=</span> <span class="s">2000          # 預設 200、production 拉 5-10 倍</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="na">autovacuum_vacuum_cost_delay</span> <span class="o">=</span> <span class="s">2ms            # 預設 2ms、不太需要動</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="na">autovacuum_max_workers</span> <span class="o">=</span> <span class="s">6                    # 預設 3、CPU 多時拉到 6-10</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="na">maintenance_work_mem</span> <span class="o">=</span> <span class="s">1GB                    # 預設 64MB、單一 vacuum 用的記憶體</span></span></span></code></pre></div><p>直覺：</p>
<ul>
<li><code>cost_limit</code> 是每個 cycle 能消費多少「cost」、cost 由 page read / dirty / hit 加總；拉高 = 每次 cycle 處理更多 page</li>
<li>拉 <code>cost_limit</code> 比 <code>cost_delay</code> 直接 — delay 太低（&lt; 1ms）OS scheduler 抖動就無效</li>
<li><code>max_workers</code> 限同時跑的 vacuum；partition 多時容易爆滿、要拉</li>
<li><code>maintenance_work_mem</code> 影響 index vacuum 速度、SSD 環境 1-2GB 是 sweet spot</li>
</ul>
<h3 id="per-table-override精準到-hot-table">Per-table override（精準到 hot table）</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1">-- 對 hot write-heavy 表加強
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events</span><span class="w"> </span><span class="k">SET</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="n">autovacuum_vacuum_scale_factor</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">.</span><span class="mi">05</span><span class="p">,</span><span class="w">      </span><span class="c1">-- 預設 0.2、5% dead 就觸發
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="c1"></span><span class="w">  </span><span class="n">autovacuum_vacuum_threshold</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">1000</span><span class="p">,</span><span class="w">          </span><span class="c1">-- 預設 50、絕對值底線
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="c1"></span><span class="w">  </span><span class="n">autovacuum_vacuum_cost_limit</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">5000</span><span class="p">,</span><span class="w">         </span><span class="c1">-- 該表獨立 cost_limit
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="c1"></span><span class="w">  </span><span class="n">autovacuum_analyze_scale_factor</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">.</span><span class="mi">05</span><span class="p">,</span><span class="w">      </span><span class="c1">-- analyze 也跟著
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="c1"></span><span class="w">  </span><span class="n">autovacuum_freeze_max_age</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">100000000</span><span class="w">        </span><span class="c1">-- anti-wraparound 提前
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="c1"></span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w"></span><span class="c1">-- 對 append-only 表（log table）降頻
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">audit_log</span><span class="w"> </span><span class="k">SET</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">  </span><span class="n">autovacuum_vacuum_scale_factor</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">.</span><span class="mi">5</span><span class="p">,</span><span class="w">        </span><span class="c1">-- 50% dead 才觸發（極少 UPDATE / DELETE）
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="c1"></span><span class="w">  </span><span class="n">autovacuum_freeze_max_age</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">1000000000</span><span class="w">       </span><span class="c1">-- freeze 延後
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="c1"></span><span class="p">);</span></span></span></code></pre></div><p>關鍵：<em>hot table 比 default 緊、cold table 比 default 鬆</em>、不要把所有表用同套配置。Production cluster 通常 5-20 個 hot table 需要 per-table tuning。</p>
<h2 id="production-故障演練">Production 故障演練</h2>
<h3 id="case-1write-heavy-hot-tableautovacuum-永遠跑不完">Case 1：write-heavy hot table，autovacuum 永遠跑不完</h3>
<p><strong>徵兆</strong>：<code>pg_stat_user_tables.n_dead_tup</code> 持續高於 <code>n_live_tup</code>、<code>pg_stat_progress_vacuum</code> 顯示某表 vacuum 跑了 6+ 小時還在 <code>scanning heap</code>、表 size 持續長大。</p>
<p><strong>根因</strong>：default <code>cost_limit=200</code> 對該表 write rate（~5000 UPDATE/s）下、vacuum 處理速度 &lt; dead tuple 產生速度；單次 autovacuum 跑完整表要 12 小時、但表 5% bloat 觸發又啟動下一輪。</p>
<p><strong>修法</strong>：</p>
<ol>
<li>對該表 <code>ALTER TABLE ... SET (autovacuum_vacuum_cost_limit = 10000)</code> — 該表 vacuum 不受全 instance 限制</li>
<li><code>maintenance_work_mem</code> 拉到 2GB（單 vacuum）</li>
<li>短期：手動 <code>VACUUM (VERBOSE, ANALYZE) events;</code> 在 maintenance window 跑、catch up</li>
<li>長期：考慮 partitioning — partition 後 vacuum 只動最近 partition、不掃整表</li>
</ol>
<h3 id="case-2長-transaction-卡住-vacuum-的-xmin-horizon">Case 2：長 transaction 卡住 vacuum 的 xmin horizon</h3>
<p><strong>徵兆</strong>：autovacuum 看似有跑、但 <code>n_dead_tup</code> 不降；<code>pg_stat_activity</code> 看到一個跑了 8 小時的 SELECT（report query 或 idle in transaction）。</p>
<p><strong>根因</strong>：vacuum 只能回收「不會被任何 active transaction 看到」的 dead tuple；長 transaction 的 xmin 鎖死 vacuum 能回收的範圍、即使 autovacuum 不停跑、能回收的 row 數為 0。</p>
<p><strong>修法</strong>：</p>
<ol>
<li><strong>預防</strong>：application 端用 <code>statement_timeout</code> + <code>idle_in_transaction_session_timeout</code>（30 分鐘）強制終止 long transaction</li>
<li><strong>偵測</strong>：<code>SELECT pid, now() - xact_start FROM pg_stat_activity WHERE state = 'idle in transaction'</code> 定期掃</li>
<li><strong>臨時</strong>：kill 長 transaction（<code>pg_cancel_backend(pid)</code> / <code>pg_terminate_backend(pid)</code>）、autovacuum 下次跑就能回收</li>
<li><strong>架構</strong>：報表 query 跑在 standby、不要在 primary 開 long transaction</li>
</ol>
<h3 id="case-3anti-wraparound-vacuum-在-peak-觸發">Case 3：Anti-wraparound vacuum 在 peak 觸發</h3>
<p><strong>徵兆</strong>：production 流量高峰時 PostgreSQL CPU 100%、<code>pg_stat_progress_vacuum</code> 顯示 anti-wraparound vacuum 正在跑、application latency 暴漲；log 出現 <code>database &quot;myapp&quot; must be vacuumed within X transactions</code>。</p>
<p><strong>根因</strong>：autovacuum_freeze_max_age（預設 200M）到了、PostgreSQL <em>強制</em> 跑 anti-wraparound vacuum（即使在 peak）；這個 vacuum <em>不受 cost_limit 限制</em>、跑到完才停、表大時要幾小時、跟 OLTP query 搶 IO。</p>
<p><strong>修法</strong>：</p>
<ol>
<li><strong>預防</strong>：<code>autovacuum_freeze_max_age</code> 拉到 1B（10 億）、給 freeze 更多時間在 off-peak 自然發生</li>
<li><strong>per-table freeze</strong>：hot table 設 <code>autovacuum_freeze_max_age = 100M</code>（提前在 off-peak freeze）、cold table 設 800M（避免不必要 freeze）</li>
<li><strong>緊急</strong>：手動跑 <code>VACUUM (FREEZE, VERBOSE) table_name;</code> 在 maintenance window 預先 freeze</li>
<li><strong>監測</strong>：<code>SELECT relname, age(relfrozenxid) FROM pg_class WHERE relkind = 'r' ORDER BY age(relfrozenxid) DESC LIMIT 20;</code> 看哪些表逼近 wraparound</li>
</ol>
<h3 id="case-4partition-table-把-autovacuum_max_workers-跑滿">Case 4：Partition table 把 autovacuum_max_workers 跑滿</h3>
<p><strong>徵兆</strong>：partition 後（時間 partition、12 個月分區）、autovacuum 跑很慢、<code>pg_stat_activity</code> 看到 3 個 autovacuum worker 都在跑 partition 表、其他 hot table queue 等很久。</p>
<p><strong>根因</strong>：<code>autovacuum_max_workers=3</code> 預設、每個 partition 算獨立 table；100 個 partition 中 50 個都需要 vacuum、worker 滿、其他 table 排隊。</p>
<p><strong>修法</strong>：</p>
<ol>
<li>拉 <code>autovacuum_max_workers</code> 到 6-10（依 CPU core 數）</li>
<li>cold partition 設 <code>autovacuum_enabled = false</code>（已不寫的舊 partition）、減少 worker 競爭</li>
<li>partition 數量本身要克制 — 100+ partition 是訊號該重新評估 partition strategy</li>
</ol>
<h3 id="case-5index-bloat-沒被-vacuum-處理">Case 5：Index bloat 沒被 vacuum 處理</h3>
<p><strong>徵兆</strong>：表 vacuum 跑完了、<code>n_dead_tup</code> 為 0、但 index size 持續長大；query 用該 index 越來越慢、跟 sequential scan 差不多。</p>
<p><strong>根因</strong>：autovacuum 只處理 <em>heap</em>（table data）跟 <em>index leaf pages</em>；index B-tree 內部結構 fragmentation 不被 vacuum 處理。dead pointer 留在 index leaf page、查詢仍 traverse 過、IO 多。</p>
<p><strong>修法</strong>：</p>
<ol>
<li><code>REINDEX CONCURRENTLY</code> 線上重建 index（PG 12+）、不鎖表</li>
<li>監測 index bloat：<code>pgstattuple_approx</code> extension 或 <code>pg_repack</code></li>
<li>預防：B-tree index 設計避免 high cardinality + 大量 UPDATE 同欄位（typical 場景：status column update）；考慮 <em>partial index</em> 或 <em>hash index</em>（PG 10+ logged）</li>
<li>大量 bloat index 用 <code>pg_repack</code> 重建（不需要 superuser、不鎖表）</li>
</ol>
<h2 id="容量規劃">容量規劃</h2>
<p>vacuum capacity 用 <em>跟得上 dead tuple 產生速度</em> 衡量：</p>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>估算方式</th>
          <th>警戒</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>dead tuple 產生 rate</td>
          <td><code>UPDATE/s + DELETE/s + ~10% INSERT/s（HOT update miss）</code></td>
          <td>跟 vacuum rate 對比</td>
      </tr>
      <tr>
          <td>vacuum 處理 rate</td>
          <td><code>cost_limit / cost_delay × page_size</code>、~MB/s 數量級</td>
          <td>跟 dead tuple rate 對比</td>
      </tr>
      <tr>
          <td>autovacuum_max_workers</td>
          <td>partition 數 + hot table 數 / 3-5</td>
          <td>100+ partition 必須拉 worker</td>
      </tr>
      <tr>
          <td>maintenance_work_mem</td>
          <td>1-2GB / vacuum worker</td>
          <td>全 worker 跑時的記憶體上限要 sizing</td>
      </tr>
      <tr>
          <td>anti-wraparound 觸發頻率</td>
          <td>預設 200M xid、write-heavy ~ 1-2 週觸發一次</td>
          <td>拉到 1B 後 ~ 2-3 月一次</td>
      </tr>
      <tr>
          <td>Bloat ratio</td>
          <td><code>pg_stat_user_tables.n_dead_tup / n_live_tup</code></td>
          <td>&gt; 50% 表示 vacuum 追不上</td>
      </tr>
  </tbody>
</table>
<p>實務 default：</p>
<ul>
<li>OLTP write-heavy（事件 / 訂單）：cost_limit 2000-5000、scale_factor 0.05、freeze_max_age 100M</li>
<li>OLTP read-heavy（user / config）：default 即可</li>
<li>Append-only log：scale_factor 0.5、freeze_max_age 800M、<code>autovacuum_enabled = false</code> for cold partition</li>
</ul>
<h2 id="整合--下一步">整合 / 下一步</h2>
<h3 id="跟-partitioning-整合">跟 <a href="/blog/backend/01-database/vendors/postgresql/declarative-partitioning/" data-link-title="PostgreSQL declarative partitioning：partition 不是切表、是讓 planner pruning" data-link-desc="Declarative partitioning 的真實價值是 query planner pruning &#43; maintenance scope 縮小、不是「把大表切小」；RANGE / LIST / HASH 取捨、partition key 選法、5 個 production 踩雷（key 選錯不 prune / unique 不 enforce 跨 partition / ATTACH 鎖太久 / partition 數爆 / DETACH 不 reclaim 空間）、跟 autovacuum &#43; index 設計整合">partitioning</a> 整合</h3>
<p>partitioning 是 vacuum 問題的長期解：</p>
<ul>
<li>大表（&gt; 100GB）vacuum 時間隨 size 線性、partition 後 vacuum 只動最近 partition</li>
<li>Cold partition <code>autovacuum_enabled = false</code> 完全停掉、新數據只在 hot partition</li>
<li>缺點：partition 數量爆炸時、autovacuum_max_workers 也要拉</li>
</ul>
<h3 id="跟-monitoring-整合">跟 monitoring 整合</h3>
<p>關鍵 metric：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1">-- bloat 比例
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">relname</span><span class="p">,</span><span class="w"> </span><span class="n">n_dead_tup</span><span class="p">,</span><span class="w"> </span><span class="n">n_live_tup</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">       </span><span class="n">round</span><span class="p">(</span><span class="n">n_dead_tup</span><span class="p">::</span><span class="nb">numeric</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="k">nullif</span><span class="p">(</span><span class="n">n_live_tup</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mi">100</span><span class="p">,</span><span class="w"> </span><span class="mi">1</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">dead_pct</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">pg_stat_user_tables</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w"></span><span class="k">WHERE</span><span class="w"> </span><span class="n">n_live_tup</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">1000</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w"></span><span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">n_dead_tup</span><span class="w"> </span><span class="k">DESC</span><span class="w"> </span><span class="k">LIMIT</span><span class="w"> </span><span class="mi">20</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w"></span><span class="c1">-- vacuum 進度
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">pg_stat_progress_vacuum</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w"></span><span class="c1">-- xid wraparound 距離
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">datname</span><span class="p">,</span><span class="w"> </span><span class="n">age</span><span class="p">(</span><span class="n">datfrozenxid</span><span class="p">)</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">pg_database</span><span class="w"> </span><span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">age</span><span class="w"> </span><span class="k">DESC</span><span class="p">;</span></span></span></code></pre></div><p>Prometheus alert 三條：<code>dead_pct &gt; 30</code>、<code>vacuum_running_seconds &gt; 3600</code>、<code>xid_age &gt; 500000000</code>。</p>
<h3 id="跟-backup-window">跟 backup window</h3>
<p>VACUUM FREEZE 在 backup 前跑能減少 backup size（freeze tuple 不需要 special handling）：</p>
<ol>
<li>每週 maintenance window 跑 <code>VACUUM (FREEZE, ANALYZE) hot_table</code> — 預先 freeze + 更新 stats</li>
<li>backup 前避免長 transaction、確保 vacuum 能跑</li>
</ol>
<h3 id="下一步議題">下一步議題</h3>
<ul>
<li><strong>HOT update 跟 fillfactor</strong>：UPDATE 同頁可重用空間、fillfactor 80 為 hot table 留 20% buffer</li>
<li><strong><code>pg_repack</code> vs <code>VACUUM FULL</code></strong>：online vs offline、長期維護工具選擇</li>
<li><strong>PostgreSQL 14+ parallel vacuum</strong>：index vacuum 平行化、大表受益明顯</li>
</ul>
<h2 id="相關連結">相關連結</h2>
<ul>
<li>上游 vendor 頁：<a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL</a></li>
<li>上游 chapter：<a href="/blog/backend/01-database/high-concurrency-access/" data-link-title="1.1 高併發下的 SQL 讀寫邊界" data-link-desc="說明高併發服務如何共用資料庫 client、控制 transaction、管理 connection pool、避免資料庫成為瓶頸">High Concurrency Access</a> — vacuum 是 concurrency 治理一環</li>
<li>平行 deep article：<a href="/blog/backend/01-database/vendors/postgresql/patroni-ha/" data-link-title="PostgreSQL Patroni HA：從 leader 失聯到 client 重連的 5 段 failover lifecycle" data-link-desc="Patroni 把 PostgreSQL HA 拆成 detection / election / promotion / reconfiguration / recovery 五段 lifecycle、每段都有獨立配置跟 failure mode；DCS quorum &#43; watchdog 防 split-brain、async/sync replication 取捨、5 個 production 踩雷、跟 PgBouncer / HAProxy / cert-manager 整合">Patroni HA</a> / <a href="/blog/backend/01-database/vendors/postgresql/declarative-partitioning/" data-link-title="PostgreSQL declarative partitioning：partition 不是切表、是讓 planner pruning" data-link-desc="Declarative partitioning 的真實價值是 query planner pruning &#43; maintenance scope 縮小、不是「把大表切小」；RANGE / LIST / HASH 取捨、partition key 選法、5 個 production 踩雷（key 選錯不 prune / unique 不 enforce 跨 partition / ATTACH 鎖太久 / partition 數爆 / DETACH 不 reclaim 空間）、跟 autovacuum &#43; index 設計整合">Declarative Partitioning</a> / <a href="/blog/backend/01-database/vendors/postgresql/mvcc-lock-model/" data-link-title="PostgreSQL MVCC &#43; Lock Model：為什麼 PG 比 MySQL 少 deadlock、但 vacuum 是別的代價" data-link-desc="PG 用 *MVCC-heavy &#43; 少 explicit lock* 的並行控制、跟 MySQL InnoDB 的 *lock-based*（record / gap / next-key）相反。本文走 MVCC 機制（tuple version &#43; xmin/xmax &#43; visibility）、PG 4 種 lock（row-level / table-level / advisory / predicate）、預測 SERIALIZABLE 行為、5 production 踩雷（idle transaction 卡 vacuum / SELECT FOR UPDATE 跨 transaction / advisory lock 沒釋放 / bloat 不是 vacuum 問題 / predicate lock 在 SSI 下 rollback）、跟 MySQL lock-contention sibling 對比">MVCC + Lock Model</a>（為什麼會有 dead tuple、跟 lock 互動）</li>
<li>Methodology：<a href="/blog/posts/vendor-%E6%B7%B1%E5%BA%A6%E6%8A%80%E8%A1%93%E6%96%87%E7%AB%A0%E6%96%B9%E6%B3%95%E8%AB%96%E7%9A%84%E6%BC%94%E5%8C%96%E7%B4%80%E9%8C%84%E5%90%8C-vendor-%E7%B3%BB%E5%88%97%E7%9A%84%E9%96%8B%E5%A0%B4%E8%BC%AA%E6%9B%BF%E9%A9%97%E8%AD%89/" data-link-title="Vendor 深度技術文章方法論的演化紀錄：同 vendor 系列的開場輪替驗證" data-link-desc="vendor overview 飽和後要寫單一功能深度文章、需要選題與結構依據時回來。這套方法論的驗證來源與 cadence variant 在高風險場景（同 vendor sub-tool 系列）的實證。">Vendor 深度技術文章的寫作方法論</a></li>
</ul>
]]></content:encoded></item></channel></rss>