<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Mvcc on Tarragon</title><link>https://tarrragon.github.io/blog/tags/mvcc/</link><description>Recent content in Mvcc on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Tue, 19 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/mvcc/index.xml" rel="self" type="application/rss+xml"/><item><title>PostgreSQL MVCC + Lock Model：為什麼 PG 比 MySQL 少 deadlock、但 vacuum 是別的代價</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/mvcc-lock-model/</link><pubDate>Tue, 19 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/mvcc-lock-model/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL&lt;/a> overview 的 implementation-layer deep article。Overview 已說明 PG 在 OLTP 譜系的定位、本文聚焦 &lt;em>MVCC + lock model&lt;/em> — PG 並行控制機制跟跟 MySQL lock-based 不同。&lt;/p>&lt;/blockquote>
&lt;hr>
&lt;h2 id="pg-mvcc每次更新都-新增-tuple不改舊版">PG MVCC：每次更新都 &lt;em>新增 tuple&lt;/em>、不改舊版&lt;/h2>
&lt;p>PG 的並行控制核心是 &lt;em>Multi-Version Concurrency Control&lt;/em> — UPDATE 不修改原 row、是 &lt;em>新增&lt;/em> 一個 tuple version、舊 version 留在 table 直到 VACUUM 清理：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">原 row: (id=1, status=&amp;#39;pending&amp;#39;, xmin=100, xmax=NULL)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl"> ↓ UPDATE status=&amp;#39;shipped&amp;#39;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">新 tuple: (id=1, status=&amp;#39;shipped&amp;#39;, xmin=200, xmax=NULL)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">舊 tuple 標 xmax=200（不刪、給其他 transaction 看舊 version）&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>xmin&lt;/code> / &lt;code>xmax&lt;/code> 是 &lt;em>creator transaction id&lt;/em> / &lt;em>destroyer transaction id&lt;/em>。每個 SELECT 用 &lt;em>snapshot&lt;/em>（含當下 active transaction list）判斷哪些 tuple 對自己可見：&lt;/p>
&lt;ul>
&lt;li>自己 transaction id &amp;gt; tuple.xmin 且 (tuple.xmax = NULL 或自己 transaction id &amp;lt; tuple.xmax) → 可見&lt;/li>
&lt;li>否則 → 看不到（過去 / 未來版本）&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>結果&lt;/strong>：&lt;/p>
&lt;ul>
&lt;li>&lt;em>Readers 不 lock writers&lt;/em>：SELECT 看 snapshot、不 block UPDATE&lt;/li>
&lt;li>&lt;em>Writers 不 lock readers&lt;/em>：UPDATE 寫新 tuple、不影響正在跑的 SELECT snapshot&lt;/li>
&lt;li>&lt;em>Writers 只 lock 同一 row 的 writers&lt;/em>：兩個 UPDATE 同 row 才 conflict&lt;/li>
&lt;/ul>
&lt;p>跟 MySQL InnoDB &lt;em>lock-based&lt;/em>（&lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/lock-contention/" data-link-title="MySQL Lock Contention：在 staging 重現的 deadlock、production 跑 6 個月才出現" data-link-desc="MySQL InnoDB 的 lock 是 row-level、但 *為什麼某些 row 莫名其妙也被 lock* 是 gap lock / next-key lock 設計造成的隱性行為。本文從一個 production case 開場（staging 重現 deadlock / production 6 個月後突然爆）、走 5 種 InnoDB lock 類型（record / gap / next-key / insert intention / auto-inc）、isolation level 對 lock 行為的決定性影響、deadlock detection / SHOW ENGINE INNODB STATUS 解讀、5 production 踩雷（gap lock 阻塞 INSERT / auto-inc lock contention / FK lock cascading / large transaction lock holding / READ COMMITTED 跟 binlog ROW 互動）">Lock Contention&lt;/a>）對比：&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL</a> overview 的 implementation-layer deep article。Overview 已說明 PG 在 OLTP 譜系的定位、本文聚焦 <em>MVCC + lock model</em> — PG 並行控制機制跟跟 MySQL lock-based 不同。</p></blockquote>
<hr>
<h2 id="pg-mvcc每次更新都-新增-tuple不改舊版">PG MVCC：每次更新都 <em>新增 tuple</em>、不改舊版</h2>
<p>PG 的並行控制核心是 <em>Multi-Version Concurrency Control</em> — UPDATE 不修改原 row、是 <em>新增</em> 一個 tuple version、舊 version 留在 table 直到 VACUUM 清理：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">原 row:    (id=1, status=&#39;pending&#39;, xmin=100, xmax=NULL)
</span></span><span class="line"><span class="ln">2</span><span class="cl">                 ↓ UPDATE status=&#39;shipped&#39;
</span></span><span class="line"><span class="ln">3</span><span class="cl">新 tuple:  (id=1, status=&#39;shipped&#39;, xmin=200, xmax=NULL)
</span></span><span class="line"><span class="ln">4</span><span class="cl">舊 tuple 標 xmax=200（不刪、給其他 transaction 看舊 version）</span></span></code></pre></div><p><code>xmin</code> / <code>xmax</code> 是 <em>creator transaction id</em> / <em>destroyer transaction id</em>。每個 SELECT 用 <em>snapshot</em>（含當下 active transaction list）判斷哪些 tuple 對自己可見：</p>
<ul>
<li>自己 transaction id &gt; tuple.xmin 且 (tuple.xmax = NULL 或自己 transaction id &lt; tuple.xmax) → 可見</li>
<li>否則 → 看不到（過去 / 未來版本）</li>
</ul>
<p><strong>結果</strong>：</p>
<ul>
<li><em>Readers 不 lock writers</em>：SELECT 看 snapshot、不 block UPDATE</li>
<li><em>Writers 不 lock readers</em>：UPDATE 寫新 tuple、不影響正在跑的 SELECT snapshot</li>
<li><em>Writers 只 lock 同一 row 的 writers</em>：兩個 UPDATE 同 row 才 conflict</li>
</ul>
<p>跟 MySQL InnoDB <em>lock-based</em>（<a href="/blog/backend/01-database/vendors/mysql/lock-contention/" data-link-title="MySQL Lock Contention：在 staging 重現的 deadlock、production 跑 6 個月才出現" data-link-desc="MySQL InnoDB 的 lock 是 row-level、但 *為什麼某些 row 莫名其妙也被 lock* 是 gap lock / next-key lock 設計造成的隱性行為。本文從一個 production case 開場（staging 重現 deadlock / production 6 個月後突然爆）、走 5 種 InnoDB lock 類型（record / gap / next-key / insert intention / auto-inc）、isolation level 對 lock 行為的決定性影響、deadlock detection / SHOW ENGINE INNODB STATUS 解讀、5 production 踩雷（gap lock 阻塞 INSERT / auto-inc lock contention / FK lock cascading / large transaction lock holding / READ COMMITTED 跟 binlog ROW 互動）">Lock Contention</a>）對比：</p>
<ul>
<li>MySQL：SELECT FOR UPDATE 用 gap lock 防 phantom、deadlock 機率高</li>
<li>PG：MVCC + snapshot 自然防 phantom（read 看 snapshot）、deadlock 少</li>
</ul>
<p>但 PG 代價是 <em>VACUUM 治理</em> — dead tuple 不清理會佔 disk + 影響 query 效率。詳見 <a href="/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/" data-link-title="PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat" data-link-desc="MVCC 怎麼產生 dead tuple、autovacuum cost-based throttle 為什麼預設保守、per-table tuning 怎麼設、5 個 production 踩雷（cost_limit 太低 / 長 transaction blocks vacuum / anti-wraparound 在 peak / partition vacuum 滿 worker / index bloat 沒處理）、跟 partitioning &#43; monitoring 整合">Autovacuum Tuning</a>。</p>
<h2 id="pg-4-種-lock">PG 4 種 lock</h2>
<p>PG 仍有 lock、但場景跟 MySQL 不同：</p>
<h3 id="1-row-level-lock--主要由-update--delete--select-for-update-取">1. Row-level lock — 主要由 UPDATE / DELETE / SELECT FOR UPDATE 取</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">BEGIN</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">100</span><span class="w"> </span><span class="k">FOR</span><span class="w"> </span><span class="k">UPDATE</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="c1">-- 對 id=100 row 加 ROW EXCLUSIVE lock
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1">-- 其他 transaction 試 UPDATE / DELETE id=100 必須等</span></span></span></code></pre></div><p>Row-level lock <em>不 block reader</em>（SELECT 看 snapshot、不檢查 lock）。</p>
<h3 id="2-table-level-lock--ddl-跟少數-select-for-場景">2. Table-level lock — DDL 跟少數 SELECT FOR 場景</h3>
<p>PG 有 8 種 table lock mode、嚴重程度遞增：</p>
<table>
  <thead>
      <tr>
          <th>Mode</th>
          <th>行為</th>
          <th>衝突</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>ACCESS SHARE</td>
          <td>SELECT 跑</td>
          <td>跟 ACCESS EXCLUSIVE 衝突</td>
      </tr>
      <tr>
          <td>ROW SHARE</td>
          <td>SELECT FOR UPDATE / FOR SHARE</td>
          <td>跟 EXCLUSIVE 衝突</td>
      </tr>
      <tr>
          <td>ROW EXCLUSIVE</td>
          <td>UPDATE / DELETE / INSERT</td>
          <td>跟 SHARE 衝突</td>
      </tr>
      <tr>
          <td>SHARE UPDATE EXCLUSIVE</td>
          <td>VACUUM / ANALYZE / CREATE INDEX CONCURRENTLY</td>
          <td>跟同 mode + 高 mode 衝突</td>
      </tr>
      <tr>
          <td>SHARE</td>
          <td>CREATE INDEX（non-concurrent）</td>
          <td>跟 ROW EXCLUSIVE 衝突</td>
      </tr>
      <tr>
          <td>SHARE ROW EXCLUSIVE</td>
          <td>CREATE TRIGGER / 某些 ALTER</td>
          <td>跟 ROW EXCLUSIVE 衝突</td>
      </tr>
      <tr>
          <td>EXCLUSIVE</td>
          <td>REFRESH MATERIALIZED VIEW</td>
          <td>跟所有 + 自身衝突</td>
      </tr>
      <tr>
          <td>ACCESS EXCLUSIVE</td>
          <td>DROP / ALTER TABLE / VACUUM FULL</td>
          <td>跟所有衝突</td>
      </tr>
  </tbody>
</table>
<p>DDL（ALTER / DROP）拿 ACCESS EXCLUSIVE、跟所有衝突。Production 跑 ALTER 必須短時間或走 <a href="/blog/backend/01-database/vendors/postgresql/online-schema-change/" data-link-title="PostgreSQL Online Schema Change：先用 ALTER 內建特性、不能解才 pg_repack / pg-osc" data-link-desc="PostgreSQL ALTER TABLE 對多數變更已是 *fast catalog-only*（add column nullable / drop column / 改 default），不必走 ghost table tool。本文走 PG 內建 fast DDL 行為、何時必須走 pg_repack / pg-osc、兩工具機制對比（trigger-based vs WAL-shipping）、配置 step-by-step、5 production 踩雷（lock 升級 / VACUUM FULL 誤用 / pg_repack version mismatch / concurrent index 失敗清理 / generated stored column 不能 online）、跟 MySQL gh-ost / pt-osc sibling 對比">Online Schema Change</a>。</p>
<h3 id="3-advisory-lock--application-自己控">3. Advisory lock — Application 自己控</h3>
<p>PG 提供 <em>advisory lock</em> 給 application 用、不關 row / table 結構：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- Session 1
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_advisory_lock</span><span class="p">(</span><span class="mi">12345</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="c1">-- 跑 critical section
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_advisory_unlock</span><span class="p">(</span><span class="mi">12345</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w"></span><span class="c1">-- Session 2
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_try_advisory_lock</span><span class="p">(</span><span class="mi">12345</span><span class="p">);</span><span class="w">  </span><span class="c1">-- 試取、不阻塞、返回 false</span></span></span></code></pre></div><p>用途：</p>
<ul>
<li>Application-level 互斥（如：cron job 同時只跑一個）</li>
<li>跨 connection 同步（PG-managed mutex）</li>
<li>Distributed transaction coordinator（lightweight）</li>
</ul>
<p>跟 row lock 不同：advisory lock 不關 row、application 自定義 lock ID 語義。</p>
<h3 id="4-predicate-lock--serializable-isolation-才用">4. Predicate lock — SERIALIZABLE isolation 才用</h3>
<p>PG SERIALIZABLE 用 <em>Serializable Snapshot Isolation (SSI)</em>、追蹤 <em>predicate</em>（query 條件）而不是 <em>row</em>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SET</span><span class="w"> </span><span class="k">TRANSACTION</span><span class="w"> </span><span class="k">ISOLATION</span><span class="w"> </span><span class="k">LEVEL</span><span class="w"> </span><span class="k">SERIALIZABLE</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">BEGIN</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="c1">-- Predicate lock 紀錄這個 query 看了哪些 predicate
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">status</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;pending&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="c1">-- 其他 transaction INSERT pending order
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1">-- 提交時：PG 偵測 anomaly、rollback 之一
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="c1"></span><span class="k">COMMIT</span><span class="p">;</span></span></span></code></pre></div><p>跟 MySQL gap lock 不同：</p>
<ul>
<li>MySQL gap lock：<em>pre-lock</em>、防 phantom 在 query 期間</li>
<li>PG predicate lock：<em>post-detect</em>、commit 時偵測 anomaly、退回 transaction</li>
</ul>
<p>PG SSI 對 <em>寫入吞吐影響低</em>（不 pre-lock）、但 <em>transaction rollback 機率高</em>（要 application retry）。</p>
<h2 id="pg-預設-isolationread-committed">PG 預設 isolation：READ COMMITTED</h2>
<p>PG 預設 READ COMMITTED、跟 MySQL InnoDB 預設 REPEATABLE READ 不同：</p>
<table>
  <thead>
      <tr>
          <th>Isolation</th>
          <th>PG 行為</th>
          <th>MySQL InnoDB 對應</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>READ UNCOMMITTED</td>
          <td>PG 視為 READ COMMITTED（不真的支援 dirty read）</td>
          <td>MySQL 真支援</td>
      </tr>
      <tr>
          <td>READ COMMITTED</td>
          <td>每 statement 看當下 committed snapshot（PG 預設）</td>
          <td>一致</td>
      </tr>
      <tr>
          <td>REPEATABLE READ</td>
          <td>Transaction 內 fixed snapshot（純 MVCC）</td>
          <td>MVCC snapshot + gap lock 防 phantom（兩者都 MVCC、差在 phantom 防護機制：PG 靠 snapshot version visibility、InnoDB 加 gap lock pre-lock 範圍）</td>
      </tr>
      <tr>
          <td>SERIALIZABLE</td>
          <td>SSI、commit 時偵測 anomaly</td>
          <td>強 lock + gap</td>
      </tr>
  </tbody>
</table>
<p><strong>對 application code 含意</strong>：</p>
<ul>
<li>PG REPEATABLE READ 對 <em>寫入吞吐</em> 影響低（不 pre-lock、只 retry）</li>
<li>沒 gap lock → INSERT 不被 lock-induced 阻塞</li>
<li>Deadlock 機率比 MySQL 低數量級</li>
</ul>
<p>實務 PG production：用預設 READ COMMITTED 即可、SERIALIZABLE 留給 <em>strict consistency 需求</em>（金融 / 訂單）但接受 retry。</p>
<h2 id="5-個-production-踩雷">5 個 Production 踩雷</h2>
<h3 id="1-idle-transaction-卡-vacuum--bloat-暴增">1. Idle transaction 卡 vacuum — Bloat 暴增</h3>
<p>PG MVCC 仰賴 <em>VACUUM 清理 dead tuple</em>。VACUUM 只清理 <em>沒 active transaction 看得到的 dead tuple</em>。如果有 <em>idle in transaction</em> session 持續開著（application connection pool 連線忘關 transaction）、VACUUM 看不到 <em>該 transaction snapshot 之後的 dead tuple</em>、累積 bloat。</p>
<p>修法：</p>
<ul>
<li>監控 <code>pg_stat_activity</code> 看 <code>state = 'idle in transaction'</code> 持續時間</li>
<li>設 <code>idle_in_transaction_session_timeout = '5min'</code> — 超時 PG 自動 kill 該 session</li>
<li>Application connection pool 配置 <em>不留 transaction 開著</em>（如：pgBouncer transaction pool 自動 commit / rollback）</li>
</ul>
<h3 id="2-select-for-update-跨-transaction--application-retry-麻煩">2. SELECT FOR UPDATE 跨 transaction — Application retry 麻煩</h3>
<p>跟 MySQL 不同：PG SELECT FOR UPDATE 不會 <em>block 其他 SELECT</em>（讀仍可繼續）、但 <em>block 其他 UPDATE / FOR UPDATE</em>。若 application 在 transaction 內 SELECT FOR UPDATE、其他 transaction 等。</p>
<p>如果 application 設計 <em>跨 transaction 持 lock</em>（如：取 lock + return UI + 等用戶操作 + commit）、容易撞 idle in transaction 跟其他 transaction wait。</p>
<p>修法：</p>
<ul>
<li><em>Transaction 短</em>：取 FOR UPDATE → 立刻處理 → commit、不跨 user interaction</li>
<li>跨 user interaction 用 <em>advisory lock</em> 或 application-level state machine、不依賴 row lock</li>
</ul>
<h3 id="3-advisory-lock-沒釋放--session-結束才自動釋放">3. Advisory lock 沒釋放 — Session 結束才自動釋放</h3>
<p><code>pg_advisory_lock()</code> 拿了、沒 <code>pg_advisory_unlock()</code>、lock 直到 <em>session 結束</em> 才自動釋放。Connection pool 重複使用同 connection、可能繼承前面留的 lock。</p>
<p>修法：</p>
<ul>
<li>用 <code>pg_advisory_lock</code> 必 <code>try/finally pg_advisory_unlock</code></li>
<li>或用 <em>session-level</em> 用 transaction-scoped：<code>pg_advisory_xact_lock()</code> — commit / rollback 自動釋放</li>
<li>監控 <code>pg_locks</code> 看 advisory lock count、長期累積是警訊</li>
</ul>
<h3 id="4-bloat-不只是-vacuum-沒跑是-active-transaction-阻擋-vacuum">4. Bloat 不只是 vacuum 沒跑、是 <em>active transaction 阻擋 vacuum</em></h3>
<p>第 #1 點延伸：vacuum 已跑、但 bloat 仍持續成長、原因不是 vacuum 不夠、是 <em>active transaction 阻擋 vacuum 看 dead tuple</em>。</p>
<p>修法：</p>
<ul>
<li>不只看 <code>last_vacuum</code>、看 <em>VACUUM 跑了但沒收回多少</em></li>
<li><code>SELECT * FROM pg_stat_progress_vacuum</code> 看 VACUUM 進度</li>
<li><code>SELECT * FROM pg_stat_activity WHERE backend_xmin IS NOT NULL ORDER BY backend_xmin</code> — 看誰阻擋 vacuum</li>
<li>詳見 <a href="/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/" data-link-title="PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat" data-link-desc="MVCC 怎麼產生 dead tuple、autovacuum cost-based throttle 為什麼預設保守、per-table tuning 怎麼設、5 個 production 踩雷（cost_limit 太低 / 長 transaction blocks vacuum / anti-wraparound 在 peak / partition vacuum 滿 worker / index bloat 沒處理）、跟 partitioning &#43; monitoring 整合">Autovacuum Tuning</a></li>
</ul>
<h3 id="5-serializable-下-transaction-rollback--application-必須-retry">5. SERIALIZABLE 下 transaction rollback — Application 必須 retry</h3>
<p><code>SET TRANSACTION ISOLATION LEVEL SERIALIZABLE</code> 後、PG SSI 偵測到 anomaly 會 <em>rollback transaction</em>、application 看到 <code>serialization failure</code>、必須 retry。</p>
<p>對 <em>不知道要 retry</em> 的 application、SERIALIZABLE 變 production bug。</p>
<p>修法：</p>
<ul>
<li>Application code 加 <em>retry middleware</em>：catch <code>SQLSTATE 40001 (serialization_failure)</code> → exponential backoff retry</li>
<li>不必所有 transaction 走 SERIALIZABLE — 只對 <em>strict consistency 需求</em> 場景 set</li>
<li>高並發 SERIALIZABLE workload 容易 rollback storm、考慮拆 transaction 縮短時間</li>
</ul>
<h2 id="觀測-metric">觀測 metric</h2>
<p>Production 監控：</p>
<ul>
<li><code>pg_stat_activity</code>：active session / idle in transaction / wait_event</li>
<li><code>pg_locks</code>：當前 lock 列表、用 join 看誰 block 誰</li>
<li><code>pg_stat_database.deadlocks</code>：deadlock 計數（PG 較低、但仍要監控）</li>
<li><code>pg_stat_user_tables.n_dead_tup</code> / <code>n_live_tup</code>：dead tuple 比例 — bloat 指標</li>
<li><code>pg_stat_progress_vacuum</code>：VACUUM 進度</li>
</ul>
<h2 id="跟-mysql-lock-model-對比">跟 MySQL Lock Model 對比</h2>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>PG MVCC</th>
          <th>MySQL InnoDB Lock</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>主要機制</td>
          <td>MVCC + snapshot</td>
          <td>Lock-based + MVCC mixed</td>
      </tr>
      <tr>
          <td>Readers vs Writers</td>
          <td>不互 block</td>
          <td>預設 RR 下 gap lock 影響</td>
      </tr>
      <tr>
          <td>Deadlock 機率</td>
          <td>低（無 gap lock）</td>
          <td>中-高（gap lock 主要來源）</td>
      </tr>
      <tr>
          <td>Phantom 防護</td>
          <td>Snapshot 自然防 + SSI predicate lock</td>
          <td>Gap lock 預先 lock</td>
      </tr>
      <tr>
          <td>預設 isolation</td>
          <td>READ COMMITTED</td>
          <td>REPEATABLE READ</td>
      </tr>
      <tr>
          <td>成本</td>
          <td>Dead tuple + VACUUM 治理</td>
          <td>Lock contention 治理</td>
      </tr>
      <tr>
          <td>Application code</td>
          <td>SERIALIZABLE 需 retry</td>
          <td>寫得不錯多數時 OK</td>
      </tr>
  </tbody>
</table>
<p>兩者解決同一問題（並行控制）、用不同策略。PG 用 <em>空間換時間</em>（保留多版本 tuple、讀寫不互鎖、但需 VACUUM 清理）、MySQL 用 <em>時間換空間</em>（lock 等待、但不必清舊版本）。</p>
<p><strong>選擇判讀</strong>：</p>
<ul>
<li>High 並發 OLTP、寫 / 讀都重：PG MVCC 通常更好（讀不 block 寫）</li>
<li>簡單 OLTP + 不想管 VACUUM：MySQL InnoDB 對 ops 簡單</li>
<li>需要 SERIALIZABLE 強一致：PG SSI 對寫吞吐影響低</li>
<li>已有 MySQL 生態 / 工具鏈：MySQL Lock 知識可繼續用</li>
</ul>
<p>詳見 <a href="/blog/backend/01-database/vendors/mysql/lock-contention/" data-link-title="MySQL Lock Contention：在 staging 重現的 deadlock、production 跑 6 個月才出現" data-link-desc="MySQL InnoDB 的 lock 是 row-level、但 *為什麼某些 row 莫名其妙也被 lock* 是 gap lock / next-key lock 設計造成的隱性行為。本文從一個 production case 開場（staging 重現 deadlock / production 6 個月後突然爆）、走 5 種 InnoDB lock 類型（record / gap / next-key / insert intention / auto-inc）、isolation level 對 lock 行為的決定性影響、deadlock detection / SHOW ENGINE INNODB STATUS 解讀、5 production 踩雷（gap lock 阻塞 INSERT / auto-inc lock contention / FK lock cascading / large transaction lock holding / READ COMMITTED 跟 binlog ROW 互動）">MySQL Lock Contention</a> — 完整 MySQL lock 機制。</p>
<h2 id="跟其他模組整合">跟其他模組整合</h2>
<h3 id="跟-autovacuum-tuning">跟 Autovacuum Tuning</h3>
<p>MVCC 仰賴 VACUUM、autovacuum 是 PG 並行控制的 <em>維護成本</em>。VACUUM 跑慢 / 沒跑 → bloat → query 慢。詳見 <a href="/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/" data-link-title="PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat" data-link-desc="MVCC 怎麼產生 dead tuple、autovacuum cost-based throttle 為什麼預設保守、per-table tuning 怎麼設、5 個 production 踩雷（cost_limit 太低 / 長 transaction blocks vacuum / anti-wraparound 在 peak / partition vacuum 滿 worker / index bloat 沒處理）、跟 partitioning &#43; monitoring 整合">Autovacuum Tuning</a>。</p>
<h3 id="跟-replication-topology">跟 Replication Topology</h3>
<p><code>hot_standby_feedback = on</code> 讓 standby 上 long-running query 不被 vacuum 取消、但 <em>standby 把 oldest xmin 推回 primary</em>、primary autovacuum 變保守、增加 bloat。詳見 <a href="/blog/backend/01-database/vendors/postgresql/replication-topology/" data-link-title="PostgreSQL Replication Topology：async / sync / quorum 三模式跟 LSN &#43; replication slot 的三軸組合" data-link-desc="PostgreSQL streaming replication 不是「sync 或 async」、是 *durability / latency / consistency* 三軸組合 &#43; LSN-based 進度追蹤 &#43; replication slot 治理。本文走 3 軸取捨模型、async / sync / quorum-based sync 行為對比、LSN &#43; replication slot 機制、配置 step-by-step、5 production 踩雷（standby lag 暴衝 / sync standby 退回 async / orphan replication slot / cascading replication 雪崩 / failover 後 timeline 分歧）、跟 Patroni HA &#43; logical replication 整合">Replication Topology</a>。</p>
<h3 id="跟-connection-pool">跟 Connection Pool</h3>
<p>pgBouncer transaction pooling 模式下、advisory lock / SELECT FOR UPDATE 跨 transaction 行為 <em>broken</em>（不同 transaction 可能進不同 backend connection）。詳見 <a href="/blog/backend/01-database/vendors/postgresql/pgbouncer-config/" data-link-title="PostgreSQL pgBouncer 配置 &#43; 連線池治理" data-link-desc="pgBouncer transaction pooling 配置、跟 application connection pool 的分層、production 故障演練（pool exhaustion / stale connection / DNS failover）跟容量規劃">pgBouncer Config</a>。</p>
<h3 id="跟-query-optimization">跟 Query Optimization</h3>
<p>長 transaction 跑慢 query 期間、其他 transaction 看到 snapshot bloat、planner 估錯 dead tuple ratio。詳見 <a href="/blog/backend/01-database/vendors/postgresql/query-optimization/" data-link-title="PostgreSQL Query Optimization：EXPLAIN ANALYZE / pg_hint_plan / auto_explain 三層工具跟 4 個 case" data-link-desc="PG query 慢的根因常是 *planner 選錯 plan 或 statistics 過時*。本文從 4 個 production case 開場（seq scan vs index / hash vs nested loop / 多 column 統計缺 / parallel query 沒觸發）、走 EXPLAIN / EXPLAIN ANALYZE / auto_explain 三層工具、pg_hint_plan extension 跟 planner GUC 取捨、5 production 踩雷（ANALYZE 過時 / multi-column statistics / cost-base setting 不對齊硬體 / random_page_cost SSD 沒調 / parallel query 配置）、跟 MySQL query-optimization sibling 對比">Query Optimization</a>。</p>
<h2 id="相關連結">相關連結</h2>
<ul>
<li><a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL vendor overview</a></li>
<li><a href="/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/" data-link-title="PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat" data-link-desc="MVCC 怎麼產生 dead tuple、autovacuum cost-based throttle 為什麼預設保守、per-table tuning 怎麼設、5 個 production 踩雷（cost_limit 太低 / 長 transaction blocks vacuum / anti-wraparound 在 peak / partition vacuum 滿 worker / index bloat 沒處理）、跟 partitioning &#43; monitoring 整合">PG Autovacuum Tuning</a>（VACUUM 是 MVCC 必要成本）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/replication-topology/" data-link-title="PostgreSQL Replication Topology：async / sync / quorum 三模式跟 LSN &#43; replication slot 的三軸組合" data-link-desc="PostgreSQL streaming replication 不是「sync 或 async」、是 *durability / latency / consistency* 三軸組合 &#43; LSN-based 進度追蹤 &#43; replication slot 治理。本文走 3 軸取捨模型、async / sync / quorum-based sync 行為對比、LSN &#43; replication slot 機制、配置 step-by-step、5 production 踩雷（standby lag 暴衝 / sync standby 退回 async / orphan replication slot / cascading replication 雪崩 / failover 後 timeline 分歧）、跟 Patroni HA &#43; logical replication 整合">PG Replication Topology</a>（hot_standby_feedback 影響）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/pgbouncer-config/" data-link-title="PostgreSQL pgBouncer 配置 &#43; 連線池治理" data-link-desc="pgBouncer transaction pooling 配置、跟 application connection pool 的分層、production 故障演練（pool exhaustion / stale connection / DNS failover）跟容量規劃">PG pgBouncer</a>（transaction pooling 跟 lock 互動）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/online-schema-change/" data-link-title="PostgreSQL Online Schema Change：先用 ALTER 內建特性、不能解才 pg_repack / pg-osc" data-link-desc="PostgreSQL ALTER TABLE 對多數變更已是 *fast catalog-only*（add column nullable / drop column / 改 default），不必走 ghost table tool。本文走 PG 內建 fast DDL 行為、何時必須走 pg_repack / pg-osc、兩工具機制對比（trigger-based vs WAL-shipping）、配置 step-by-step、5 production 踩雷（lock 升級 / VACUUM FULL 誤用 / pg_repack version mismatch / concurrent index 失敗清理 / generated stored column 不能 online）、跟 MySQL gh-ost / pt-osc sibling 對比">PG Online Schema Change</a>（DDL lock 議題）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/query-optimization/" data-link-title="PostgreSQL Query Optimization：EXPLAIN ANALYZE / pg_hint_plan / auto_explain 三層工具跟 4 個 case" data-link-desc="PG query 慢的根因常是 *planner 選錯 plan 或 statistics 過時*。本文從 4 個 production case 開場（seq scan vs index / hash vs nested loop / 多 column 統計缺 / parallel query 沒觸發）、走 EXPLAIN / EXPLAIN ANALYZE / auto_explain 三層工具、pg_hint_plan extension 跟 planner GUC 取捨、5 production 踩雷（ANALYZE 過時 / multi-column statistics / cost-base setting 不對齊硬體 / random_page_cost SSD 沒調 / parallel query 配置）、跟 MySQL query-optimization sibling 對比">PG Query Optimization</a>（snapshot bloat 影響 planner）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/lock-contention/" data-link-title="MySQL Lock Contention：在 staging 重現的 deadlock、production 跑 6 個月才出現" data-link-desc="MySQL InnoDB 的 lock 是 row-level、但 *為什麼某些 row 莫名其妙也被 lock* 是 gap lock / next-key lock 設計造成的隱性行為。本文從一個 production case 開場（staging 重現 deadlock / production 6 個月後突然爆）、走 5 種 InnoDB lock 類型（record / gap / next-key / insert intention / auto-inc）、isolation level 對 lock 行為的決定性影響、deadlock detection / SHOW ENGINE INNODB STATUS 解讀、5 production 踩雷（gap lock 阻塞 INSERT / auto-inc lock contention / FK lock cascading / large transaction lock holding / READ COMMITTED 跟 binlog ROW 互動）">MySQL Lock Contention</a>（sibling、不同模型）</li>
<li><a href="/blog/backend/knowledge-cards/isolation-level/" data-link-title="Isolation Level" data-link-desc="說明資料庫交易隔離級別如何影響並發讀寫結果">Isolation Level 卡片</a></li>
<li>官方：<a href="https://www.postgresql.org/docs/current/mvcc.html">PG MVCC</a> / <a href="https://www.postgresql.org/docs/current/transaction-iso.html">PG Concurrency Control</a> / <a href="https://www.postgresql.org/docs/current/explicit-locking.html">Explicit Locking</a></li>
</ul>
]]></content:encoded></item></channel></rss>