<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Lock on Tarragon</title><link>https://tarrragon.github.io/blog/tags/lock/</link><description>Recent content in Lock on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Tue, 19 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/lock/index.xml" rel="self" type="application/rss+xml"/><item><title>MySQL Lock Contention：在 staging 重現的 deadlock、production 跑 6 個月才出現</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/lock-contention/</link><pubDate>Tue, 19 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/lock-contention/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL&lt;/a> overview 的 implementation-layer deep article。Overview 已說明 MySQL 在 OLTP 譜系的定位、本文聚焦 &lt;em>lock contention&lt;/em> — 5 種 lock type + isolation level 互動 + production debug。&lt;/p>&lt;/blockquote>
&lt;hr>
&lt;h2 id="開場案例">開場案例&lt;/h2>
&lt;p>Application 跑了 6 個月、staging 100% 重現過的 deadlock 從來沒在 production 出現。某天 traffic 上升 30%、production 開始爆 &lt;code>ER_LOCK_DEADLOCK&lt;/code>、application retry 不夠快、order 大量失敗。&lt;/p>
&lt;p>&lt;code>SHOW ENGINE INNODB STATUS\G&lt;/code> 拉出 deadlock：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">*** (1) TRANSACTION:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">TRANSACTION 12345, ACTIVE 1 sec starting index read
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">mysql tables in use 1, locked 1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl">LOCK WAIT 4 lock struct(s), heap size 1136, 3 row lock(s)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">MySQL thread id 100, query id 5000 update orders
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">UPDATE orders SET status = &amp;#39;shipped&amp;#39; WHERE id = 500
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl">*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">RECORD LOCKS space id 50 page no 5 n bits 80 index PRIMARY of table `production`.`orders`
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl">trx id 12345 lock_mode X locks rec but not gap waiting
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl">*** (2) TRANSACTION:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl">TRANSACTION 12346, ACTIVE 1 sec starting index read
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">14&lt;/span>&lt;span class="cl">mysql tables in use 1, locked 1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">15&lt;/span>&lt;span class="cl">4 lock struct(s), heap size 1136, 4 row lock(s)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">16&lt;/span>&lt;span class="cl">MySQL thread id 101, query id 5001 update payments
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">17&lt;/span>&lt;span class="cl">UPDATE payments SET captured = 1 WHERE order_id = 500
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">18&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">19&lt;/span>&lt;span class="cl">*** (2) HOLDS THE LOCK(S):
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">20&lt;/span>&lt;span class="cl">RECORD LOCKS space id 50 page no 5 n bits 80 index PRIMARY of table `production`.`orders`
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">21&lt;/span>&lt;span class="cl">trx id 12346 lock_mode X locks rec but not gap
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">22&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">23&lt;/span>&lt;span class="cl">*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">24&lt;/span>&lt;span class="cl">RECORD LOCKS space id 51 page no 10 n bits 80 index idx_order_id of table `production`.`payments`
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">25&lt;/span>&lt;span class="cl">trx id 12346 lock_mode X waiting
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">26&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">27&lt;/span>&lt;span class="cl">*** WE ROLL BACK TRANSACTION (1)&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>兩個 transaction 各自拿了一邊 lock、互相等對方的、deadlock。為什麼 staging 重現過、production 6 個月才爆？因為 &lt;strong>lock contention 是 &lt;em>可能性&lt;/em> 不是 &lt;em>確定性&lt;/em>&lt;/strong> — staging 重現等於確認「程式邏輯有 deadlock risk」、production 6 個月平安等於「concurrency 還沒撞到」。Traffic 上升把 &lt;em>機率乘以 N&lt;/em>、原本每天 0 次變每分鐘 5 次。&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL</a> overview 的 implementation-layer deep article。Overview 已說明 MySQL 在 OLTP 譜系的定位、本文聚焦 <em>lock contention</em> — 5 種 lock type + isolation level 互動 + production debug。</p></blockquote>
<hr>
<h2 id="開場案例">開場案例</h2>
<p>Application 跑了 6 個月、staging 100% 重現過的 deadlock 從來沒在 production 出現。某天 traffic 上升 30%、production 開始爆 <code>ER_LOCK_DEADLOCK</code>、application retry 不夠快、order 大量失敗。</p>
<p><code>SHOW ENGINE INNODB STATUS\G</code> 拉出 deadlock：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln"> 1</span><span class="cl">*** (1) TRANSACTION:
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">TRANSACTION 12345, ACTIVE 1 sec starting index read
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">mysql tables in use 1, locked 1
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">LOCK WAIT 4 lock struct(s), heap size 1136, 3 row lock(s)
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">MySQL thread id 100, query id 5000 update orders
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">UPDATE orders SET status = &#39;shipped&#39; WHERE id = 500
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">RECORD LOCKS space id 50 page no 5 n bits 80 index PRIMARY of table `production`.`orders`
</span></span><span class="line"><span class="ln">10</span><span class="cl">trx id 12345 lock_mode X locks rec but not gap waiting
</span></span><span class="line"><span class="ln">11</span><span class="cl">
</span></span><span class="line"><span class="ln">12</span><span class="cl">*** (2) TRANSACTION:
</span></span><span class="line"><span class="ln">13</span><span class="cl">TRANSACTION 12346, ACTIVE 1 sec starting index read
</span></span><span class="line"><span class="ln">14</span><span class="cl">mysql tables in use 1, locked 1
</span></span><span class="line"><span class="ln">15</span><span class="cl">4 lock struct(s), heap size 1136, 4 row lock(s)
</span></span><span class="line"><span class="ln">16</span><span class="cl">MySQL thread id 101, query id 5001 update payments
</span></span><span class="line"><span class="ln">17</span><span class="cl">UPDATE payments SET captured = 1 WHERE order_id = 500
</span></span><span class="line"><span class="ln">18</span><span class="cl">
</span></span><span class="line"><span class="ln">19</span><span class="cl">*** (2) HOLDS THE LOCK(S):
</span></span><span class="line"><span class="ln">20</span><span class="cl">RECORD LOCKS space id 50 page no 5 n bits 80 index PRIMARY of table `production`.`orders`
</span></span><span class="line"><span class="ln">21</span><span class="cl">trx id 12346 lock_mode X locks rec but not gap
</span></span><span class="line"><span class="ln">22</span><span class="cl">
</span></span><span class="line"><span class="ln">23</span><span class="cl">*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
</span></span><span class="line"><span class="ln">24</span><span class="cl">RECORD LOCKS space id 51 page no 10 n bits 80 index idx_order_id of table `production`.`payments`
</span></span><span class="line"><span class="ln">25</span><span class="cl">trx id 12346 lock_mode X waiting
</span></span><span class="line"><span class="ln">26</span><span class="cl">
</span></span><span class="line"><span class="ln">27</span><span class="cl">*** WE ROLL BACK TRANSACTION (1)</span></span></code></pre></div><p>兩個 transaction 各自拿了一邊 lock、互相等對方的、deadlock。為什麼 staging 重現過、production 6 個月才爆？因為 <strong>lock contention 是 <em>可能性</em> 不是 <em>確定性</em></strong> — staging 重現等於確認「程式邏輯有 deadlock risk」、production 6 個月平安等於「concurrency 還沒撞到」。Traffic 上升把 <em>機率乘以 N</em>、原本每天 0 次變每分鐘 5 次。</p>
<p>這個 case 揭露 MySQL lock 教學的核心：理解 lock 不只是 <em>debug 跑 deadlock 報錯</em> 的能力、是 <em>讀 query 預測 lock pattern</em> 的能力。</p>
<h2 id="innodb-5-種-lock-類型">InnoDB 5 種 Lock 類型</h2>
<p>InnoDB 不是 <em>簡單 row lock</em>、有 5 個獨立 lock concept：</p>
<h3 id="1-record-lock--鎖-row">1. Record Lock — 鎖 row</h3>
<p><code>SELECT ... FOR UPDATE</code> / UPDATE / DELETE 對 <em>被 match 的 row</em> 加 record lock。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- Transaction 1
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">BEGIN</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">100</span><span class="w"> </span><span class="k">FOR</span><span class="w"> </span><span class="k">UPDATE</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- 對 id=100 的 row 加 record lock</span></span></span></code></pre></div><p>Transaction 2 試 <code>UPDATE orders WHERE id = 100</code> 必須等。</p>
<h3 id="2-gap-lock--鎖-row-之間的空隙">2. Gap Lock — 鎖 row 之間的「空隙」</h3>
<p>InnoDB 在 <em>REPEATABLE READ</em> (預設) 下、<code>SELECT ... FOR UPDATE WHERE col &gt; 100</code> 不只 lock 符合的 row、<em>也 lock 該 range 內的「空隙」</em>、防其他 transaction INSERT 進這個 range。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 已存在 orders: id=100, 200, 300
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">BEGIN</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">100</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="mi">300</span><span class="w"> </span><span class="k">FOR</span><span class="w"> </span><span class="k">UPDATE</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- Lock id=200 + gap lock (100, 200) + gap lock (200, 300)</span></span></span></code></pre></div><p>Transaction 2 試 <code>INSERT INTO orders (id) VALUES (150)</code> 必須等 — 即使 id=150 不存在、gap lock 阻擋 INSERT。</p>
<p><strong>Gap lock 是 deadlock 最常見來源</strong> — application logic 看 row、但 lock 卻 cover row 之外的空隙、難預測。</p>
<h3 id="3-next-key-lock--record--gap-組合">3. Next-Key Lock — Record + Gap 組合</h3>
<p>預設 lock 行為。<code>SELECT ... FOR UPDATE WHERE col = 100</code> 對 id=100 的 record lock + id=100 之前的 gap lock。</p>
<p>Lock 的範圍實際是 <em>半開區間</em> (previous_id, current_id]：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">Records: 100, 200, 300
</span></span><span class="line"><span class="ln">2</span><span class="cl">
</span></span><span class="line"><span class="ln">3</span><span class="cl">WHERE id = 100 FOR UPDATE → next-key lock (-inf, 100]
</span></span><span class="line"><span class="ln">4</span><span class="cl">WHERE id = 200 FOR UPDATE → next-key lock (100, 200]
</span></span><span class="line"><span class="ln">5</span><span class="cl">WHERE id = 300 FOR UPDATE → next-key lock (200, 300]
</span></span><span class="line"><span class="ln">6</span><span class="cl">WHERE id BETWEEN 150 AND 250 FOR UPDATE → next-key lock (100, 200] + (200, 300]</span></span></code></pre></div><h3 id="4-insert-intention-lock--insert-之前的-gap-lock">4. Insert Intention Lock — INSERT 之前的 gap lock</h3>
<p><code>INSERT</code> 不直接 lock 整個 gap、而是 <em>insert intention lock</em> — 比 gap lock 弱、允許多個 INSERT 同 gap 並行（不同 id）。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- Transaction 1
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="p">(</span><span class="n">id</span><span class="p">)</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="p">(</span><span class="mi">150</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="c1">-- Transaction 2
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"></span><span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="p">(</span><span class="n">id</span><span class="p">)</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="p">(</span><span class="mi">175</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="c1">-- 同 gap (100, 200)、兩個 INSERT 並行、不阻塞</span></span></span></code></pre></div><p>但如果 Transaction 1 已 hold gap lock（through SELECT FOR UPDATE）、Transaction 2 INSERT 必須等。</p>
<h3 id="5-auto-inc-lock--auto-increment-column-專用">5. Auto-Inc Lock — Auto-Increment column 專用</h3>
<p><code>INSERT INTO orders (id) VALUES (DEFAULT)</code> 取得 auto-increment value 時 lock。Mode：</p>
<ul>
<li><code>innodb_autoinc_lock_mode=0</code>（traditional）：lock 整個 INSERT statement 期間、其他 INSERT 必須等</li>
<li><code>innodb_autoinc_lock_mode=1</code>（consecutive）：lock 短時間（取值期間）、INSERT 1 row 不會阻塞其他</li>
<li><code>innodb_autoinc_lock_mode=2</code>（interleaved、8.0+ 預設（5.7 預設仍是 1））：完全並行、auto-inc value 不保證連續但可並行</li>
</ul>
<p>8.0+ 預設 mode=2、性能高、但 <em>binlog format 必須 ROW</em>（STATEMENT 行為錯）。</p>
<h2 id="isolation-level-對-lock-的決定性影響">Isolation Level 對 Lock 的決定性影響</h2>
<p>InnoDB 4 個 isolation level、lock 行為完全不同：</p>
<table>
  <thead>
      <tr>
          <th>Isolation</th>
          <th>Read 行為</th>
          <th>Lock 範圍</th>
          <th>Default?</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>READ UNCOMMITTED</td>
          <td>可讀 dirty data</td>
          <td>純 record lock、無 gap</td>
          <td>否</td>
      </tr>
      <tr>
          <td>READ COMMITTED</td>
          <td>每個 statement 看當下 committed</td>
          <td>純 record lock、無 gap</td>
          <td>否</td>
      </tr>
      <tr>
          <td>REPEATABLE READ</td>
          <td>Transaction 內 snapshot consistent</td>
          <td>Record + gap + next-key</td>
          <td><strong>是</strong></td>
      </tr>
      <tr>
          <td>SERIALIZABLE</td>
          <td>強制 SELECT 變 SELECT &hellip; FOR SHARE</td>
          <td>Record + gap + next-key 加重</td>
          <td>否</td>
      </tr>
  </tbody>
</table>
<p><strong>REPEATABLE READ + Gap lock 是 deadlock 主要來源</strong>：</p>
<ul>
<li>預設 isolation level</li>
<li>為了 <em>保證 repeatable read</em>（同 transaction 內讀同樣資料）、強制 gap lock 防 phantom row</li>
<li>但 gap lock 經常 lock 比預期廣的範圍、deadlock 機率上升</li>
</ul>
<p><strong>改成 READ COMMITTED 的取捨</strong>：</p>
<ul>
<li>優點：無 gap lock、deadlock 大降、寫吞吐上升</li>
<li>缺點：transaction 內讀同 query 結果可能不同（non-repeatable read）</li>
<li>重要：<em>binlog format 必須 ROW</em>（STATEMENT 在 READ COMMITTED 下 replication 行為不一致）</li>
<li>多數 MySQL production 用 READ COMMITTED 跑 OLTP、REPEATABLE READ 留給特殊 case</li>
</ul>
<p><strong>對比 PostgreSQL</strong>：</p>
<ul>
<li>PG 預設 isolation 是 <em>READ COMMITTED</em>（不是 RR）</li>
<li>PG 的 RR 用 <em>snapshot isolation</em>（不靠 gap lock）、deadlock 少</li>
<li>這是 MySQL 跟 PG 在 <em>並行控制 model</em> 的根本差異 — MySQL 用 lock-based、PG 用 MVCC-heavy</li>
</ul>
<h2 id="用-show-engine-innodb-status-讀-lock-狀態">用 SHOW ENGINE INNODB STATUS 讀 lock 狀態</h2>
<p><code>SHOW ENGINE INNODB STATUS\G</code> 是 production debug lock contention 的主要工具：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln"> 1</span><span class="cl">------------
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">TRANSACTIONS
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">------------
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">Trx id counter 12350
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">Purge done for trx&#39;s n:o &lt; 12340 undo n:o &lt; 0 state: running but idle
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">History list length 5
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">---TRANSACTION 12345, ACTIVE 30 sec  -- 長 transaction、警訊
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">3 lock struct(s), heap size 1136, 5 row lock(s)
</span></span><span class="line"><span class="ln">10</span><span class="cl">MySQL thread id 100, OS thread handle ..., query id ...
</span></span><span class="line"><span class="ln">11</span><span class="cl">SELECT * FROM orders WHERE id &gt; 100 FOR UPDATE
</span></span><span class="line"><span class="ln">12</span><span class="cl">------- TRX HAS BEEN WAITING 5 SEC FOR THIS LOCK:
</span></span><span class="line"><span class="ln">13</span><span class="cl">RECORD LOCKS space id 50 page no 5 n bits 80 index PRIMARY of table `production`.`orders`
</span></span><span class="line"><span class="ln">14</span><span class="cl">trx id 12345 lock_mode X locks gap before rec  -- gap lock</span></span></code></pre></div><p>關鍵欄位：</p>
<ul>
<li><code>ACTIVE N sec</code>：transaction 跑多久（長 transaction 嫌疑）</li>
<li><code>lock_mode X / S</code>：exclusive / shared lock</li>
<li><code>locks rec but not gap</code> / <code>locks gap before rec</code> / <code>locks rec</code>：是 record / gap / next-key</li>
<li><code>TRX HAS BEEN WAITING N SEC FOR THIS LOCK</code>：等多久、超過幾秒就是 lock contention</li>
</ul>
<p><code>SELECT * FROM information_schema.INNODB_TRX</code> / <code>INNODB_LOCKS</code> (5.7) / <code>performance_schema.data_locks</code> (8.0) 給 <em>structured</em> lock 視圖。</p>
<h2 id="5-個-production-踩雷">5 個 Production 踩雷</h2>
<h3 id="1-gap-lock-阻塞-insert--lock-不存在的-row">1. Gap lock 阻塞 INSERT — 「Lock 不存在的 row」</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- Transaction 1
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">BEGIN</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">user_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">100</span><span class="w"> </span><span class="k">FOR</span><span class="w"> </span><span class="k">UPDATE</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- 假設 user_id=100 沒任何 order、預期沒 lock 任何 row
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"></span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w"></span><span class="c1">-- Transaction 2
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="c1"></span><span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="p">(</span><span class="n">user_id</span><span class="p">,</span><span class="w"> </span><span class="n">amount</span><span class="p">)</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="p">(</span><span class="mi">100</span><span class="p">,</span><span class="w"> </span><span class="mi">50</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="w"></span><span class="c1">-- 等！為什麼？</span></span></span></code></pre></div><p>問題：<code>WHERE user_id = 100</code> <em>沒有 record</em> 時、InnoDB 仍 lock <em>user_id=100 應該在的 gap</em>（防 phantom）、Transaction 2 INSERT 進這個 gap 被阻擋。</p>
<p>修法：</p>
<ul>
<li>改 READ COMMITTED isolation</li>
<li>或不用 <code>SELECT ... FOR UPDATE</code> on empty result、改 <em>application 層 check + INSERT</em> pattern</li>
<li>用 <code>INSERT ... ON DUPLICATE KEY UPDATE</code> 或 <code>INSERT IGNORE</code> 避免 SELECT FOR UPDATE</li>
</ul>
<h3 id="2-auto-inc-lock-contention--大量並行-insert">2. Auto-Inc Lock Contention — 大量並行 INSERT</h3>
<p><code>innodb_autoinc_lock_mode=0</code> 或 <code>=1</code> 模式下、大量並行 INSERT 撞 auto-inc lock、寫吞吐 cap。</p>
<p>修法：</p>
<ul>
<li>設 <code>innodb_autoinc_lock_mode=2</code>（interleaved、8.0+ 預設（5.7 預設仍是 1））</li>
<li>確認 <code>binlog_format=ROW</code>（mode=2 必須）</li>
<li>接受 auto-inc value 不連續（id 可能跳號）</li>
</ul>
<h3 id="3-fk-lock-cascading--父子-transaction-互鎖">3. FK Lock Cascading — 父子 transaction 互鎖</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- orders 表有 customer_id FK → customers.id
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1">-- Transaction 1
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="c1"></span><span class="k">UPDATE</span><span class="w"> </span><span class="n">customers</span><span class="w"> </span><span class="k">SET</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;...&#39;</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">100</span><span class="p">;</span><span class="w">  </span><span class="c1">-- lock customers row
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"></span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="c1">-- Transaction 2
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1"></span><span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="p">(</span><span class="n">customer_id</span><span class="p">,</span><span class="w"> </span><span class="n">amount</span><span class="p">)</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="p">(</span><span class="mi">100</span><span class="p">,</span><span class="w"> </span><span class="mi">50</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w"></span><span class="c1">-- FK check 需要 lock customers row id=100、等 Transaction 1</span></span></span></code></pre></div><p>FK 強制 <em>每個 INSERT child 都要 shared lock parent</em>、parent 的任何 UPDATE 都會 lock 所有 child INSERT。</p>
<p>修法：</p>
<ul>
<li>評估 FK 是否真的需要（high-write 場景考慮 application-level enforcement）</li>
<li>短 transaction 縮短 lock 時間</li>
<li>FK 設計時讓 <em>parent UPDATE 少</em> / <em>child INSERT 多</em>（parent 是穩定資料）</li>
</ul>
<h3 id="4-large-transaction-lock-holding--1-個-transaction-拖全-cluster">4. Large Transaction Lock Holding — 1 個 transaction 拖全 cluster</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">BEGIN</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="c1">-- 100K row 的 batch UPDATE
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="c1"></span><span class="k">UPDATE</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">SET</span><span class="w"> </span><span class="n">status</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;archived&#39;</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="s1">&#39;2024-01-01&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- 跑 5 分鐘、持 100K row 的 lock
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1">-- 其他 transaction 撞到任何被 lock 的 row 都等 5 分鐘
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1"></span><span class="k">COMMIT</span><span class="p">;</span></span></span></code></pre></div><p>長 transaction 是 <em>lock contention 災難</em>。</p>
<p>修法：</p>
<ul>
<li>
<p>把 batch operation <em>拆 chunk</em>（每 chunk 1000 row、commit、繼續）：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">DO</span><span class="w"> </span><span class="err">{</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">  </span><span class="k">START</span><span class="w"> </span><span class="k">TRANSACTION</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">  </span><span class="k">UPDATE</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">SET</span><span class="w"> </span><span class="n">status</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;archived&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">  </span><span class="k">WHERE</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="s1">&#39;2024-01-01&#39;</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">status</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="s1">&#39;archived&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">  </span><span class="k">LIMIT</span><span class="w"> </span><span class="mi">1000</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w">  </span><span class="k">COMMIT</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w"></span><span class="err">}</span><span class="w"> </span><span class="n">WHILE</span><span class="w"> </span><span class="n">rows_affected</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span></span></span></code></pre></div></li>
<li>
<p>用 <em>pt-archiver</em> tool（Percona）對 batch UPDATE / DELETE 自動 chunked</p>
</li>
<li>
<p>監控 <code>information_schema.innodb_trx</code> 找出 long-running transaction</p>
</li>
</ul>
<h3 id="5-read-committed--binlog-row-interaction">5. READ COMMITTED + Binlog ROW Interaction</h3>
<p>READ COMMITTED isolation 改善 deadlock、但對 <em>binlog format</em> 有要求：</p>
<ul>
<li><code>binlog_format=STATEMENT</code>：READ COMMITTED 下 transaction 看到不同 snapshot、replicate 後 replica 結果可能 <em>不同於 primary</em>（broken replication semantically）</li>
<li><code>binlog_format=ROW</code>：每個 row event 都 explicit、READ COMMITTED 跟 ROW 兼容、replica 結果一致</li>
<li><code>binlog_format=MIXED</code>：部分 case 仍可能 fall back STATEMENT、不推薦</li>
</ul>
<p>修法：</p>
<ul>
<li>用 READ COMMITTED 時、強制 <code>binlog_format=ROW</code></li>
<li>全 cluster server（primary + replica + Group Replication members）統一 binlog_format</li>
<li>Migration 5.7 STATEMENT → 8.0 ROW 時、isolation 跟 binlog format 一起 review</li>
</ul>
<h2 id="跟其他模組整合">跟其他模組整合</h2>
<h3 id="跟-replication">跟 Replication</h3>
<p><code>binlog_format=ROW</code> 跟 isolation level 互動已述。Replica apply ROW binlog 時、replica 上 <em>也 acquire 同樣 lock</em>、replica 上的 long query 跟 replication lag 互動。詳見 <a href="/blog/backend/01-database/vendors/mysql/replication-topology/" data-link-title="MySQL Replication Topology：async / semi-sync / GTID 不是三選一、是三個 trade-off 軸的疊加" data-link-desc="MySQL replication 不是「選 async 還是 semi-sync」、是 *durability / latency / consistency* 三個 trade-off 軸的疊加；GTID 是跨 mode 的 infrastructure layer、不是第三種 mode。本文走 3 軸取捨模型 → async / semi-sync 行為對比 → GTID 替代 binlog-position 的好處 → 配置 step-by-step → 5 production 踩雷（lag 暴衝 / semi-sync 退回 async / GTID gap / Loss-Less semi-sync 真的 loss-less / chained replication 雪崩）→ 跟 Aurora MySQL / Vitess / ProxySQL / Orchestrator 整合">Replication Topology</a>。</p>
<h3 id="跟-group-replication">跟 Group Replication</h3>
<p>GR certification phase 跟 row lock 衝突 — write conflict 檢測在 certification、不是 lock。但 <em>local row lock</em> 仍存在、影響 single-instance write throughput。詳見 <a href="/blog/backend/01-database/vendors/mysql/group-replication/" data-link-title="MySQL Group Replication / InnoDB Cluster：single-primary vs multi-primary mode 對 transaction certification 的影響" data-link-desc="MySQL Group Replication 提供 synchronous multi-primary replication、用 Paxos-like Group Communication Engine（GCE）達成 quorum-based commit。但「multi-primary」不是「single-primary 多開幾個 write 入口」、是 *transaction conflict detection &#43; certification* 整個機制不同。本文走 GR 機制（GCE &#43; certification &#43; applier）、single-primary vs multi-primary mode、InnoDB Cluster 跟 MySQL Shell / Router 整合、5 production 踩雷（cert lag / write conflict / large transaction / network partition / member 加入 catch-up）、何時用 GR 何時用傳統 replication">Group Replication</a>。</p>
<h3 id="跟-online-schema-change">跟 Online Schema Change</h3>
<p>gh-ost / pt-osc 在 cut-over 階段需要 metadata lock、跟 long-running transaction 衝突。Lock contention deep dive 跟 OSC cut-over 議題密切。詳見 <a href="/blog/backend/01-database/vendors/mysql/online-schema-change-tools/" data-link-title="MySQL Online Schema Change：gh-ost 跟 pt-online-schema-change 兩條完全不同的 ghost table 路徑" data-link-desc="MySQL ALTER TABLE 可能鎖整張表，production 需要 online schema change 流程。gh-ost（GitHub）跟 pt-online-schema-change（Percona）都用 ghost table 解決、但底層機制完全不同：pt-osc 用 trigger 同步、gh-ost 用 binlog stream 同步。本文走兩工具機制對照表 → trigger vs binlog 各自取捨 → 配置 step-by-step → 5 production 踩雷（trigger overhead / binlog 延遲 / FK constraint / hot trigger lock / 切換瞬間 deadlock）→ 何時用哪一個">Online Schema Change Tools</a>。</p>
<h3 id="跟-query-optimization">跟 Query Optimization</h3>
<p>Slow query 持 lock 久、放大 contention。<code>EXPLAIN ANALYZE</code> 看實際執行時間、跟 lock holding time 直接相關。詳見 <a href="/blog/backend/01-database/vendors/mysql/query-optimization/" data-link-title="MySQL Query Optimization：從 EXPLAIN 看到實際執行、5 條 query 從 5 秒變 50ms 的 anatomy" data-link-desc="MySQL query 慢的根因不在「SQL 寫法」、在「optimizer 選錯 plan」。本文從 5 個常見 production case 開場（5 秒 → 50ms / 30 秒 → 200ms / 8 秒 → 30ms 等）、走 EXPLAIN / EXPLAIN ANALYZE / optimizer trace 三層分析工具、index hint vs optimizer hint 取捨、cardinality estimation 失效時的修法、5 production 踩雷（statistics 過時 / forced index 用錯 / hash join 沒觸發 / range scan 退化 ALL / derived table materialization）">Query Optimization</a>。</p>
<h3 id="跟-innodb-tuning">跟 InnoDB Tuning</h3>
<p><code>innodb_lock_wait_timeout=50</code>（預設 50 秒）— lock wait 超時 transaction 自動 rollback、避免無限等。production 建議調短（10-20 秒）、快 fail 給 application retry。詳見 <a href="/blog/backend/01-database/vendors/mysql/innodb-tuning/" data-link-title="MySQL InnoDB Tuning：為什麼一個 100 GB DB 在 64 GB RAM server 上 query 慢 5 倍" data-link-desc="InnoDB 是 MySQL 預設 storage engine、預設值給 256 MB buffer pool（早期 default）。本文從一個常見痛點開場（DB &gt; RAM 但 server 仍 swap）、走 4 個 critical knob（buffer pool / redo log / flush method / IO capacity）、各自如何影響讀寫吞吐、配置 step-by-step、5 production 踩雷（buffer pool warm-up / log file 大小 / 設 sync_binlog=0 換速度 / IO scheduler / undo log 膨脹）、跟 SSD / NVMe / EBS 的 IO 假設">InnoDB Tuning</a>。</p>
<h2 id="跟-postgresql-lock-model-對比">跟 PostgreSQL Lock model 對比</h2>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>MySQL InnoDB</th>
          <th>PostgreSQL</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Concurrency model</td>
          <td>Lock-based（rec / gap / next-key）</td>
          <td>MVCC-heavy（few explicit lock）</td>
      </tr>
      <tr>
          <td>預設 isolation</td>
          <td>REPEATABLE READ</td>
          <td>READ COMMITTED</td>
      </tr>
      <tr>
          <td>Gap lock</td>
          <td>有</td>
          <td>無對應（PG 用 predicate lock for SERIALIZABLE）</td>
      </tr>
      <tr>
          <td>Deadlock 機率</td>
          <td>中-高</td>
          <td>低</td>
      </tr>
      <tr>
          <td>Auto-inc</td>
          <td>內建 + auto-inc lock</td>
          <td>SEQUENCE（無對應 lock 議題）</td>
      </tr>
      <tr>
          <td>Snapshot isolation</td>
          <td>部分（RR 內）</td>
          <td>完整（MVCC 跑全 stack）</td>
      </tr>
  </tbody>
</table>
<p>PG 用 MVCC 跑大部分並行 control、少數 case 才用 explicit lock、整體 deadlock 機率低。MySQL 用 lock-based + MVCC mixed、production 必須懂 lock pattern。</p>
<h2 id="觀測-metric">觀測 metric</h2>
<p>Production 持續 monitor：</p>
<ul>
<li><code>Innodb_row_lock_waits</code> / <code>_time</code> → lock wait 累計</li>
<li><code>Innodb_deadlocks</code> → deadlock 次數（5.7+ 有、之前要 parse SHOW ENGINE）</li>
<li><code>performance_schema.data_lock_waits</code> → 即時 lock wait 視圖（8.0+）</li>
<li><code>information_schema.innodb_trx</code> → long-running transaction</li>
<li><code>slow_query_log</code> → 看 query 是否花太多 time 在 lock wait</li>
</ul>
<p>對 deadlock：把 <code>innodb_print_all_deadlocks=ON</code>、所有 deadlock 寫 error log、不用 <code>SHOW ENGINE</code> 才看到。</p>
<h2 id="何時改-isolation-level">何時改 isolation level</h2>
<table>
  <thead>
      <tr>
          <th>場景</th>
          <th>建議 isolation</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>典型 web OLTP、低-中寫吞吐</td>
          <td>REPEATABLE READ（預設）</td>
      </tr>
      <tr>
          <td>高寫吞吐、deadlock 頻繁</td>
          <td>READ COMMITTED</td>
      </tr>
      <tr>
          <td>金融 transaction、需要 strict isolation</td>
          <td>REPEATABLE READ + 仔細 review</td>
      </tr>
      <tr>
          <td>嚴格 serializable（小 case）</td>
          <td>SERIALIZABLE（performance penalty）</td>
      </tr>
      <tr>
          <td>跨 region replication + 強一致</td>
          <td>用 Group Replication / Spanner 而不是 isolation level</td>
      </tr>
  </tbody>
</table>
<h2 id="相關連結">相關連結</h2>
<ul>
<li><a href="/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL vendor overview</a></li>
<li><a href="/blog/backend/01-database/vendors/mysql/replication-topology/" data-link-title="MySQL Replication Topology：async / semi-sync / GTID 不是三選一、是三個 trade-off 軸的疊加" data-link-desc="MySQL replication 不是「選 async 還是 semi-sync」、是 *durability / latency / consistency* 三個 trade-off 軸的疊加；GTID 是跨 mode 的 infrastructure layer、不是第三種 mode。本文走 3 軸取捨模型 → async / semi-sync 行為對比 → GTID 替代 binlog-position 的好處 → 配置 step-by-step → 5 production 踩雷（lag 暴衝 / semi-sync 退回 async / GTID gap / Loss-Less semi-sync 真的 loss-less / chained replication 雪崩）→ 跟 Aurora MySQL / Vitess / ProxySQL / Orchestrator 整合">MySQL Replication Topology</a>（binlog format + isolation 互動）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/query-optimization/" data-link-title="MySQL Query Optimization：從 EXPLAIN 看到實際執行、5 條 query 從 5 秒變 50ms 的 anatomy" data-link-desc="MySQL query 慢的根因不在「SQL 寫法」、在「optimizer 選錯 plan」。本文從 5 個常見 production case 開場（5 秒 → 50ms / 30 秒 → 200ms / 8 秒 → 30ms 等）、走 EXPLAIN / EXPLAIN ANALYZE / optimizer trace 三層分析工具、index hint vs optimizer hint 取捨、cardinality estimation 失效時的修法、5 production 踩雷（statistics 過時 / forced index 用錯 / hash join 沒觸發 / range scan 退化 ALL / derived table materialization）">MySQL Query Optimization</a>（slow query → lock contention）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/innodb-tuning/" data-link-title="MySQL InnoDB Tuning：為什麼一個 100 GB DB 在 64 GB RAM server 上 query 慢 5 倍" data-link-desc="InnoDB 是 MySQL 預設 storage engine、預設值給 256 MB buffer pool（早期 default）。本文從一個常見痛點開場（DB &gt; RAM 但 server 仍 swap）、走 4 個 critical knob（buffer pool / redo log / flush method / IO capacity）、各自如何影響讀寫吞吐、配置 step-by-step、5 production 踩雷（buffer pool warm-up / log file 大小 / 設 sync_binlog=0 換速度 / IO scheduler / undo log 膨脹）、跟 SSD / NVMe / EBS 的 IO 假設">MySQL InnoDB Tuning</a>（lock_wait_timeout）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/group-replication/" data-link-title="MySQL Group Replication / InnoDB Cluster：single-primary vs multi-primary mode 對 transaction certification 的影響" data-link-desc="MySQL Group Replication 提供 synchronous multi-primary replication、用 Paxos-like Group Communication Engine（GCE）達成 quorum-based commit。但「multi-primary」不是「single-primary 多開幾個 write 入口」、是 *transaction conflict detection &#43; certification* 整個機制不同。本文走 GR 機制（GCE &#43; certification &#43; applier）、single-primary vs multi-primary mode、InnoDB Cluster 跟 MySQL Shell / Router 整合、5 production 踩雷（cert lag / write conflict / large transaction / network partition / member 加入 catch-up）、何時用 GR 何時用傳統 replication">MySQL Group Replication</a>（cert vs lock）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/online-schema-change-tools/" data-link-title="MySQL Online Schema Change：gh-ost 跟 pt-online-schema-change 兩條完全不同的 ghost table 路徑" data-link-desc="MySQL ALTER TABLE 可能鎖整張表，production 需要 online schema change 流程。gh-ost（GitHub）跟 pt-online-schema-change（Percona）都用 ghost table 解決、但底層機制完全不同：pt-osc 用 trigger 同步、gh-ost 用 binlog stream 同步。本文走兩工具機制對照表 → trigger vs binlog 各自取捨 → 配置 step-by-step → 5 production 踩雷（trigger overhead / binlog 延遲 / FK constraint / hot trigger lock / 切換瞬間 deadlock）→ 何時用哪一個">MySQL Online Schema Change</a>（metadata lock）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/mvcc-lock-model/" data-link-title="PostgreSQL MVCC &#43; Lock Model：為什麼 PG 比 MySQL 少 deadlock、但 vacuum 是別的代價" data-link-desc="PG 用 *MVCC-heavy &#43; 少 explicit lock* 的並行控制、跟 MySQL InnoDB 的 *lock-based*（record / gap / next-key）相反。本文走 MVCC 機制（tuple version &#43; xmin/xmax &#43; visibility）、PG 4 種 lock（row-level / table-level / advisory / predicate）、預測 SERIALIZABLE 行為、5 production 踩雷（idle transaction 卡 vacuum / SELECT FOR UPDATE 跨 transaction / advisory lock 沒釋放 / bloat 不是 vacuum 問題 / predicate lock 在 SSI 下 rollback）、跟 MySQL lock-contention sibling 對比">PostgreSQL MVCC + Lock Model</a>（PG sibling、為什麼 PG 比 MySQL 少 deadlock — pure MVCC vs MVCC + gap lock）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL vendor page</a>（MVCC vs lock model）</li>
<li><a href="/blog/backend/knowledge-cards/isolation-level/" data-link-title="Isolation Level" data-link-desc="說明資料庫交易隔離級別如何影響並發讀寫結果">Isolation Level 卡片</a></li>
<li>官方：<a href="https://dev.mysql.com/doc/refman/8.0/en/innodb-locking.html">InnoDB Locking</a></li>
</ul>
]]></content:encoded></item><item><title>PostgreSQL MVCC + Lock Model：為什麼 PG 比 MySQL 少 deadlock、但 vacuum 是別的代價</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/mvcc-lock-model/</link><pubDate>Tue, 19 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/mvcc-lock-model/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL&lt;/a> overview 的 implementation-layer deep article。Overview 已說明 PG 在 OLTP 譜系的定位、本文聚焦 &lt;em>MVCC + lock model&lt;/em> — PG 並行控制機制跟跟 MySQL lock-based 不同。&lt;/p>&lt;/blockquote>
&lt;hr>
&lt;h2 id="pg-mvcc每次更新都-新增-tuple不改舊版">PG MVCC：每次更新都 &lt;em>新增 tuple&lt;/em>、不改舊版&lt;/h2>
&lt;p>PG 的並行控制核心是 &lt;em>Multi-Version Concurrency Control&lt;/em> — UPDATE 不修改原 row、是 &lt;em>新增&lt;/em> 一個 tuple version、舊 version 留在 table 直到 VACUUM 清理：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">原 row: (id=1, status=&amp;#39;pending&amp;#39;, xmin=100, xmax=NULL)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl"> ↓ UPDATE status=&amp;#39;shipped&amp;#39;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">新 tuple: (id=1, status=&amp;#39;shipped&amp;#39;, xmin=200, xmax=NULL)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">舊 tuple 標 xmax=200（不刪、給其他 transaction 看舊 version）&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>xmin&lt;/code> / &lt;code>xmax&lt;/code> 是 &lt;em>creator transaction id&lt;/em> / &lt;em>destroyer transaction id&lt;/em>。每個 SELECT 用 &lt;em>snapshot&lt;/em>（含當下 active transaction list）判斷哪些 tuple 對自己可見：&lt;/p>
&lt;ul>
&lt;li>自己 transaction id &amp;gt; tuple.xmin 且 (tuple.xmax = NULL 或自己 transaction id &amp;lt; tuple.xmax) → 可見&lt;/li>
&lt;li>否則 → 看不到（過去 / 未來版本）&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>結果&lt;/strong>：&lt;/p>
&lt;ul>
&lt;li>&lt;em>Readers 不 lock writers&lt;/em>：SELECT 看 snapshot、不 block UPDATE&lt;/li>
&lt;li>&lt;em>Writers 不 lock readers&lt;/em>：UPDATE 寫新 tuple、不影響正在跑的 SELECT snapshot&lt;/li>
&lt;li>&lt;em>Writers 只 lock 同一 row 的 writers&lt;/em>：兩個 UPDATE 同 row 才 conflict&lt;/li>
&lt;/ul>
&lt;p>跟 MySQL InnoDB &lt;em>lock-based&lt;/em>（&lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/lock-contention/" data-link-title="MySQL Lock Contention：在 staging 重現的 deadlock、production 跑 6 個月才出現" data-link-desc="MySQL InnoDB 的 lock 是 row-level、但 *為什麼某些 row 莫名其妙也被 lock* 是 gap lock / next-key lock 設計造成的隱性行為。本文從一個 production case 開場（staging 重現 deadlock / production 6 個月後突然爆）、走 5 種 InnoDB lock 類型（record / gap / next-key / insert intention / auto-inc）、isolation level 對 lock 行為的決定性影響、deadlock detection / SHOW ENGINE INNODB STATUS 解讀、5 production 踩雷（gap lock 阻塞 INSERT / auto-inc lock contention / FK lock cascading / large transaction lock holding / READ COMMITTED 跟 binlog ROW 互動）">Lock Contention&lt;/a>）對比：&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL</a> overview 的 implementation-layer deep article。Overview 已說明 PG 在 OLTP 譜系的定位、本文聚焦 <em>MVCC + lock model</em> — PG 並行控制機制跟跟 MySQL lock-based 不同。</p></blockquote>
<hr>
<h2 id="pg-mvcc每次更新都-新增-tuple不改舊版">PG MVCC：每次更新都 <em>新增 tuple</em>、不改舊版</h2>
<p>PG 的並行控制核心是 <em>Multi-Version Concurrency Control</em> — UPDATE 不修改原 row、是 <em>新增</em> 一個 tuple version、舊 version 留在 table 直到 VACUUM 清理：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">原 row:    (id=1, status=&#39;pending&#39;, xmin=100, xmax=NULL)
</span></span><span class="line"><span class="ln">2</span><span class="cl">                 ↓ UPDATE status=&#39;shipped&#39;
</span></span><span class="line"><span class="ln">3</span><span class="cl">新 tuple:  (id=1, status=&#39;shipped&#39;, xmin=200, xmax=NULL)
</span></span><span class="line"><span class="ln">4</span><span class="cl">舊 tuple 標 xmax=200（不刪、給其他 transaction 看舊 version）</span></span></code></pre></div><p><code>xmin</code> / <code>xmax</code> 是 <em>creator transaction id</em> / <em>destroyer transaction id</em>。每個 SELECT 用 <em>snapshot</em>（含當下 active transaction list）判斷哪些 tuple 對自己可見：</p>
<ul>
<li>自己 transaction id &gt; tuple.xmin 且 (tuple.xmax = NULL 或自己 transaction id &lt; tuple.xmax) → 可見</li>
<li>否則 → 看不到（過去 / 未來版本）</li>
</ul>
<p><strong>結果</strong>：</p>
<ul>
<li><em>Readers 不 lock writers</em>：SELECT 看 snapshot、不 block UPDATE</li>
<li><em>Writers 不 lock readers</em>：UPDATE 寫新 tuple、不影響正在跑的 SELECT snapshot</li>
<li><em>Writers 只 lock 同一 row 的 writers</em>：兩個 UPDATE 同 row 才 conflict</li>
</ul>
<p>跟 MySQL InnoDB <em>lock-based</em>（<a href="/blog/backend/01-database/vendors/mysql/lock-contention/" data-link-title="MySQL Lock Contention：在 staging 重現的 deadlock、production 跑 6 個月才出現" data-link-desc="MySQL InnoDB 的 lock 是 row-level、但 *為什麼某些 row 莫名其妙也被 lock* 是 gap lock / next-key lock 設計造成的隱性行為。本文從一個 production case 開場（staging 重現 deadlock / production 6 個月後突然爆）、走 5 種 InnoDB lock 類型（record / gap / next-key / insert intention / auto-inc）、isolation level 對 lock 行為的決定性影響、deadlock detection / SHOW ENGINE INNODB STATUS 解讀、5 production 踩雷（gap lock 阻塞 INSERT / auto-inc lock contention / FK lock cascading / large transaction lock holding / READ COMMITTED 跟 binlog ROW 互動）">Lock Contention</a>）對比：</p>
<ul>
<li>MySQL：SELECT FOR UPDATE 用 gap lock 防 phantom、deadlock 機率高</li>
<li>PG：MVCC + snapshot 自然防 phantom（read 看 snapshot）、deadlock 少</li>
</ul>
<p>但 PG 代價是 <em>VACUUM 治理</em> — dead tuple 不清理會佔 disk + 影響 query 效率。詳見 <a href="/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/" data-link-title="PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat" data-link-desc="MVCC 怎麼產生 dead tuple、autovacuum cost-based throttle 為什麼預設保守、per-table tuning 怎麼設、5 個 production 踩雷（cost_limit 太低 / 長 transaction blocks vacuum / anti-wraparound 在 peak / partition vacuum 滿 worker / index bloat 沒處理）、跟 partitioning &#43; monitoring 整合">Autovacuum Tuning</a>。</p>
<h2 id="pg-4-種-lock">PG 4 種 lock</h2>
<p>PG 仍有 lock、但場景跟 MySQL 不同：</p>
<h3 id="1-row-level-lock--主要由-update--delete--select-for-update-取">1. Row-level lock — 主要由 UPDATE / DELETE / SELECT FOR UPDATE 取</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">BEGIN</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">100</span><span class="w"> </span><span class="k">FOR</span><span class="w"> </span><span class="k">UPDATE</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="c1">-- 對 id=100 row 加 ROW EXCLUSIVE lock
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1">-- 其他 transaction 試 UPDATE / DELETE id=100 必須等</span></span></span></code></pre></div><p>Row-level lock <em>不 block reader</em>（SELECT 看 snapshot、不檢查 lock）。</p>
<h3 id="2-table-level-lock--ddl-跟少數-select-for-場景">2. Table-level lock — DDL 跟少數 SELECT FOR 場景</h3>
<p>PG 有 8 種 table lock mode、嚴重程度遞增：</p>
<table>
  <thead>
      <tr>
          <th>Mode</th>
          <th>行為</th>
          <th>衝突</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>ACCESS SHARE</td>
          <td>SELECT 跑</td>
          <td>跟 ACCESS EXCLUSIVE 衝突</td>
      </tr>
      <tr>
          <td>ROW SHARE</td>
          <td>SELECT FOR UPDATE / FOR SHARE</td>
          <td>跟 EXCLUSIVE 衝突</td>
      </tr>
      <tr>
          <td>ROW EXCLUSIVE</td>
          <td>UPDATE / DELETE / INSERT</td>
          <td>跟 SHARE 衝突</td>
      </tr>
      <tr>
          <td>SHARE UPDATE EXCLUSIVE</td>
          <td>VACUUM / ANALYZE / CREATE INDEX CONCURRENTLY</td>
          <td>跟同 mode + 高 mode 衝突</td>
      </tr>
      <tr>
          <td>SHARE</td>
          <td>CREATE INDEX（non-concurrent）</td>
          <td>跟 ROW EXCLUSIVE 衝突</td>
      </tr>
      <tr>
          <td>SHARE ROW EXCLUSIVE</td>
          <td>CREATE TRIGGER / 某些 ALTER</td>
          <td>跟 ROW EXCLUSIVE 衝突</td>
      </tr>
      <tr>
          <td>EXCLUSIVE</td>
          <td>REFRESH MATERIALIZED VIEW</td>
          <td>跟所有 + 自身衝突</td>
      </tr>
      <tr>
          <td>ACCESS EXCLUSIVE</td>
          <td>DROP / ALTER TABLE / VACUUM FULL</td>
          <td>跟所有衝突</td>
      </tr>
  </tbody>
</table>
<p>DDL（ALTER / DROP）拿 ACCESS EXCLUSIVE、跟所有衝突。Production 跑 ALTER 必須短時間或走 <a href="/blog/backend/01-database/vendors/postgresql/online-schema-change/" data-link-title="PostgreSQL Online Schema Change：先用 ALTER 內建特性、不能解才 pg_repack / pg-osc" data-link-desc="PostgreSQL ALTER TABLE 對多數變更已是 *fast catalog-only*（add column nullable / drop column / 改 default），不必走 ghost table tool。本文走 PG 內建 fast DDL 行為、何時必須走 pg_repack / pg-osc、兩工具機制對比（trigger-based vs WAL-shipping）、配置 step-by-step、5 production 踩雷（lock 升級 / VACUUM FULL 誤用 / pg_repack version mismatch / concurrent index 失敗清理 / generated stored column 不能 online）、跟 MySQL gh-ost / pt-osc sibling 對比">Online Schema Change</a>。</p>
<h3 id="3-advisory-lock--application-自己控">3. Advisory lock — Application 自己控</h3>
<p>PG 提供 <em>advisory lock</em> 給 application 用、不關 row / table 結構：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- Session 1
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_advisory_lock</span><span class="p">(</span><span class="mi">12345</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="c1">-- 跑 critical section
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_advisory_unlock</span><span class="p">(</span><span class="mi">12345</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w"></span><span class="c1">-- Session 2
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_try_advisory_lock</span><span class="p">(</span><span class="mi">12345</span><span class="p">);</span><span class="w">  </span><span class="c1">-- 試取、不阻塞、返回 false</span></span></span></code></pre></div><p>用途：</p>
<ul>
<li>Application-level 互斥（如：cron job 同時只跑一個）</li>
<li>跨 connection 同步（PG-managed mutex）</li>
<li>Distributed transaction coordinator（lightweight）</li>
</ul>
<p>跟 row lock 不同：advisory lock 不關 row、application 自定義 lock ID 語義。</p>
<h3 id="4-predicate-lock--serializable-isolation-才用">4. Predicate lock — SERIALIZABLE isolation 才用</h3>
<p>PG SERIALIZABLE 用 <em>Serializable Snapshot Isolation (SSI)</em>、追蹤 <em>predicate</em>（query 條件）而不是 <em>row</em>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SET</span><span class="w"> </span><span class="k">TRANSACTION</span><span class="w"> </span><span class="k">ISOLATION</span><span class="w"> </span><span class="k">LEVEL</span><span class="w"> </span><span class="k">SERIALIZABLE</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">BEGIN</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="c1">-- Predicate lock 紀錄這個 query 看了哪些 predicate
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">status</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;pending&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="c1">-- 其他 transaction INSERT pending order
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1">-- 提交時：PG 偵測 anomaly、rollback 之一
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="c1"></span><span class="k">COMMIT</span><span class="p">;</span></span></span></code></pre></div><p>跟 MySQL gap lock 不同：</p>
<ul>
<li>MySQL gap lock：<em>pre-lock</em>、防 phantom 在 query 期間</li>
<li>PG predicate lock：<em>post-detect</em>、commit 時偵測 anomaly、退回 transaction</li>
</ul>
<p>PG SSI 對 <em>寫入吞吐影響低</em>（不 pre-lock）、但 <em>transaction rollback 機率高</em>（要 application retry）。</p>
<h2 id="pg-預設-isolationread-committed">PG 預設 isolation：READ COMMITTED</h2>
<p>PG 預設 READ COMMITTED、跟 MySQL InnoDB 預設 REPEATABLE READ 不同：</p>
<table>
  <thead>
      <tr>
          <th>Isolation</th>
          <th>PG 行為</th>
          <th>MySQL InnoDB 對應</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>READ UNCOMMITTED</td>
          <td>PG 視為 READ COMMITTED（不真的支援 dirty read）</td>
          <td>MySQL 真支援</td>
      </tr>
      <tr>
          <td>READ COMMITTED</td>
          <td>每 statement 看當下 committed snapshot（PG 預設）</td>
          <td>一致</td>
      </tr>
      <tr>
          <td>REPEATABLE READ</td>
          <td>Transaction 內 fixed snapshot（純 MVCC）</td>
          <td>MVCC snapshot + gap lock 防 phantom（兩者都 MVCC、差在 phantom 防護機制：PG 靠 snapshot version visibility、InnoDB 加 gap lock pre-lock 範圍）</td>
      </tr>
      <tr>
          <td>SERIALIZABLE</td>
          <td>SSI、commit 時偵測 anomaly</td>
          <td>強 lock + gap</td>
      </tr>
  </tbody>
</table>
<p><strong>對 application code 含意</strong>：</p>
<ul>
<li>PG REPEATABLE READ 對 <em>寫入吞吐</em> 影響低（不 pre-lock、只 retry）</li>
<li>沒 gap lock → INSERT 不被 lock-induced 阻塞</li>
<li>Deadlock 機率比 MySQL 低數量級</li>
</ul>
<p>實務 PG production：用預設 READ COMMITTED 即可、SERIALIZABLE 留給 <em>strict consistency 需求</em>（金融 / 訂單）但接受 retry。</p>
<h2 id="5-個-production-踩雷">5 個 Production 踩雷</h2>
<h3 id="1-idle-transaction-卡-vacuum--bloat-暴增">1. Idle transaction 卡 vacuum — Bloat 暴增</h3>
<p>PG MVCC 仰賴 <em>VACUUM 清理 dead tuple</em>。VACUUM 只清理 <em>沒 active transaction 看得到的 dead tuple</em>。如果有 <em>idle in transaction</em> session 持續開著（application connection pool 連線忘關 transaction）、VACUUM 看不到 <em>該 transaction snapshot 之後的 dead tuple</em>、累積 bloat。</p>
<p>修法：</p>
<ul>
<li>監控 <code>pg_stat_activity</code> 看 <code>state = 'idle in transaction'</code> 持續時間</li>
<li>設 <code>idle_in_transaction_session_timeout = '5min'</code> — 超時 PG 自動 kill 該 session</li>
<li>Application connection pool 配置 <em>不留 transaction 開著</em>（如：pgBouncer transaction pool 自動 commit / rollback）</li>
</ul>
<h3 id="2-select-for-update-跨-transaction--application-retry-麻煩">2. SELECT FOR UPDATE 跨 transaction — Application retry 麻煩</h3>
<p>跟 MySQL 不同：PG SELECT FOR UPDATE 不會 <em>block 其他 SELECT</em>（讀仍可繼續）、但 <em>block 其他 UPDATE / FOR UPDATE</em>。若 application 在 transaction 內 SELECT FOR UPDATE、其他 transaction 等。</p>
<p>如果 application 設計 <em>跨 transaction 持 lock</em>（如：取 lock + return UI + 等用戶操作 + commit）、容易撞 idle in transaction 跟其他 transaction wait。</p>
<p>修法：</p>
<ul>
<li><em>Transaction 短</em>：取 FOR UPDATE → 立刻處理 → commit、不跨 user interaction</li>
<li>跨 user interaction 用 <em>advisory lock</em> 或 application-level state machine、不依賴 row lock</li>
</ul>
<h3 id="3-advisory-lock-沒釋放--session-結束才自動釋放">3. Advisory lock 沒釋放 — Session 結束才自動釋放</h3>
<p><code>pg_advisory_lock()</code> 拿了、沒 <code>pg_advisory_unlock()</code>、lock 直到 <em>session 結束</em> 才自動釋放。Connection pool 重複使用同 connection、可能繼承前面留的 lock。</p>
<p>修法：</p>
<ul>
<li>用 <code>pg_advisory_lock</code> 必 <code>try/finally pg_advisory_unlock</code></li>
<li>或用 <em>session-level</em> 用 transaction-scoped：<code>pg_advisory_xact_lock()</code> — commit / rollback 自動釋放</li>
<li>監控 <code>pg_locks</code> 看 advisory lock count、長期累積是警訊</li>
</ul>
<h3 id="4-bloat-不只是-vacuum-沒跑是-active-transaction-阻擋-vacuum">4. Bloat 不只是 vacuum 沒跑、是 <em>active transaction 阻擋 vacuum</em></h3>
<p>第 #1 點延伸：vacuum 已跑、但 bloat 仍持續成長、原因不是 vacuum 不夠、是 <em>active transaction 阻擋 vacuum 看 dead tuple</em>。</p>
<p>修法：</p>
<ul>
<li>不只看 <code>last_vacuum</code>、看 <em>VACUUM 跑了但沒收回多少</em></li>
<li><code>SELECT * FROM pg_stat_progress_vacuum</code> 看 VACUUM 進度</li>
<li><code>SELECT * FROM pg_stat_activity WHERE backend_xmin IS NOT NULL ORDER BY backend_xmin</code> — 看誰阻擋 vacuum</li>
<li>詳見 <a href="/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/" data-link-title="PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat" data-link-desc="MVCC 怎麼產生 dead tuple、autovacuum cost-based throttle 為什麼預設保守、per-table tuning 怎麼設、5 個 production 踩雷（cost_limit 太低 / 長 transaction blocks vacuum / anti-wraparound 在 peak / partition vacuum 滿 worker / index bloat 沒處理）、跟 partitioning &#43; monitoring 整合">Autovacuum Tuning</a></li>
</ul>
<h3 id="5-serializable-下-transaction-rollback--application-必須-retry">5. SERIALIZABLE 下 transaction rollback — Application 必須 retry</h3>
<p><code>SET TRANSACTION ISOLATION LEVEL SERIALIZABLE</code> 後、PG SSI 偵測到 anomaly 會 <em>rollback transaction</em>、application 看到 <code>serialization failure</code>、必須 retry。</p>
<p>對 <em>不知道要 retry</em> 的 application、SERIALIZABLE 變 production bug。</p>
<p>修法：</p>
<ul>
<li>Application code 加 <em>retry middleware</em>：catch <code>SQLSTATE 40001 (serialization_failure)</code> → exponential backoff retry</li>
<li>不必所有 transaction 走 SERIALIZABLE — 只對 <em>strict consistency 需求</em> 場景 set</li>
<li>高並發 SERIALIZABLE workload 容易 rollback storm、考慮拆 transaction 縮短時間</li>
</ul>
<h2 id="觀測-metric">觀測 metric</h2>
<p>Production 監控：</p>
<ul>
<li><code>pg_stat_activity</code>：active session / idle in transaction / wait_event</li>
<li><code>pg_locks</code>：當前 lock 列表、用 join 看誰 block 誰</li>
<li><code>pg_stat_database.deadlocks</code>：deadlock 計數（PG 較低、但仍要監控）</li>
<li><code>pg_stat_user_tables.n_dead_tup</code> / <code>n_live_tup</code>：dead tuple 比例 — bloat 指標</li>
<li><code>pg_stat_progress_vacuum</code>：VACUUM 進度</li>
</ul>
<h2 id="跟-mysql-lock-model-對比">跟 MySQL Lock Model 對比</h2>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>PG MVCC</th>
          <th>MySQL InnoDB Lock</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>主要機制</td>
          <td>MVCC + snapshot</td>
          <td>Lock-based + MVCC mixed</td>
      </tr>
      <tr>
          <td>Readers vs Writers</td>
          <td>不互 block</td>
          <td>預設 RR 下 gap lock 影響</td>
      </tr>
      <tr>
          <td>Deadlock 機率</td>
          <td>低（無 gap lock）</td>
          <td>中-高（gap lock 主要來源）</td>
      </tr>
      <tr>
          <td>Phantom 防護</td>
          <td>Snapshot 自然防 + SSI predicate lock</td>
          <td>Gap lock 預先 lock</td>
      </tr>
      <tr>
          <td>預設 isolation</td>
          <td>READ COMMITTED</td>
          <td>REPEATABLE READ</td>
      </tr>
      <tr>
          <td>成本</td>
          <td>Dead tuple + VACUUM 治理</td>
          <td>Lock contention 治理</td>
      </tr>
      <tr>
          <td>Application code</td>
          <td>SERIALIZABLE 需 retry</td>
          <td>寫得不錯多數時 OK</td>
      </tr>
  </tbody>
</table>
<p>兩者解決同一問題（並行控制）、用不同策略。PG 用 <em>空間換時間</em>（保留多版本 tuple、讀寫不互鎖、但需 VACUUM 清理）、MySQL 用 <em>時間換空間</em>（lock 等待、但不必清舊版本）。</p>
<p><strong>選擇判讀</strong>：</p>
<ul>
<li>High 並發 OLTP、寫 / 讀都重：PG MVCC 通常更好（讀不 block 寫）</li>
<li>簡單 OLTP + 不想管 VACUUM：MySQL InnoDB 對 ops 簡單</li>
<li>需要 SERIALIZABLE 強一致：PG SSI 對寫吞吐影響低</li>
<li>已有 MySQL 生態 / 工具鏈：MySQL Lock 知識可繼續用</li>
</ul>
<p>詳見 <a href="/blog/backend/01-database/vendors/mysql/lock-contention/" data-link-title="MySQL Lock Contention：在 staging 重現的 deadlock、production 跑 6 個月才出現" data-link-desc="MySQL InnoDB 的 lock 是 row-level、但 *為什麼某些 row 莫名其妙也被 lock* 是 gap lock / next-key lock 設計造成的隱性行為。本文從一個 production case 開場（staging 重現 deadlock / production 6 個月後突然爆）、走 5 種 InnoDB lock 類型（record / gap / next-key / insert intention / auto-inc）、isolation level 對 lock 行為的決定性影響、deadlock detection / SHOW ENGINE INNODB STATUS 解讀、5 production 踩雷（gap lock 阻塞 INSERT / auto-inc lock contention / FK lock cascading / large transaction lock holding / READ COMMITTED 跟 binlog ROW 互動）">MySQL Lock Contention</a> — 完整 MySQL lock 機制。</p>
<h2 id="跟其他模組整合">跟其他模組整合</h2>
<h3 id="跟-autovacuum-tuning">跟 Autovacuum Tuning</h3>
<p>MVCC 仰賴 VACUUM、autovacuum 是 PG 並行控制的 <em>維護成本</em>。VACUUM 跑慢 / 沒跑 → bloat → query 慢。詳見 <a href="/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/" data-link-title="PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat" data-link-desc="MVCC 怎麼產生 dead tuple、autovacuum cost-based throttle 為什麼預設保守、per-table tuning 怎麼設、5 個 production 踩雷（cost_limit 太低 / 長 transaction blocks vacuum / anti-wraparound 在 peak / partition vacuum 滿 worker / index bloat 沒處理）、跟 partitioning &#43; monitoring 整合">Autovacuum Tuning</a>。</p>
<h3 id="跟-replication-topology">跟 Replication Topology</h3>
<p><code>hot_standby_feedback = on</code> 讓 standby 上 long-running query 不被 vacuum 取消、但 <em>standby 把 oldest xmin 推回 primary</em>、primary autovacuum 變保守、增加 bloat。詳見 <a href="/blog/backend/01-database/vendors/postgresql/replication-topology/" data-link-title="PostgreSQL Replication Topology：async / sync / quorum 三模式跟 LSN &#43; replication slot 的三軸組合" data-link-desc="PostgreSQL streaming replication 不是「sync 或 async」、是 *durability / latency / consistency* 三軸組合 &#43; LSN-based 進度追蹤 &#43; replication slot 治理。本文走 3 軸取捨模型、async / sync / quorum-based sync 行為對比、LSN &#43; replication slot 機制、配置 step-by-step、5 production 踩雷（standby lag 暴衝 / sync standby 退回 async / orphan replication slot / cascading replication 雪崩 / failover 後 timeline 分歧）、跟 Patroni HA &#43; logical replication 整合">Replication Topology</a>。</p>
<h3 id="跟-connection-pool">跟 Connection Pool</h3>
<p>pgBouncer transaction pooling 模式下、advisory lock / SELECT FOR UPDATE 跨 transaction 行為 <em>broken</em>（不同 transaction 可能進不同 backend connection）。詳見 <a href="/blog/backend/01-database/vendors/postgresql/pgbouncer-config/" data-link-title="PostgreSQL pgBouncer 配置 &#43; 連線池治理" data-link-desc="pgBouncer transaction pooling 配置、跟 application connection pool 的分層、production 故障演練（pool exhaustion / stale connection / DNS failover）跟容量規劃">pgBouncer Config</a>。</p>
<h3 id="跟-query-optimization">跟 Query Optimization</h3>
<p>長 transaction 跑慢 query 期間、其他 transaction 看到 snapshot bloat、planner 估錯 dead tuple ratio。詳見 <a href="/blog/backend/01-database/vendors/postgresql/query-optimization/" data-link-title="PostgreSQL Query Optimization：EXPLAIN ANALYZE / pg_hint_plan / auto_explain 三層工具跟 4 個 case" data-link-desc="PG query 慢的根因常是 *planner 選錯 plan 或 statistics 過時*。本文從 4 個 production case 開場（seq scan vs index / hash vs nested loop / 多 column 統計缺 / parallel query 沒觸發）、走 EXPLAIN / EXPLAIN ANALYZE / auto_explain 三層工具、pg_hint_plan extension 跟 planner GUC 取捨、5 production 踩雷（ANALYZE 過時 / multi-column statistics / cost-base setting 不對齊硬體 / random_page_cost SSD 沒調 / parallel query 配置）、跟 MySQL query-optimization sibling 對比">Query Optimization</a>。</p>
<h2 id="相關連結">相關連結</h2>
<ul>
<li><a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL vendor overview</a></li>
<li><a href="/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/" data-link-title="PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat" data-link-desc="MVCC 怎麼產生 dead tuple、autovacuum cost-based throttle 為什麼預設保守、per-table tuning 怎麼設、5 個 production 踩雷（cost_limit 太低 / 長 transaction blocks vacuum / anti-wraparound 在 peak / partition vacuum 滿 worker / index bloat 沒處理）、跟 partitioning &#43; monitoring 整合">PG Autovacuum Tuning</a>（VACUUM 是 MVCC 必要成本）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/replication-topology/" data-link-title="PostgreSQL Replication Topology：async / sync / quorum 三模式跟 LSN &#43; replication slot 的三軸組合" data-link-desc="PostgreSQL streaming replication 不是「sync 或 async」、是 *durability / latency / consistency* 三軸組合 &#43; LSN-based 進度追蹤 &#43; replication slot 治理。本文走 3 軸取捨模型、async / sync / quorum-based sync 行為對比、LSN &#43; replication slot 機制、配置 step-by-step、5 production 踩雷（standby lag 暴衝 / sync standby 退回 async / orphan replication slot / cascading replication 雪崩 / failover 後 timeline 分歧）、跟 Patroni HA &#43; logical replication 整合">PG Replication Topology</a>（hot_standby_feedback 影響）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/pgbouncer-config/" data-link-title="PostgreSQL pgBouncer 配置 &#43; 連線池治理" data-link-desc="pgBouncer transaction pooling 配置、跟 application connection pool 的分層、production 故障演練（pool exhaustion / stale connection / DNS failover）跟容量規劃">PG pgBouncer</a>（transaction pooling 跟 lock 互動）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/online-schema-change/" data-link-title="PostgreSQL Online Schema Change：先用 ALTER 內建特性、不能解才 pg_repack / pg-osc" data-link-desc="PostgreSQL ALTER TABLE 對多數變更已是 *fast catalog-only*（add column nullable / drop column / 改 default），不必走 ghost table tool。本文走 PG 內建 fast DDL 行為、何時必須走 pg_repack / pg-osc、兩工具機制對比（trigger-based vs WAL-shipping）、配置 step-by-step、5 production 踩雷（lock 升級 / VACUUM FULL 誤用 / pg_repack version mismatch / concurrent index 失敗清理 / generated stored column 不能 online）、跟 MySQL gh-ost / pt-osc sibling 對比">PG Online Schema Change</a>（DDL lock 議題）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/query-optimization/" data-link-title="PostgreSQL Query Optimization：EXPLAIN ANALYZE / pg_hint_plan / auto_explain 三層工具跟 4 個 case" data-link-desc="PG query 慢的根因常是 *planner 選錯 plan 或 statistics 過時*。本文從 4 個 production case 開場（seq scan vs index / hash vs nested loop / 多 column 統計缺 / parallel query 沒觸發）、走 EXPLAIN / EXPLAIN ANALYZE / auto_explain 三層工具、pg_hint_plan extension 跟 planner GUC 取捨、5 production 踩雷（ANALYZE 過時 / multi-column statistics / cost-base setting 不對齊硬體 / random_page_cost SSD 沒調 / parallel query 配置）、跟 MySQL query-optimization sibling 對比">PG Query Optimization</a>（snapshot bloat 影響 planner）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/lock-contention/" data-link-title="MySQL Lock Contention：在 staging 重現的 deadlock、production 跑 6 個月才出現" data-link-desc="MySQL InnoDB 的 lock 是 row-level、但 *為什麼某些 row 莫名其妙也被 lock* 是 gap lock / next-key lock 設計造成的隱性行為。本文從一個 production case 開場（staging 重現 deadlock / production 6 個月後突然爆）、走 5 種 InnoDB lock 類型（record / gap / next-key / insert intention / auto-inc）、isolation level 對 lock 行為的決定性影響、deadlock detection / SHOW ENGINE INNODB STATUS 解讀、5 production 踩雷（gap lock 阻塞 INSERT / auto-inc lock contention / FK lock cascading / large transaction lock holding / READ COMMITTED 跟 binlog ROW 互動）">MySQL Lock Contention</a>（sibling、不同模型）</li>
<li><a href="/blog/backend/knowledge-cards/isolation-level/" data-link-title="Isolation Level" data-link-desc="說明資料庫交易隔離級別如何影響並發讀寫結果">Isolation Level 卡片</a></li>
<li>官方：<a href="https://www.postgresql.org/docs/current/mvcc.html">PG MVCC</a> / <a href="https://www.postgresql.org/docs/current/transaction-iso.html">PG Concurrency Control</a> / <a href="https://www.postgresql.org/docs/current/explicit-locking.html">Explicit Locking</a></li>
</ul>
]]></content:encoded></item></channel></rss>