<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Logical-Replication on Tarragon</title><link>https://tarrragon.github.io/blog/tags/logical-replication/</link><description>Recent content in Logical-Replication on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Tue, 19 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/logical-replication/index.xml" rel="self" type="application/rss+xml"/><item><title>PostgreSQL Replication Slot Management：Physical / Logical / Failover Slot 治理</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/replication-slot-management/</link><pubDate>Tue, 19 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/replication-slot-management/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL&lt;/a> overview 的 implementation-layer deep article。Overview 已說明 PG 在 OLTP 譜系的定位、本文聚焦 &lt;em>replication slot management&lt;/em> — physical / logical / failover slot 三類治理。&lt;/p>&lt;/blockquote>
&lt;hr>
&lt;h2 id="replication-slot-兩大類">Replication Slot 兩大類&lt;/h2>
&lt;p>PG 兩種 replication slot：&lt;/p>
&lt;h3 id="physical-replication-slot">Physical Replication Slot&lt;/h3>
&lt;p>對應 &lt;em>streaming replication&lt;/em>（physical WAL byte-level）：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="k">SELECT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">pg_create_physical_replication_slot&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s1">&amp;#39;standby1_slot&amp;#39;&lt;/span>&lt;span class="p">);&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>用於：&lt;/p>
&lt;ul>
&lt;li>Streaming replication standby（&lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/replication-topology/" data-link-title="PostgreSQL Replication Topology：async / sync / quorum 三模式跟 LSN &amp;#43; replication slot 的三軸組合" data-link-desc="PostgreSQL streaming replication 不是「sync 或 async」、是 *durability / latency / consistency* 三軸組合 &amp;#43; LSN-based 進度追蹤 &amp;#43; replication slot 治理。本文走 3 軸取捨模型、async / sync / quorum-based sync 行為對比、LSN &amp;#43; replication slot 機制、配置 step-by-step、5 production 踩雷（standby lag 暴衝 / sync standby 退回 async / orphan replication slot / cascading replication 雪崩 / failover 後 timeline 分歧）、跟 Patroni HA &amp;#43; logical replication 整合">Replication Topology&lt;/a>）&lt;/li>
&lt;li>pg_basebackup 用 slot 防 WAL 清理&lt;/li>
&lt;li>高 lag standby 防 WAL premature deletion&lt;/li>
&lt;/ul>
&lt;h3 id="logical-replication-slot">Logical Replication Slot&lt;/h3>
&lt;p>對應 &lt;em>logical replication / logical decoding&lt;/em>：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="k">SELECT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">pg_create_logical_replication_slot&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s1">&amp;#39;my_slot&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s1">&amp;#39;pgoutput&amp;#39;&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="c1">-- 或用 wal2json plugin
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="k">SELECT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">pg_create_logical_replication_slot&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s1">&amp;#39;debezium_slot&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s1">&amp;#39;wal2json&amp;#39;&lt;/span>&lt;span class="p">);&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>用於：&lt;/p>
&lt;ul>
&lt;li>PG-to-PG logical replication（publication / subscription）&lt;/li>
&lt;li>CDC（Debezium / Maxwell / pg_logical_emitter）&lt;/li>
&lt;li>Multi-master replication（BDR / pgEdge / Spock）&lt;/li>
&lt;/ul>
&lt;p>logical slot 跟 physical slot 共存、各自獨立 retention。&lt;/p>
&lt;h2 id="slot-lifecycle">Slot Lifecycle&lt;/h2>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">建立 → active（有 consumer）→ inactive（consumer 失聯）→ drop
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl"> ↓
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl"> WAL 持續累積（直到推進 LSN 或 drop）&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>狀態查詢&lt;/strong>：&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL</a> overview 的 implementation-layer deep article。Overview 已說明 PG 在 OLTP 譜系的定位、本文聚焦 <em>replication slot management</em> — physical / logical / failover slot 三類治理。</p></blockquote>
<hr>
<h2 id="replication-slot-兩大類">Replication Slot 兩大類</h2>
<p>PG 兩種 replication slot：</p>
<h3 id="physical-replication-slot">Physical Replication Slot</h3>
<p>對應 <em>streaming replication</em>（physical WAL byte-level）：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_create_physical_replication_slot</span><span class="p">(</span><span class="s1">&#39;standby1_slot&#39;</span><span class="p">);</span></span></span></code></pre></div><p>用於：</p>
<ul>
<li>Streaming replication standby（<a href="/blog/backend/01-database/vendors/postgresql/replication-topology/" data-link-title="PostgreSQL Replication Topology：async / sync / quorum 三模式跟 LSN &#43; replication slot 的三軸組合" data-link-desc="PostgreSQL streaming replication 不是「sync 或 async」、是 *durability / latency / consistency* 三軸組合 &#43; LSN-based 進度追蹤 &#43; replication slot 治理。本文走 3 軸取捨模型、async / sync / quorum-based sync 行為對比、LSN &#43; replication slot 機制、配置 step-by-step、5 production 踩雷（standby lag 暴衝 / sync standby 退回 async / orphan replication slot / cascading replication 雪崩 / failover 後 timeline 分歧）、跟 Patroni HA &#43; logical replication 整合">Replication Topology</a>）</li>
<li>pg_basebackup 用 slot 防 WAL 清理</li>
<li>高 lag standby 防 WAL premature deletion</li>
</ul>
<h3 id="logical-replication-slot">Logical Replication Slot</h3>
<p>對應 <em>logical replication / logical decoding</em>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_create_logical_replication_slot</span><span class="p">(</span><span class="s1">&#39;my_slot&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;pgoutput&#39;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="c1">-- 或用 wal2json plugin
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_create_logical_replication_slot</span><span class="p">(</span><span class="s1">&#39;debezium_slot&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;wal2json&#39;</span><span class="p">);</span></span></span></code></pre></div><p>用於：</p>
<ul>
<li>PG-to-PG logical replication（publication / subscription）</li>
<li>CDC（Debezium / Maxwell / pg_logical_emitter）</li>
<li>Multi-master replication（BDR / pgEdge / Spock）</li>
</ul>
<p>logical slot 跟 physical slot 共存、各自獨立 retention。</p>
<h2 id="slot-lifecycle">Slot Lifecycle</h2>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">建立 → active（有 consumer）→ inactive（consumer 失聯）→ drop
</span></span><span class="line"><span class="ln">2</span><span class="cl">                                    ↓
</span></span><span class="line"><span class="ln">3</span><span class="cl">                              WAL 持續累積（直到推進 LSN 或 drop）</span></span></code></pre></div><p><strong>狀態查詢</strong>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SELECT</span><span class="w"> </span><span class="n">slot_name</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">       </span><span class="n">slot_type</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">       </span><span class="n">active</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">       </span><span class="n">restart_lsn</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">       </span><span class="n">confirmed_flush_lsn</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w">       </span><span class="n">pg_size_pretty</span><span class="p">(</span><span class="n">pg_wal_lsn_diff</span><span class="p">(</span><span class="n">pg_current_wal_lsn</span><span class="p">(),</span><span class="w"> </span><span class="n">restart_lsn</span><span class="p">))</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">retained_wal</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">pg_replication_slots</span><span class="p">;</span></span></span></code></pre></div><p>關鍵欄位：</p>
<ul>
<li><code>slot_type</code>：<code>physical</code> / <code>logical</code></li>
<li><code>active</code>：true / false（consumer 是否連著）</li>
<li><code>restart_lsn</code>：slot 起點 LSN、primary 必須保留這以後的 WAL</li>
<li><code>confirmed_flush_lsn</code>：logical slot 已 confirm flush 的 LSN</li>
<li><code>retained_wal</code>：當前因 slot 累積的 WAL</li>
</ul>
<h2 id="failover-slot-synchronization-pg-17">Failover Slot Synchronization (PG 17+)</h2>
<p>PG 17 之前的 <em>痛點</em>：logical replication slot 是 <em>primary 上的 state</em>、failover 後 <em>新 primary 沒這個 slot</em>、CDC consumer 失聯、需要重建（大工程）。</p>
<p>PG 17 加 <em>failover slot synchronization</em>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1">-- PG 17+：標 slot 為 failover-tracked
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1">-- signature: pg_create_logical_replication_slot(slot_name, plugin, temporary, two_phase, failover)
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_create_logical_replication_slot</span><span class="p">(</span><span class="s1">&#39;my_slot&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;pgoutput&#39;</span><span class="p">,</span><span class="w"> </span><span class="k">false</span><span class="p">,</span><span class="w"> </span><span class="k">false</span><span class="p">,</span><span class="w"> </span><span class="k">true</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w"></span><span class="c1">--                                                                          ↑
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="c1">--                                                                     failover=true（第 5 個參數）
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="c1">-- 注意：第 4 個參數是 two_phase（這裡 false）、第 5 個才是 failover
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="c1"></span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w"></span><span class="c1">-- Standby 上 enable sync_replication_slots
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">SYSTEM</span><span class="w"> </span><span class="k">SET</span><span class="w"> </span><span class="n">sync_replication_slots</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">on</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_reload_conf</span><span class="p">();</span></span></span></code></pre></div><p><code>sync_replication_slots = on</code> 後、physical replication 同步 slot state 到 standby。Failover promote standby 後、logical slot 仍可用、CDC consumer 重連即可。</p>
<p>PG 17 之前用 <a href="https://www.pgedge.com/">pgEdge</a> / <em>pglogical</em> 等 extension 提供類似功能、現在 PG core 內建。</p>
<h2 id="orphan-slot-治理">Orphan Slot 治理</h2>
<p><code>active = false</code> 的 slot 持續累積 WAL、disk 爆是 PG production 經典事故。</p>
<h3 id="監控-orphan-slot">監控 orphan slot</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 找 inactive 太久的 slot
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">slot_name</span><span class="p">,</span><span class="w"> </span><span class="n">active</span><span class="p">,</span><span class="w"> </span><span class="n">restart_lsn</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">       </span><span class="n">pg_size_pretty</span><span class="p">(</span><span class="n">pg_wal_lsn_diff</span><span class="p">(</span><span class="n">pg_current_wal_lsn</span><span class="p">(),</span><span class="w"> </span><span class="n">restart_lsn</span><span class="p">))</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">retained_wal</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">pg_replication_slots</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="k">WHERE</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="n">active</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w">  </span><span class="k">AND</span><span class="w"> </span><span class="n">pg_wal_lsn_diff</span><span class="p">(</span><span class="n">pg_current_wal_lsn</span><span class="p">(),</span><span class="w"> </span><span class="n">restart_lsn</span><span class="p">)</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">1024</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mi">1024</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mi">1024</span><span class="p">;</span><span class="w">  </span><span class="c1">-- &gt; 1 GB</span></span></span></code></pre></div><h3 id="自動-invalidate-slotpg-13">自動 invalidate slot（PG 13+）</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- postgresql.conf
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">SYSTEM</span><span class="w"> </span><span class="k">SET</span><span class="w"> </span><span class="n">max_slot_wal_keep_size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;50GB&#39;</span><span class="p">;</span><span class="w">  </span><span class="c1">-- slot 累積 &gt; 50GB 自動 invalidate</span></span></span></code></pre></div><p>當 slot 累積 WAL 超過 <code>max_slot_wal_keep_size</code>、PG 自動 invalidate slot（<code>active=false</code> 且不再保留 WAL）。Consumer 重連會 fail、必須重建（base backup + new slot）。</p>
<p>這是 <em>trade-off</em>：</p>
<ul>
<li>設 limit → 保護 disk、但 consumer 失聯 → 大重建工作</li>
<li>不設 limit → consumer 失聯 OK、但 disk 爆</li>
</ul>
<p>實務多數設 <code>max_slot_wal_keep_size</code> 給 <em>disk capacity 50%</em>、避免徹底 disk full。</p>
<h3 id="手動-drop-orphan-slot">手動 drop orphan slot</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 確認 slot 真的不需要
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">pg_replication_slots</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">slot_name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;old_standby_slot&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- Drop
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_drop_replication_slot</span><span class="p">(</span><span class="s1">&#39;old_standby_slot&#39;</span><span class="p">);</span></span></span></code></pre></div><p>DR runbook 必須包含 <em>standby 退役流程</em>：先 standby fence、再 primary drop slot。</p>
<h2 id="5-個-production-踩雷">5 個 Production 踩雷</h2>
<h3 id="1-orphan-slot-disk-爆">1. Orphan slot disk 爆</h3>
<p>最經典 PG 事故：standby decomission 沒 drop slot、primary 持續保留 WAL、<code>pg_wal/</code> 累積到 disk full、primary 也掛。</p>
<p>修法：</p>
<ul>
<li>監控 <code>pg_replication_slots</code> + <code>pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn))</code> retained_wal</li>
<li>設 <code>max_slot_wal_keep_size</code>（PG 13+）— hard limit</li>
<li>Standby 退役 runbook 強制 <em>先 fence、再 drop slot</em></li>
<li>Cron job 自動 alert orphan slot</li>
</ul>
<h3 id="2-logical-slot-lag--cdc-consumer-跟不上">2. Logical slot lag — CDC consumer 跟不上</h3>
<p>Logical decoding 比 physical replication 慢（per-transaction logical event 重組）。CDC consumer（Debezium）跟不上 → slot lag 累積。</p>
<p>修法：</p>
<ul>
<li>監控 <code>pg_replication_slots.confirmed_flush_lsn</code> 跟 primary <code>pg_current_wal_lsn()</code> 對比</li>
<li>CDC consumer 性能調整（throughput / batch size）</li>
<li>Throttle source writes（如果不能升 consumer）</li>
<li>對 hot table 拆 publication / subscription、避免單 slot 處理所有變更</li>
</ul>
<p>詳見 <a href="/blog/backend/01-database/vendors/postgresql/logical-replication-debezium/" data-link-title="PostgreSQL Logical Replication &#43; Debezium CDC：replication slot × failure × recovery 對照" data-link-desc="PostgreSQL logical replication slot 跟 Debezium CDC 的失效模式對照表：slot lag 撐爆 primary disk / schema change 斷流 / 初始 COPY 鎖表 / zombie slot 不釋放 / replay storm 後 offset reset；publication / subscription / pgoutput 配置、跟 Kafka outbox pattern 整合">Logical Replication + Debezium</a>。</p>
<h3 id="3-failover-後-logical-slot-丟pg-16-之前">3. Failover 後 logical slot 丟（PG 16 之前）</h3>
<p>PG 16 之前、failover promote standby、新 primary 沒有原 logical slot。CDC consumer 試連、ERROR: <code>replication slot &quot;xxx&quot; does not exist</code>。</p>
<p>修法（PG 17+）：</p>
<ul>
<li>用 <em>failover slot synchronization</em>（如上）</li>
<li><code>pg_create_logical_replication_slot(...,  failover := true)</code></li>
<li>Standby <code>sync_replication_slots = on</code></li>
</ul>
<p>修法（PG 16-）：</p>
<ul>
<li>用 <a href="https://www.2ndquadrant.com/en/resources/pglogical/">pglogical</a> 或 <a href="https://www.pgedge.com/">pgEdge</a> extension</li>
<li>Failover runbook 包含 <em>新 primary 重建 logical slot</em>（CDC consumer 重 snapshot）</li>
<li>Pre-create slot on standby + manual sync（早期 workaround）</li>
</ul>
<h3 id="4-wal_keep_size-跟-slot-衝突">4. <code>wal_keep_size</code> 跟 slot 衝突</h3>
<p><code>wal_keep_size</code>（PG 13+）/ <code>wal_keep_segments</code>（&lt; 13）跟 slot 都會保留 WAL：</p>
<ul>
<li><code>wal_keep_size</code>：固定 minimum WAL 保留量</li>
<li>Slot：動態保留直到 consumer 推進</li>
</ul>
<p>兩者一起 set 時：實際保留 WAL = <code>max(wal_keep_size, slot 需要的量)</code>。</p>
<p>修法：</p>
<ul>
<li><code>wal_keep_size</code> 設小（如 1-2 GB）作 <em>minimum backup</em></li>
<li>主要靠 slot 動態保留 — 給 active consumer</li>
<li>監控 <code>pg_wal/</code> 大小 + 拆解 retention source（<code>wal_keep_size</code> vs slot 各佔多少）</li>
</ul>
<h3 id="5-slot-數量上限">5. Slot 數量上限</h3>
<p><code>max_replication_slots</code> 預設 10、不夠時新 slot 建不出來、報錯。</p>
<p>修法：</p>
<ul>
<li>Production 大 cluster 設 <code>max_replication_slots = 50</code> 或更多</li>
<li>對 <em>standby + logical replication + CDC consumer</em> 同時跑、計算需要的 slot 數</li>
<li>監控 <code>SELECT count(*) FROM pg_replication_slots</code> 接近 limit 時告警</li>
</ul>
<h2 id="slot-naming-convention">Slot Naming Convention</h2>
<p>Production 大 cluster 多 slot、命名 convention 重要：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">&lt;consumer-type&gt;_&lt;consumer-name&gt;_&lt;purpose&gt;
</span></span><span class="line"><span class="ln">2</span><span class="cl">例：
</span></span><span class="line"><span class="ln">3</span><span class="cl">- physical_standby1_replication
</span></span><span class="line"><span class="ln">4</span><span class="cl">- physical_standby2_replication
</span></span><span class="line"><span class="ln">5</span><span class="cl">- logical_debezium_orders_cdc
</span></span><span class="line"><span class="ln">6</span><span class="cl">- logical_pgedge_node2_subscription
</span></span><span class="line"><span class="ln">7</span><span class="cl">- physical_pgbasebackup_temp（base backup 用、completed 後 drop）</span></span></code></pre></div><p>清楚命名讓 <em>看 slot 名</em> 就知道用途、誰負責、能不能 drop。</p>
<h2 id="跟其他模組整合">跟其他模組整合</h2>
<ul>
<li><a href="/blog/backend/01-database/vendors/postgresql/replication-topology/" data-link-title="PostgreSQL Replication Topology：async / sync / quorum 三模式跟 LSN &#43; replication slot 的三軸組合" data-link-desc="PostgreSQL streaming replication 不是「sync 或 async」、是 *durability / latency / consistency* 三軸組合 &#43; LSN-based 進度追蹤 &#43; replication slot 治理。本文走 3 軸取捨模型、async / sync / quorum-based sync 行為對比、LSN &#43; replication slot 機制、配置 step-by-step、5 production 踩雷（standby lag 暴衝 / sync standby 退回 async / orphan replication slot / cascading replication 雪崩 / failover 後 timeline 分歧）、跟 Patroni HA &#43; logical replication 整合">Replication Topology</a>：physical slot 給 streaming replication 用</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/logical-replication-debezium/" data-link-title="PostgreSQL Logical Replication &#43; Debezium CDC：replication slot × failure × recovery 對照" data-link-desc="PostgreSQL logical replication slot 跟 Debezium CDC 的失效模式對照表：slot lag 撐爆 primary disk / schema change 斷流 / 初始 COPY 鎖表 / zombie slot 不釋放 / replay storm 後 offset reset；publication / subscription / pgoutput 配置、跟 Kafka outbox pattern 整合">Logical Replication + Debezium</a>：logical slot 給 CDC</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/bdr-multi-master/" data-link-title="PostgreSQL BDR / Multi-Master：active-active 寫入的 3 種路徑跟 conflict 治理" data-link-desc="PG 預設是 single-primary、active-active 多寫入入口需要 *BDR (EDB)* / *pgEdge* / *Bucardo* 等 extension。本文走 3 種 multi-master 方案對比、conflict detection &#43; resolution model、async vs sync 取捨、配置 step-by-step（pgEdge 為主）、5 production 踩雷（last-write-wins data loss / sequence collision / DDL replication / conflict log 治理 / failover 後 timeline 分歧）、跟 MySQL Group Replication sibling 對比">BDR / Multi-Master</a>：multi-master 大量用 logical slot</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/pitr-wal-archiving/" data-link-title="PostgreSQL PITR &#43; WAL archiving：從 base backup 到 point-in-time recovery 的完整鏈" data-link-desc="Base backup &#43; WAL archive 構成 PITR 的雙軌資料、archive_command &#43; restore_command 配置、用 pgBackRest / WAL-G 替代手寫腳本、5 個 production 踩雷（archive 靜默失敗 / archive lag / 錯誤 target time / base backup 過期未清 / timeline 分歧 recovery 模糊）、跟 Patroni &#43; monitoring 整合">PITR + WAL Archiving</a>：WAL archive 跟 slot 是兩種 WAL retention 機制、可並行</li>
</ul>
<h2 id="監控-metric">監控 metric</h2>
<p>Production 持續監控：</p>
<ul>
<li><code>pg_replication_slots.active</code> — 失聯 slot</li>
<li><code>pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)</code> — slot 累積 WAL</li>
<li><code>pg_replication_slots.confirmed_flush_lsn</code> vs <code>pg_current_wal_lsn()</code> — logical slot lag</li>
<li><code>pg_ls_waldir()</code> 看 <code>pg_wal/</code> 目錄大小</li>
<li><code>count(*) FROM pg_replication_slots</code> 對 <code>max_replication_slots</code> 比例</li>
</ul>
<p>把這些丟進 Datadog / Prometheus + alert。</p>
<h2 id="相關連結">相關連結</h2>
<ul>
<li><a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL vendor overview</a></li>
<li><a href="/blog/backend/01-database/vendors/postgresql/replication-topology/" data-link-title="PostgreSQL Replication Topology：async / sync / quorum 三模式跟 LSN &#43; replication slot 的三軸組合" data-link-desc="PostgreSQL streaming replication 不是「sync 或 async」、是 *durability / latency / consistency* 三軸組合 &#43; LSN-based 進度追蹤 &#43; replication slot 治理。本文走 3 軸取捨模型、async / sync / quorum-based sync 行為對比、LSN &#43; replication slot 機制、配置 step-by-step、5 production 踩雷（standby lag 暴衝 / sync standby 退回 async / orphan replication slot / cascading replication 雪崩 / failover 後 timeline 分歧）、跟 Patroni HA &#43; logical replication 整合">PG Replication Topology</a>（physical slot 用途）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/logical-replication-debezium/" data-link-title="PostgreSQL Logical Replication &#43; Debezium CDC：replication slot × failure × recovery 對照" data-link-desc="PostgreSQL logical replication slot 跟 Debezium CDC 的失效模式對照表：slot lag 撐爆 primary disk / schema change 斷流 / 初始 COPY 鎖表 / zombie slot 不釋放 / replay storm 後 offset reset；publication / subscription / pgoutput 配置、跟 Kafka outbox pattern 整合">PG Logical Replication + Debezium</a>（logical slot 用途）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/bdr-multi-master/" data-link-title="PostgreSQL BDR / Multi-Master：active-active 寫入的 3 種路徑跟 conflict 治理" data-link-desc="PG 預設是 single-primary、active-active 多寫入入口需要 *BDR (EDB)* / *pgEdge* / *Bucardo* 等 extension。本文走 3 種 multi-master 方案對比、conflict detection &#43; resolution model、async vs sync 取捨、配置 step-by-step（pgEdge 為主）、5 production 踩雷（last-write-wins data loss / sequence collision / DDL replication / conflict log 治理 / failover 後 timeline 分歧）、跟 MySQL Group Replication sibling 對比">PG BDR / Multi-Master</a>（multi-master 大量 slot）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/pitr-wal-archiving/" data-link-title="PostgreSQL PITR &#43; WAL archiving：從 base backup 到 point-in-time recovery 的完整鏈" data-link-desc="Base backup &#43; WAL archive 構成 PITR 的雙軌資料、archive_command &#43; restore_command 配置、用 pgBackRest / WAL-G 替代手寫腳本、5 個 production 踩雷（archive 靜默失敗 / archive lag / 錯誤 target time / base backup 過期未清 / timeline 分歧 recovery 模糊）、跟 Patroni &#43; monitoring 整合">PG PITR + WAL Archiving</a>（WAL retention 兩種機制）</li>
<li>官方：<a href="https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS">PG Replication Slots</a> / <a href="https://www.postgresql.org/docs/current/logicaldecoding.html">Logical Replication Slot</a></li>
</ul>
]]></content:encoded></item><item><title>PostgreSQL Logical Replication + Debezium CDC：replication slot × failure × recovery 對照</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/logical-replication-debezium/</link><pubDate>Mon, 18 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/logical-replication-debezium/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL&lt;/a> overview 的 implementation-layer deep article。Overview 提到 logical decoding / Debezium CDC、本文聚焦 &lt;em>replication slot 生命週期 + 5 個 production failure mode 跟 recovery&lt;/em> 的對照。&lt;/p>&lt;/blockquote>
&lt;h2 id="replication-slot--failure--recovery-對照">Replication slot × Failure × Recovery 對照&lt;/h2>
&lt;p>Logical replication 跟 Debezium CDC 的 production 議題集中在 &lt;em>replication slot&lt;/em> — 它是 PostgreSQL 內保證 WAL 不被回收的 anchor point；slot 設不對、整個 CDC pipeline 失效。各 failure mode 對 slot 的影響跟 recovery 路徑：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Failure mode&lt;/th>
 &lt;th>對 slot 影響&lt;/th>
 &lt;th>Primary 端徵兆&lt;/th>
 &lt;th>Recovery 路徑&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Consumer 卡住 / lag&lt;/td>
 &lt;td>slot LSN 不前進、WAL 留著&lt;/td>
 &lt;td>&lt;code>pg_wal&lt;/code> 目錄持續長大、disk 撐爆&lt;/td>
 &lt;td>修 consumer / 加 throttle / 必要時 drop slot&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Consumer crash 無 restart&lt;/td>
 &lt;td>slot 留在 active state&lt;/td>
 &lt;td>跟 lag 同、不會自動清&lt;/td>
 &lt;td>手動 &lt;code>SELECT pg_drop_replication_slot('name')&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Schema change（ADD COLUMN）&lt;/td>
 &lt;td>多數 plugin 自動處理、無感&lt;/td>
 &lt;td>通常無感&lt;/td>
 &lt;td>-&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Schema change（DROP / RENAME COLUMN）&lt;/td>
 &lt;td>多數 plugin 直接斷&lt;/td>
 &lt;td>Consumer log 報錯、slot active 卻不前進&lt;/td>
 &lt;td>重建 publication / 重 init load&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Initial COPY&lt;/td>
 &lt;td>slot 建立時跑 snapshot、long-running tx&lt;/td>
 &lt;td>大表 COPY 期間鎖跟 WAL 都受影響&lt;/td>
 &lt;td>用 &lt;code>CREATE_REPLICATION_SLOT ... NOEXPORT_SNAPSHOT&lt;/code> 分階段&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Promotion (failover)&lt;/td>
 &lt;td>physical slot 跟 logical slot 處理不同&lt;/td>
 &lt;td>logical slot 在 PG 16- 不跨 failover&lt;/td>
 &lt;td>PG 16+ logical slot 持久化、或 consumer 重 init load&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Replay storm（offset 重置）&lt;/td>
 &lt;td>slot 不變、consumer 重讀&lt;/td>
 &lt;td>Kafka 端流量爆、application 看 duplicate&lt;/td>
 &lt;td>Idempotent consumer 設計、或 transactional outbox&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>每個 failure mode 對應的詳細配置 + recovery 步驟、下面分段展開。&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL</a> overview 的 implementation-layer deep article。Overview 提到 logical decoding / Debezium CDC、本文聚焦 <em>replication slot 生命週期 + 5 個 production failure mode 跟 recovery</em> 的對照。</p></blockquote>
<h2 id="replication-slot--failure--recovery-對照">Replication slot × Failure × Recovery 對照</h2>
<p>Logical replication 跟 Debezium CDC 的 production 議題集中在 <em>replication slot</em> — 它是 PostgreSQL 內保證 WAL 不被回收的 anchor point；slot 設不對、整個 CDC pipeline 失效。各 failure mode 對 slot 的影響跟 recovery 路徑：</p>
<table>
  <thead>
      <tr>
          <th>Failure mode</th>
          <th>對 slot 影響</th>
          <th>Primary 端徵兆</th>
          <th>Recovery 路徑</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Consumer 卡住 / lag</td>
          <td>slot LSN 不前進、WAL 留著</td>
          <td><code>pg_wal</code> 目錄持續長大、disk 撐爆</td>
          <td>修 consumer / 加 throttle / 必要時 drop slot</td>
      </tr>
      <tr>
          <td>Consumer crash 無 restart</td>
          <td>slot 留在 active state</td>
          <td>跟 lag 同、不會自動清</td>
          <td>手動 <code>SELECT pg_drop_replication_slot('name')</code></td>
      </tr>
      <tr>
          <td>Schema change（ADD COLUMN）</td>
          <td>多數 plugin 自動處理、無感</td>
          <td>通常無感</td>
          <td>-</td>
      </tr>
      <tr>
          <td>Schema change（DROP / RENAME COLUMN）</td>
          <td>多數 plugin 直接斷</td>
          <td>Consumer log 報錯、slot active 卻不前進</td>
          <td>重建 publication / 重 init load</td>
      </tr>
      <tr>
          <td>Initial COPY</td>
          <td>slot 建立時跑 snapshot、long-running tx</td>
          <td>大表 COPY 期間鎖跟 WAL 都受影響</td>
          <td>用 <code>CREATE_REPLICATION_SLOT ... NOEXPORT_SNAPSHOT</code> 分階段</td>
      </tr>
      <tr>
          <td>Promotion (failover)</td>
          <td>physical slot 跟 logical slot 處理不同</td>
          <td>logical slot 在 PG 16- 不跨 failover</td>
          <td>PG 16+ logical slot 持久化、或 consumer 重 init load</td>
      </tr>
      <tr>
          <td>Replay storm（offset 重置）</td>
          <td>slot 不變、consumer 重讀</td>
          <td>Kafka 端流量爆、application 看 duplicate</td>
          <td>Idempotent consumer 設計、或 transactional outbox</td>
      </tr>
  </tbody>
</table>
<p>每個 failure mode 對應的詳細配置 + recovery 步驟、下面分段展開。</p>
<h2 id="logical-replication-基礎publication--subscription--slot">Logical replication 基礎：publication + subscription + slot</h2>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- Primary：建 publication
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="n">PUBLICATION</span><span class="w"> </span><span class="n">app_changes</span><span class="w"> </span><span class="k">FOR</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders</span><span class="p">,</span><span class="w"> </span><span class="n">events</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- Subscriber：建 subscription（自動建 replication slot）
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="n">SUBSCRIPTION</span><span class="w"> </span><span class="n">app_sub</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w">  </span><span class="k">CONNECTION</span><span class="w"> </span><span class="s1">&#39;host=primary user=replicator dbname=app&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w">  </span><span class="n">PUBLICATION</span><span class="w"> </span><span class="n">app_changes</span><span class="w">
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="w">  </span><span class="k">WITH</span><span class="w"> </span><span class="p">(</span><span class="n">slot_name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;app_sub_slot&#39;</span><span class="p">,</span><span class="w"> </span><span class="n">copy_data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">true</span><span class="p">);</span></span></span></code></pre></div><p>關鍵物件：</p>
<ul>
<li><strong>publication</strong>（primary 端）：宣告 <em>哪些表 + 哪些操作（INSERT/UPDATE/DELETE/TRUNCATE）</em> 對外暴露</li>
<li><strong>subscription</strong>（subscriber 端、若是 PG-to-PG）：訂閱 + 自動建 slot + 自動 initial COPY</li>
<li><strong>replication slot</strong>：primary 端、保證 <em>consumer 還沒消費的 WAL</em> 不被回收</li>
</ul>
<p><code>copy_data = true</code> 觸發 initial COPY（snapshot）+ 後續 streaming；<code>copy_data = false</code> 只 streaming、適合 already-in-sync 場景。</p>
<h2 id="debezium-cdc用-logical-replication-slot-但繞過-subscription">Debezium CDC：用 logical replication slot 但繞過 subscription</h2>
<p>Debezium 不是 PostgreSQL subscriber、是 <em>直接讀 replication slot</em> 的外部 consumer：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-properties" data-lang="properties"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># Debezium PostgreSQL connector</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="na">connector.class</span><span class="o">=</span><span class="s">io.debezium.connector.postgresql.PostgresConnector</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="na">database.hostname</span><span class="o">=</span><span class="s">primary</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="na">database.dbname</span><span class="o">=</span><span class="s">app</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="na">plugin.name</span><span class="o">=</span><span class="s">pgoutput                            # 內建、PG 10+ 推薦</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="na">slot.name</span><span class="o">=</span><span class="s">debezium_app</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="na">publication.name</span><span class="o">=</span><span class="s">app_changes</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="na">publication.autocreate.mode</span><span class="o">=</span><span class="s">filtered            # debezium 自動建 publication</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="na">table.include.list</span><span class="o">=</span><span class="s">public.orders,public.events</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="na">snapshot.mode</span><span class="o">=</span><span class="s">initial                            # 起始 snapshot 後 streaming</span></span></span></code></pre></div><p>差異：</p>
<ul>
<li>Debezium 用 <code>pgoutput</code>（PG 10+ 內建）或 <code>wal2json</code>（外掛 plugin）解 WAL、轉成結構化事件送 Kafka</li>
<li>不像 PG-to-PG subscription、Debezium 沒 subscription object、是 <em>外部 consumer 自管</em> replication slot</li>
<li>Failure mode 上 <em>consumer 端是 Debezium 自己</em>、所以 lag 來源是 Debezium 處理速度 / Kafka 寫入速度</li>
</ul>
<h2 id="production-故障演練">Production 故障演練</h2>
<h3 id="case-1consumer-lagslot-lsn-不前進primary-disk-爆">Case 1：consumer lag、slot LSN 不前進、primary disk 爆</h3>
<p><strong>徵兆</strong>：primary <code>pg_wal</code> 目錄持續長大、<code>df -h</code> 看磁碟 90%+；<code>pg_replication_slots</code> 看 <code>confirmed_flush_lsn</code> 卡在某 LSN、<code>pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)</code> 數十 GB。</p>
<p><strong>根因</strong>：consumer（Debezium / subscriber）處理慢於 primary 寫入；replication slot <em>保證 WAL 不回收</em>、但 consumer 沒消費 → WAL 堆積。</p>
<p><strong>修法</strong>：</p>
<ol>
<li><strong>監測</strong>：Prometheus alert <code>pg_replication_slot_lag_bytes &gt; 5GB</code> 觸發前 catch</li>
<li><strong>修 consumer</strong>：throttle primary 寫入 OR scale Debezium / subscriber 處理能力</li>
<li><strong>緊急</strong>：<code>SELECT pg_drop_replication_slot('debezium_app')</code> 釋放 WAL — 但 consumer 必須重 init load（資料缺一塊）</li>
<li><strong>架構</strong>：用 <em>max_slot_wal_keep_size</em>（PG 13+）設 slot 能保留 WAL 上限、超出自動 invalidate slot、保護 primary disk</li>
</ol>
<h3 id="case-2consumer-crash-後-slot-變-zombie">Case 2：consumer crash 後 slot 變 zombie</h3>
<p><strong>徵兆</strong>：Debezium pod OOM crash、新 pod 起來時報 <code>slot is active for PID X</code>、無法 attach；primary 端 <code>pg_replication_slots.active = true</code>、<code>active_pid</code> 指向已經死掉的 process。</p>
<p><strong>根因</strong>：PostgreSQL 把 slot 標 active 是基於 <em>當下有 connection</em>；consumer crash 但 connection 沒被 server 端發現（network 沒 RST）、slot 留在 active state。</p>
<p><strong>修法</strong>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 手動清 zombie slot
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_terminate_backend</span><span class="p">(</span><span class="n">active_pid</span><span class="p">)</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">pg_replication_slots</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">  </span><span class="k">WHERE</span><span class="w"> </span><span class="n">slot_name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;debezium_app&#39;</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">active</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="c1">-- 或直接 drop（會丟資料、consumer 要重 init）
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">pg_drop_replication_slot</span><span class="p">(</span><span class="s1">&#39;debezium_app&#39;</span><span class="p">);</span></span></span></code></pre></div><p>預防：</p>
<ol>
<li>PostgreSQL <code>tcp_keepalives_idle / interval / count</code> 設較短（300 / 60 / 6）、network drop 較快被發現</li>
<li>Consumer 端用 <em>graceful shutdown</em> + <code>pg_terminate_backend(active_pid)</code> 在 startup 前主動清 stale connection</li>
</ol>
<h3 id="case-3schema-changedrop--rename-column斷流">Case 3：schema change（DROP / RENAME COLUMN）斷流</h3>
<p><strong>徵兆</strong>：Debezium consumer 突然停 produce 訊息、log 報 <code>column XYZ does not exist</code>；primary 端 slot 還 active、但 <code>confirmed_flush_lsn</code> 不前進。</p>
<p><strong>根因</strong>：pgoutput plugin 把 WAL 解成 row event 時、用的 schema 是 <em>當下 catalog</em>；如果中間 DROP COLUMN、之前 WAL 內的 row event 含已不存在欄位、解析失敗。</p>
<p><strong>修法</strong>：</p>
<ol>
<li><strong>預防</strong>：schema change 走 <em>expand-contract pattern</em>
<ul>
<li>Phase 1: ADD COLUMN new_col（不影響 logical replication）</li>
<li>Phase 2: application 雙寫 old + new</li>
<li>Phase 3: 等 consumer catch up old column 訊息</li>
<li>Phase 4: DROP COLUMN old_col（此時無 in-flight WAL 帶 old_col）</li>
</ul>
</li>
<li><strong>緊急</strong>：DROP existing slot、重建 publication 跟 slot、consumer 重 init load</li>
<li><strong>長期</strong>：用 Debezium <em>snapshot.mode=schema_only_recovery</em> 在 schema 變動時不重灌資料、只 reset schema</li>
</ol>
<h3 id="case-4initial-copy-大表鎖太久">Case 4：initial COPY 大表鎖太久</h3>
<p><strong>徵兆</strong>：對 1TB 表跑 <code>CREATE SUBSCRIPTION ... WITH (copy_data=true)</code> 後、application 對該表 query / write 阻塞 30+ 分鐘；application timeout 大量。</p>
<p><strong>根因</strong>：initial COPY 默認跑在 <em>single transaction</em>、整個 snapshot LSN 鎖住、長 transaction 跟 vacuum 衝突；同時對 subscriber 端鎖表寫入。</p>
<p><strong>修法</strong>：</p>
<ol>
<li><strong>分階段 init</strong>：</li>
</ol>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1">-- Primary：建 publication 不 copy
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="n">PUBLICATION</span><span class="w"> </span><span class="n">app_changes</span><span class="w"> </span><span class="k">FOR</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">big_table</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w"></span><span class="c1">-- Subscriber：建 subscription 不 copy
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="n">SUBSCRIPTION</span><span class="w"> </span><span class="n">app_sub</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">  </span><span class="k">CONNECTION</span><span class="w"> </span><span class="s1">&#39;...&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">  </span><span class="n">PUBLICATION</span><span class="w"> </span><span class="n">app_changes</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">  </span><span class="k">WITH</span><span class="w"> </span><span class="p">(</span><span class="n">copy_data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">false</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w"></span><span class="c1">-- 手動跑 partition-by-partition COPY（若是 partition table）
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="c1">-- 或用 pg_dump / pg_basebackup 拿 snapshot</span></span></span></code></pre></div><ol start="2">
<li><strong>PG 16+ parallel init</strong>：<code>max_sync_workers_per_subscription = 4</code> 平行 COPY 多個表</li>
<li><strong>Debezium replacement</strong>：用 incremental snapshot（Debezium 1.6+）、background trickle copy、不鎖長 transaction</li>
</ol>
<h3 id="case-5replay-storm-後-consumer-offset-reset">Case 5：replay storm 後 consumer offset reset</h3>
<p><strong>徵兆</strong>：Debezium 修 bug / 重 deploy 後、<code>snapshot.mode=initial</code> 觸發整個資料重灌；Kafka topic 流量爆 10x、下游 application 看到大量 duplicate event。</p>
<p><strong>根因</strong>：Debezium offset store（Kafka topic 或 file）被誤刪 / corruption；重啟時不知道從哪 LSN 開始、預設 fall back 到 initial snapshot。</p>
<p><strong>修法</strong>：</p>
<ol>
<li><strong>預防</strong>：Debezium offset store 跟 Kafka cluster <em>backup 一起做</em>、不要單獨依賴 Kafka topic</li>
<li><strong>架構</strong>：consumer side 設計 <em>idempotent</em> — 用 event 自帶的 (source LSN + transaction ID) 當 dedupe key</li>
<li><strong>transactional outbox pattern</strong>：CDC 只 capture outbox 表、application 主動寫 outbox + business data 在同 transaction；duplicate 由 application 自己 dedupe</li>
</ol>
<h2 id="容量規劃">容量規劃</h2>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>估算</th>
          <th>警戒</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Replication slot lag</td>
          <td><code>pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)</code></td>
          <td>&gt; 1GB lag 訊號 consumer 跟不上</td>
      </tr>
      <tr>
          <td>Primary <code>pg_wal</code> size</td>
          <td>retention × peak WAL rate</td>
          <td>預留 disk 容量 = max_slot_wal_keep_size + 30% buffer</td>
      </tr>
      <tr>
          <td>Debezium throughput</td>
          <td>~5-10K row/s 單 connector、多表平行可拉</td>
          <td>跟 primary write rate 對比</td>
      </tr>
      <tr>
          <td>Initial COPY time</td>
          <td>100GB ~ 10-30 分鐘（看 network + subscriber IO）</td>
          <td>TB 級必須分階段</td>
      </tr>
      <tr>
          <td>Slot 數量</td>
          <td>每 slot 佔 primary 一份 WAL 保留 buffer</td>
          <td>5+ slot 同時跑 disk 壓力倍增</td>
      </tr>
      <tr>
          <td>max_replication_slots</td>
          <td>預設 10、production 跑 CDC + standby 各佔 slot 要拉到 20-50</td>
          <td>達上限會拒新 slot 建立</td>
      </tr>
  </tbody>
</table>
<p>實務 default：</p>
<ul>
<li>Debezium production：1 connector per source schema、不要 1 connector 跨 50 個表</li>
<li>Slot retention：<code>max_slot_wal_keep_size = 100GB</code>、超出 invalidate slot 保護 primary</li>
<li>Monitor cadence：1 分鐘 sample lag + 5 分鐘 alert threshold</li>
</ul>
<h2 id="整合--下一步">整合 / 下一步</h2>
<h3 id="跟-patroni-ha-整合">跟 <a href="/blog/backend/01-database/vendors/postgresql/patroni-ha/" data-link-title="PostgreSQL Patroni HA：從 leader 失聯到 client 重連的 5 段 failover lifecycle" data-link-desc="Patroni 把 PostgreSQL HA 拆成 detection / election / promotion / reconfiguration / recovery 五段 lifecycle、每段都有獨立配置跟 failure mode；DCS quorum &#43; watchdog 防 split-brain、async/sync replication 取捨、5 個 production 踩雷、跟 PgBouncer / HAProxy / cert-manager 整合">Patroni HA</a> 整合</h3>
<p>logical slot 在 PG 16- 不跨 failover、是長期痛點：</p>
<ol>
<li><strong>PG 16-</strong>：failover 後 logical consumer 必須重 init（slot 在新 leader 上不存在）</li>
<li><strong>PG 16+</strong>：<code>failover</code> parameter 讓 logical slot 在 standby 同步、failover 後 consumer 直接接</li>
<li>Patroni 16+ 支援 logical slot persistence 配置、配合用</li>
</ol>
<h3 id="跟-kafka-outbox-pattern">跟 Kafka outbox pattern</h3>
<p>production-grade CDC 不直接 read business table、是 read <em>outbox table</em>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- Application transaction
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">BEGIN</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">  </span><span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="p">(...)</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="p">(...);</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">  </span><span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">outbox</span><span class="w"> </span><span class="p">(</span><span class="n">event_type</span><span class="p">,</span><span class="w"> </span><span class="n">payload</span><span class="p">,</span><span class="w"> </span><span class="n">created_at</span><span class="p">)</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="p">(</span><span class="s1">&#39;order_created&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;...&#39;</span><span class="p">,</span><span class="w"> </span><span class="n">now</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="k">COMMIT</span><span class="p">;</span></span></span></code></pre></div><p>Debezium 只 capture outbox table、event payload 已是 application-shaped JSON、不用解 row event。好處：</p>
<ol>
<li>Schema change 不影響 CDC（outbox table schema 穩定）</li>
<li>跨表 transaction 對應到單 event（outbox 是業務語意層）</li>
<li>Replay 可靠 — outbox 是 append-only、可重讀</li>
</ol>
<h3 id="跟-partitioning-整合">跟 <a href="/blog/backend/01-database/vendors/postgresql/declarative-partitioning/" data-link-title="PostgreSQL declarative partitioning：partition 不是切表、是讓 planner pruning" data-link-desc="Declarative partitioning 的真實價值是 query planner pruning &#43; maintenance scope 縮小、不是「把大表切小」；RANGE / LIST / HASH 取捨、partition key 選法、5 個 production 踩雷（key 選錯不 prune / unique 不 enforce 跨 partition / ATTACH 鎖太久 / partition 數爆 / DETACH 不 reclaim 空間）、跟 autovacuum &#43; index 設計整合">partitioning</a> 整合</h3>
<p>partitioned table 的 logical replication：</p>
<ol>
<li>PG 13+ <code>publish_via_partition_root = true</code> — publication 從 parent 角度看、不是 per-partition</li>
<li>Subscriber 端可 partition 不同 strategy（甚至不 partition）</li>
<li>Schema change 對 partition table 更複雜、走 expand-contract 嚴格</li>
</ol>
<h3 id="下一步議題">下一步議題</h3>
<ul>
<li><strong>Logical replication conflict</strong>：subscriber 端寫衝突的處理（PG 17+ 加 conflict resolution）</li>
<li><strong>bi-directional replication（pg_active）</strong>：多 region active-active、衝突解決設計</li>
<li><strong>Decoder plugin 對比</strong>：pgoutput / wal2json / decoderbufs 效能跟易用性</li>
</ul>
<h2 id="相關連結">相關連結</h2>
<ul>
<li>上游 vendor 頁：<a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL</a></li>
<li>上游 chapter：<a href="/blog/backend/01-database/schema-migration-rollout-evidence/" data-link-title="1.7 Schema Migration Rollout 證據（Schema Migration Rollout Evidence）實作示範" data-link-desc="以訂單付款狀態欄位演進示範 schema migration 如何產出 evidence、release gate 與 incident decision log。">Schema Migration Rollout Evidence</a> — schema change × CDC 對應</li>
<li>平行 deep article：<a href="/blog/backend/01-database/vendors/postgresql/patroni-ha/" data-link-title="PostgreSQL Patroni HA：從 leader 失聯到 client 重連的 5 段 failover lifecycle" data-link-desc="Patroni 把 PostgreSQL HA 拆成 detection / election / promotion / reconfiguration / recovery 五段 lifecycle、每段都有獨立配置跟 failure mode；DCS quorum &#43; watchdog 防 split-brain、async/sync replication 取捨、5 個 production 踩雷、跟 PgBouncer / HAProxy / cert-manager 整合">Patroni HA</a> / <a href="/blog/backend/01-database/vendors/postgresql/pitr-wal-archiving/" data-link-title="PostgreSQL PITR &#43; WAL archiving：從 base backup 到 point-in-time recovery 的完整鏈" data-link-desc="Base backup &#43; WAL archive 構成 PITR 的雙軌資料、archive_command &#43; restore_command 配置、用 pgBackRest / WAL-G 替代手寫腳本、5 個 production 踩雷（archive 靜默失敗 / archive lag / 錯誤 target time / base backup 過期未清 / timeline 分歧 recovery 模糊）、跟 Patroni &#43; monitoring 整合">PITR + WAL Archiving</a> / <a href="/blog/backend/01-database/vendors/postgresql/replication-slot-management/" data-link-title="PostgreSQL Replication Slot Management：Physical / Logical / Failover Slot 治理" data-link-desc="PG replication slot 是 *primary 端的 standby 進度紀錄*、防 WAL premature deletion。但 orphan slot 會吃 disk、failover 後 logical slot 不會自動跟新 primary、是 PG 操作的 hidden complexity。本文走 physical / logical slot 差異、slot lifecycle、failover slot synchronization（PG 17&#43; 新特性）、orphan slot 治理、5 production 踩雷（orphan slot disk 爆 / logical slot lag / failover 後 slot 丟 / wal_keep_size 跟 slot 衝突 / connection 同時打 slot 數量限制）">Replication Slot Management</a>（slot lifecycle / orphan / failover sync）/ <a href="/blog/backend/01-database/vendors/postgresql/replication-topology/" data-link-title="PostgreSQL Replication Topology：async / sync / quorum 三模式跟 LSN &#43; replication slot 的三軸組合" data-link-desc="PostgreSQL streaming replication 不是「sync 或 async」、是 *durability / latency / consistency* 三軸組合 &#43; LSN-based 進度追蹤 &#43; replication slot 治理。本文走 3 軸取捨模型、async / sync / quorum-based sync 行為對比、LSN &#43; replication slot 機制、配置 step-by-step、5 production 踩雷（standby lag 暴衝 / sync standby 退回 async / orphan replication slot / cascading replication 雪崩 / failover 後 timeline 分歧）、跟 Patroni HA &#43; logical replication 整合">Replication Topology</a>（streaming + LSN 基礎）</li>
<li>Methodology：<a href="/blog/posts/vendor-%E6%B7%B1%E5%BA%A6%E6%8A%80%E8%A1%93%E6%96%87%E7%AB%A0%E6%96%B9%E6%B3%95%E8%AB%96%E7%9A%84%E6%BC%94%E5%8C%96%E7%B4%80%E9%8C%84%E5%90%8C-vendor-%E7%B3%BB%E5%88%97%E7%9A%84%E9%96%8B%E5%A0%B4%E8%BC%AA%E6%9B%BF%E9%A9%97%E8%AD%89/" data-link-title="Vendor 深度技術文章方法論的演化紀錄：同 vendor 系列的開場輪替驗證" data-link-desc="vendor overview 飽和後要寫單一功能深度文章、需要選題與結構依據時回來。這套方法論的驗證來源與 cadence variant 在高風險場景（同 vendor sub-tool 系列）的實證。">Vendor 深度技術文章的寫作方法論</a></li>
</ul>
]]></content:encoded></item></channel></rss>