<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Multi-Master on Tarragon</title><link>https://tarrragon.github.io/blog/tags/multi-master/</link><description>Recent content in Multi-Master on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Tue, 19 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/multi-master/index.xml" rel="self" type="application/rss+xml"/><item><title>PostgreSQL BDR / Multi-Master：active-active 寫入的 3 種路徑跟 conflict 治理</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/bdr-multi-master/</link><pubDate>Tue, 19 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/bdr-multi-master/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL&lt;/a> overview 的 implementation-layer deep article。Overview 已說明 PG 在 OLTP 譜系的定位、本文聚焦 &lt;em>multi-master / active-active replication&lt;/em> — 不是 PG 預設、需要 extension。&lt;/p>&lt;/blockquote>
&lt;hr>
&lt;h2 id="pg-預設沒-multi-master得用-extension">PG 預設沒 multi-master、得用 extension&lt;/h2>
&lt;p>PG core 是 &lt;em>single-primary streaming replication&lt;/em>：&lt;/p>
&lt;ul>
&lt;li>寫入只能進 primary&lt;/li>
&lt;li>Standby 接受 read（hot_standby）但拒絕 write&lt;/li>
&lt;li>Failover 後新 primary 接管、不能多入口&lt;/li>
&lt;/ul>
&lt;p>對需要 &lt;em>active-active&lt;/em>（多 region 各自接受 local write）的場景、PG 提供 3 條 extension 路徑：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>方案&lt;/th>
 &lt;th>來源&lt;/th>
 &lt;th>機制&lt;/th>
 &lt;th>License&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;strong>BDR&lt;/strong>&lt;/td>
 &lt;td>EDB（Enterprise）&lt;/td>
 &lt;td>Logical replication-based、雙向&lt;/td>
 &lt;td>商業（EDB 訂閱）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>pgEdge&lt;/strong>&lt;/td>
 &lt;td>pgEdge Inc.&lt;/td>
 &lt;td>基於 BDR、開源、加 Spock extension&lt;/td>
 &lt;td>開源（Spock）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>Bucardo&lt;/strong>&lt;/td>
 &lt;td>community&lt;/td>
 &lt;td>Trigger-based、async、Perl 寫&lt;/td>
 &lt;td>開源（BSD）&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>每條路徑有不同 trade-off。對 99% PG production case、&lt;em>不需要 multi-master&lt;/em> — single-primary streaming replication + read replica scaling 已夠。Multi-master 是 &lt;em>特殊需求&lt;/em>（跨 region active-active write / 不可中斷 maintenance）才上。&lt;/p>
&lt;p>跟 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/group-replication/" data-link-title="MySQL Group Replication / InnoDB Cluster：single-primary vs multi-primary mode 對 transaction certification 的影響" data-link-desc="MySQL Group Replication 提供 synchronous multi-primary replication、用 Paxos-like Group Communication Engine（GCE）達成 quorum-based commit。但「multi-primary」不是「single-primary 多開幾個 write 入口」、是 *transaction conflict detection &amp;#43; certification* 整個機制不同。本文走 GR 機制（GCE &amp;#43; certification &amp;#43; applier）、single-primary vs multi-primary mode、InnoDB Cluster 跟 MySQL Shell / Router 整合、5 production 踩雷（cert lag / write conflict / large transaction / network partition / member 加入 catch-up）、何時用 GR 何時用傳統 replication">MySQL Group Replication&lt;/a> 對比：MySQL GR 是 &lt;em>官方內建&lt;/em>（5.7+）、PG 沒對應內建選項。MySQL 用戶 GR / InnoDB Cluster 直接套、PG 用戶要選 extension + license trade-off。&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL</a> overview 的 implementation-layer deep article。Overview 已說明 PG 在 OLTP 譜系的定位、本文聚焦 <em>multi-master / active-active replication</em> — 不是 PG 預設、需要 extension。</p></blockquote>
<hr>
<h2 id="pg-預設沒-multi-master得用-extension">PG 預設沒 multi-master、得用 extension</h2>
<p>PG core 是 <em>single-primary streaming replication</em>：</p>
<ul>
<li>寫入只能進 primary</li>
<li>Standby 接受 read（hot_standby）但拒絕 write</li>
<li>Failover 後新 primary 接管、不能多入口</li>
</ul>
<p>對需要 <em>active-active</em>（多 region 各自接受 local write）的場景、PG 提供 3 條 extension 路徑：</p>
<table>
  <thead>
      <tr>
          <th>方案</th>
          <th>來源</th>
          <th>機制</th>
          <th>License</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>BDR</strong></td>
          <td>EDB（Enterprise）</td>
          <td>Logical replication-based、雙向</td>
          <td>商業（EDB 訂閱）</td>
      </tr>
      <tr>
          <td><strong>pgEdge</strong></td>
          <td>pgEdge Inc.</td>
          <td>基於 BDR、開源、加 Spock extension</td>
          <td>開源（Spock）</td>
      </tr>
      <tr>
          <td><strong>Bucardo</strong></td>
          <td>community</td>
          <td>Trigger-based、async、Perl 寫</td>
          <td>開源（BSD）</td>
      </tr>
  </tbody>
</table>
<p>每條路徑有不同 trade-off。對 99% PG production case、<em>不需要 multi-master</em> — single-primary streaming replication + read replica scaling 已夠。Multi-master 是 <em>特殊需求</em>（跨 region active-active write / 不可中斷 maintenance）才上。</p>
<p>跟 <a href="/blog/backend/01-database/vendors/mysql/group-replication/" data-link-title="MySQL Group Replication / InnoDB Cluster：single-primary vs multi-primary mode 對 transaction certification 的影響" data-link-desc="MySQL Group Replication 提供 synchronous multi-primary replication、用 Paxos-like Group Communication Engine（GCE）達成 quorum-based commit。但「multi-primary」不是「single-primary 多開幾個 write 入口」、是 *transaction conflict detection &#43; certification* 整個機制不同。本文走 GR 機制（GCE &#43; certification &#43; applier）、single-primary vs multi-primary mode、InnoDB Cluster 跟 MySQL Shell / Router 整合、5 production 踩雷（cert lag / write conflict / large transaction / network partition / member 加入 catch-up）、何時用 GR 何時用傳統 replication">MySQL Group Replication</a> 對比：MySQL GR 是 <em>官方內建</em>（5.7+）、PG 沒對應內建選項。MySQL 用戶 GR / InnoDB Cluster 直接套、PG 用戶要選 extension + license trade-off。</p>
<h2 id="multi-master-三方案對比">Multi-master 三方案對比</h2>
<h3 id="方案-1bdr-edb-postgres-distributed">方案 1：BDR (EDB Postgres Distributed)</h3>
<p>EDB 商業 distributed 方案、跑在 EDB Postgres Advanced Server 或 PG community 上。</p>
<p><strong>特性</strong>：</p>
<ul>
<li>雙向 logical replication、N-way active-active</li>
<li>Built-in conflict detection + resolution（LWW / column-level / user-defined）</li>
<li>Eager（sync）跟 async 兩種 mode</li>
<li>Tightly integrated with EDB tooling</li>
</ul>
<p><strong>Trade-off</strong>：</p>
<ul>
<li>商業 license、EDB 訂閱</li>
<li>對 cross-region multi-master 成熟（北美 enterprise 廣用）</li>
<li>對 <em>新 PG version</em> 通常滯後幾個月</li>
</ul>
<h3 id="方案-2pgedge基於-spock-extension">方案 2：pgEdge（基於 Spock extension）</h3>
<p>pgEdge 開源 multi-master、基於 <em>Spock</em> extension（從 BDR 衍生）：</p>
<p><strong>特性</strong>：</p>
<ul>
<li>開源、可自管</li>
<li>跟 BDR 架構接近、無 license fee</li>
<li>Conflict resolution 用 LWW + column-level</li>
<li>對 <em>edge / 地理分散</em> 場景設計</li>
</ul>
<p><strong>Trade-off</strong>：</p>
<ul>
<li>較新（2023+）、社群驗證度低於 BDR</li>
<li>Conflict resolution policy 比 BDR 簡單</li>
<li>部分 EDB 商業 feature 沒對應</li>
</ul>
<h3 id="方案-3bucardo">方案 3：Bucardo</h3>
<p>PG community async multi-master、Perl 寫、trigger-based：</p>
<p><strong>特性</strong>：</p>
<ul>
<li>完全開源</li>
<li>Trigger-based（不依賴 logical replication）</li>
<li>支援 multi-source replication（fan-in / fan-out）</li>
</ul>
<p><strong>Trade-off</strong>：</p>
<ul>
<li>Async only — <em>higher latency conflict</em></li>
<li>Trigger overhead（影響 primary 寫吞吐）</li>
<li>維護 Perl + tools chain 不普及</li>
<li>對 <em>Sync 一致性</em> 需求不適用</li>
</ul>
<h2 id="multi-master-conflict-model">Multi-Master Conflict Model</h2>
<p>任何 multi-master 方案都要解決 <em>同一 row 兩地同時改</em> 的 conflict：</p>
<h3 id="conflict-來源">Conflict 來源</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">Region A (primary 1)          Region B (primary 2)
</span></span><span class="line"><span class="ln">2</span><span class="cl">UPDATE orders                 UPDATE orders
</span></span><span class="line"><span class="ln">3</span><span class="cl">SET status=&#39;shipped&#39;          SET status=&#39;cancelled&#39;
</span></span><span class="line"><span class="ln">4</span><span class="cl">WHERE id=100                  WHERE id=100
</span></span><span class="line"><span class="ln">5</span><span class="cl">     ↓                              ↓
</span></span><span class="line"><span class="ln">6</span><span class="cl">   合併？哪個贏？</span></span></code></pre></div><p>跨 region 兩地各自 commit、replication lag 期間發現 conflict、必須 <em>自動 resolve</em>（不能丟給 application）。</p>
<h3 id="conflict-resolution-strategies">Conflict Resolution Strategies</h3>
<p><strong>1. Last-Write-Wins (LWW)</strong> — 最常見：</p>
<ul>
<li>比較 transaction commit timestamp、晚的贏</li>
<li>簡單但 <em>data loss</em>（前一個 commit 的變更被覆蓋）</li>
<li>需要 <em>clock 同步</em>（NTP）—  clock skew 造成不可預測</li>
</ul>
<p><strong>2. Column-level conflict resolution</strong>：</p>
<ul>
<li>不同 column 各自 LWW（status column 跟 amount column 獨立解）</li>
<li>比 row-level LWW 細、但需 application semantics 配合</li>
</ul>
<p><strong>3. User-defined trigger</strong>：</p>
<ul>
<li>寫 PG function 解 conflict</li>
<li>對 <em>特殊 business logic</em>（如：金額相加、不是覆蓋）有用</li>
<li>維護成本高</li>
</ul>
<p><strong>4. Manual reconciliation</strong>：</p>
<ul>
<li>Conflict 寫進 log table、application / DBA 手動處理</li>
<li>對 <em>無法自動 resolve</em> 場景（如金融）</li>
<li>高 ops cost</li>
</ul>
<p>對 99% case 用 LWW、接受 small data loss、application 設計 <em>idempotent / commutative</em> 操作避免衝突。</p>
<h3 id="conflict-機率取決於-application-pattern">Conflict 機率取決於 application pattern</h3>
<ul>
<li><em>Tenant-isolated</em> application（user_id 各自寫自己的 row）：基本無 conflict</li>
<li><em>Shared counter / inventory</em> application：高 conflict、multi-master 不適合</li>
<li><em>Append-only event log</em>：conflict 低、適合 multi-master</li>
</ul>
<h2 id="配置-step-by-steppgedge-為主">配置 step-by-step（pgEdge 為主）</h2>
<p>pgEdge 開源、最常見的 self-hosted 選擇。</p>
<h3 id="step-1在每個-region-node-裝-pgedge">Step 1：在每個 region node 裝 pgEdge</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># Install pgEdge CLI</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">curl -fsSL https://pgedge-upstream.s3.amazonaws.com/REPO/install.py <span class="p">|</span> python3
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># Setup PG + Spock + pgEdge</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">./pgedge install pg16
</span></span><span class="line"><span class="ln">6</span><span class="cl">./pgedge install spock</span></span></code></pre></div><h3 id="step-2配置每個-node">Step 2：配置每個 node</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 在 node1（us-east） 跑
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">spock</span><span class="p">.</span><span class="n">node_create</span><span class="p">(</span><span class="n">node_name</span><span class="w"> </span><span class="p">:</span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;node1&#39;</span><span class="p">,</span><span class="w"> </span><span class="n">dsn</span><span class="w"> </span><span class="p">:</span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;host=node1.example.com port=5432 dbname=production&#39;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- 在 node2（eu-west）跑
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">spock</span><span class="p">.</span><span class="n">node_create</span><span class="p">(</span><span class="n">node_name</span><span class="w"> </span><span class="p">:</span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;node2&#39;</span><span class="p">,</span><span class="w"> </span><span class="n">dsn</span><span class="w"> </span><span class="p">:</span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;host=node2.example.com port=5432 dbname=production&#39;</span><span class="p">);</span></span></span></code></pre></div><h3 id="step-3建-replication-set--subscribe">Step 3：建 replication set + subscribe</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1">-- 在 node1 建 default replication set + 加 tables
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">spock</span><span class="p">.</span><span class="n">repset_add_all_tables</span><span class="p">(</span><span class="s1">&#39;default&#39;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w"></span><span class="c1">-- 在 node1 subscribe node2
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">spock</span><span class="p">.</span><span class="n">sub_create</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">    </span><span class="n">subscription_name</span><span class="w"> </span><span class="p">:</span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;sub_n1_n2&#39;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">    </span><span class="n">provider_dsn</span><span class="w"> </span><span class="p">:</span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;host=node2.example.com port=5432 dbname=production&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w"></span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w"></span><span class="c1">-- 在 node2 subscribe node1（雙向）
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">spock</span><span class="p">.</span><span class="n">sub_create</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">    </span><span class="n">subscription_name</span><span class="w"> </span><span class="p">:</span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;sub_n2_n1&#39;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">    </span><span class="n">provider_dsn</span><span class="w"> </span><span class="p">:</span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;host=node1.example.com port=5432 dbname=production&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w"></span><span class="p">);</span></span></span></code></pre></div><h3 id="step-4設-conflict-resolution">Step 4：設 conflict resolution</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 設 LWW（預設）
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">spock</span><span class="p">.</span><span class="n">conflict_resolution_setting_set</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">    </span><span class="n">conflict_type</span><span class="w"> </span><span class="p">:</span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;update_origin_change&#39;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">    </span><span class="n">resolution_setting</span><span class="w"> </span><span class="p">:</span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;apply_remote&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="p">);</span></span></span></code></pre></div><h3 id="step-5驗證">Step 5：驗證</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 看 subscription 狀態
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">spock</span><span class="p">.</span><span class="n">subscription</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- 看 replication lag
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">pg_stat_replication</span><span class="p">;</span></span></span></code></pre></div><h2 id="5-個-production-踩雷">5 個 Production 踩雷</h2>
<h3 id="1-lww-data-loss--application-沒設計-commutative">1. LWW data loss — Application 沒設計 commutative</h3>
<p>LWW 預設、兩 region 同時 UPDATE 同 row → 晚的 commit 贏、早的丟失。Application 看不到「我寫的不見了」、debug 困難。</p>
<p>修法：</p>
<ul>
<li>Application schema 設計 <em>tenant-isolated</em>（user_id 各自寫自己 row）</li>
<li>對 <em>shared counter / inventory</em> 用 <em>commutative operation</em>（INCREMENT not SET）</li>
<li>重要寫入加 <em>audit log</em> — conflict 仍寫到 audit、application 看 audit 知道發生過</li>
<li>真的需要 strict consistency 別用 multi-master、用 single-primary + reader 或 distributed SQL</li>
</ul>
<h3 id="2-sequence-collision--two-region-各自-next-同號">2. Sequence collision — Two region 各自 next 同號</h3>
<p><code>SERIAL</code> / <code>IDENTITY</code> 用 sequence、兩 region 各自 nextval 可能拿到同 number、INSERT 衝突（PK duplicate）。</p>
<p>修法：</p>
<ul>
<li>用 <em>staggered sequence range</em>：node1 用 1-1M、node2 用 1M+1 到 2M（用 <code>setval</code>）</li>
<li>或用 <em>UUID</em>（v4 / v7）作 PK、跨 node 無 collision</li>
<li>或 <em>sequence per-node namespace</em>：<code>CREATE SEQUENCE orders_id_node1 START 1 INCREMENT 2</code>（odd vs even）</li>
</ul>
<h3 id="3-ddl-replication-不自動">3. DDL replication 不自動</h3>
<p>PG logical replication（pgEdge / BDR 基礎）<em>不自動 replicate DDL</em>。每 node <code>CREATE TABLE</code> / <code>ALTER TABLE</code> 必須 <em>分別跑</em>。</p>
<p>修法：</p>
<ul>
<li>用 <em>deployment automation</em>（Ansible / Terraform）對所有 node 同時跑 DDL</li>
<li>pgEdge 提供 <code>spock.replicate_ddl(...)</code> 把 DDL 轉成可 replicate event</li>
<li>BDR Enterprise 有 <em>DDL replication</em>（商業 feature）</li>
<li>DDL 變更前確認 <em>所有 node 都健康</em>、減少 partial state</li>
</ul>
<h3 id="4-conflict-log-治理--log-table-爆滿">4. Conflict log 治理 — Log table 爆滿</h3>
<p>每個 conflict 寫進 <code>spock.conflict_log</code> / <code>bdr.conflict_history</code> 等 table、log 累積 disk 爆。</p>
<p>修法：</p>
<ul>
<li>設 <em>log retention</em>：cron 定期 archive + delete 老 conflict log</li>
<li>監控 conflict rate — 高 conflict rate 是 application 設計問題（不是 ops 問題）</li>
<li>對 <em>strict business</em> conflict 寫進 application-level audit table、不只 system log</li>
</ul>
<h3 id="5-failover-後-timeline-分歧">5. Failover 後 timeline 分歧</h3>
<p>Multi-master 設計上 <em>每 region 是 primary</em>、Region A 掛了 Region B 接管 — 但 Region A 復活後 <em>仍認為自己是 primary</em>。如果 Region A 復活前已有寫入沒 replicate 出去、resolution 跟 LWW 衝突。</p>
<p>修法：</p>
<ul>
<li><em>Fence Region A 復活</em>：物理 fence（network firewall）+ 手動 unfence 流程</li>
<li>用 <em>etcd / Consul</em> 跟 BDR / Spock 整合 leader election（避免 split-brain）</li>
<li>對 cross-region multi-master、必須有 <em>runbook</em> 處理 region 復活流程、不靠自動</li>
</ul>
<h2 id="何時用-multi-master-vs-不用">何時用 multi-master vs 不用</h2>
<table>
  <thead>
      <tr>
          <th>情境</th>
          <th>建議</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>真正 cross-region active-active write 需求</td>
          <td>BDR / pgEdge</td>
      </tr>
      <tr>
          <td>不可中斷 maintenance（zero downtime upgrade）</td>
          <td>BDR / pgEdge</td>
      </tr>
      <tr>
          <td>高 conflict rate（shared counter / inventory）</td>
          <td>不要 multi-master、用 distributed SQL</td>
      </tr>
      <tr>
          <td>Read scaling 為主、可接受 stale read</td>
          <td>streaming replication + read replica（更簡單）</td>
      </tr>
      <tr>
          <td>Strict consistency 需求</td>
          <td>single-primary + sync replication 或 Aurora DSQL / Spanner</td>
      </tr>
      <tr>
          <td>預算敏感 + 不想養 BDR / pgEdge ops</td>
          <td>不要 multi-master、用 managed distributed SQL</td>
      </tr>
  </tbody>
</table>
<h2 id="跟-mysql-group-replication-對比">跟 MySQL Group Replication 對比</h2>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>PG Multi-Master</th>
          <th>MySQL Group Replication</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>內建？</td>
          <td>否、需 extension</td>
          <td>是、5.7+ 內建</td>
      </tr>
      <tr>
          <td>商業 vs 開源</td>
          <td>BDR 商業 / pgEdge 開源</td>
          <td>Oracle 商業 / community 都行</td>
      </tr>
      <tr>
          <td>Sync mode</td>
          <td>可（BDR eager）</td>
          <td>是（certification-based）</td>
      </tr>
      <tr>
          <td>Conflict resolution</td>
          <td>LWW / column / user-defined</td>
          <td>Certification-based（distributed transaction）</td>
      </tr>
      <tr>
          <td>Production maturity</td>
          <td>BDR 高、pgEdge 中</td>
          <td>高（Oracle 推）</td>
      </tr>
      <tr>
          <td>Use case 比例</td>
          <td>少（PG 多用 single-primary）</td>
          <td>較多（MySQL 推 InnoDB Cluster）</td>
      </tr>
  </tbody>
</table>
<p>MySQL GR 內建 + Oracle 推、PG 沒對應內建。對 multi-master 需求重的 org、MySQL 走 GR 路徑更直接。</p>
<h2 id="跟其他模組整合">跟其他模組整合</h2>
<h3 id="跟-replication-topology">跟 Replication Topology</h3>
<p>Multi-master 是 <em>streaming replication 之上的 logical replication 加雙向</em>、不取代 streaming。Streaming 仍給 standby / failover、multi-master 給 active-active write。詳見 <a href="/blog/backend/01-database/vendors/postgresql/replication-topology/" data-link-title="PostgreSQL Replication Topology：async / sync / quorum 三模式跟 LSN &#43; replication slot 的三軸組合" data-link-desc="PostgreSQL streaming replication 不是「sync 或 async」、是 *durability / latency / consistency* 三軸組合 &#43; LSN-based 進度追蹤 &#43; replication slot 治理。本文走 3 軸取捨模型、async / sync / quorum-based sync 行為對比、LSN &#43; replication slot 機制、配置 step-by-step、5 production 踩雷（standby lag 暴衝 / sync standby 退回 async / orphan replication slot / cascading replication 雪崩 / failover 後 timeline 分歧）、跟 Patroni HA &#43; logical replication 整合">Replication Topology</a>。</p>
<h3 id="跟-logical-replication">跟 Logical Replication</h3>
<p>pgEdge / BDR 都基於 logical replication slot、跟 <a href="/blog/backend/01-database/vendors/postgresql/logical-replication-debezium/" data-link-title="PostgreSQL Logical Replication &#43; Debezium CDC：replication slot × failure × recovery 對照" data-link-desc="PostgreSQL logical replication slot 跟 Debezium CDC 的失效模式對照表：slot lag 撐爆 primary disk / schema change 斷流 / 初始 COPY 鎖表 / zombie slot 不釋放 / replay storm 後 offset reset；publication / subscription / pgoutput 配置、跟 Kafka outbox pattern 整合">Logical Replication + Debezium</a> 共用 PG logical decoding infrastructure、但 <em>配置 + tooling</em> 不同。</p>
<h3 id="跟-mvcc">跟 MVCC</h3>
<p>Multi-master 的 conflict 在 <em>commit 後</em> 偵測（async）、不在 transaction 內。跟單機 MVCC（同 cluster 內 transaction snapshot）不同層。詳見 <a href="/blog/backend/01-database/vendors/postgresql/mvcc-lock-model/" data-link-title="PostgreSQL MVCC &#43; Lock Model：為什麼 PG 比 MySQL 少 deadlock、但 vacuum 是別的代價" data-link-desc="PG 用 *MVCC-heavy &#43; 少 explicit lock* 的並行控制、跟 MySQL InnoDB 的 *lock-based*（record / gap / next-key）相反。本文走 MVCC 機制（tuple version &#43; xmin/xmax &#43; visibility）、PG 4 種 lock（row-level / table-level / advisory / predicate）、預測 SERIALIZABLE 行為、5 production 踩雷（idle transaction 卡 vacuum / SELECT FOR UPDATE 跨 transaction / advisory lock 沒釋放 / bloat 不是 vacuum 問題 / predicate lock 在 SSI 下 rollback）、跟 MySQL lock-contention sibling 對比">MVCC + Lock Model</a>。</p>
<h2 id="相關連結">相關連結</h2>
<ul>
<li><a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL vendor overview</a></li>
<li><a href="/blog/backend/01-database/vendors/postgresql/replication-topology/" data-link-title="PostgreSQL Replication Topology：async / sync / quorum 三模式跟 LSN &#43; replication slot 的三軸組合" data-link-desc="PostgreSQL streaming replication 不是「sync 或 async」、是 *durability / latency / consistency* 三軸組合 &#43; LSN-based 進度追蹤 &#43; replication slot 治理。本文走 3 軸取捨模型、async / sync / quorum-based sync 行為對比、LSN &#43; replication slot 機制、配置 step-by-step、5 production 踩雷（standby lag 暴衝 / sync standby 退回 async / orphan replication slot / cascading replication 雪崩 / failover 後 timeline 分歧）、跟 Patroni HA &#43; logical replication 整合">PG Replication Topology</a>（streaming + multi-master 共存）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/logical-replication-debezium/" data-link-title="PostgreSQL Logical Replication &#43; Debezium CDC：replication slot × failure × recovery 對照" data-link-desc="PostgreSQL logical replication slot 跟 Debezium CDC 的失效模式對照表：slot lag 撐爆 primary disk / schema change 斷流 / 初始 COPY 鎖表 / zombie slot 不釋放 / replay storm 後 offset reset；publication / subscription / pgoutput 配置、跟 Kafka outbox pattern 整合">PG Logical Replication + Debezium</a>（logical decoding 基礎）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/mvcc-lock-model/" data-link-title="PostgreSQL MVCC &#43; Lock Model：為什麼 PG 比 MySQL 少 deadlock、但 vacuum 是別的代價" data-link-desc="PG 用 *MVCC-heavy &#43; 少 explicit lock* 的並行控制、跟 MySQL InnoDB 的 *lock-based*（record / gap / next-key）相反。本文走 MVCC 機制（tuple version &#43; xmin/xmax &#43; visibility）、PG 4 種 lock（row-level / table-level / advisory / predicate）、預測 SERIALIZABLE 行為、5 production 踩雷（idle transaction 卡 vacuum / SELECT FOR UPDATE 跨 transaction / advisory lock 沒釋放 / bloat 不是 vacuum 問題 / predicate lock 在 SSI 下 rollback）、跟 MySQL lock-contention sibling 對比">PG MVCC + Lock Model</a>（multi-master conflict vs 單機 MVCC）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/patroni-ha/" data-link-title="PostgreSQL Patroni HA：從 leader 失聯到 client 重連的 5 段 failover lifecycle" data-link-desc="Patroni 把 PostgreSQL HA 拆成 detection / election / promotion / reconfiguration / recovery 五段 lifecycle、每段都有獨立配置跟 failure mode；DCS quorum &#43; watchdog 防 split-brain、async/sync replication 取捨、5 個 production 踩雷、跟 PgBouncer / HAProxy / cert-manager 整合">PG Patroni HA</a>（single-primary HA 替代方案）</li>
<li><a href="/blog/backend/01-database/global-distributed-oltp/" data-link-title="1.11 全球分散式 OLTP" data-link-desc="Spanner / Aurora DSQL / Cosmos DB multi-region write / CockroachDB / TiDB 的全球一致性取捨">1.11 全球分散式 OLTP</a>（multi-master vs distributed SQL）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/group-replication/" data-link-title="MySQL Group Replication / InnoDB Cluster：single-primary vs multi-primary mode 對 transaction certification 的影響" data-link-desc="MySQL Group Replication 提供 synchronous multi-primary replication、用 Paxos-like Group Communication Engine（GCE）達成 quorum-based commit。但「multi-primary」不是「single-primary 多開幾個 write 入口」、是 *transaction conflict detection &#43; certification* 整個機制不同。本文走 GR 機制（GCE &#43; certification &#43; applier）、single-primary vs multi-primary mode、InnoDB Cluster 跟 MySQL Shell / Router 整合、5 production 踩雷（cert lag / write conflict / large transaction / network partition / member 加入 catch-up）、何時用 GR 何時用傳統 replication">MySQL Group Replication</a>（sibling、不同實作）</li>
<li>官方：<a href="https://www.enterprisedb.com/products/edb-postgres-distributed-bdr">EDB BDR</a> / <a href="https://www.pgedge.com/">pgEdge</a> / <a href="https://github.com/pgEdge/spock">Spock GitHub</a> / <a href="https://bucardo.org/">Bucardo</a></li>
</ul>
]]></content:encoded></item></channel></rss>