<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Interleaved-Tables on Tarragon</title><link>https://tarrragon.github.io/blog/tags/interleaved-tables/</link><description>Recent content in Interleaved-Tables on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Wed, 27 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/interleaved-tables/index.xml" rel="self" type="application/rss+xml"/><item><title>Spanner Schema Migration Without Downtime + Interleaved Tables</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/spanner/schema-migration-interleaved-tables/</link><pubDate>Wed, 27 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/spanner/schema-migration-interleaved-tables/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/spanner/" data-link-title="Google Cloud Spanner" data-link-desc="全球分散式 strong-consistency OLTP、TrueTime API、線性擴展到 10 億 req/sec">Cloud Spanner&lt;/a> overview 的 implementation-layer deep article。Overview 已說明 Spanner 在全球 OLTP 譜系的定位、本文聚焦 &lt;em>schema migration without downtime + interleaved tables&lt;/em> — Spanner 兩個跟傳統 SQL 差異最大的 schema 機制。&lt;/p>&lt;/blockquote>
&lt;hr>
&lt;h2 id="問題情境ddl-不停機跟-parent-child-物理-layout-的兩個疑問">問題情境：DDL 不停機跟 parent-child 物理 layout 的兩個疑問&lt;/h2>
&lt;p>傳統 PostgreSQL / MySQL DDL 拿 ACCESS EXCLUSIVE / metadata lock、線上跑 ALTER TABLE 動輒鎖表幾分鐘、大型 schema change 要 pt-osc / gh-ost / pg_repack 等外掛工具。Spanner 宣稱「schema change 不停機」、但團隊不知道實際機制跟邊界。讀者徵兆通常從這幾個地方浮現：「Spanner ALTER 真的不卡寫入嗎」「INDEX backfill 跑了 12 小時是正常嗎」「parent-child 的 INTERLEAVE IN PARENT 是什麼黑魔法」「ON DELETE CASCADE 在 interleaved table 為什麼是 storage-level 而不是 application-level」。&lt;/p>
&lt;p>真實壓力：multi-tenant SaaS 要對 100 億 row 的 orders 表加 column + 加 index、不能停機、不能讓 p99 write latency 超過 SLA。團隊以為「Spanner schema change 不停機」等同於「DDL 瞬間完成」、實際 ALTER 是 long-running operation、index backfill 在大表上跑數小時到數天、capacity 規劃要把 backfill 期間的 CPU 升幅算進去。&lt;/p>
&lt;p>Case anchor：&lt;strong>缺案例&lt;/strong>。9.C10 是 Google internal dogfood case、未展開 schema migration 細節、且 9.C10 不是 customer-facing capacity reference。本文用通用 pattern + 官方文件 + 反向回 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/online-schema-change/" data-link-title="PostgreSQL Online Schema Change：先用 ALTER 內建特性、不能解才 pg_repack / pg-osc" data-link-desc="PostgreSQL ALTER TABLE 對多數變更已是 *fast catalog-only*（add column nullable / drop column / 改 default），不必走 ghost table tool。本文走 PG 內建 fast DDL 行為、何時必須走 pg_repack / pg-osc、兩工具機制對比（trigger-based vs WAL-shipping）、配置 step-by-step、5 production 踩雷（lock 升級 / VACUUM FULL 誤用 / pg_repack version mismatch / concurrent index 失敗清理 / generated stored column 不能 online）、跟 MySQL gh-ost / pt-osc sibling 對比">PostgreSQL Online Schema Change&lt;/a> 對照、待後續 customer case audit 補強。&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/spanner/" data-link-title="Google Cloud Spanner" data-link-desc="全球分散式 strong-consistency OLTP、TrueTime API、線性擴展到 10 億 req/sec">Cloud Spanner</a> overview 的 implementation-layer deep article。Overview 已說明 Spanner 在全球 OLTP 譜系的定位、本文聚焦 <em>schema migration without downtime + interleaved tables</em> — Spanner 兩個跟傳統 SQL 差異最大的 schema 機制。</p></blockquote>
<hr>
<h2 id="問題情境ddl-不停機跟-parent-child-物理-layout-的兩個疑問">問題情境：DDL 不停機跟 parent-child 物理 layout 的兩個疑問</h2>
<p>傳統 PostgreSQL / MySQL DDL 拿 ACCESS EXCLUSIVE / metadata lock、線上跑 ALTER TABLE 動輒鎖表幾分鐘、大型 schema change 要 pt-osc / gh-ost / pg_repack 等外掛工具。Spanner 宣稱「schema change 不停機」、但團隊不知道實際機制跟邊界。讀者徵兆通常從這幾個地方浮現：「Spanner ALTER 真的不卡寫入嗎」「INDEX backfill 跑了 12 小時是正常嗎」「parent-child 的 INTERLEAVE IN PARENT 是什麼黑魔法」「ON DELETE CASCADE 在 interleaved table 為什麼是 storage-level 而不是 application-level」。</p>
<p>真實壓力：multi-tenant SaaS 要對 100 億 row 的 orders 表加 column + 加 index、不能停機、不能讓 p99 write latency 超過 SLA。團隊以為「Spanner schema change 不停機」等同於「DDL 瞬間完成」、實際 ALTER 是 long-running operation、index backfill 在大表上跑數小時到數天、capacity 規劃要把 backfill 期間的 CPU 升幅算進去。</p>
<p>Case anchor：<strong>缺案例</strong>。9.C10 是 Google internal dogfood case、未展開 schema migration 細節、且 9.C10 不是 customer-facing capacity reference。本文用通用 pattern + 官方文件 + 反向回 <a href="/blog/backend/01-database/vendors/postgresql/online-schema-change/" data-link-title="PostgreSQL Online Schema Change：先用 ALTER 內建特性、不能解才 pg_repack / pg-osc" data-link-desc="PostgreSQL ALTER TABLE 對多數變更已是 *fast catalog-only*（add column nullable / drop column / 改 default），不必走 ghost table tool。本文走 PG 內建 fast DDL 行為、何時必須走 pg_repack / pg-osc、兩工具機制對比（trigger-based vs WAL-shipping）、配置 step-by-step、5 production 踩雷（lock 升級 / VACUUM FULL 誤用 / pg_repack version mismatch / concurrent index 失敗清理 / generated stored column 不能 online）、跟 MySQL gh-ost / pt-osc sibling 對比">PostgreSQL Online Schema Change</a> 對照、待後續 customer case audit 補強。</p>
<h2 id="核心機制ddl-是-long-runningtruetime-對齊-schema-version">核心機制：DDL 是 long-running、TrueTime 對齊 schema version</h2>
<h3 id="schema-change-的-lifecycle">Schema change 的 lifecycle</h3>
<p>Spanner DDL 不是同步 ALTER、是 <em>long-running operation</em>。TrueTime 給每次 schema change 分配一個 version timestamp、所有 read / write 用各自 transaction timestamp 對應「當下看到哪個 schema version」。讀者要理解的核心是：DDL 不是「鎖表→改→解鎖」、是「廣播新 schema version、讓現有 transaction 用舊 schema、新 transaction 用新 schema、背景 backfill 物理資料」。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln"> 1</span><span class="cl">時間軸：
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">T0 (DDL 開始)
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">     |
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">     | ──── 舊 schema 仍可用、新 schema metadata 廣播 ────
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">     |
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">T1 (metadata 完成)
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">     |
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">     | ──── 新 transaction 用新 schema、舊 transaction 完成自己 ────
</span></span><span class="line"><span class="ln">10</span><span class="cl">     | ──── backfill 開始（背景）────
</span></span><span class="line"><span class="ln">11</span><span class="cl">     |
</span></span><span class="line"><span class="ln">12</span><span class="cl">T2 (backfill 完成)
</span></span><span class="line"><span class="ln">13</span><span class="cl">     |
</span></span><span class="line"><span class="ln">14</span><span class="cl">     | ──── 新 schema fully serve ────</span></span></code></pre></div><p>DDL 本身瞬間完成的部分是 <em>metadata 廣播</em>（毫秒到秒級）、慢的部分是 <em>backfill</em>（依資料量、可能數小時到數天）。讀者常見誤解是把 metadata 完成當「DDL 完成」、實際 query 還沒走新 index 因為 backfill 沒跑完。</p>
<h3 id="不停機的關鍵不同-ddl-的兩階段行為">不停機的關鍵：不同 DDL 的兩階段行為</h3>
<table>
  <thead>
      <tr>
          <th>DDL 類型</th>
          <th>metadata 行為</th>
          <th>backfill 行為</th>
          <th>阻塞？</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>ADD COLUMN</code>（無 NOT NULL）</td>
          <td>metadata-only、瞬間生效</td>
          <td>不需 backfill（新 column 預設 NULL）</td>
          <td>不阻塞 write</td>
      </tr>
      <tr>
          <td><code>ADD COLUMN</code>（NOT NULL）</td>
          <td>必須兩階段：先 ADD COLUMN with default、後 ADD CONSTRAINT</td>
          <td>兩階段間需 backfill default</td>
          <td>不阻塞 write、但兩階段不能合</td>
      </tr>
      <tr>
          <td><code>CREATE INDEX</code></td>
          <td>metadata 立即</td>
          <td>背景 backfill、不阻塞 write；backfill 完才 serve query</td>
          <td>不阻塞 write、阻塞「該 index 的 query」</td>
      </tr>
      <tr>
          <td><code>DROP COLUMN</code></td>
          <td>metadata 立即</td>
          <td>背景 GC dead column</td>
          <td>不阻塞</td>
      </tr>
      <tr>
          <td><code>ALTER COLUMN TYPE</code></td>
          <td>限制多、查最新文件</td>
          <td>-</td>
          <td>-</td>
      </tr>
  </tbody>
</table>
<p>讀者要記的是：<strong>index backfill 完成前、query 該 index 會 fallback 到 table scan</strong>、用 <code>EXPLAIN</code> 確認 query plan 走新 index 才算真正完成。沒做這層驗證、團隊會以為 CREATE INDEX 已經成功、實際 p99 query latency 還在表掃描的數量級。</p>
<h3 id="interleaved-table-的設計">Interleaved table 的設計</h3>
<p><a href="/blog/backend/knowledge-cards/interleaved-table/" data-link-title="Interleaved Table" data-link-desc="Spanner 把 parent / child table row 物理交錯儲存、parent &#43; child JOIN 不跨 split">Interleaved Table</a> 把 parent table（如 <code>Customer</code>）跟 child table（如 <code>Order</code>）的 row 在 storage 層 <em>物理上交錯儲存</em> — child row 跟對應 parent row 在同一個 split。不是純 foreign key、是 storage layout：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln"> 1</span><span class="cl">傳統 PostgreSQL FK 設計（兩張獨立表）：
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">Customer table:  [c1, c2, c3, ...]  → 一張表、一段 storage range
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">Order table:     [o1, o2, o3, ...]  → 另一張表、另一段 storage range
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">FK 由 planner 在 JOIN 時拼接、可能跨 page / 跨 segment
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">Spanner Interleaved 設計（物理交錯）：
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">Storage layout: [c1, c1.o1, c1.o2, c2, c2.o1, c2.o2, c2.o3, c3, ...]
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">                 |____________________|  |________________|
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">                  c1 + 其 child           c2 + 其 child
</span></span><span class="line"><span class="ln">10</span><span class="cl">                  在同一個 split          在同一個 split</span></span></code></pre></div><p>Interleaved 的效果：parent + child JOIN 在同一個 <a href="/blog/backend/knowledge-cards/range-sharding/" data-link-title="Range Sharding" data-link-desc="分散式 SQL 把 key space 切成可自動 split / merge 的 range、每個 range 自己的 consensus group、application 透明">Range Sharding</a> split 完成、不跨 split = 不跨 Paxos group = 低延遲 transaction。這條設計把「FK 是 logical constraint」翻成「parent-child access pattern 是 physical co-location」、對 access pattern 固定的 workload（customer → orders、user → posts、tenant → records）是巨大 latency benefit。</p>
<h3 id="interleaved-的硬限">Interleaved 的硬限</h3>
<table>
  <thead>
      <tr>
          <th>限制</th>
          <th>影響</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>必須以 parent primary key 為 prefix</td>
          <td>child PK 第一段必須是 parent PK、不能完全自由</td>
      </tr>
      <tr>
          <td>最深 7 層</td>
          <td>深巢狀關係要選層級</td>
      </tr>
      <tr>
          <td><code>ON DELETE</code> 只能 CASCADE 或 NO ACTION</td>
          <td>不像 PG FK 有 SET NULL / SET DEFAULT</td>
      </tr>
      <tr>
          <td>一旦建立、無法直接 ALTER 改 interleave</td>
          <td>要改 → export + recreate + import、不是 ALTER</td>
      </tr>
  </tbody>
</table>
<p>最後一條是讀者最容易踩的雷 — 一開始沒設 interleaved、後悔時要 export-import 100 億 row、是大工程、不是 ALTER。Schema 設計階段要先 audit access pattern、決定哪些 parent-child 該 interleave。</p>
<h3 id="跟通用-fk-概念的差異">跟通用 FK 概念的差異</h3>
<p>PostgreSQL FK 是 logical constraint、JOIN 由 planner 處理；Spanner interleaved 是 physical layout、JOIN cost 跟 single-table access 接近。對應 <a href="/blog/backend/knowledge-cards/transaction-boundary/" data-link-title="Transaction Boundary" data-link-desc="說明哪些資料變更應在同一個交易中一起成功或一起回復">transaction-boundary</a> 卡 — interleaved 讓 transaction boundary 跟 storage boundary 對齊、跨 split transaction 變少、commit wait + Paxos round-trip 也省。</p>
<h2 id="操作流程ddl-跟-interleaved-table-的具體步驟">操作流程：DDL 跟 interleaved table 的具體步驟</h2>
<h3 id="加-column">加 column</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">Orders</span><span class="w"> </span><span class="k">ADD</span><span class="w"> </span><span class="k">COLUMN</span><span class="w"> </span><span class="n">tax_amount</span><span class="w"> </span><span class="n">FLOAT64</span><span class="p">;</span></span></span></code></pre></div><p>執行後拿 long-running operation id、用 <code>gcloud spanner operations list</code> 觀察狀態：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">gcloud spanner operations list --instance<span class="o">=</span>prod --database<span class="o">=</span>app
</span></span><span class="line"><span class="ln">2</span><span class="cl">gcloud spanner operations describe projects/.../operations/&lt;op-id&gt;</span></span></code></pre></div><p>驗證點：operation 顯示 <code>done: true</code> 後、跑 <code>SELECT tax_amount FROM Orders LIMIT 1</code> 確認 column 可查。</p>
<h3 id="加-index">加 index</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">CREATE</span><span class="w"> </span><span class="k">INDEX</span><span class="w"> </span><span class="n">OrdersByCustomer</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">Orders</span><span class="p">(</span><span class="n">customer_id</span><span class="p">);</span></span></span></code></pre></div><p>拿 operation id → 用 Monitoring metric <code>spanner.googleapis.com/instance/indexes/backfill_progress</code>（或對應的最新 metric、查官方文件）追蹤進度。Backfill 完成前 query 不會走新 index、要用 <code>EXPLAIN</code> 確認：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">EXPLAIN</span><span class="w"> </span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">Orders</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">customer_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;c123&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="c1">-- 應看到 plan 用 OrdersByCustomer index、不是 table scan</span></span></span></code></pre></div><h3 id="創建-interleaved-table">創建 interleaved table</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="o">`</span><span class="k">Order</span><span class="o">`</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">    </span><span class="n">customer_id</span><span class="w"> </span><span class="n">INT64</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">    </span><span class="n">order_id</span><span class="w"> </span><span class="n">INT64</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">    </span><span class="n">amount</span><span class="w"> </span><span class="n">FLOAT64</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">    </span><span class="n">created_at</span><span class="w"> </span><span class="k">TIMESTAMP</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w"> </span><span class="k">PRIMARY</span><span class="w"> </span><span class="k">KEY</span><span class="w"> </span><span class="p">(</span><span class="n">customer_id</span><span class="p">,</span><span class="w"> </span><span class="n">order_id</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w">  </span><span class="n">INTERLEAVE</span><span class="w"> </span><span class="k">IN</span><span class="w"> </span><span class="n">PARENT</span><span class="w"> </span><span class="n">Customer</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="k">DELETE</span><span class="w"> </span><span class="k">CASCADE</span><span class="p">;</span></span></span></code></pre></div><p>關鍵約束：</p>
<ul>
<li>child PK <code>(customer_id, order_id)</code> 第一段是 parent PK</li>
<li><code>ON DELETE CASCADE</code> 是 storage-level — 刪 parent row 自動刪 child row、Spanner 內部處理、不是 trigger</li>
</ul>
<h3 id="從-non-interleaved-改成-interleaved">從 non-interleaved 改成 interleaved</h3>
<p><em>無法直接 ALTER</em>、要走 export-recreate-import：</p>
<ol>
<li>用 Dataflow / <code>gcloud spanner databases export</code> 把舊表 export 到 GCS</li>
<li>建新表（interleaved schema）</li>
<li>用 Dataflow / <code>gcloud spanner databases import</code> 把資料倒回</li>
<li>應用層 cutover（feature flag / dual write）</li>
</ol>
<p>這個流程是 mini-migration、要走完整 <a href="../migrate-from-cloud-sql-pg/">migration playbook</a> 的 phase plan。Schema 設計階段就決定好 interleave、避免後悔成本。</p>
<h3 id="rollback-boundary">Rollback boundary</h3>
<p>DDL 完成前可 <code>gcloud spanner operations cancel</code> 取消；完成後加 index 要 DROP、加 column 要 DROP COLUMN（同樣是 long-running）。讀者要先確認自己在 DDL 哪個階段、cancel 跟 reverse DDL 是兩條不同路徑。</p>
<h2 id="失敗模式5-個-production-踩雷">失敗模式：5 個 production 踩雷</h2>
<h3 id="backfill-時間沒估event-window-撞牆">Backfill 時間沒估、event window 撞牆</h3>
<p>100 億 row 加 index、預期 1 小時、實際 12 小時 — 沒先用 <code>cost</code> 估 + 沒監控進度 metric。事故場景：團隊在 black friday 前一週開 CREATE INDEX、以為週末跑完、實際週末仍在 backfill、event 期間 CPU 升、query latency 退化。</p>
<p>修法：</p>
<ul>
<li>DDL 前用小表 benchmark backfill 速度（rows/sec）、推估大表時間</li>
<li>DDL 期間監控 <code>instance/cpu/smoothed_utilization</code>、若 &gt; 80% 暫停或降流量</li>
<li>大 DDL 排在 capacity headroom 充足的時段、避開 event window</li>
</ul>
<h3 id="interleaved-table-一開始沒設後悔時要-recreate">Interleaved table 一開始沒設、後悔時要 recreate</h3>
<p>100 億 row export-import + cutover 是大工程、不是 ALTER。事故場景：團隊一開始把 Customer / Order 設成獨立表、上線一年後發現 customer → orders access pattern 是 99% 的 query、JOIN 跨 split 付 commit wait + Paxos cost、想改 interleaved、發現要 mini-migration。</p>
<p>修法：</p>
<ul>
<li>Schema 設計階段就 audit access pattern、決定哪些 parent-child 該 interleave</li>
<li>寫 ADR 把 interleave 決策跟業務 access pattern 綁定、避免後悔成本</li>
</ul>
<h3 id="把-interleaved-跟-fk-混為一談">把 interleaved 跟 FK 混為一談</h3>
<p>interleaved 的 <code>ON DELETE CASCADE</code> 是 storage-level、刪 parent 自動刪 child；非 interleaved FK 要 application 或 trigger 處理。事故場景：團隊以為「我加了 FK 就會 CASCADE」、實際非 interleaved table 只是 constraint check、刪 parent 時 child orphan、對帳爆炸。</p>
<p>修法：</p>
<ul>
<li>Schema 設計時明確分類：interleaved（storage-level CASCADE）vs FK constraint（只檢查、不 CASCADE）</li>
<li>非 interleaved 的 parent-child 刪除邏輯放應用層、寫入對帳測試</li>
</ul>
<h3 id="加-not-null-一步到位">加 NOT NULL 一步到位</h3>
<p>直接 <code>ALTER ADD COLUMN x INT64 NOT NULL</code> 會失敗、必須兩階段。事故場景：開發環境 schema 是新建空表、<code>ADD COLUMN NOT NULL</code> OK；production 表有資料、ADD 失敗、團隊以為 Spanner 不支援、回退。</p>
<p>修法：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- Phase 1: ADD with default
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">Orders</span><span class="w"> </span><span class="k">ADD</span><span class="w"> </span><span class="k">COLUMN</span><span class="w"> </span><span class="n">tax_amount</span><span class="w"> </span><span class="n">FLOAT64</span><span class="w"> </span><span class="k">DEFAULT</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="c1">-- 等 backfill 完成
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"></span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="c1">-- Phase 2: ADD CONSTRAINT
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">Orders</span><span class="w"> </span><span class="k">ALTER</span><span class="w"> </span><span class="k">COLUMN</span><span class="w"> </span><span class="n">tax_amount</span><span class="w"> </span><span class="k">SET</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">;</span></span></span></code></pre></div><h3 id="schema-change-期間舊-client-還在用舊-schema">Schema change 期間舊 client 還在用舊 schema</h3>
<p>TrueTime 保證 read 看到自己 timestamp 對應的 schema version、但 client SDK cache schema 過期會 retry — 沒處理會看到 transient error。事故場景：DDL 完成後、舊 client session 看到 transient <code>FAILED_PRECONDITION</code>、團隊以為 DDL 失敗、回退。</p>
<p>修法：</p>
<ul>
<li>應用層處理 transient retry（指數退避）</li>
<li>DDL 完成後重新 deploy app instance、避免長期 stale schema cache</li>
</ul>
<h2 id="容量與觀測backfill-是-cpu--io-的額外負載">容量與觀測：Backfill 是 CPU + I/O 的額外負載</h2>
<p>必看 metric：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">spanner.googleapis.com/instance/cpu/smoothed_utilization
</span></span><span class="line"><span class="ln">2</span><span class="cl">   → backfill 期間 CPU 升幅、判讀是否撞 headroom
</span></span><span class="line"><span class="ln">3</span><span class="cl">api/api_request_count for ExecuteSql
</span></span><span class="line"><span class="ln">4</span><span class="cl">   → application traffic 是否受 backfill 影響
</span></span><span class="line"><span class="ln">5</span><span class="cl">long-running operation API progress
</span></span><span class="line"><span class="ln">6</span><span class="cl">   → DDL 自身進度（不是 query 進度）</span></span></code></pre></div><p>Backfill 期間的 capacity impact：DDL 跑在 background priority、但仍佔 CPU、需要在 instance 有足夠 headroom（建議 &lt; 65% CPU baseline 才開大 backfill）。capacity 規劃要把 schema migration 列入 buffer、回 <a href="/blog/backend/09-performance-capacity/capacity-planning/" data-link-title="9.6 容量規劃模型" data-link-desc="peak forecast、headroom budget、growth curve、autoscaling sizing">9.6 容量規劃模型</a>。</p>
<p>Observability evidence：backfill 開始 timestamp、operation id、predicted duration、實際 duration、CPU peak — 全進 incident decision log、回 <a href="/blog/backend/04-observability/observability-evidence-package/" data-link-title="4.20 Observability Evidence Package" data-link-desc="把 log、metric、trace、audit 與資料品質限制包成可交接證據">4.20 Observability Evidence Package</a>。</p>
<p>監控盲點：DDL operation 失敗 silent fail 在 <code>gcloud operations describe</code> 才能看到、Cloud Monitoring 沒有直接 alert。團隊要寫自己的 polling script、operation 失敗時主動 alert、不靠 Cloud Monitoring default。</p>
<h2 id="邊界與整合何時不用-interleaved怎麼跟-pg-對照">邊界與整合：何時不用 interleaved、怎麼跟 PG 對照</h2>
<h3 id="何時不用-interleaved">何時不用 interleaved</h3>
<ul>
<li>小 table（&lt; 1M row、單機可放）：不需要 interleave、用 standard FK 就好</li>
<li>過度 interleave 7 層：把 split 變窄、反而 hot、得不償失</li>
<li>access pattern 不是 parent-child JOIN：interleave 沒 benefit、純粹給 schema 加複雜度</li>
</ul>
<h3 id="跟-postgresql-的對照">跟 PostgreSQL 的對照</h3>
<p><a href="/blog/backend/01-database/vendors/postgresql/online-schema-change/" data-link-title="PostgreSQL Online Schema Change：先用 ALTER 內建特性、不能解才 pg_repack / pg-osc" data-link-desc="PostgreSQL ALTER TABLE 對多數變更已是 *fast catalog-only*（add column nullable / drop column / 改 default），不必走 ghost table tool。本文走 PG 內建 fast DDL 行為、何時必須走 pg_repack / pg-osc、兩工具機制對比（trigger-based vs WAL-shipping）、配置 step-by-step、5 production 踩雷（lock 升級 / VACUUM FULL 誤用 / pg_repack version mismatch / concurrent index 失敗清理 / generated stored column 不能 online）、跟 MySQL gh-ost / pt-osc sibling 對比">PostgreSQL Online Schema Change</a> 用 pg_repack / pt-osc workflow 模擬「不停機」 — 實際是用 trigger + 影子表 + cutover 把 lock 時間壓到秒級、不是真正瞬間。Spanner 是原生支援 DDL long-running operation、不需要外掛工具、但 backfill 時間在大表上仍長、跟 pg_repack 在大表上的執行時間量級接近。</p>
<p>差異點：</p>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>PostgreSQL（pg_repack / pt-osc）</th>
          <th>Spanner</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Lock 時間</td>
          <td>秒級（cutover 時短鎖）</td>
          <td>毫秒（metadata 廣播）</td>
      </tr>
      <tr>
          <td>Backfill 時間</td>
          <td>數小時</td>
          <td>數小時</td>
      </tr>
      <tr>
          <td>工具</td>
          <td>外掛</td>
          <td>原生</td>
      </tr>
      <tr>
          <td>Schema version</td>
          <td>單版</td>
          <td>TrueTime timestamp 對齊多版並存</td>
      </tr>
      <tr>
          <td>大表加 NOT NULL</td>
          <td>一步到位（搭配 default）</td>
          <td>必須兩階段</td>
      </tr>
  </tbody>
</table>
<p>讀者選 Spanner 不是為了「DDL 更快」、是為了「不依賴外掛 + 多版本並存」。實際在大表上的耗時兩邊差不多。</p>
<h3 id="sibling-deep-articles">Sibling deep articles</h3>
<ul>
<li><a href="../truetime-api-depth/">truetime-api-depth</a>：schema version 也是 TrueTime timestamp、跟 transaction timestamp 同層機制</li>
<li><a href="../migrate-from-cloud-sql-pg/">migrate-from-cloud-sql-pg</a>：target schema 設計含 interleaved、Phase 1 必讀本文</li>
<li><a href="../consistency-models-comparison/">consistency-models-comparison</a>：schema change 期間多版本並存的一致性保證</li>
</ul>
<h3 id="跟-1x-章節">跟 1.x 章節</h3>
<p><a href="/blog/backend/01-database/schema-design/" data-link-title="1.2 Schema Design 與資料建模" data-link-desc="整理 table、index、key、partition、denormalization 與命名規則">Schema Design</a> — interleaved 是 schema 設計的物理層決策、不是純 logical design。對照 <a href="/blog/backend/01-database/schema-migration-rollout-evidence/" data-link-title="1.7 Schema Migration Rollout 證據（Schema Migration Rollout Evidence）實作示範" data-link-desc="以訂單付款狀態欄位演進示範 schema migration 如何產出 evidence、release gate 與 incident decision log。">schema-migration-rollout-evidence</a> 看 schema rollout 的 evidence 收集模式。</p>
<h3 id="anti-recommendation">Anti-recommendation</h3>
<p>讀者讀完本文應該能判斷：interleaved 不是「強制使用」的 feature、是「access pattern 固定時的 latency benefit」。小規模 OLTP、access pattern 不確定的 workload、用 standard PostgreSQL FK 就好、為 interleaved 付 schema 後悔成本的判準很高。</p>
]]></content:encoded></item></channel></rss>