<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Partitioning on Tarragon</title><link>https://tarrragon.github.io/blog/tags/partitioning/</link><description>Recent content in Partitioning on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Fri, 22 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/partitioning/index.xml" rel="self" type="application/rss+xml"/><item><title>MySQL Partitioning：partition lifecycle 五段、跟 Vitess sharding 不同的「同 instance 內水平切割」</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/partitioning/</link><pubDate>Tue, 19 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/partitioning/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL&lt;/a> overview 的 implementation-layer deep article。Overview 已說明 MySQL 在 OLTP 譜系的定位、本文聚焦 &lt;em>native partitioning&lt;/em> — 5 段 lifecycle + 4 種 type + 跟 Vitess sharding / PG partitioning 對比。&lt;/p>&lt;/blockquote>
&lt;hr>
&lt;h2 id="partition-lifecycle-五段">Partition lifecycle 五段&lt;/h2>
&lt;p>MySQL native partitioning 是 &lt;em>同 instance 內把一個邏輯 table 拆成多個 physical sub-table&lt;/em>、optimizer 可選擇只 scan 相關 partition。整個 partition lifecycle 5 段：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">Design 決定 partition key / type / 數量
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl"> ↓
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">Create CREATE TABLE ... PARTITION BY ...
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl"> ↓
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">Query WHERE clause + partition pruning
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl"> ↓
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">Maintenance ADD / DROP / REORGANIZE / EXCHANGE
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">8&lt;/span>&lt;span class="cl"> ↓
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">9&lt;/span>&lt;span class="cl">Drop 整個 partition 一次刪（比 DELETE FROM 快 1000x）&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>每段都有獨立工程決策。設計階段選錯 partition key、後續 maintenance + query 全部 broken。&lt;/p>
&lt;p>跟 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/vitess-sharding/" data-link-title="MySQL Vitess Sharding：VTGate / VTTablet / VReplication / VSchema 四件套協作" data-link-desc="Vitess 不只是 MySQL sharding proxy、是 4 個 component 協作的完整 sharding 系統 — VTGate（query routing layer）、VTTablet（per-MySQL agent）、VReplication（跨 shard 資料移動）、VSchema（sharding metadata）。本文走 4 件套各自責任、keyspace / shard / tablet 架構、shard key 設計（Vindex）、配置 step-by-step、5 production 踩雷（cross-shard transaction / VStream lag / Vindex 不均勻 / resharding 切流 / VReplication 卡住）、跟自管 sharding 跟 PlanetScale 的對比">Vitess sharding&lt;/a> 對比：&lt;/p>
&lt;ul>
&lt;li>&lt;em>MySQL partitioning&lt;/em>：同 instance、optimizer 自動 pruning、無 cross-instance network cost&lt;/li>
&lt;li>&lt;em>Vitess sharding&lt;/em>：跨 instance、application 透過 VTGate routing、可線性 scale&lt;/li>
&lt;/ul>
&lt;p>兩者不衝突、可組合：Vitess shard 內部 &lt;em>再&lt;/em> 用 MySQL partition（例如：shard 切 16 個、每個 shard 的 table 再按月份 partition）。&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL</a> overview 的 implementation-layer deep article。Overview 已說明 MySQL 在 OLTP 譜系的定位、本文聚焦 <em>native partitioning</em> — 5 段 lifecycle + 4 種 type + 跟 Vitess sharding / PG partitioning 對比。</p></blockquote>
<hr>
<h2 id="partition-lifecycle-五段">Partition lifecycle 五段</h2>
<p>MySQL native partitioning 是 <em>同 instance 內把一個邏輯 table 拆成多個 physical sub-table</em>、optimizer 可選擇只 scan 相關 partition。整個 partition lifecycle 5 段：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">Design       決定 partition key / type / 數量
</span></span><span class="line"><span class="ln">2</span><span class="cl">   ↓
</span></span><span class="line"><span class="ln">3</span><span class="cl">Create       CREATE TABLE ... PARTITION BY ...
</span></span><span class="line"><span class="ln">4</span><span class="cl">   ↓
</span></span><span class="line"><span class="ln">5</span><span class="cl">Query        WHERE clause + partition pruning
</span></span><span class="line"><span class="ln">6</span><span class="cl">   ↓
</span></span><span class="line"><span class="ln">7</span><span class="cl">Maintenance  ADD / DROP / REORGANIZE / EXCHANGE
</span></span><span class="line"><span class="ln">8</span><span class="cl">   ↓
</span></span><span class="line"><span class="ln">9</span><span class="cl">Drop         整個 partition 一次刪（比 DELETE FROM 快 1000x）</span></span></code></pre></div><p>每段都有獨立工程決策。設計階段選錯 partition key、後續 maintenance + query 全部 broken。</p>
<p>跟 <a href="/blog/backend/01-database/vendors/mysql/vitess-sharding/" data-link-title="MySQL Vitess Sharding：VTGate / VTTablet / VReplication / VSchema 四件套協作" data-link-desc="Vitess 不只是 MySQL sharding proxy、是 4 個 component 協作的完整 sharding 系統 — VTGate（query routing layer）、VTTablet（per-MySQL agent）、VReplication（跨 shard 資料移動）、VSchema（sharding metadata）。本文走 4 件套各自責任、keyspace / shard / tablet 架構、shard key 設計（Vindex）、配置 step-by-step、5 production 踩雷（cross-shard transaction / VStream lag / Vindex 不均勻 / resharding 切流 / VReplication 卡住）、跟自管 sharding 跟 PlanetScale 的對比">Vitess sharding</a> 對比：</p>
<ul>
<li><em>MySQL partitioning</em>：同 instance、optimizer 自動 pruning、無 cross-instance network cost</li>
<li><em>Vitess sharding</em>：跨 instance、application 透過 VTGate routing、可線性 scale</li>
</ul>
<p>兩者不衝突、可組合：Vitess shard 內部 <em>再</em> 用 MySQL partition（例如：shard 切 16 個、每個 shard 的 table 再按月份 partition）。</p>
<h2 id="4-種-partition-type">4 種 partition type</h2>
<h3 id="range-partitioning--連續區間切割">RANGE partitioning — 連續區間切割</h3>
<p>最常見、適合 time-series / 連續數字：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w">    </span><span class="n">id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="w"> </span><span class="n">AUTO_INCREMENT</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">    </span><span class="n">user_id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="n">amount</span><span class="w"> </span><span class="nb">DECIMAL</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span><span class="mi">2</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="n">created_at</span><span class="w"> </span><span class="n">DATETIME</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">    </span><span class="k">PRIMARY</span><span class="w"> </span><span class="k">KEY</span><span class="w"> </span><span class="p">(</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">created_at</span><span class="p">)</span><span class="w">              </span><span class="c1">-- PK 必須含 partition key
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="c1"></span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w"></span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">RANGE</span><span class="w"> </span><span class="p">(</span><span class="n">TO_DAYS</span><span class="p">(</span><span class="n">created_at</span><span class="p">))</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p202601</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">LESS</span><span class="w"> </span><span class="k">THAN</span><span class="w"> </span><span class="p">(</span><span class="n">TO_DAYS</span><span class="p">(</span><span class="s1">&#39;2026-02-01&#39;</span><span class="p">)),</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p202602</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">LESS</span><span class="w"> </span><span class="k">THAN</span><span class="w"> </span><span class="p">(</span><span class="n">TO_DAYS</span><span class="p">(</span><span class="s1">&#39;2026-03-01&#39;</span><span class="p">)),</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p202603</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">LESS</span><span class="w"> </span><span class="k">THAN</span><span class="w"> </span><span class="p">(</span><span class="n">TO_DAYS</span><span class="p">(</span><span class="s1">&#39;2026-04-01&#39;</span><span class="p">)),</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p_future</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">LESS</span><span class="w"> </span><span class="k">THAN</span><span class="w"> </span><span class="k">MAXVALUE</span><span class="w">  </span><span class="c1">-- 未來資料 fallback
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="c1"></span><span class="p">);</span></span></span></code></pre></div><p>優點：</p>
<ul>
<li>Partition pruning 高效（時間 range query）</li>
<li>整個月 archive 直接 <code>ALTER TABLE orders DROP PARTITION p202601</code>、毫秒級</li>
</ul>
<p>缺點：</p>
<ul>
<li>必須 <em>預先建</em> 未來 partition（或用 <code>p_future</code> fallback、但 fallback partition 變大就失去 pruning 意義）</li>
<li><em>Hot partition</em> — 最新 partition 接收所有 INSERT、其他 partition 純歷史</li>
</ul>
<h3 id="list-partitioning--離散值切割">LIST partitioning — 離散值切割</h3>
<p>適合 enum-like value：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">users</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w">    </span><span class="n">id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">    </span><span class="n">name</span><span class="w"> </span><span class="nb">VARCHAR</span><span class="p">(</span><span class="mi">100</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="n">region</span><span class="w"> </span><span class="nb">VARCHAR</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="k">PRIMARY</span><span class="w"> </span><span class="k">KEY</span><span class="w"> </span><span class="p">(</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">region</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w"></span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">LIST</span><span class="w"> </span><span class="n">COLUMNS</span><span class="w"> </span><span class="p">(</span><span class="n">region</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p_asia</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">IN</span><span class="w"> </span><span class="p">(</span><span class="s1">&#39;TW&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;JP&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;KR&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;CN&#39;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p_americas</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">IN</span><span class="w"> </span><span class="p">(</span><span class="s1">&#39;US&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;CA&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;BR&#39;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p_emea</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">IN</span><span class="w"> </span><span class="p">(</span><span class="s1">&#39;GB&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;DE&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;FR&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;IT&#39;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w"></span><span class="p">);</span></span></span></code></pre></div><p>優點：對 enum-like value 直接命中、pruning 簡單。</p>
<p>缺點：value list 不能變更（不 supported <code>ALTER PARTITION ADD VALUE</code>）、新國家代碼必須 REORGANIZE。</p>
<h3 id="hash-partitioning--均勻分布">HASH partitioning — 均勻分布</h3>
<p>對 numeric / string column 取 hash、均勻分布：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">    </span><span class="n">id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">    </span><span class="n">user_id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">    </span><span class="n">event_type</span><span class="w"> </span><span class="nb">VARCHAR</span><span class="p">(</span><span class="mi">50</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">    </span><span class="k">PRIMARY</span><span class="w"> </span><span class="k">KEY</span><span class="w"> </span><span class="p">(</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">user_id</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w"></span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">HASH</span><span class="w"> </span><span class="p">(</span><span class="n">user_id</span><span class="p">)</span><span class="w"> </span><span class="n">PARTITIONS</span><span class="w"> </span><span class="mi">8</span><span class="p">;</span></span></span></code></pre></div><p>優點：均勻分布、沒有 hot partition。</p>
<p>缺點：</p>
<ul>
<li><em>Range query 沒效</em> — <code>WHERE user_id BETWEEN 100 AND 200</code> 不能 pruning、scan 全部 partition</li>
<li>Partition 數量改變需要 REORGANIZE 整張表</li>
</ul>
<h3 id="key-partitioning--mysql-內部-hash">KEY partitioning — MySQL 內部 hash</h3>
<p>跟 HASH 類似、但用 MySQL 內部 hash function（不依賴 column 是否 integer）：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">sessions</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">    </span><span class="n">session_id</span><span class="w"> </span><span class="nb">VARCHAR</span><span class="p">(</span><span class="mi">64</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">    </span><span class="n">user_id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">    </span><span class="k">data</span><span class="w"> </span><span class="nb">TEXT</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">    </span><span class="k">PRIMARY</span><span class="w"> </span><span class="k">KEY</span><span class="w"> </span><span class="p">(</span><span class="n">session_id</span><span class="p">,</span><span class="w"> </span><span class="n">user_id</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w"></span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="k">KEY</span><span class="w"> </span><span class="p">(</span><span class="n">user_id</span><span class="p">)</span><span class="w"> </span><span class="n">PARTITIONS</span><span class="w"> </span><span class="mi">16</span><span class="p">;</span></span></span></code></pre></div><p>用於 <em>string column 或 composite column</em> 的均勻分布。一般場景跟 HASH 效果接近。</p>
<h3 id="sub-partitioning--兩層切割">Sub-partitioning — 兩層切割</h3>
<p>RANGE + HASH 組合、深化分隔：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">big_events</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w">    </span><span class="n">id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">    </span><span class="n">user_id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="n">created_at</span><span class="w"> </span><span class="n">DATETIME</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="k">PRIMARY</span><span class="w"> </span><span class="k">KEY</span><span class="w"> </span><span class="p">(</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">created_at</span><span class="p">,</span><span class="w"> </span><span class="n">user_id</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w"></span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">RANGE</span><span class="w"> </span><span class="p">(</span><span class="n">TO_DAYS</span><span class="p">(</span><span class="n">created_at</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w"></span><span class="n">SUBPARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">HASH</span><span class="w"> </span><span class="p">(</span><span class="n">user_id</span><span class="p">)</span><span class="w"> </span><span class="n">SUBPARTITIONS</span><span class="w"> </span><span class="mi">4</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p202601</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">LESS</span><span class="w"> </span><span class="k">THAN</span><span class="w"> </span><span class="p">(</span><span class="n">TO_DAYS</span><span class="p">(</span><span class="s1">&#39;2026-02-01&#39;</span><span class="p">)),</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p202602</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">LESS</span><span class="w"> </span><span class="k">THAN</span><span class="w"> </span><span class="p">(</span><span class="n">TO_DAYS</span><span class="p">(</span><span class="s1">&#39;2026-03-01&#39;</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w"></span><span class="p">);</span></span></span></code></pre></div><p>每個 RANGE partition 又拆 4 個 HASH sub-partition、共 8 個 physical storage location。適合 <em>時間 range + user_id hash</em> 兩維度。</p>
<p>實務罕用、複雜性高、調 query plan 困難。多數 case 用 single-level partition 即可。</p>
<h2 id="partition-pruning--optimizer-怎麼選-partition">Partition Pruning — Optimizer 怎麼選 partition</h2>
<p><code>EXPLAIN PARTITIONS SELECT ...</code> 顯示 query 命中哪些 partition：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">EXPLAIN</span><span class="w"> </span><span class="n">PARTITIONS</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="k">BETWEEN</span><span class="w"> </span><span class="s1">&#39;2026-02-15&#39;</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="s1">&#39;2026-02-20&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="o">+</span><span class="c1">----+-------------+--------+------------+-------+
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"></span><span class="o">|</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">select_type</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="k">table</span><span class="w">  </span><span class="o">|</span><span class="w"> </span><span class="n">partitions</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="k">type</span><span class="w">  </span><span class="o">|</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w"></span><span class="o">+</span><span class="c1">----+-------------+--------+------------+-------+
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="c1"></span><span class="o">|</span><span class="w">  </span><span class="mi">1</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="k">SIMPLE</span><span class="w">      </span><span class="o">|</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">p202602</span><span class="w">    </span><span class="o">|</span><span class="w"> </span><span class="n">range</span><span class="w"> </span><span class="o">|</span><span class="w">
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="w"></span><span class="o">+</span><span class="c1">----+-------------+--------+------------+-------+</span></span></span></code></pre></div><p>只命中 <code>p202602</code>、其他 partition 不 scan。</p>
<p><strong>Pruning 失效場景</strong>：</p>
<ol>
<li>
<p><strong>Function on partition key</strong>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">WHERE</span><span class="w"> </span><span class="k">YEAR</span><span class="p">(</span><span class="n">created_at</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">2026</span><span class="w">  </span><span class="c1">-- 沒 pruning、scan 全部</span></span></span></code></pre></div><p>應該寫成：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">WHERE</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">&gt;=</span><span class="w"> </span><span class="s1">&#39;2026-01-01&#39;</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="s1">&#39;2027-01-01&#39;</span></span></span></code></pre></div></li>
<li>
<p><strong>Implicit conversion</strong>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">WHERE</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;2026-02-15&#39;</span><span class="w">  </span><span class="c1">-- 字串 vs DATETIME、可能失效</span></span></span></code></pre></div><p>應該：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">WHERE</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">TIMESTAMP</span><span class="w"> </span><span class="s1">&#39;2026-02-15 00:00:00&#39;</span></span></span></code></pre></div></li>
<li>
<p><strong>OR 跨 partition</strong>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">WHERE</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;2026-02-15&#39;</span><span class="w"> </span><span class="k">OR</span><span class="w"> </span><span class="n">user_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">100</span><span class="w">  </span><span class="c1">-- partition + non-partition column OR、scan 全部</span></span></span></code></pre></div></li>
<li>
<p><strong>JOIN 不直接 filter partition key</strong>：JOIN 條件不含 partition key、optimizer 估計無法 pruning。</p>
</li>
</ol>
<h2 id="partition-maintenance--add--drop--reorganize--exchange">Partition Maintenance — ADD / DROP / REORGANIZE / EXCHANGE</h2>
<h3 id="add-partition">ADD partition</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">ADD</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p202604</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">LESS</span><span class="w"> </span><span class="k">THAN</span><span class="w"> </span><span class="p">(</span><span class="n">TO_DAYS</span><span class="p">(</span><span class="s1">&#39;2026-05-01&#39;</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="p">);</span></span></span></code></pre></div><p>對 RANGE 簡單、但要 <em>排在 MAXVALUE partition 之前</em>（如果有 <code>p_future</code>、要先 REORGANIZE）。</p>
<h3 id="drop-partition">DROP partition</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">DROP</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p202601</span><span class="p">;</span></span></span></code></pre></div><p>直接刪 partition file、毫秒級完成。是 <em>time-series archive 的最大優勢</em> — 對比 <code>DELETE FROM orders WHERE created_at &lt; '...'</code> 跑 hours。</p>
<h3 id="reorganize-partition">REORGANIZE partition</h3>
<p>切分 / 合併 partition：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 切：把 p_future 切成 p202604 + new p_future
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="n">REORGANIZE</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p_future</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p202604</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">LESS</span><span class="w"> </span><span class="k">THAN</span><span class="w"> </span><span class="p">(</span><span class="n">TO_DAYS</span><span class="p">(</span><span class="s1">&#39;2026-05-01&#39;</span><span class="p">)),</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">    </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p_future</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">LESS</span><span class="w"> </span><span class="k">THAN</span><span class="w"> </span><span class="k">MAXVALUE</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="p">);</span></span></span></code></pre></div><p>REORGANIZE <em>rewrites partition data</em>、跟 OSC 一樣慢、大 partition 走 gh-ost / pt-osc 模擬（用 ghost table）。</p>
<h3 id="exchange-partition">EXCHANGE partition</h3>
<p>把 partition 跟 <em>獨立 table</em> swap（不複製資料）：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 建一個 staging table 跟 partition 同 schema
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders_staging</span><span class="w"> </span><span class="k">LIKE</span><span class="w"> </span><span class="n">orders</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders_staging</span><span class="w"> </span><span class="n">REMOVE</span><span class="w"> </span><span class="n">PARTITIONING</span><span class="p">;</span><span class="w">  </span><span class="c1">-- staging 必須是 non-partitioned
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"></span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="c1">-- 把 archive partition 的資料 atomic swap 給 staging
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="n">EXCHANGE</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">p202601</span><span class="w"> </span><span class="k">WITH</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders_staging</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="w"></span><span class="c1">-- 現在 orders_staging 有 p202601 的資料、orders 的 p202601 變空
</span></span></span><span class="line"><span class="ln">9</span><span class="cl"><span class="c1">-- 可以 dump staging 到 S3、或 INSERT 進 archive DB</span></span></span></code></pre></div><p><code>EXCHANGE PARTITION</code> 是 <em>metadata operation</em>、毫秒級完成、不複製資料。Time-series archive 工作流的核心工具。</p>
<h2 id="5-個-production-踩雷">5 個 Production 踩雷</h2>
<h3 id="1-pk-必須含-partition-key--schema-設計受限">1. PK 必須含 partition key — Schema 設計受限</h3>
<p>MySQL partition 規則：<strong>PK 必須包含所有 partition key column</strong>。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 錯：PK 沒包含 partition key
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">    </span><span class="n">id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="w"> </span><span class="n">AUTO_INCREMENT</span><span class="w"> </span><span class="k">PRIMARY</span><span class="w"> </span><span class="k">KEY</span><span class="p">,</span><span class="w">  </span><span class="c1">-- 只有 id
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"></span><span class="w">    </span><span class="n">created_at</span><span class="w"> </span><span class="n">DATETIME</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">RANGE</span><span class="w"> </span><span class="p">(</span><span class="n">TO_DAYS</span><span class="p">(</span><span class="n">created_at</span><span class="p">))</span><span class="w"> </span><span class="p">(...);</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w"></span><span class="c1">-- ERROR 1503: A PRIMARY KEY must include all columns in the table&#39;s partitioning function</span></span></span></code></pre></div>




<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 對：PK 包含 partition key
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">    </span><span class="n">id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="w"> </span><span class="n">AUTO_INCREMENT</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">    </span><span class="n">created_at</span><span class="w"> </span><span class="n">DATETIME</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">    </span><span class="k">PRIMARY</span><span class="w"> </span><span class="k">KEY</span><span class="w"> </span><span class="p">(</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">created_at</span><span class="p">)</span><span class="w">  </span><span class="c1">-- 兩 column 都進 PK
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1"></span><span class="p">)</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">RANGE</span><span class="w"> </span><span class="p">(</span><span class="n">TO_DAYS</span><span class="p">(</span><span class="n">created_at</span><span class="p">))</span><span class="w"> </span><span class="p">(...);</span></span></span></code></pre></div><p>修法：</p>
<ul>
<li>接受 PK 是 composite（id + partition_key column）</li>
<li>AUTO_INCREMENT 仍 work、但 INSERT 必須給定 created_at</li>
<li><em>Unique constraint 也受影響</em> — 所有 UNIQUE index 必須含 partition key</li>
</ul>
<p>對 application：原本 <code>WHERE id = X</code> 仍 work、但慢（沒 partition pruning）、必須 <code>WHERE id = X AND created_at &gt;= ...</code> 才高效。</p>
<h3 id="2-global-index-沒原生支援">2. Global index 沒原生支援</h3>
<p>MySQL partitioning <em>沒 global secondary index</em>（PG 有）。每個 partition 各自有自己的 local index、跨 partition 的 unique constraint 必須 <em>包含 partition key</em>。</p>
<p>例：希望 <code>user_id</code> 全表 unique、但 partition by <code>created_at</code>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- MySQL 不允許這樣 — UNIQUE 必須含 created_at
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">    </span><span class="n">id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="w"> </span><span class="n">AUTO_INCREMENT</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">    </span><span class="n">user_id</span><span class="w"> </span><span class="nb">BIGINT</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">    </span><span class="n">created_at</span><span class="w"> </span><span class="n">DATETIME</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w">    </span><span class="k">PRIMARY</span><span class="w"> </span><span class="k">KEY</span><span class="w"> </span><span class="p">(</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">created_at</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w">    </span><span class="k">UNIQUE</span><span class="w"> </span><span class="k">KEY</span><span class="w"> </span><span class="p">(</span><span class="n">user_id</span><span class="p">,</span><span class="w"> </span><span class="n">created_at</span><span class="p">)</span><span class="w">  </span><span class="c1">-- 必須含 created_at、不是純 user_id
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="c1"></span><span class="p">);</span></span></span></code></pre></div><p>對 application：跨 partition 的 unique 需要 <em>application 層處理</em>（INSERT 前 SELECT 檢查）或改用 Vitess <code>lookup_hash</code> Vindex。</p>
<h3 id="3-exchange-partition--schema-必須完全一致">3. EXCHANGE partition — schema 必須完全一致</h3>
<p>EXCHANGE 失敗常見：staging table 跟 partition 的 <em>index / column 順序差一個</em>、<code>ERROR 1736: Tables have different definitions</code>。</p>
<p>修法：</p>
<ul>
<li>建 staging 用 <code>CREATE TABLE staging LIKE orders</code> 而非手寫</li>
<li><code>REMOVE PARTITIONING</code> 後立即 verify schema</li>
<li>跑 OSC 改 schema 時、partition + staging table 同時改、不能漏一個</li>
</ul>
<h3 id="4-orphan-partition--future-partition-預先建忘記延展">4. Orphan partition — Future partition 預先建忘記延展</h3>
<p>部署 cron 每月建下個月 partition、cron 失敗 / pause、下個月 INSERT 無對應 partition、寫入 <code>p_future</code>。<code>p_future</code> 一年累積後變超大、partition pruning 沒效、查最近資料 scan 全表。</p>
<p>修法：</p>
<ul>
<li>監控 <code>p_future</code> partition size、超過 threshold alert</li>
<li>Cron 失敗 alert（不是 silent fail）</li>
<li>不依賴 cron、改成 <em>application 層在 INSERT 前 ensure partition exists</em>（lazy create）</li>
</ul>
<h3 id="5-cross-partition-query-慢">5. Cross-partition query 慢</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SELECT</span><span class="w"> </span><span class="n">user_id</span><span class="p">,</span><span class="w"> </span><span class="k">SUM</span><span class="p">(</span><span class="n">amount</span><span class="p">)</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">user_id</span><span class="p">;</span></span></span></code></pre></div><p>沒 partition key filter、optimizer 不能 pruning、scan 全部 partition。比 <em>single big table without partition</em> 還慢（因為跨 partition aggregation overhead）。</p>
<p>修法：</p>
<ul>
<li>接受 partition 不是 <em>讀效能</em> 工具、是 <em>write + archive 效能</em> 工具</li>
<li>跨 partition aggregation 改 <em>materialized aggregation table</em>（trigger / scheduled job 維護）</li>
<li>跨 partition reporting 改丟 OLAP DB（BigQuery / Snowflake / ClickHouse）</li>
</ul>
<h2 id="跟-vitess-sharding-對比">跟 Vitess sharding 對比</h2>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>MySQL partitioning</th>
          <th>Vitess sharding</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>切割範圍</td>
          <td>同 instance 內</td>
          <td>跨 instance（無上限）</td>
      </tr>
      <tr>
          <td>Cross-shard query</td>
          <td>不適用</td>
          <td>VTGate 自動 split + aggregate</td>
      </tr>
      <tr>
          <td>Resharding</td>
          <td>REORGANIZE（rewrite data）</td>
          <td>VReplication 自動</td>
      </tr>
      <tr>
          <td>Operational cost</td>
          <td>低（單 instance 內）</td>
          <td>高（4 component Vitess stack）</td>
      </tr>
      <tr>
          <td>可線性 scale write</td>
          <td>否（單 instance 寫吞吐限）</td>
          <td>是（加 shard）</td>
      </tr>
      <tr>
          <td>Archive 效率</td>
          <td>DROP PARTITION 毫秒級</td>
          <td>不是 archive 工具</td>
      </tr>
  </tbody>
</table>
<p>兩者不衝突、適用不同問題。Partitioning 解決 <em>單 instance archive + write 集中</em>、sharding 解決 <em>跨 instance scale</em>。</p>
<h2 id="跟-postgresql-declarative-partitioning-對比">跟 PostgreSQL declarative-partitioning 對比</h2>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>MySQL partitioning</th>
          <th>PostgreSQL declarative-partitioning</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Partition type</td>
          <td>RANGE / LIST / HASH / KEY</td>
          <td>RANGE / LIST / HASH</td>
      </tr>
      <tr>
          <td>Sub-partitioning</td>
          <td>RANGE + HASH</td>
          <td>多層 nested 支援更廣</td>
      </tr>
      <tr>
          <td>Global index</td>
          <td>無</td>
          <td>PG 11+ 有</td>
      </tr>
      <tr>
          <td>Partition wise join</td>
          <td>受限</td>
          <td>PG 11+ 強</td>
      </tr>
      <tr>
          <td>Cross-partition unique</td>
          <td>必須含 partition key</td>
          <td>PG 11+ 同限制、但 PG 17+ 部分解除</td>
      </tr>
      <tr>
          <td>Partition attach</td>
          <td>EXCHANGE PARTITION</td>
          <td>ATTACH PARTITION</td>
      </tr>
      <tr>
          <td>操作工具</td>
          <td>gh-ost / pt-osc 對 partition</td>
          <td>pg_partman（成熟）</td>
      </tr>
      <tr>
          <td>Production maturity</td>
          <td>中（5.x 開始有、8.0 強化）</td>
          <td>高（11+ declarative 後成熟）</td>
      </tr>
  </tbody>
</table>
<p>PG partitioning 對 <em>跨 partition unique</em> 跟 <em>partition-wise join</em> 處理較好、是 reporting workload 的優勢。MySQL partitioning 對 <em>archive workflow</em>（DROP / EXCHANGE）較成熟。詳見 <a href="/blog/backend/01-database/vendors/postgresql/declarative-partitioning/" data-link-title="PostgreSQL declarative partitioning：partition 不是切表、是讓 planner pruning" data-link-desc="Declarative partitioning 的真實價值是 query planner pruning &#43; maintenance scope 縮小、不是「把大表切小」；RANGE / LIST / HASH 取捨、partition key 選法、5 個 production 踩雷（key 選錯不 prune / unique 不 enforce 跨 partition / ATTACH 鎖太久 / partition 數爆 / DETACH 不 reclaim 空間）、跟 autovacuum &#43; index 設計整合">PostgreSQL Declarative Partitioning</a>。</p>
<h2 id="何時用-native-partitioning">何時用 native partitioning</h2>
<table>
  <thead>
      <tr>
          <th>場景</th>
          <th>建議</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Time-series workload + archive needs（log / event / order history）</td>
          <td>用 RANGE</td>
      </tr>
      <tr>
          <td>大表 &gt; 1 TB 且 query 多有 time filter</td>
          <td>用 RANGE 加速 prune</td>
      </tr>
      <tr>
          <td>跨 region / 跨業務切分</td>
          <td>用 LIST</td>
      </tr>
      <tr>
          <td>需要 <em>線性 scale write throughput</em></td>
          <td>不用 partition、用 Vitess sharding</td>
      </tr>
      <tr>
          <td>需要 <em>全表 unique constraint</em></td>
          <td>不用 partition、影響太大</td>
      </tr>
      <tr>
          <td>主要做 ad-hoc analytical query</td>
          <td>不用 partition、OLAP DB（ClickHouse / BigQuery）</td>
      </tr>
      <tr>
          <td>小表 &lt; 100 GB</td>
          <td>不必 partition、index 夠用</td>
      </tr>
  </tbody>
</table>
<h2 id="跟其他模組整合">跟其他模組整合</h2>
<h3 id="跟-online-schema-change">跟 Online Schema Change</h3>
<p>對 partitioned table 的 schema change（ALTER COLUMN）必須 <em>每個 partition 都改</em>。gh-ost / pt-osc 對 partitioned table 仍 work、但複雜性增加。詳見 <a href="/blog/backend/01-database/vendors/mysql/online-schema-change-tools/" data-link-title="MySQL Online Schema Change：gh-ost 跟 pt-online-schema-change 兩條完全不同的 ghost table 路徑" data-link-desc="MySQL ALTER TABLE 可能鎖整張表，production 需要 online schema change 流程。gh-ost（GitHub）跟 pt-online-schema-change（Percona）都用 ghost table 解決、但底層機制完全不同：pt-osc 用 trigger 同步、gh-ost 用 binlog stream 同步。本文走兩工具機制對照表 → trigger vs binlog 各自取捨 → 配置 step-by-step → 5 production 踩雷（trigger overhead / binlog 延遲 / FK constraint / hot trigger lock / 切換瞬間 deadlock）→ 何時用哪一個">Online Schema Change Tools</a>。</p>
<h3 id="跟-vitess">跟 Vitess</h3>
<p>Vitess shard 內部可再 partition、單 shard 對應一個 MySQL instance、partition 是 instance 內優化。Vitess <code>vtctldclient PartitionTablet</code> 命令處理 shard-aware partition 操作。詳見 <a href="/blog/backend/01-database/vendors/mysql/vitess-sharding/" data-link-title="MySQL Vitess Sharding：VTGate / VTTablet / VReplication / VSchema 四件套協作" data-link-desc="Vitess 不只是 MySQL sharding proxy、是 4 個 component 協作的完整 sharding 系統 — VTGate（query routing layer）、VTTablet（per-MySQL agent）、VReplication（跨 shard 資料移動）、VSchema（sharding metadata）。本文走 4 件套各自責任、keyspace / shard / tablet 架構、shard key 設計（Vindex）、配置 step-by-step、5 production 踩雷（cross-shard transaction / VStream lag / Vindex 不均勻 / resharding 切流 / VReplication 卡住）、跟自管 sharding 跟 PlanetScale 的對比">Vitess sharding</a>。</p>
<h3 id="跟-innodb-tuning">跟 InnoDB Tuning</h3>
<p>每個 partition 是獨立 InnoDB tablespace（<code>innodb_file_per_table=ON</code> 預設）、buffer pool 內 cache 行為跟 single big table 不同。Partition 多時 buffer pool warm-up 時間更長。詳見 <a href="/blog/backend/01-database/vendors/mysql/innodb-tuning/" data-link-title="MySQL InnoDB Tuning：為什麼一個 100 GB DB 在 64 GB RAM server 上 query 慢 5 倍" data-link-desc="InnoDB 是 MySQL 預設 storage engine、預設值給 256 MB buffer pool（早期 default）。本文從一個常見痛點開場（DB &gt; RAM 但 server 仍 swap）、走 4 個 critical knob（buffer pool / redo log / flush method / IO capacity）、各自如何影響讀寫吞吐、配置 step-by-step、5 production 踩雷（buffer pool warm-up / log file 大小 / 設 sync_binlog=0 換速度 / IO scheduler / undo log 膨脹）、跟 SSD / NVMe / EBS 的 IO 假設">InnoDB Tuning</a>。</p>
<h3 id="跟-replication">跟 Replication</h3>
<p>Partition operation（ADD / DROP / EXCHANGE）是 DDL、走 binlog、replica apply 時可能 <em>locking issue</em>（特別是 EXCHANGE 跟 replica running query 衝突）。詳見 <a href="/blog/backend/01-database/vendors/mysql/replication-topology/" data-link-title="MySQL Replication Topology：async / semi-sync / GTID 不是三選一、是三個 trade-off 軸的疊加" data-link-desc="MySQL replication 不是「選 async 還是 semi-sync」、是 *durability / latency / consistency* 三個 trade-off 軸的疊加；GTID 是跨 mode 的 infrastructure layer、不是第三種 mode。本文走 3 軸取捨模型 → async / semi-sync 行為對比 → GTID 替代 binlog-position 的好處 → 配置 step-by-step → 5 production 踩雷（lag 暴衝 / semi-sync 退回 async / GTID gap / Loss-Less semi-sync 真的 loss-less / chained replication 雪崩）→ 跟 Aurora MySQL / Vitess / ProxySQL / Orchestrator 整合">Replication Topology</a>。</p>
<h3 id="跟-query-optimization">跟 Query Optimization</h3>
<p><code>EXPLAIN PARTITIONS</code> 是 partition-aware query optimization 的關鍵工具、看 query 真的命中哪些 partition。詳見 <a href="/blog/backend/01-database/vendors/mysql/query-optimization/" data-link-title="MySQL Query Optimization：從 EXPLAIN 看到實際執行、5 條 query 從 5 秒變 50ms 的 anatomy" data-link-desc="MySQL query 慢的根因不在「SQL 寫法」、在「optimizer 選錯 plan」。本文從 5 個常見 production case 開場（5 秒 → 50ms / 30 秒 → 200ms / 8 秒 → 30ms 等）、走 EXPLAIN / EXPLAIN ANALYZE / optimizer trace 三層分析工具、index hint vs optimizer hint 取捨、cardinality estimation 失效時的修法、5 production 踩雷（statistics 過時 / forced index 用錯 / hash join 沒觸發 / range scan 退化 ALL / derived table materialization）">Query Optimization</a>。</p>
<h2 id="容量規劃要點">容量規劃要點</h2>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>建議</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Partition 數量上限</td>
          <td>8.0 預設 8192、實務建議 &lt; 1000（管理成本上升）</td>
      </tr>
      <tr>
          <td>單 partition 大小</td>
          <td>10 GB - 100 GB（太小無 partition value、太大 prune 沒效）</td>
      </tr>
      <tr>
          <td>RANGE 時間 partition</td>
          <td>月 / 週 / 日（依資料量）</td>
      </tr>
      <tr>
          <td>HASH partition 數量</td>
          <td>通常 power of 2（8 / 16 / 32 / 64）</td>
      </tr>
      <tr>
          <td>Future partition pre-create</td>
          <td>至少 6 個月 buffer、cron 每月 add 1 個</td>
      </tr>
  </tbody>
</table>
<h2 id="相關連結">相關連結</h2>
<ul>
<li><a href="/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL vendor overview</a></li>
<li><a href="/blog/backend/01-database/vendors/mysql/vitess-sharding/" data-link-title="MySQL Vitess Sharding：VTGate / VTTablet / VReplication / VSchema 四件套協作" data-link-desc="Vitess 不只是 MySQL sharding proxy、是 4 個 component 協作的完整 sharding 系統 — VTGate（query routing layer）、VTTablet（per-MySQL agent）、VReplication（跨 shard 資料移動）、VSchema（sharding metadata）。本文走 4 件套各自責任、keyspace / shard / tablet 架構、shard key 設計（Vindex）、配置 step-by-step、5 production 踩雷（cross-shard transaction / VStream lag / Vindex 不均勻 / resharding 切流 / VReplication 卡住）、跟自管 sharding 跟 PlanetScale 的對比">MySQL Vitess sharding</a>（跨 instance 切割對比）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/online-schema-change-tools/" data-link-title="MySQL Online Schema Change：gh-ost 跟 pt-online-schema-change 兩條完全不同的 ghost table 路徑" data-link-desc="MySQL ALTER TABLE 可能鎖整張表，production 需要 online schema change 流程。gh-ost（GitHub）跟 pt-online-schema-change（Percona）都用 ghost table 解決、但底層機制完全不同：pt-osc 用 trigger 同步、gh-ost 用 binlog stream 同步。本文走兩工具機制對照表 → trigger vs binlog 各自取捨 → 配置 step-by-step → 5 production 踩雷（trigger overhead / binlog 延遲 / FK constraint / hot trigger lock / 切換瞬間 deadlock）→ 何時用哪一個">MySQL Online Schema Change</a>（partition table 的 schema change）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/query-optimization/" data-link-title="MySQL Query Optimization：從 EXPLAIN 看到實際執行、5 條 query 從 5 秒變 50ms 的 anatomy" data-link-desc="MySQL query 慢的根因不在「SQL 寫法」、在「optimizer 選錯 plan」。本文從 5 個常見 production case 開場（5 秒 → 50ms / 30 秒 → 200ms / 8 秒 → 30ms 等）、走 EXPLAIN / EXPLAIN ANALYZE / optimizer trace 三層分析工具、index hint vs optimizer hint 取捨、cardinality estimation 失效時的修法、5 production 踩雷（statistics 過時 / forced index 用錯 / hash join 沒觸發 / range scan 退化 ALL / derived table materialization）">MySQL Query Optimization</a>（EXPLAIN PARTITIONS）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/innodb-tuning/" data-link-title="MySQL InnoDB Tuning：為什麼一個 100 GB DB 在 64 GB RAM server 上 query 慢 5 倍" data-link-desc="InnoDB 是 MySQL 預設 storage engine、預設值給 256 MB buffer pool（早期 default）。本文從一個常見痛點開場（DB &gt; RAM 但 server 仍 swap）、走 4 個 critical knob（buffer pool / redo log / flush method / IO capacity）、各自如何影響讀寫吞吐、配置 step-by-step、5 production 踩雷（buffer pool warm-up / log file 大小 / 設 sync_binlog=0 換速度 / IO scheduler / undo log 膨脹）、跟 SSD / NVMe / EBS 的 IO 假設">MySQL InnoDB Tuning</a>（partition + buffer pool 互動）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/declarative-partitioning/" data-link-title="PostgreSQL declarative partitioning：partition 不是切表、是讓 planner pruning" data-link-desc="Declarative partitioning 的真實價值是 query planner pruning &#43; maintenance scope 縮小、不是「把大表切小」；RANGE / LIST / HASH 取捨、partition key 選法、5 個 production 踩雷（key 選錯不 prune / unique 不 enforce 跨 partition / ATTACH 鎖太久 / partition 數爆 / DETACH 不 reclaim 空間）、跟 autovacuum &#43; index 設計整合">PostgreSQL Declarative Partitioning</a>（PG sibling 對比）</li>
<li><a href="/blog/backend/knowledge-cards/partition/" data-link-title="Partition" data-link-desc="說明事件流如何切分成多個可並行處理的有序片段">Partition 卡片</a></li>
<li>官方：<a href="https://dev.mysql.com/doc/refman/8.0/en/partitioning.html">MySQL Partitioning</a></li>
</ul>
]]></content:encoded></item><item><title>PostgreSQL declarative partitioning：partition 不是切表、是讓 planner pruning</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/declarative-partitioning/</link><pubDate>Mon, 18 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/declarative-partitioning/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL&lt;/a> overview 的 implementation-layer deep article。Overview 已說明大表（&amp;gt; 1TB）需要 partitioning、本文聚焦 &lt;em>partition 真實價值在哪、為什麼多數人第一次 partition 都做錯&lt;/em>。&lt;/p>&lt;/blockquote>
&lt;h2 id="partition-不是把大表切小是讓-planner-pruning--縮小-maintenance-scope">Partition 不是「把大表切小」、是「讓 planner pruning + 縮小 maintenance scope」&lt;/h2>
&lt;p>剛開始學 partitioning 的人多半從「表太大、切小一點」直覺出發；切了之後發現 — &lt;em>query 變慢&lt;/em>（planner 還在看所有 partition）、&lt;em>INSERT 變慢&lt;/em>（trigger / partition routing overhead）、&lt;em>backup 沒變短&lt;/em>（總資料量沒變）。直覺錯了：partition 的工程價值來自兩個機制、跟「切小」沒直接關係：&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Query planner pruning&lt;/strong>：planner 在 planning 階段 &lt;em>跳過&lt;/em> 不可能命中 partition key 的 partition、查詢只 scan 相關 partition；前提是 &lt;em>WHERE 條件含 partition key&lt;/em>、否則 planner 看完所有 partition、效能反而比單表差&lt;/li>
&lt;li>&lt;strong>Maintenance scope 縮小&lt;/strong>：vacuum / index rebuild / DROP / archive 只動單一 partition、不掃整表；vacuum 12 小時變 30 分鐘 / DROP 老資料 0.01 秒、是 partition 真正回本的地方&lt;/li>
&lt;/ol>
&lt;p>partition 是 &lt;em>為了 maintenance 跟 planner pruning&lt;/em> 設計、不是「表變小」設計。漏掉這個 framing、partition 配置會錯。&lt;/p>
&lt;h2 id="range--list--hashpartition-策略對應業務形狀">RANGE / LIST / HASH：partition 策略對應業務形狀&lt;/h2>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="c1">-- RANGE: 時間序列、log、event（最常見）
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TABLE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">events&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">bigint&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">event_time&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">timestamptz&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NULL&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">payload&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">jsonb&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">PARTITION&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">BY&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">RANGE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">event_time&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TABLE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">events_2026_05&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">PARTITION&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">OF&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">events&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">FOR&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">VALUES&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">FROM&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s1">&amp;#39;2026-05-01&amp;#39;&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TO&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s1">&amp;#39;2026-06-01&amp;#39;&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="c1">-- LIST: tenant ID / region / status enum
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TABLE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">orders&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">bigint&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">14&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">tenant_id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">int&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NULL&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">15&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">16&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">PARTITION&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">BY&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">LIST&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">tenant_id&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">17&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">18&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TABLE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">orders_tenant_premium&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">PARTITION&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">OF&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">orders&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">19&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">FOR&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">VALUES&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">IN&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="mi">1001&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">1002&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">1003&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">20&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">21&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="c1">-- HASH: 均勻散落（無自然 partition key）
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">22&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TABLE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">users&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">23&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">user_id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">bigint&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NULL&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">24&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="p">...&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">25&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">PARTITION&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">BY&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">HASH&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">user_id&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">26&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">27&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TABLE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">users_0&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">PARTITION&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">OF&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">users&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">28&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">FOR&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">VALUES&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">WITH&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">MODULUS&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">4&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">REMAINDER&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">);&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>策略選擇關鍵：&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL</a> overview 的 implementation-layer deep article。Overview 已說明大表（&gt; 1TB）需要 partitioning、本文聚焦 <em>partition 真實價值在哪、為什麼多數人第一次 partition 都做錯</em>。</p></blockquote>
<h2 id="partition-不是把大表切小是讓-planner-pruning--縮小-maintenance-scope">Partition 不是「把大表切小」、是「讓 planner pruning + 縮小 maintenance scope」</h2>
<p>剛開始學 partitioning 的人多半從「表太大、切小一點」直覺出發；切了之後發現 — <em>query 變慢</em>（planner 還在看所有 partition）、<em>INSERT 變慢</em>（trigger / partition routing overhead）、<em>backup 沒變短</em>（總資料量沒變）。直覺錯了：partition 的工程價值來自兩個機制、跟「切小」沒直接關係：</p>
<ol>
<li><strong>Query planner pruning</strong>：planner 在 planning 階段 <em>跳過</em> 不可能命中 partition key 的 partition、查詢只 scan 相關 partition；前提是 <em>WHERE 條件含 partition key</em>、否則 planner 看完所有 partition、效能反而比單表差</li>
<li><strong>Maintenance scope 縮小</strong>：vacuum / index rebuild / DROP / archive 只動單一 partition、不掃整表；vacuum 12 小時變 30 分鐘 / DROP 老資料 0.01 秒、是 partition 真正回本的地方</li>
</ol>
<p>partition 是 <em>為了 maintenance 跟 planner pruning</em> 設計、不是「表變小」設計。漏掉這個 framing、partition 配置會錯。</p>
<h2 id="range--list--hashpartition-策略對應業務形狀">RANGE / LIST / HASH：partition 策略對應業務形狀</h2>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1">-- RANGE: 時間序列、log、event（最常見）
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="n">id</span><span class="w"> </span><span class="nb">bigint</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">  </span><span class="n">event_time</span><span class="w"> </span><span class="n">timestamptz</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">  </span><span class="n">payload</span><span class="w"> </span><span class="n">jsonb</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">RANGE</span><span class="w"> </span><span class="p">(</span><span class="n">event_time</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events_2026_05</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="k">OF</span><span class="w"> </span><span class="n">events</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">  </span><span class="k">FOR</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="p">(</span><span class="s1">&#39;2026-05-01&#39;</span><span class="p">)</span><span class="w"> </span><span class="k">TO</span><span class="w"> </span><span class="p">(</span><span class="s1">&#39;2026-06-01&#39;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w"></span><span class="c1">-- LIST: tenant ID / region / status enum
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">  </span><span class="n">id</span><span class="w"> </span><span class="nb">bigint</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">  </span><span class="n">tenant_id</span><span class="w"> </span><span class="nb">int</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="w">  </span><span class="p">...</span><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">LIST</span><span class="w"> </span><span class="p">(</span><span class="n">tenant_id</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="w"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">orders_tenant_premium</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="k">OF</span><span class="w"> </span><span class="n">orders</span><span class="w">
</span></span></span><span class="line"><span class="ln">19</span><span class="cl"><span class="w">  </span><span class="k">FOR</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">IN</span><span class="w"> </span><span class="p">(</span><span class="mi">1001</span><span class="p">,</span><span class="w"> </span><span class="mi">1002</span><span class="p">,</span><span class="w"> </span><span class="mi">1003</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">21</span><span class="cl"><span class="w"></span><span class="c1">-- HASH: 均勻散落（無自然 partition key）
</span></span></span><span class="line"><span class="ln">22</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">users</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">23</span><span class="cl"><span class="w">  </span><span class="n">user_id</span><span class="w"> </span><span class="nb">bigint</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">24</span><span class="cl"><span class="w">  </span><span class="p">...</span><span class="w">
</span></span></span><span class="line"><span class="ln">25</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">HASH</span><span class="w"> </span><span class="p">(</span><span class="n">user_id</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">26</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">27</span><span class="cl"><span class="w"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">users_0</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="k">OF</span><span class="w"> </span><span class="n">users</span><span class="w">
</span></span></span><span class="line"><span class="ln">28</span><span class="cl"><span class="w">  </span><span class="k">FOR</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">WITH</span><span class="w"> </span><span class="p">(</span><span class="n">MODULUS</span><span class="w"> </span><span class="mi">4</span><span class="p">,</span><span class="w"> </span><span class="n">REMAINDER</span><span class="w"> </span><span class="mi">0</span><span class="p">);</span></span></span></code></pre></div><p>策略選擇關鍵：</p>
<ul>
<li><strong>RANGE</strong> 適合 <em>時間 / 有序值</em> — query 多半帶 <code>WHERE event_time &gt;= X</code>、prune 效率最高；archive / drop 老資料是 <code>DROP PARTITION</code> 0.01 秒</li>
<li><strong>LIST</strong> 適合 <em>離散 enum / tenant</em> — query 帶 <code>WHERE tenant_id = X</code> prune；缺點是 tenant 增長要手動 ALTER ADD PARTITION</li>
<li><strong>HASH</strong> 適合 <em>均勻分散、沒自然 key</em> — query 多半 by-PK lookup、HASH 讓單 partition 大小均勻；prune 只在 <code>WHERE hash_key = X</code> 等值查詢觸發</li>
</ul>
<h3 id="選錯-partition-key-是最常見的錯誤">選錯 partition key 是最常見的錯誤</h3>
<p>例：events 表用 <code>user_id</code> HASH partition、但 query 多半 <code>WHERE event_time BETWEEN ...</code>、<code>user_id</code> 不在 WHERE — planner 沒法 prune、掃所有 partition、效能比單表更差（多了 partition routing overhead）。</p>
<p>partition key <em>必須</em> 對應 query 最常用的 WHERE filter；錯了就退化成 <em>維護面有好處、查詢面有壞處</em> 的尷尬狀態。</p>
<h2 id="partition-pruningplanner-怎麼決定跳過">Partition pruning：planner 怎麼決定跳過</h2>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">EXPLAIN</span><span class="w"> </span><span class="p">(</span><span class="k">ANALYZE</span><span class="p">,</span><span class="w"> </span><span class="n">BUFFERS</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">events</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">WHERE</span><span class="w"> </span><span class="n">event_time</span><span class="w"> </span><span class="o">&gt;=</span><span class="w"> </span><span class="s1">&#39;2026-05-01&#39;</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">event_time</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="s1">&#39;2026-05-15&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="c1">-- 期望輸出包含：
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1">--  Append (cost=...)
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="c1">--    -&gt; Seq Scan on events_2026_05  (cost=...)
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="c1">-- (只 scan 一個 partition、其他 partition pruned)</span></span></span></code></pre></div><p>pruning 觸發條件：</p>
<ol>
<li>WHERE 含 partition key 的 <em>constant expression</em>（<code>WHERE x = 5</code> 觸發；<code>WHERE x = some_function()</code> 不觸發 planning-time prune、但 PG 11+ execution-time prune 可救）</li>
<li>PG 11+ 支援 <em>execution-time pruning</em> — query plan 內含 partition key、runtime 才知道值（prepared statement / NestedLoop join）</li>
<li>partition key 不在 WHERE 時 — <em>全部 partition 掃</em>、是反指標、表示 partition strategy 不對</li>
</ol>
<h3 id="partition-wise-join--aggregate-pg-11">Partition-wise join / aggregate (PG 11+)</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SET</span><span class="w"> </span><span class="n">enable_partitionwise_join</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">on</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">SET</span><span class="w"> </span><span class="n">enable_partitionwise_aggregate</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">on</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- 兩個同 partition 策略的表 JOIN 時、planner 可 partition-wise 平行做
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">events</span><span class="w"> </span><span class="n">e</span><span class="w"> </span><span class="k">JOIN</span><span class="w"> </span><span class="n">events_metadata</span><span class="w"> </span><span class="n">m</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w">  </span><span class="k">ON</span><span class="w"> </span><span class="n">e</span><span class="p">.</span><span class="n">event_time</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">m</span><span class="p">.</span><span class="n">event_time</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w">  </span><span class="k">WHERE</span><span class="w"> </span><span class="n">e</span><span class="p">.</span><span class="n">event_time</span><span class="w"> </span><span class="o">&gt;=</span><span class="w"> </span><span class="s1">&#39;2026-05-01&#39;</span><span class="p">;</span></span></span></code></pre></div><p>需要兩個表 <em>partition strategy 完全一致</em>（同 partition key + 同 partition boundary）— 設計時對齊、後期不容易調整。</p>
<h2 id="production-故障演練">Production 故障演練</h2>
<h3 id="case-1partition-key-選錯query-變慢">Case 1：partition key 選錯，query 變慢</h3>
<p><strong>徵兆</strong>：partition 後特定查詢從 200ms 變成 2000ms；EXPLAIN 顯示 <code>Append</code> 下面所有 partition 都被 scan、沒 partition 被 prune。</p>
<p><strong>根因</strong>：partition by <code>user_id</code> HASH、但 query 多用 <code>WHERE created_at BETWEEN X AND Y</code>；planner 不知道 user 在哪個 partition、必須掃全部。</p>
<p><strong>修法</strong>：</p>
<ol>
<li><strong>驗證 step</strong>：partition 前先 <code>pg_stat_statements</code> 看 top 10 query 的 WHERE pattern、partition key 必須對應其中 80% 流量的 filter</li>
<li><strong>修正</strong>：DROP partition strategy、改 partition by <code>created_at</code> RANGE；遷移用 <code>pg_dump --section=data</code> per-partition 重灌</li>
<li><strong>避免</strong>：partitioning 不可逆、設計階段 query pattern 沒看清楚不要動</li>
</ol>
<h3 id="case-2cross-partition-unique-constraint-不-enforce">Case 2：cross-partition unique constraint 不 enforce</h3>
<p><strong>徵兆</strong>：partition 後發現 application code 寫死 duplicate user_email、但 unique constraint 沒擋；DB 內有同 email 多筆。</p>
<p><strong>根因</strong>：PostgreSQL partition table 的 <code>UNIQUE</code> constraint <em>必須包含 partition key</em> — <code>UNIQUE (email)</code> 在 partition by <code>tenant_id</code> 的表上 <em>無法 enforce</em>（PostgreSQL 拒建）；workaround 用 <code>UNIQUE (email, tenant_id)</code>、但業務語意是「email 全域唯一」、PG 無法保證。</p>
<p><strong>修法</strong>：</p>
<ol>
<li><strong>架構</strong>：跨 partition 唯一性必須在 <em>application 層</em> enforce（lock + check 模式）</li>
<li><strong>替代</strong>：用 <em>non-partitioned</em> 表存唯一性目標（user_email_registry）、做寫入前 lookup</li>
<li><strong>設計階段檢查</strong>：partition by X、unique constraint 必須含 X；若業務要求 unique 不含 X、partition strategy 錯</li>
</ol>
<h3 id="case-3attach-partition-鎖表太久">Case 3：ATTACH PARTITION 鎖表太久</h3>
<p><strong>徵兆</strong>：新 month partition <code>ATTACH PARTITION</code> 跑 30 秒、期間整個 events 表 read 阻塞、application timeout 大量。</p>
<p><strong>根因</strong>：<code>ATTACH PARTITION</code> 預設加 <code>ACCESS EXCLUSIVE</code> lock 在 parent table、scan 整個新 partition 驗證 CHECK constraint；大 partition + 沒 CHECK constraint 預先驗證 → 鎖時間爆。</p>
<p><strong>修法</strong>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1">-- 1. 先把要 attach 的 partition 加 CHECK constraint，用 NOT VALID 不掃描
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events_2026_06</span><span class="w"> </span><span class="k">ADD</span><span class="w"> </span><span class="k">CONSTRAINT</span><span class="w"> </span><span class="n">events_2026_06_range</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="k">CHECK</span><span class="w"> </span><span class="p">(</span><span class="n">event_time</span><span class="w"> </span><span class="o">&gt;=</span><span class="w"> </span><span class="s1">&#39;2026-06-01&#39;</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">event_time</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="s1">&#39;2026-07-01&#39;</span><span class="p">)</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">VALID</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w"></span><span class="c1">-- 2. VALIDATE 用 SHARE UPDATE EXCLUSIVE lock、允許讀寫
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events_2026_06</span><span class="w"> </span><span class="n">VALIDATE</span><span class="w"> </span><span class="k">CONSTRAINT</span><span class="w"> </span><span class="n">events_2026_06_range</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w"></span><span class="c1">-- 3. ATTACH 不再需要 scan（CHECK 已 VALIDATE 過）
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events</span><span class="w"> </span><span class="n">ATTACH</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">events_2026_06</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">  </span><span class="k">FOR</span><span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="p">(</span><span class="s1">&#39;2026-06-01&#39;</span><span class="p">)</span><span class="w"> </span><span class="k">TO</span><span class="w"> </span><span class="p">(</span><span class="s1">&#39;2026-07-01&#39;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w"></span><span class="c1">-- ATTACH 變 instant</span></span></span></code></pre></div><h3 id="case-4partition-數爆炸planner-planning-time-爆">Case 4：partition 數爆炸，planner planning time 爆</h3>
<p><strong>徵兆</strong>：partition 累積到 500+（daily partition 跑 1-2 年）、簡單 query EXPLAIN 顯示 planning_time 從 1ms 漲到 200ms、application response 變慢。</p>
<p><strong>根因</strong>：partition 越多 planner 要評估的 partition 越多、即使有 pruning、planning 階段也要 walk 全部 partition table；500+ partition 是 planning overhead 明顯的閾值。</p>
<p><strong>修法</strong>：</p>
<ol>
<li><strong>架構</strong>：partition granularity 對應 retention — 不要 daily partition 留 2 年（→ weekly / monthly）</li>
<li><strong>archive 老 partition</strong>：DETACH 老 partition、轉成 cold storage 表、planner 不再看</li>
<li><strong><code>enable_partition_pruning</code></strong> 預設 on、確保啟用</li>
<li><strong>PG 12+</strong>：planner 對 partition table 的 list 處理優化、planning time 上限拉高、但仍要控</li>
</ol>
<h3 id="case-5detach-後磁碟空間沒回收">Case 5：DETACH 後磁碟空間沒回收</h3>
<p><strong>徵兆</strong>：DETACH PARTITION 後 <code>pg_database_size</code> 沒下降、預期釋放 50GB；磁碟仍滿。</p>
<p><strong>根因</strong>：DETACH 只是把 partition 從 parent table <em>分離</em>、partition 自己仍是獨立表存在；要真釋放需要 <code>DROP TABLE detached_partition</code>。SRE 以為 DETACH = 刪掉。</p>
<p><strong>修法</strong>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 完整流程
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events</span><span class="w"> </span><span class="n">DETACH</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">events_2024_01</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="c1">-- events_2024_01 仍存在、佔磁碟
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"></span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="c1">-- 確認沒 query 在用後
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1"></span><span class="k">DROP</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events_2024_01</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w"></span><span class="c1">-- 才釋放磁碟</span></span></span></code></pre></div><h3 id="routinearchive-workflow">Routine：archive workflow</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 月底跑：
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1">-- 1. detach 13 個月前的 partition
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="c1"></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events</span><span class="w"> </span><span class="n">DETACH</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="n">events_2025_04</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="c1">-- 2. dump 到 cold storage
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1"></span><span class="err">\</span><span class="k">COPY</span><span class="w"> </span><span class="n">events_2025_04</span><span class="w"> </span><span class="k">TO</span><span class="w"> </span><span class="s1">&#39;/cold/events_2025_04.csv&#39;</span><span class="w"> </span><span class="p">(</span><span class="n">FORMAT</span><span class="w"> </span><span class="n">CSV</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="w"></span><span class="c1">-- 3. drop 釋放磁碟
</span></span></span><span class="line"><span class="ln">9</span><span class="cl"><span class="c1"></span><span class="k">DROP</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events_2025_04</span><span class="p">;</span></span></span></code></pre></div><h2 id="容量規劃">容量規劃</h2>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>估算</th>
          <th>警戒</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>單 partition size</td>
          <td>跟單表 vacuum 上限對齊（10-100GB sweet spot）</td>
          <td>&gt; 200GB 時考慮 sub-partition 或細化 granularity</td>
      </tr>
      <tr>
          <td>Partition 數量</td>
          <td>對應 retention × granularity</td>
          <td>&gt; 200 partition 時 planning time 開始浮現</td>
      </tr>
      <tr>
          <td>Partition key cardinality</td>
          <td>LIST：&lt; 100 / HASH：自定 modulus / RANGE：時間 + 維度</td>
          <td>太多獨立 partition value 用 HASH</td>
      </tr>
      <tr>
          <td>Cross-partition query 比例</td>
          <td>EXPLAIN 看 partition scan 數</td>
          <td>&gt; 30% query 掃 &gt; 50% partition 表示 key 選錯</td>
      </tr>
      <tr>
          <td>Maintenance window</td>
          <td>DROP / DETACH / ATTACH 各 partition 各自管</td>
          <td>hot partition 維護仍在 maintenance window</td>
      </tr>
  </tbody>
</table>
<p>實務 default：</p>
<ul>
<li>時間序列（events / log）：monthly RANGE partition、retention 12-24 個月</li>
<li>Multi-tenant（orders / records）：tenant_id LIST partition + 大 tenant 各自獨立 partition</li>
<li>均勻散落（user / metric）：8-16 個 HASH partition、單 partition 50-100GB</li>
</ul>
<h2 id="整合--下一步">整合 / 下一步</h2>
<h3 id="跟-autovacuum-tuning-整合">跟 <a href="/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/" data-link-title="PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat" data-link-desc="MVCC 怎麼產生 dead tuple、autovacuum cost-based throttle 為什麼預設保守、per-table tuning 怎麼設、5 個 production 踩雷（cost_limit 太低 / 長 transaction blocks vacuum / anti-wraparound 在 peak / partition vacuum 滿 worker / index bloat 沒處理）、跟 partitioning &#43; monitoring 整合">autovacuum tuning</a> 整合</h3>
<p>partitioning 是 autovacuum 問題的長期解：</p>
<ol>
<li>Hot partition autovacuum 緊（scale_factor 0.05、cost_limit 5000）</li>
<li>Cold partition <code>autovacuum_enabled = false</code></li>
<li>但 partition 數爆會把 <code>autovacuum_max_workers</code> 跑滿、需要拉</li>
</ol>
<h3 id="跟-index-設計整合">跟 index 設計整合</h3>
<p>partition table 的 index 處理：</p>
<ol>
<li>PG 11+ 全域 index：<code>CREATE INDEX ON partitioned_table (...)</code> 自動在每 partition 建 local index</li>
<li><strong>不存在跨 partition unique</strong> — 只能 partition-local</li>
<li><strong>partition-wise index scan</strong>：PG 11+ 跟 partition-wise join 一起、index lookup 平行</li>
</ol>
<h3 id="跟-backup--pitr">跟 backup / PITR</h3>
<p>partition 不是 backup 替代品 — 但能加速 <em>partial restore</em>：</p>
<ol>
<li>只 restore 特定時段的 partition、不用 restore 整個表</li>
<li>對應 <a href="/blog/backend/01-database/vendors/postgresql/pitr-wal-archiving/" data-link-title="PostgreSQL PITR &#43; WAL archiving：從 base backup 到 point-in-time recovery 的完整鏈" data-link-desc="Base backup &#43; WAL archive 構成 PITR 的雙軌資料、archive_command &#43; restore_command 配置、用 pgBackRest / WAL-G 替代手寫腳本、5 個 production 踩雷（archive 靜默失敗 / archive lag / 錯誤 target time / base backup 過期未清 / timeline 分歧 recovery 模糊）、跟 Patroni &#43; monitoring 整合">PITR + WAL archiving</a> 的 partial recovery scenario</li>
</ol>
<h3 id="下一步議題">下一步議題</h3>
<ul>
<li><strong>Sub-partitioning</strong>：partition 內再 partition（時間 + tenant）、適合 multi-tenant + 時間序列</li>
<li><strong>pg_partman extension</strong>：自動建月 partition、不用 cron</li>
<li><strong>Foreign key to partitioned table</strong> (PG 12+)：跨 partition FK enforce、但 cascade 限制多</li>
</ul>
<h2 id="相關連結">相關連結</h2>
<ul>
<li>上游 vendor 頁：<a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL</a></li>
<li>上游 chapter：<a href="/blog/backend/01-database/schema-design/" data-link-title="1.2 Schema Design 與資料建模" data-link-desc="整理 table、index、key、partition、denormalization 與命名規則">Schema Design</a> — partition 是 schema 決策</li>
<li>平行 deep article：<a href="/blog/backend/01-database/vendors/postgresql/patroni-ha/" data-link-title="PostgreSQL Patroni HA：從 leader 失聯到 client 重連的 5 段 failover lifecycle" data-link-desc="Patroni 把 PostgreSQL HA 拆成 detection / election / promotion / reconfiguration / recovery 五段 lifecycle、每段都有獨立配置跟 failure mode；DCS quorum &#43; watchdog 防 split-brain、async/sync replication 取捨、5 個 production 踩雷、跟 PgBouncer / HAProxy / cert-manager 整合">Patroni HA</a> / <a href="/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/" data-link-title="PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat" data-link-desc="MVCC 怎麼產生 dead tuple、autovacuum cost-based throttle 為什麼預設保守、per-table tuning 怎麼設、5 個 production 踩雷（cost_limit 太低 / 長 transaction blocks vacuum / anti-wraparound 在 peak / partition vacuum 滿 worker / index bloat 沒處理）、跟 partitioning &#43; monitoring 整合">autovacuum tuning</a> / <a href="/blog/backend/01-database/vendors/postgresql/timescaledb-deep-dive/" data-link-title="TimescaleDB Deep Dive：Hypertable / Continuous Aggregate / Compression 把 PG 變 Time-Series DB" data-link-desc="TimescaleDB 是 PG extension（不是 fork）、用 *hypertable* 自動 partition by time、加 *continuous aggregate* 做 incremental materialized view、加 *compression* 對舊 chunk 壓 90%&#43;、把 PG 變成 InfluxDB / Prometheus 級 time-series DB。本文走 hypertable 機制、continuous aggregate 跟普通 MV 差異、compression policy、retention policy、5 production 踩雷（chunk size 不對 / CAGG refresh 落後 / compression 後 update 限制 / hypertable 不能加 FK / TimescaleDB 跟 PG 主版本對齊）、跟 PG 原生 partitioning 對比">TimescaleDB Deep Dive</a>（hypertable 是 partition 自動化）</li>
<li>後續路由：<a href="/blog/backend/01-database/vendors/postgresql/partition-redesign/" data-link-title="PostgreSQL Partition Redesign：當 monthly partition 越跑越慢" data-link-desc="PostgreSQL partition redesign 是 Type F「topology re-layout」第 2 個 dogfood — 從 monthly partition 改 daily / 從 range 改 list / 從單軸改 sub-partition；6 維 audit 皆 Low &#43; topology 軸 High；涵蓋 partition 不平衡偵測、ATTACH/DETACH 線上重劃、5 個 production 踩雷、跟 partition_pruning &#43; autovacuum 整合">Partition Redesign</a>（重排 partition strategy 的 migration playbook）</li>
<li>Methodology：<a href="/blog/posts/vendor-%E6%B7%B1%E5%BA%A6%E6%8A%80%E8%A1%93%E6%96%87%E7%AB%A0%E6%96%B9%E6%B3%95%E8%AB%96%E7%9A%84%E6%BC%94%E5%8C%96%E7%B4%80%E9%8C%84%E5%90%8C-vendor-%E7%B3%BB%E5%88%97%E7%9A%84%E9%96%8B%E5%A0%B4%E8%BC%AA%E6%9B%BF%E9%A9%97%E8%AD%89/" data-link-title="Vendor 深度技術文章方法論的演化紀錄：同 vendor 系列的開場輪替驗證" data-link-desc="vendor overview 飽和後要寫單一功能深度文章、需要選題與結構依據時回來。這套方法論的驗證來源與 cadence variant 在高風險場景（同 vendor sub-tool 系列）的實證。">Vendor 深度技術文章的寫作方法論</a></li>
</ul>
]]></content:encoded></item><item><title>PostgreSQL pg_partman Advanced</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/pg-partman-advanced/</link><pubDate>Fri, 22 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/pg-partman-advanced/</guid><description>&lt;p>PostgreSQL pg_partman advanced 的核心責任是把 &lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/table-partitioning/" data-link-title="Table Partitioning" data-link-desc="說明單一資料庫內如何把大表拆成多個分區，並由查詢規劃器只掃相關片段">declarative partitioning&lt;/a> 的日常維護自動化。pg_partman 可以協助建立未來 partition、管理 retention、執行 maintenance job，讓 time-based 或 serial-based partition 不再依賴人工 DDL。&lt;/p>
&lt;p>本文的判讀錨點是：pg_partman 解決的是 partition lifecycle operation，而非 partition strategy 本身。Partition key、query pattern、retention、index、foreign key 與 migration 仍要先在 &lt;a href="../declarative-partitioning/">Declarative Partitioning&lt;/a> 與 &lt;a href="../partition-redesign/">Partition Redesign&lt;/a> 做對。&lt;/p>
&lt;h2 id="responsibility-boundary">Responsibility Boundary&lt;/h2>
&lt;p>Responsibility boundary 的核心責任是區分 PostgreSQL 原生 partition 和 pg_partman。&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>層級&lt;/th>
 &lt;th>責任&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>PostgreSQL declarative partitioning&lt;/td>
 &lt;td>partition table、constraint、planner pruning&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>pg_partman&lt;/td>
 &lt;td>future partition premake、retention、maintenance&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Scheduler / job runner&lt;/td>
 &lt;td>定期執行 maintenance&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>DBA / platform&lt;/td>
 &lt;td>monitoring、backup、DDL review&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Application&lt;/td>
 &lt;td>query pattern、partition key 使用&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>pg_partman 的價值在於減少重複 DDL。它不會替 application 選出正確 partition key，也不會自動修復跨 partition query 設計。&lt;/p>
&lt;h2 id="core-concepts">Core Concepts&lt;/h2>
&lt;p>Core concepts 的核心責任是理解 pg_partman operation vocabulary。&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>概念&lt;/th>
 &lt;th>意義&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Parent table&lt;/td>
 &lt;td>partitioned table 的入口&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Child table&lt;/td>
 &lt;td>實際存放資料的 partition&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Premake&lt;/td>
 &lt;td>預先建立未來 partition&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Retention&lt;/td>
 &lt;td>自動 detach / drop 舊 partition&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Maintenance&lt;/td>
 &lt;td>建立新 partition、處理 retention 的 job&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Template&lt;/td>
 &lt;td>child partition 繼承 index / constraint 的模板&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Premake 是防止 insert 打到不存在 partition 的保護。若 partition 建立落後於時間，application insert 會失敗或落到 default partition；production 要對 future partition count 設 alert。&lt;/p>
&lt;p>Retention 是資料生命週期操作。Drop 舊 partition 速度快，但要先確認 legal retention、backup、analytics dependency 與 downstream CDC。&lt;/p>
&lt;h2 id="setup-pattern">Setup Pattern&lt;/h2>
&lt;p>Setup pattern 的核心責任是把 pg_partman 導入流程放進 migration gate。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">EXTENSION&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">IF&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">EXISTS&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">pg_partman&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TABLE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">events&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">bigserial&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">tenant_id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">uuid&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NULL&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">created_at&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">timestamptz&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NULL&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">payload&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">jsonb&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NOT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">NULL&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">8&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">PARTITION&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">BY&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">RANGE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">created_at&lt;/span>&lt;span class="p">);&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>實際建立 partman config 要依 pg_partman 版本與 provider 支援文件執行。Managed PostgreSQL 可能限制 extension version、background worker 或 scheduler，因此 setup 前要先確認 provider boundary。&lt;/p></description><content:encoded><![CDATA[<p>PostgreSQL pg_partman advanced 的核心責任是把 <a href="/blog/backend/knowledge-cards/table-partitioning/" data-link-title="Table Partitioning" data-link-desc="說明單一資料庫內如何把大表拆成多個分區，並由查詢規劃器只掃相關片段">declarative partitioning</a> 的日常維護自動化。pg_partman 可以協助建立未來 partition、管理 retention、執行 maintenance job，讓 time-based 或 serial-based partition 不再依賴人工 DDL。</p>
<p>本文的判讀錨點是：pg_partman 解決的是 partition lifecycle operation，而非 partition strategy 本身。Partition key、query pattern、retention、index、foreign key 與 migration 仍要先在 <a href="../declarative-partitioning/">Declarative Partitioning</a> 與 <a href="../partition-redesign/">Partition Redesign</a> 做對。</p>
<h2 id="responsibility-boundary">Responsibility Boundary</h2>
<p>Responsibility boundary 的核心責任是區分 PostgreSQL 原生 partition 和 pg_partman。</p>
<table>
  <thead>
      <tr>
          <th>層級</th>
          <th>責任</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>PostgreSQL declarative partitioning</td>
          <td>partition table、constraint、planner pruning</td>
      </tr>
      <tr>
          <td>pg_partman</td>
          <td>future partition premake、retention、maintenance</td>
      </tr>
      <tr>
          <td>Scheduler / job runner</td>
          <td>定期執行 maintenance</td>
      </tr>
      <tr>
          <td>DBA / platform</td>
          <td>monitoring、backup、DDL review</td>
      </tr>
      <tr>
          <td>Application</td>
          <td>query pattern、partition key 使用</td>
      </tr>
  </tbody>
</table>
<p>pg_partman 的價值在於減少重複 DDL。它不會替 application 選出正確 partition key，也不會自動修復跨 partition query 設計。</p>
<h2 id="core-concepts">Core Concepts</h2>
<p>Core concepts 的核心責任是理解 pg_partman operation vocabulary。</p>
<table>
  <thead>
      <tr>
          <th>概念</th>
          <th>意義</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Parent table</td>
          <td>partitioned table 的入口</td>
      </tr>
      <tr>
          <td>Child table</td>
          <td>實際存放資料的 partition</td>
      </tr>
      <tr>
          <td>Premake</td>
          <td>預先建立未來 partition</td>
      </tr>
      <tr>
          <td>Retention</td>
          <td>自動 detach / drop 舊 partition</td>
      </tr>
      <tr>
          <td>Maintenance</td>
          <td>建立新 partition、處理 retention 的 job</td>
      </tr>
      <tr>
          <td>Template</td>
          <td>child partition 繼承 index / constraint 的模板</td>
      </tr>
  </tbody>
</table>
<p>Premake 是防止 insert 打到不存在 partition 的保護。若 partition 建立落後於時間，application insert 會失敗或落到 default partition；production 要對 future partition count 設 alert。</p>
<p>Retention 是資料生命週期操作。Drop 舊 partition 速度快，但要先確認 legal retention、backup、analytics dependency 與 downstream CDC。</p>
<h2 id="setup-pattern">Setup Pattern</h2>
<p>Setup pattern 的核心責任是把 pg_partman 導入流程放進 migration gate。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">CREATE</span><span class="w"> </span><span class="n">EXTENSION</span><span class="w"> </span><span class="k">IF</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">EXISTS</span><span class="w"> </span><span class="n">pg_partman</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">events</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">  </span><span class="n">id</span><span class="w"> </span><span class="n">bigserial</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">  </span><span class="n">tenant_id</span><span class="w"> </span><span class="n">uuid</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w">  </span><span class="n">created_at</span><span class="w"> </span><span class="n">timestamptz</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w">  </span><span class="n">payload</span><span class="w"> </span><span class="n">jsonb</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="w">
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w"> </span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">RANGE</span><span class="w"> </span><span class="p">(</span><span class="n">created_at</span><span class="p">);</span></span></span></code></pre></div><p>實際建立 partman config 要依 pg_partman 版本與 provider 支援文件執行。Managed PostgreSQL 可能限制 extension version、background worker 或 scheduler，因此 setup 前要先確認 provider boundary。</p>
<p>最小 setup evidence：</p>
<ol>
<li>Extension version。</li>
<li>Parent table DDL。</li>
<li>Partition key 與 interval。</li>
<li>Premake 數量。</li>
<li>Retention policy。</li>
<li>Maintenance job schedule。</li>
<li>Test insert 到 current / future partition。</li>
</ol>
<h2 id="maintenance-runbook">Maintenance Runbook</h2>
<p>Maintenance runbook 的核心責任是讓 partition lifecycle 可觀測。</p>
<table>
  <thead>
      <tr>
          <th>Signal</th>
          <th>意義</th>
          <th>反應</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>future partition count</td>
          <td>premake 是否足夠</td>
          <td>手動跑 maintenance、修 scheduler</td>
      </tr>
      <tr>
          <td>default partition rows</td>
          <td>routing 失敗或 partition 缺漏</td>
          <td>建 partition、搬資料、修 app timestamp</td>
      </tr>
      <tr>
          <td>old partition count</td>
          <td>retention 是否執行</td>
          <td>檢查 policy、legal hold、job error</td>
      </tr>
      <tr>
          <td>maintenance duration</td>
          <td>DDL / lock / catalog 壓力</td>
          <td>調整 schedule、拆 table</td>
      </tr>
      <tr>
          <td>index build time</td>
          <td>child index 建立成本</td>
          <td>template / concurrent strategy review</td>
      </tr>
  </tbody>
</table>
<p>Maintenance job 要有 owner。Cron、pg_cron、background worker、Kubernetes job 或 managed scheduler 都可以；重點是 job failure 會告警，並且有人處理。</p>
<h2 id="migration-and-backfill">Migration and Backfill</h2>
<p>Migration and backfill 的核心責任是把既有大表轉成 partman-managed partition。這通常比新表導入更高風險。</p>
<table>
  <thead>
      <tr>
          <th>Phase</th>
          <th>Evidence</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Audit</td>
          <td>table size、query pattern、write rate</td>
      </tr>
      <tr>
          <td>New schema</td>
          <td>parent table、child partition、index</td>
      </tr>
      <tr>
          <td>Backfill</td>
          <td>batch size、lag、lock、checksum</td>
      </tr>
      <tr>
          <td>Dual write</td>
          <td>app compatibility</td>
      </tr>
      <tr>
          <td>Cutover</td>
          <td>rename / view / routing switch</td>
      </tr>
      <tr>
          <td>Cleanup</td>
          <td>old table retention、rollback</td>
      </tr>
  </tbody>
</table>
<p>Backfill 要控制 WAL、replica lag、autovacuum、index bloat 與 lock。大型 table 應先用 shadow table 或 partition redesign playbook，避開 peak traffic 直接重建。</p>
<h2 id="failure-modes">Failure Modes</h2>
<p>Failure modes 的核心責任是列出 pg_partman 常見事故。</p>
<table>
  <thead>
      <tr>
          <th>Failure mode</th>
          <th>判讀訊號</th>
          <th>修正方向</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>未建立未來 partition</td>
          <td>insert 失敗或 default partition 增長</td>
          <td>補 partition、修 maintenance schedule</td>
      </tr>
      <tr>
          <td>retention drop 過早</td>
          <td>查詢缺歷史資料</td>
          <td>restore backup、調 policy、legal review</td>
      </tr>
      <tr>
          <td>managed provider 不支援</td>
          <td>extension / worker 限制</td>
          <td>改 manual partition job 或 provider</td>
      </tr>
      <tr>
          <td>index / constraint 漂移</td>
          <td>child partition schema 不一致</td>
          <td>template review、schema diff</td>
      </tr>
      <tr>
          <td>planner pruning 失效</td>
          <td>query 未帶 partition key</td>
          <td>query rewrite、index review</td>
      </tr>
  </tbody>
</table>
<p>pg_partman 事故通常是 lifecycle 事故。Runbook 要先看 maintenance job，再看 partition metadata 與 application query。</p>
<h2 id="下一步路由">下一步路由</h2>
<p>pg_partman advanced 完成後，partition 設計讀 <a href="../declarative-partitioning/">Declarative Partitioning</a>；重排策略讀 <a href="../partition-redesign/">Partition Redesign</a>；migration gate 讀 <a href="../online-schema-change/">Online Schema Change</a>。</p>
]]></content:encoded></item></channel></rss>