<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Json on Tarragon</title><link>https://tarrragon.github.io/blog/tags/json/</link><description>Recent content in Json on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Fri, 19 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/json/index.xml" rel="self" type="application/rss+xml"/><item><title>模組二：Log Schema 設計</title><link>https://tarrragon.github.io/blog/monitoring/02-log-schema/</link><pubDate>Fri, 19 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/monitoring/02-log-schema/</guid><description>&lt;p>回答「事件長什麼樣」。schema 是所有 SDK 和 collector 的契約 SOT。&lt;/p>
&lt;h2 id="待寫章節">待寫章節&lt;/h2>
&lt;ul>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> event.schema.json 完整欄位解說&lt;/li>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> 欄位設計原則（source 標明來源 / data 自由欄位 / v 版本演進）&lt;/li>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> Schema 版本演進策略（backward compatible 的增量變更）&lt;/li>
&lt;li>&lt;input checked="" disabled="" type="checkbox"> 跟 OpenTelemetry 的 schema 差異對照&lt;/li>
&lt;/ul>
&lt;h2 id="跨分類引用">跨分類引用&lt;/h2>
&lt;ul>
&lt;li>SOT repo：&lt;a href="https://github.com/tarrragon/monitor">tarrragon/monitor&lt;/a> 的 &lt;code>schema/event.schema.json&lt;/code>&lt;/li>
&lt;li>← &lt;a href="https://tarrragon.github.io/blog/testing/02-client-observability/" data-link-title="模組二：客戶端可觀測性" data-link-desc="連線生命週期 log、protocol 訊息 log、使用者行為 log — log 設計是功能規格的一部分">testing 模組二&lt;/a>：log 點設計產出的事件需符合本 schema&lt;/li>
&lt;li>→ &lt;a href="https://tarrragon.github.io/blog/monitoring/07-security-privacy/" data-link-title="模組七：資安與隱私" data-link-desc="SDK redaction / transport 加密 / collector access control / 去識別化 — 蒐集的資料本身就是風險資產">monitoring 模組七 資安&lt;/a>：schema 中哪些欄位需要 redaction&lt;/li>
&lt;/ul></description><content:encoded><![CDATA[<p>回答「事件長什麼樣」。schema 是所有 SDK 和 collector 的契約 SOT。</p>
<h2 id="待寫章節">待寫章節</h2>
<ul>
<li><input checked="" disabled="" type="checkbox"> event.schema.json 完整欄位解說</li>
<li><input checked="" disabled="" type="checkbox"> 欄位設計原則（source 標明來源 / data 自由欄位 / v 版本演進）</li>
<li><input checked="" disabled="" type="checkbox"> Schema 版本演進策略（backward compatible 的增量變更）</li>
<li><input checked="" disabled="" type="checkbox"> 跟 OpenTelemetry 的 schema 差異對照</li>
</ul>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>SOT repo：<a href="https://github.com/tarrragon/monitor">tarrragon/monitor</a> 的 <code>schema/event.schema.json</code></li>
<li>← <a href="/blog/testing/02-client-observability/" data-link-title="模組二：客戶端可觀測性" data-link-desc="連線生命週期 log、protocol 訊息 log、使用者行為 log — log 設計是功能規格的一部分">testing 模組二</a>：log 點設計產出的事件需符合本 schema</li>
<li>→ <a href="/blog/monitoring/07-security-privacy/" data-link-title="模組七：資安與隱私" data-link-desc="SDK redaction / transport 加密 / collector access control / 去識別化 — 蒐集的資料本身就是風險資產">monitoring 模組七 資安</a>：schema 中哪些欄位需要 redaction</li>
</ul>
]]></content:encoded></item><item><title>MySQL 8.0 Modern SQL：CTE / window function / JSON_TABLE 不是「終於跟上 PG」、是進入 SQL 工程深度的入場券</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/modern-sql-features/</link><pubDate>Tue, 19 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/modern-sql-features/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL&lt;/a> overview 的 implementation-layer deep article。Overview 已說明 MySQL 在 OLTP 譜系的定位、本文聚焦 &lt;em>8.0 modern SQL 特性&lt;/em> — 5 個關鍵能力 + 跟 PostgreSQL 對應特性的對比。&lt;/p>&lt;/blockquote>
&lt;hr>
&lt;p>「MySQL 是 SQL 簡單版」是個過時觀念。&lt;/p>
&lt;p>這個觀念的來源很合理：MySQL 5.x 時代沒 CTE、window function 要嗑 hack、recursive query 寫不出來、JSON 處理是字串 substring 拼接、複雜分析 query 只能丟去 PostgreSQL 或 Snowflake。整整 10 年 SQL 進階特性 MySQL 全缺、PostgreSQL 全有。&lt;/p>
&lt;p>MySQL 8.0（2018 推出）改變這件事。CTE / window function / lateral derived table / JSON_TABLE / hash join / atomic DDL / role-based authentication / common table expression 全部進來。&lt;strong>這不是「終於跟上 PG」、是 MySQL 第一次有資格進入 SQL 工程深度討論&lt;/strong>。但有 caveats：每個特性的 &lt;em>行為實現&lt;/em> 跟 PostgreSQL 對應特性都有 &lt;em>微妙差異&lt;/em>、不能假設 PG 經驗直接套用。&lt;/p>
&lt;p>對從 PostgreSQL 過來評估 MySQL 的讀者：本文是 &lt;em>特性對等驗證&lt;/em> — 哪些 8.0 特性真的可以 production 用、哪些是 marketing 但實作有 gap。對既有 MySQL 5.7 user：本文是 &lt;em>upgrade 5.7 → 8.0 的具體 ROI&lt;/em> — 從 SQL feature 角度看升級值不值得。&lt;/p>
&lt;h2 id="5-個關鍵特性--pg-對比">5 個關鍵特性 + PG 對比&lt;/h2>
&lt;h3 id="特性-1ctecommon-table-expression">特性 1：CTE（Common Table Expression）&lt;/h3>
&lt;p>MySQL 8.0 / PG 8.4+ 都支援。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="c1">-- MySQL 8.0 + PG 都 OK
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="k">WITH&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">order_summary&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">AS&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">SELECT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">user_id&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">SUM&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">amount&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">AS&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">total&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">FROM&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">orders&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">WHERE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">created_at&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;gt;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s1">&amp;#39;2026-01-01&amp;#39;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="k">GROUP&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">BY&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">user_id&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">SELECT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">u&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">name&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">os&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">total&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">FROM&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">users&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">u&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">JOIN&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">order_summary&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">os&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">ON&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">u&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">os&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">user_id&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="k">WHERE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">os&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="n">total&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">&amp;gt;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="mi">1000&lt;/span>&lt;span class="p">;&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>行為差異&lt;/strong>：&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL</a> overview 的 implementation-layer deep article。Overview 已說明 MySQL 在 OLTP 譜系的定位、本文聚焦 <em>8.0 modern SQL 特性</em> — 5 個關鍵能力 + 跟 PostgreSQL 對應特性的對比。</p></blockquote>
<hr>
<p>「MySQL 是 SQL 簡單版」是個過時觀念。</p>
<p>這個觀念的來源很合理：MySQL 5.x 時代沒 CTE、window function 要嗑 hack、recursive query 寫不出來、JSON 處理是字串 substring 拼接、複雜分析 query 只能丟去 PostgreSQL 或 Snowflake。整整 10 年 SQL 進階特性 MySQL 全缺、PostgreSQL 全有。</p>
<p>MySQL 8.0（2018 推出）改變這件事。CTE / window function / lateral derived table / JSON_TABLE / hash join / atomic DDL / role-based authentication / common table expression 全部進來。<strong>這不是「終於跟上 PG」、是 MySQL 第一次有資格進入 SQL 工程深度討論</strong>。但有 caveats：每個特性的 <em>行為實現</em> 跟 PostgreSQL 對應特性都有 <em>微妙差異</em>、不能假設 PG 經驗直接套用。</p>
<p>對從 PostgreSQL 過來評估 MySQL 的讀者：本文是 <em>特性對等驗證</em> — 哪些 8.0 特性真的可以 production 用、哪些是 marketing 但實作有 gap。對既有 MySQL 5.7 user：本文是 <em>upgrade 5.7 → 8.0 的具體 ROI</em> — 從 SQL feature 角度看升級值不值得。</p>
<h2 id="5-個關鍵特性--pg-對比">5 個關鍵特性 + PG 對比</h2>
<h3 id="特性-1ctecommon-table-expression">特性 1：CTE（Common Table Expression）</h3>
<p>MySQL 8.0 / PG 8.4+ 都支援。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1">-- MySQL 8.0 + PG 都 OK
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1"></span><span class="k">WITH</span><span class="w"> </span><span class="n">order_summary</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">    </span><span class="k">SELECT</span><span class="w"> </span><span class="n">user_id</span><span class="p">,</span><span class="w"> </span><span class="k">SUM</span><span class="p">(</span><span class="n">amount</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">total</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="k">WHERE</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="s1">&#39;2026-01-01&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">    </span><span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">user_id</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">u</span><span class="p">.</span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">os</span><span class="p">.</span><span class="n">total</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">users</span><span class="w"> </span><span class="n">u</span><span class="w"> </span><span class="k">JOIN</span><span class="w"> </span><span class="n">order_summary</span><span class="w"> </span><span class="n">os</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">u</span><span class="p">.</span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">os</span><span class="p">.</span><span class="n">user_id</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w"></span><span class="k">WHERE</span><span class="w"> </span><span class="n">os</span><span class="p">.</span><span class="n">total</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">1000</span><span class="p">;</span></span></span></code></pre></div><p><strong>行為差異</strong>：</p>
<ul>
<li><strong>MySQL 8.0</strong>：CTE <em>不 materialize 為預設</em>、optimizer 把 CTE 視為 <em>inlined subquery</em>、CTE 引用兩次以上會 <em>重複計算</em></li>
<li><strong>PostgreSQL（&lt; 12）</strong>：CTE <em>fence by default</em>（materialize barrier）、optimizer 不 push predicate 進 CTE</li>
<li><strong>PostgreSQL（12+）</strong>：CTE 行為跟 MySQL 接近、有 <code>MATERIALIZED</code> / <code>NOT MATERIALIZED</code> keyword 明示</li>
</ul>
<p>對 PG 12+ user：可以套 MySQL 經驗。對 PG 11 以下 user：CTE 行為跟 MySQL 不一樣、要重看 query plan。</p>
<p><strong>Recursive CTE</strong>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">WITH</span><span class="w"> </span><span class="k">RECURSIVE</span><span class="w"> </span><span class="n">org_chart</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">    </span><span class="k">SELECT</span><span class="w"> </span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">manager_id</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">depth</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">    </span><span class="k">FROM</span><span class="w"> </span><span class="n">employees</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">manager_id</span><span class="w"> </span><span class="k">IS</span><span class="w"> </span><span class="k">NULL</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">    </span><span class="k">UNION</span><span class="w"> </span><span class="k">ALL</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">    </span><span class="k">SELECT</span><span class="w"> </span><span class="n">e</span><span class="p">.</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">e</span><span class="p">.</span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">e</span><span class="p">.</span><span class="n">manager_id</span><span class="p">,</span><span class="w"> </span><span class="n">oc</span><span class="p">.</span><span class="n">depth</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="mi">1</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w">    </span><span class="k">FROM</span><span class="w"> </span><span class="n">employees</span><span class="w"> </span><span class="n">e</span><span class="w"> </span><span class="k">JOIN</span><span class="w"> </span><span class="n">org_chart</span><span class="w"> </span><span class="n">oc</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">e</span><span class="p">.</span><span class="n">manager_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">oc</span><span class="p">.</span><span class="n">id</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">org_chart</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">depth</span><span class="w"> </span><span class="o">&lt;=</span><span class="w"> </span><span class="mi">10</span><span class="p">;</span></span></span></code></pre></div><p>兩家都支援、但 MySQL 8.0 有 <em>深度上限</em>（<code>cte_max_recursion_depth=1000</code>、預設 1000、PG 預設 unlimited）。複雜 hierarchical query（深度 &gt; 1000）MySQL 需要顯式提高 limit。</p>
<h3 id="特性-2window-function">特性 2：Window Function</h3>
<p>MySQL 8.0 / PG 8.4+ 都支援、語法同 SQL standard。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SELECT</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">    </span><span class="n">order_id</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">    </span><span class="n">user_id</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">    </span><span class="n">amount</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">    </span><span class="k">SUM</span><span class="p">(</span><span class="n">amount</span><span class="p">)</span><span class="w"> </span><span class="n">OVER</span><span class="w"> </span><span class="p">(</span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">user_id</span><span class="w"> </span><span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">created_at</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">running_total</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w">    </span><span class="n">RANK</span><span class="p">()</span><span class="w"> </span><span class="n">OVER</span><span class="w"> </span><span class="p">(</span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">user_id</span><span class="w"> </span><span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">amount</span><span class="w"> </span><span class="k">DESC</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">rank_in_user</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="p">;</span></span></span></code></pre></div><p><strong>行為差異</strong>：</p>
<ul>
<li><strong>執行 plan</strong>：MySQL 8.0 用 <em>window iterator</em>、單 partition 內 sort、外加 in-memory window buffer。PostgreSQL 有更成熟的 <em>WindowAgg node</em>、複雜 frame spec 處理更好</li>
<li><strong>Frame spec 支援度</strong>：兩家都支援 ROWS / RANGE / GROUPS、但 <em>GROUPS frame</em> MySQL 是 8.0.16+ 才補進、PG 11+ 才補</li>
<li><strong>大資料量 spill behavior</strong>：MySQL window function 超過 <code>sort_buffer_size</code>（預設 256K）會 spill 到 disk、Performance 雪崩。PG 用 <code>work_mem</code>（預設 4MB）、寬裕些但也會 spill</li>
</ul>
<p>對長期用 PG window function 寫複雜 reporting query 的 user：MySQL 8.0 可以做、但 <em>效能 tune</em> 工作量大、不是 drop-in。</p>
<h3 id="特性-3json_tablepg-主要賣點對比">特性 3：JSON_TABLE（PG 主要賣點對比）</h3>
<p>這是 user 點到的對比重點。</p>
<p><strong>MySQL 8.0 的 JSON_TABLE</strong>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">SELECT</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">.</span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">.</span><span class="n">price</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="n">t</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">     </span><span class="n">JSON_TABLE</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">         </span><span class="n">t</span><span class="p">.</span><span class="n">metadata</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">         </span><span class="s1">&#39;$.variants[*]&#39;</span><span class="w"> </span><span class="n">COLUMNS</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">             </span><span class="n">name</span><span class="w"> </span><span class="nb">VARCHAR</span><span class="p">(</span><span class="mi">50</span><span class="p">)</span><span class="w"> </span><span class="n">PATH</span><span class="w"> </span><span class="s1">&#39;$.name&#39;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">             </span><span class="n">price</span><span class="w"> </span><span class="nb">DECIMAL</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span><span class="mi">2</span><span class="p">)</span><span class="w"> </span><span class="n">PATH</span><span class="w"> </span><span class="s1">&#39;$.price&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">         </span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">     </span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">j</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w"></span><span class="k">WHERE</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">category</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;shoes&#39;</span><span class="p">;</span></span></span></code></pre></div><p>JSON_TABLE 把 JSON document 內的 array element 展開成 <em>relational rows</em>、然後可以 JOIN / WHERE / GROUP BY。SQL:2016 standard 規範。</p>
<p><strong>PostgreSQL 對應</strong>：</p>
<p>PG 17+ 有 <code>JSON_TABLE</code>（SQL:2016 standard、跟 MySQL 同語法）、但歷史上 PG user 用兩條不同路線：</p>
<ol>
<li>
<p><strong>JSONB operator</strong>（PG 9.4+）：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SELECT</span><span class="w"> </span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">metadata</span><span class="o">-&gt;</span><span class="s1">&#39;variants&#39;</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">variants</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">WHERE</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">@&gt;</span><span class="w"> </span><span class="s1">&#39;{&#34;category&#34;: &#34;shoes&#34;}&#39;</span><span class="p">;</span></span></span></code></pre></div></li>
<li>
<p><strong>jsonb_path_query</strong>（PG 12+）：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SELECT</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">price</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="n">t</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">     </span><span class="n">jsonb_path_query</span><span class="p">(</span><span class="n">t</span><span class="p">.</span><span class="n">metadata</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;$.variants[*]&#39;</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">v</span><span class="p">;</span></span></span></code></pre></div></li>
</ol>
<p><strong>核心差異</strong>：</p>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>MySQL JSON_TABLE</th>
          <th>PG JSONB operator</th>
          <th>PG jsonb_path_query</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Index</td>
          <td>必須對 JSON column 建 <em>generated column + 一般 index</em>、不能直接 GIN index JSON path</td>
          <td><strong>GIN index 直接 over JSONB</strong>（業界唯一）</td>
          <td>可以走 GIN expression index</td>
      </tr>
      <tr>
          <td>Storage</td>
          <td>JSON column = LONGTEXT 包裝</td>
          <td>JSONB = binary、壓縮、index 友善</td>
          <td>同左</td>
      </tr>
      <tr>
          <td>Query 效率（複雜 path）</td>
          <td>中等（需要 generated column 加速）</td>
          <td>高（GIN index 直接）</td>
          <td>高</td>
      </tr>
      <tr>
          <td>SQL standard 對齊</td>
          <td>高（JSON_TABLE 是 standard）</td>
          <td>低（JSONB operator 是 PG 專有）</td>
          <td>中（jsonpath 是 standard）</td>
      </tr>
      <tr>
          <td>大 JSON（&gt; 1 MB）</td>
          <td>LONGTEXT 仍可、但 query 慢</td>
          <td>JSONB 壓縮 + 部分 read</td>
          <td>同左</td>
      </tr>
  </tbody>
</table>
<p><strong>選型結論</strong>：</p>
<ul>
<li><strong>MySQL 是 JSON-storage 角色</strong>（document 順手存進關聯 DB）：JSON_TABLE 夠用、配 generated column + index、production-ready</li>
<li><strong>MySQL 是 document-heavy workload</strong>（大量 JSON-driven query / 複雜 path / 高 selectivity）：PG JSONB GIN index 仍是 <em>clearly winner</em>、或直接用 MongoDB</li>
<li><strong>MySQL 8.0 JSON 不是 PG JSONB 替代</strong>：JSON_TABLE 是 <em>SQL standard 對齊</em>、好 portable、但 <em>index 跟 storage 仍弱</em></li>
</ul>
<p>對「JSON 是 PG 主要賣點」的判斷：JSONB binary storage + GIN index 是 PG 在 JSON workload 的 <em>結構性優勢</em>、MySQL 8.0 補了 SQL_TABLE 但 <em>index 那層沒補</em>。8.0 後 JSON 議題 <em>不是 deal-breaker for MySQL</em>（不像 5.7 時代直接 disqualify）、但仍不是 MySQL 主場。</p>
<h3 id="特性-4lateral-derived-table">特性 4：Lateral Derived Table</h3>
<p>MySQL 8.0.14+ / PG 9.3+ 都支援。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 對每個 user、找他最近 5 個 order
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">u</span><span class="p">.</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">recent</span><span class="p">.</span><span class="o">*</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">users</span><span class="w"> </span><span class="n">u</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="k">LEFT</span><span class="w"> </span><span class="k">JOIN</span><span class="w"> </span><span class="k">LATERAL</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">    </span><span class="k">SELECT</span><span class="w"> </span><span class="n">order_id</span><span class="p">,</span><span class="w"> </span><span class="n">amount</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w">    </span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="w"> </span><span class="n">o</span><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w">    </span><span class="k">WHERE</span><span class="w"> </span><span class="n">o</span><span class="p">.</span><span class="n">user_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">u</span><span class="p">.</span><span class="n">id</span><span class="w">
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="w">    </span><span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="k">DESC</span><span class="w"> </span><span class="k">LIMIT</span><span class="w"> </span><span class="mi">5</span><span class="w">
</span></span></span><span class="line"><span class="ln">9</span><span class="cl"><span class="w"></span><span class="p">)</span><span class="w"> </span><span class="n">recent</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="k">true</span><span class="p">;</span></span></span></code></pre></div><p>Lateral 讓 subquery 可以 <em>引用外部 reference column</em>（<code>u.id</code>）、不可能用 plain subquery 寫出來。</p>
<p><strong>行為差異</strong>：</p>
<ul>
<li>MySQL 8.0：lateral 後加、optimizer plan 仍在演進、複雜 lateral query 可能 plan 次優</li>
<li>PostgreSQL：lateral 早就成熟、plan 跟 join 直接 fuse、效率高</li>
</ul>
<p>對 PG-experienced 使用 lateral 寫 reporting query 的 user：MySQL 8.0 可以、但有時候要 hint optimizer 達到最佳 plan。</p>
<h3 id="特性-5hash-join">特性 5：Hash Join</h3>
<p>MySQL 8.0.18+ / PG 早已有。</p>
<p><strong>MySQL 8.0 之前</strong>：只有 <em>nested loop join</em>、大表 JOIN 完全失控（n × m row scan）。8.0.18 加 hash join、optimizer 在預估 row count 大時自動切。</p>
<p><strong>注意</strong>：MySQL 8.0 hash join 預設 <em>不對所有 join 開</em>、只在 <code>optimizer_switch='hash_join=on'</code> 且 join condition 是 <em>equality on indexed column</em> 時觸發。常見錯估：複雜 join 條件不觸發 hash join、optimizer fallback nested loop、query 永遠跑不完。</p>
<p><strong>PG 對應</strong>：PG 一直有 hash join、optimizer 預設 cover 廣、且有 <em>parallel hash join</em>（PG 11+）大表 JOIN 並行加速。</p>
<p>MySQL hash join 是 <em>補洞</em>、不是 <em>並肩特性</em>。複雜 OLAP query MySQL 仍弱於 PG。</p>
<h2 id="其他-80-特性一句話帶過">其他 8.0 特性（一句話帶過）</h2>
<ul>
<li><strong>Atomic DDL</strong>：CREATE TABLE / DROP / ALTER 變 transactional、crash recovery 不會留 orphan table（PG 早就 atomic）</li>
<li><strong>Role-based authentication</strong>：role 取代 group-level grant、user 可繼承 role（PG 早就 role 系統）</li>
<li><strong>CHECK constraint enforcement</strong>：5.7 可寫但不執行、8.0 真的 enforce（PG 一直執行）</li>
<li><strong>invisible index</strong>：建 index 但 optimizer 暫不用、適合 staging query plan 測試（PG 沒原生對應）</li>
<li><strong>Resource Group</strong>：query 跑時可分配 CPU thread 給特定 user group（PG 沒原生對應）</li>
<li><strong>Generated column</strong>：MySQL 5.7 已有、8.0 強化、可作為 JSON path 加速的 workaround</li>
</ul>
<h2 id="配置-step-by-step從-57--80-sql-feature-升級">配置 step-by-step（從 5.7 → 8.0 SQL feature 升級）</h2>
<p>如果已經是 8.0、所有特性都可以用、不必額外配置。如果是 5.7 → 8.0、需要：</p>
<ol>
<li><strong><code>character_set_server=utf8mb4</code></strong>：8.0 預設 utf8mb4（5.7 預設 latin1）、character set 不一致導致 query 行為微差</li>
<li><strong><code>default_authentication_plugin=mysql_native_password</code></strong>：8.0 預設 caching_sha2_password、舊 client 連不上、cluster upgrade 期間用 native_password 保兼容</li>
<li><strong><code>optimizer_switch='hash_join=on'</code></strong>：確認 hash join 啟用、預設應該已 ON</li>
<li><strong><code>cte_max_recursion_depth=10000</code></strong>：複雜 recursive CTE 需要時提高</li>
<li><strong>重新 review 所有 ORM-generated SQL</strong>：8.0 keywords 變多（WINDOW、RANK、LATERAL 等變成 reserved word）、5.7 識別碼可能變 syntax error</li>
</ol>
<h2 id="5-個-production-踩雷">5 個 Production 踩雷</h2>
<h3 id="1-cte-引用兩次--跑兩次">1. CTE 引用兩次 = 跑兩次</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">WITH</span><span class="w"> </span><span class="n">expensive</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="p">...</span><span class="w"> </span><span class="n">heavy</span><span class="w"> </span><span class="n">aggregation</span><span class="w"> </span><span class="p">...)</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">expensive</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="p">...</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">UNION</span><span class="w"> </span><span class="k">ALL</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">expensive</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">other_condition</span><span class="p">;</span></span></span></code></pre></div><p>預期 CTE 跑一次、實際 MySQL 跑兩次。Query 時間 doubled。</p>
<p>修法：</p>
<ul>
<li>把 CTE 結果先 INSERT 進 <em>temporary table</em>、SELECT 兩次走 temp table（手動 materialize）</li>
<li>或 PG 用 <code>MATERIALIZED</code> keyword（MySQL 沒對應 hint、要手動 temp table）</li>
</ul>
<h3 id="2-window-function-大-partition-spill-到-disk">2. Window function 大 partition spill 到 disk</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SELECT</span><span class="w"> </span><span class="n">order_id</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">       </span><span class="k">SUM</span><span class="p">(</span><span class="n">amount</span><span class="p">)</span><span class="w"> </span><span class="n">OVER</span><span class="w"> </span><span class="p">(</span><span class="n">PARTITION</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">user_id</span><span class="w"> </span><span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">created_at</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">orders</span><span class="p">;</span><span class="w">  </span><span class="c1">-- 1 億 row</span></span></span></code></pre></div><p><code>sort_buffer_size=256K</code> 預設、單 partition &gt; 256K row 開始 spill disk、執行從秒級變分鐘級。</p>
<p>修法：</p>
<ul>
<li>提高 <code>sort_buffer_size</code>（per-connection、不要設太大、connection × buffer 會吃 RAM）</li>
<li>加 INDEX 包含 <code>user_id, created_at</code>、optimizer 可直接用 sorted index、不必額外 sort</li>
</ul>
<h3 id="3-json_table-跟-generated-column-取捨錯誤">3. JSON_TABLE 跟 generated column 取捨錯誤</h3>
<p>直接 JSON_TABLE on every query：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="n">JSON_TABLE</span><span class="p">(</span><span class="n">metadata</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;$.variants[*]&#39;</span><span class="w"> </span><span class="n">COLUMNS</span><span class="w"> </span><span class="p">(...));</span></span></span></code></pre></div><p>每次 query 跑 JSON parse、無 index 加速、大表 query 慢。</p>
<p>修法：</p>
<ul>
<li>
<p>對 <em>常 query 的 JSON path</em> 建 generated column：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">products</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">ADD</span><span class="w"> </span><span class="k">COLUMN</span><span class="w"> </span><span class="n">category</span><span class="w"> </span><span class="nb">VARCHAR</span><span class="p">(</span><span class="mi">50</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">GENERATED</span><span class="w"> </span><span class="n">ALWAYS</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="p">(</span><span class="n">JSON_UNQUOTE</span><span class="p">(</span><span class="n">metadata</span><span class="o">-&gt;</span><span class="s1">&#39;$.category&#39;</span><span class="p">))</span><span class="w"> </span><span class="n">STORED</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="k">ADD</span><span class="w"> </span><span class="k">INDEX</span><span class="w"> </span><span class="n">idx_category</span><span class="w"> </span><span class="p">(</span><span class="n">category</span><span class="p">);</span></span></span></code></pre></div></li>
<li>
<p>JSON_TABLE 用於 <em>ad-hoc query</em>、不要當熱 path</p>
</li>
<li>
<p>跟 PG JSONB GIN 對比：PG 不必預先建 generated column、GIN index 直接 over JSONB</p>
</li>
</ul>
<h3 id="4-hash-join-沒觸發--optimizer-預估錯-row-count">4. Hash join 沒觸發 — Optimizer 預估錯 row count</h3>
<p>JOIN 大表預期 hash join、實際 MySQL 跑 nested loop、query 跑不完。常見原因：</p>
<ul>
<li>Table statistics 過時（沒跑 <code>ANALYZE TABLE</code>）</li>
<li>Join condition 不是 pure equality（<code>a.id = b.id + 1</code> 等）</li>
<li>一邊有 LIMIT、optimizer 估 small set、選 nested loop</li>
</ul>
<p>修法：</p>
<ul>
<li>跑 <code>ANALYZE TABLE</code> 更新 statistics</li>
<li>用 <code>EXPLAIN ANALYZE</code> 看實際 row count vs 估計</li>
<li>用 <code>optimizer_hint</code>（如 <code>/*+ HASH_JOIN(t1 t2) */</code>）強制</li>
</ul>
<h3 id="5-recursive-cte-深度上限--production-query-突然-fail">5. Recursive CTE 深度上限 — Production query 突然 fail</h3>
<p><code>cte_max_recursion_depth=1000</code> 預設、organization hierarchy / tree query 超過 1000 層直接 fail（<code>ER_CTE_MAX_RECURSION_DEPTH_EXCEEDED</code>）。</p>
<p>修法：</p>
<ul>
<li>評估真實 hierarchy 深度、設 <code>cte_max_recursion_depth=10000</code> 或更高</li>
<li>或 query 加 <code>WHERE depth &lt; N</code> 提前停（不依賴 implicit limit）</li>
<li>對極大 hierarchy（社群 follow graph 等）改用 <em>graph DB</em>（Neo4j）— MySQL recursive CTE 不是 graph workload 主場</li>
</ul>
<h2 id="mysql-80-vs-pg-sql-特性-cross-reference">MySQL 8.0 vs PG SQL 特性 cross-reference</h2>
<table>
  <thead>
      <tr>
          <th>特性</th>
          <th>MySQL 8.0</th>
          <th>PostgreSQL</th>
          <th>差異</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>CTE</td>
          <td>8.0+</td>
          <td>8.4+</td>
          <td>PG 2009 即支援、MySQL 2018 才支援、約晚 9 年</td>
      </tr>
      <tr>
          <td>Recursive CTE</td>
          <td>8.0+（depth 限）</td>
          <td>8.4+（unlimited）</td>
          <td>PG 無深度上限</td>
      </tr>
      <tr>
          <td>Window function</td>
          <td>8.0+</td>
          <td>8.4+</td>
          <td>Frame spec 兩家略不同（GROUPS frame 推出時點）</td>
      </tr>
      <tr>
          <td>Lateral</td>
          <td>8.0.14+</td>
          <td>9.3+</td>
          <td>PG plan 較成熟</td>
      </tr>
      <tr>
          <td>JSON_TABLE</td>
          <td>8.0+</td>
          <td>17+</td>
          <td>MySQL 早 6 年（SQL:2016 standard）</td>
      </tr>
      <tr>
          <td>JSONB index</td>
          <td>無原生</td>
          <td>GIN index over JSONB</td>
          <td><strong>PG 結構優勢</strong></td>
      </tr>
      <tr>
          <td>Hash join</td>
          <td>8.0.18+</td>
          <td>早</td>
          <td>PG parallel hash join</td>
      </tr>
      <tr>
          <td>Atomic DDL</td>
          <td>8.0+</td>
          <td>早</td>
          <td>PG 一直 atomic</td>
      </tr>
      <tr>
          <td>Common keyword</td>
          <td>補齊</td>
          <td>完整</td>
          <td>-</td>
      </tr>
      <tr>
          <td>Role-based auth</td>
          <td>8.0+</td>
          <td>早</td>
          <td>-</td>
      </tr>
      <tr>
          <td>Materialized view</td>
          <td>無原生</td>
          <td>9.3+</td>
          <td><strong>PG 結構優勢</strong>（MySQL 用 trigger / scheduled refresh 模擬）</td>
      </tr>
      <tr>
          <td>Partial index</td>
          <td>無</td>
          <td>早</td>
          <td><strong>PG 結構優勢</strong></td>
      </tr>
      <tr>
          <td>Expression index</td>
          <td>8.0.13+</td>
          <td>早</td>
          <td>MySQL 後加</td>
      </tr>
      <tr>
          <td>Full-text search</td>
          <td>內建（InnoDB 5.6+）</td>
          <td>內建（tsvector）</td>
          <td>PG full-text 更成熟</td>
      </tr>
      <tr>
          <td>Foreign data wrapper</td>
          <td>無原生</td>
          <td>早（FDW）</td>
          <td><strong>PG 結構優勢</strong></td>
      </tr>
  </tbody>
</table>
<p>8.0 補了 <em>語法層</em> 大部分缺漏、<em>storage / index / extensibility 層</em> 仍是 PG 結構優勢。對「先選 SQL 工程深度」的 org、PG 仍領先；對「先選 ecosystem / replication / sharding」的 org、MySQL 已不是 disqualifier。</p>
<h2 id="跟其他模組整合">跟其他模組整合</h2>
<h3 id="跟-innodb-tuning">跟 InnoDB Tuning</h3>
<p>JSON column 在 InnoDB 是 LONGTEXT 包裝、大 JSON 進 off-page storage（<code>innodb_default_row_format=DYNAMIC</code> 才行、Antelope format 不支援）。Buffer pool 對 LONGTEXT 較不友善、大 JSON workload 可能要更大 buffer pool。詳見 <a href="/blog/backend/01-database/vendors/mysql/innodb-tuning/" data-link-title="MySQL InnoDB Tuning：為什麼一個 100 GB DB 在 64 GB RAM server 上 query 慢 5 倍" data-link-desc="InnoDB 是 MySQL 預設 storage engine、預設值給 256 MB buffer pool（早期 default）。本文從一個常見痛點開場（DB &gt; RAM 但 server 仍 swap）、走 4 個 critical knob（buffer pool / redo log / flush method / IO capacity）、各自如何影響讀寫吞吐、配置 step-by-step、5 production 踩雷（buffer pool warm-up / log file 大小 / 設 sync_binlog=0 換速度 / IO scheduler / undo log 膨脹）、跟 SSD / NVMe / EBS 的 IO 假設">InnoDB Tuning</a>。</p>
<h3 id="跟-query-optimization">跟 Query Optimization</h3>
<p>8.0 新 hash join + lateral derived 讓 <em>EXPLAIN ANALYZE</em> 結果更複雜。優化複雜 query 需要熟 <em>新 plan node 類型</em>。詳見 <em>Query Optimization deep dive</em> 篇（待寫）。</p>
<h3 id="跟-online-schema-change">跟 Online Schema Change</h3>
<p>JSON column 跟 generated column 的 schema change 走 gh-ost / pt-osc 沒問題、但 JSON 大表 ALTER 速度比一般 column 慢（每 row 重 serialize）。詳見 <a href="/blog/backend/01-database/vendors/mysql/online-schema-change-tools/" data-link-title="MySQL Online Schema Change：gh-ost 跟 pt-online-schema-change 兩條完全不同的 ghost table 路徑" data-link-desc="MySQL ALTER TABLE 可能鎖整張表，production 需要 online schema change 流程。gh-ost（GitHub）跟 pt-online-schema-change（Percona）都用 ghost table 解決、但底層機制完全不同：pt-osc 用 trigger 同步、gh-ost 用 binlog stream 同步。本文走兩工具機制對照表 → trigger vs binlog 各自取捨 → 配置 step-by-step → 5 production 踩雷（trigger overhead / binlog 延遲 / FK constraint / hot trigger lock / 切換瞬間 deadlock）→ 何時用哪一個">Online Schema Change Tools</a>。</p>
<h3 id="跟-replication">跟 Replication</h3>
<p>Window function / CTE / JSON_TABLE 的 query <em>結果</em> replicate（row-level binlog 紀錄結果）、不 replicate <em>query 本身</em>。所以 replica apply 不會重新跑 window function、效率 OK。詳見 <a href="/blog/backend/01-database/vendors/mysql/replication-topology/" data-link-title="MySQL Replication Topology：async / semi-sync / GTID 不是三選一、是三個 trade-off 軸的疊加" data-link-desc="MySQL replication 不是「選 async 還是 semi-sync」、是 *durability / latency / consistency* 三個 trade-off 軸的疊加；GTID 是跨 mode 的 infrastructure layer、不是第三種 mode。本文走 3 軸取捨模型 → async / semi-sync 行為對比 → GTID 替代 binlog-position 的好處 → 配置 step-by-step → 5 production 踩雷（lag 暴衝 / semi-sync 退回 async / GTID gap / Loss-Less semi-sync 真的 loss-less / chained replication 雪崩）→ 跟 Aurora MySQL / Vitess / ProxySQL / Orchestrator 整合">Replication Topology</a>。</p>
<h2 id="何時-sql-特性是-mysql-選型-driver">何時 SQL 特性是 MySQL 選型 driver</h2>
<ul>
<li><strong>想要 SQL standard 對齊跨 vendor portable</strong>：MySQL 8.0 JSON_TABLE / window 都對齊 standard、PG 部分能力（JSONB operator）是 PG-only、portability MySQL 略好</li>
<li><strong>JSON workload &lt; 20% query</strong>：MySQL 8.0 + generated column 夠用、不必為 JSON 換 PG</li>
<li><strong>JSON workload &gt; 50% query + 複雜 path / aggregation</strong>：PG JSONB GIN 仍 winner、考慮 PG 或 MongoDB</li>
<li><strong>需要 materialized view / FDW / partial index</strong>：PG 仍領先、不要因為 SQL feature parity 假設 MySQL 全 cover</li>
<li><strong>既有 MySQL 投資 + SQL 工程深度上升</strong>：升 8.0 + 訓練團隊用新特性、不是換 vendor</li>
</ul>
<h2 id="相關連結">相關連結</h2>
<ul>
<li><a href="/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL vendor overview</a></li>
<li><a href="/blog/backend/01-database/vendors/mysql/innodb-tuning/" data-link-title="MySQL InnoDB Tuning：為什麼一個 100 GB DB 在 64 GB RAM server 上 query 慢 5 倍" data-link-desc="InnoDB 是 MySQL 預設 storage engine、預設值給 256 MB buffer pool（早期 default）。本文從一個常見痛點開場（DB &gt; RAM 但 server 仍 swap）、走 4 個 critical knob（buffer pool / redo log / flush method / IO capacity）、各自如何影響讀寫吞吐、配置 step-by-step、5 production 踩雷（buffer pool warm-up / log file 大小 / 設 sync_binlog=0 換速度 / IO scheduler / undo log 膨脹）、跟 SSD / NVMe / EBS 的 IO 假設">InnoDB Tuning</a>（JSON column 對 buffer pool 影響）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/online-schema-change-tools/" data-link-title="MySQL Online Schema Change：gh-ost 跟 pt-online-schema-change 兩條完全不同的 ghost table 路徑" data-link-desc="MySQL ALTER TABLE 可能鎖整張表，production 需要 online schema change 流程。gh-ost（GitHub）跟 pt-online-schema-change（Percona）都用 ghost table 解決、但底層機制完全不同：pt-osc 用 trigger 同步、gh-ost 用 binlog stream 同步。本文走兩工具機制對照表 → trigger vs binlog 各自取捨 → 配置 step-by-step → 5 production 踩雷（trigger overhead / binlog 延遲 / FK constraint / hot trigger lock / 切換瞬間 deadlock）→ 何時用哪一個">Online Schema Change Tools</a>（JSON column 大表 ALTER）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/replication-topology/" data-link-title="MySQL Replication Topology：async / semi-sync / GTID 不是三選一、是三個 trade-off 軸的疊加" data-link-desc="MySQL replication 不是「選 async 還是 semi-sync」、是 *durability / latency / consistency* 三個 trade-off 軸的疊加；GTID 是跨 mode 的 infrastructure layer、不是第三種 mode。本文走 3 軸取捨模型 → async / semi-sync 行為對比 → GTID 替代 binlog-position 的好處 → 配置 step-by-step → 5 production 踩雷（lag 暴衝 / semi-sync 退回 async / GTID gap / Loss-Less semi-sync 真的 loss-less / chained replication 雪崩）→ 跟 Aurora MySQL / Vitess / ProxySQL / Orchestrator 整合">Replication Topology</a>（ROW-format binlog 對 window function）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/sql-features-baseline/" data-link-title="PostgreSQL SQL Features：PG 早就有的、MySQL 8.0 才補的、PG 仍領先的" data-link-desc="PG 在 SQL features 上長期領先 MySQL — CTE / window function / lateral / partial index / FTS / JSONB / GIN index / materialized view 在 PG 早 5-15 年。MySQL 8.0（2018）補多數但 *index / storage / extension* 層仍是 PG 結構優勢。本文整理 PG 早期就有的特性、MySQL 8.0 補的差異、PG 仍領先的、跟 MySQL modern-sql-features sibling 反向視角">PostgreSQL SQL Features Baseline</a>（PG 反向視角、哪些特性 PG 早 5-15 年、MySQL 8.0 補齊後 PG 仍領先）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/jsonb-deep-dive/" data-link-title="PostgreSQL JSONB Deep Dive：Binary Storage &#43; GIN Index 為什麼是結構性優勢" data-link-desc="PG JSONB（9.4&#43;）是 *binary 儲存的 JSON*、可直接 GIN index、是 PG 在 JSON workload 的結構性優勢、跟 MongoDB / MySQL 8.0 JSON_TABLE 比仍領先。本文走 JSON vs JSONB 差異、GIN index 機制（jsonb_ops vs jsonb_path_ops）、operator &#43; path query、partial JSONB indexing、5 production 踩雷（大 JSONB 跟 TOAST / nested update / index 選錯 op class / jsonb_path_query 跟 jsonb_path_exists 行為差 / partial index 條件搞錯）、何時用 JSONB vs 拆 column">PostgreSQL JSONB Deep Dive</a>（PG sibling、binary storage + GIN index 跟 MySQL JSON_TABLE 對比）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL vendor page</a>（JSON / SQL feature 對比 source）</li>
<li><a href="/blog/backend/01-database/vendors/mongodb/" data-link-title="MongoDB" data-link-desc="Document database 代表、Atlas managed、跨雲可用、許多大規模平台從 MongoDB 起家">MongoDB vendor page</a>（document-heavy workload 替代）</li>
<li>官方：<a href="https://dev.mysql.com/doc/refman/8.0/en/mysql-nutshell.html">MySQL 8.0 What&rsquo;s New</a></li>
</ul>
]]></content:encoded></item><item><title>PostgreSQL JSONB Deep Dive：Binary Storage + GIN Index 為什麼是結構性優勢</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/jsonb-deep-dive/</link><pubDate>Tue, 19 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/jsonb-deep-dive/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL&lt;/a> overview 的 implementation-layer deep article。Overview 已說明 PG 在 OLTP 譜系的定位、本文聚焦 &lt;em>JSONB deep dive&lt;/em> — binary storage + GIN index 的結構性優勢。&lt;/p>&lt;/blockquote>
&lt;hr>
&lt;h2 id="json-vs-jsonb選-jsonb">JSON vs JSONB：選 JSONB&lt;/h2>
&lt;p>PG 9.2 加 &lt;code>JSON&lt;/code> type、9.4 加 &lt;code>JSONB&lt;/code>。99% 場景用 JSONB：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>維度&lt;/th>
 &lt;th>JSON&lt;/th>
 &lt;th>JSONB&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>儲存&lt;/td>
 &lt;td>純文字（原樣保存）&lt;/td>
 &lt;td>Binary decomposed format&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Parse cost&lt;/td>
 &lt;td>每次 query parse&lt;/td>
 &lt;td>Insert 時 parse 一次&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Index 支援&lt;/td>
 &lt;td>Limited（functional index）&lt;/td>
 &lt;td>GIN / functional / partial 都行&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Operator 支援&lt;/td>
 &lt;td>有限（→ / →&amp;gt;）&lt;/td>
 &lt;td>完整（@&amp;gt; / ? / @? / ? 等）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Duplicate key&lt;/td>
 &lt;td>保留（原樣）&lt;/td>
 &lt;td>只保留最後一個（normalize）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Key order&lt;/td>
 &lt;td>保留&lt;/td>
 &lt;td>不保留&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Whitespace&lt;/td>
 &lt;td>保留&lt;/td>
 &lt;td>不保留&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>JSONB 唯一缺點是 &lt;em>binary 儲存（不保留 key order / whitespace / duplicate）&lt;/em>。99% application 不在意這些。&lt;/p>
&lt;p>從 &lt;em>application semantics&lt;/em> 視角、JSONB 是 PG JSON 的 &lt;em>the right type&lt;/em>、JSON 是 &lt;em>legacy / niche&lt;/em>。&lt;/p>
&lt;h2 id="jsonb-gin-index核心結構性優勢">JSONB GIN Index：核心結構性優勢&lt;/h2>
&lt;p>PG GIN（Generalized Inverted Index）可以對 JSONB 內所有 key/value pair 建 inverted index：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">TABLE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">products&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">id&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">SERIAL&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">PRIMARY&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">KEY&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="n">metadata&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">JSONB&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="p">);&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="c1">-- GIN index
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="k">CREATE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">INDEX&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">idx_products_metadata&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">ON&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">products&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">USING&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">GIN&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">metadata&lt;/span>&lt;span class="p">);&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>加完後、JSONB query 用 GIN index 加速：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sql" data-lang="sql">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1">-- @&amp;gt; (contains) 用 GIN
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="k">SELECT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">FROM&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">products&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">WHERE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">metadata&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">@&amp;gt;&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s1">&amp;#39;{&amp;#34;category&amp;#34;: &amp;#34;shoes&amp;#34;}&amp;#39;&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="c1">-- ? (has key) 用 GIN
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="k">SELECT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">FROM&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">products&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">WHERE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">metadata&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">?&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s1">&amp;#39;discount&amp;#39;&lt;/span>&lt;span class="p">;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="c1">-- ?| (has any of these keys) 用 GIN
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">8&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="k">SELECT&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">FROM&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">products&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">WHERE&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">metadata&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">?|&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="nb">array&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;discount&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s1">&amp;#39;promotion&amp;#39;&lt;/span>&lt;span class="p">];&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>跟 MongoDB index 對比、PG 不必 &lt;em>預先 define&lt;/em> JSON path index、&lt;code>USING GIN (metadata)&lt;/code> 對 &lt;em>整個 JSONB document 任意 path&lt;/em> 都有效。&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL</a> overview 的 implementation-layer deep article。Overview 已說明 PG 在 OLTP 譜系的定位、本文聚焦 <em>JSONB deep dive</em> — binary storage + GIN index 的結構性優勢。</p></blockquote>
<hr>
<h2 id="json-vs-jsonb選-jsonb">JSON vs JSONB：選 JSONB</h2>
<p>PG 9.2 加 <code>JSON</code> type、9.4 加 <code>JSONB</code>。99% 場景用 JSONB：</p>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>JSON</th>
          <th>JSONB</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>儲存</td>
          <td>純文字（原樣保存）</td>
          <td>Binary decomposed format</td>
      </tr>
      <tr>
          <td>Parse cost</td>
          <td>每次 query parse</td>
          <td>Insert 時 parse 一次</td>
      </tr>
      <tr>
          <td>Index 支援</td>
          <td>Limited（functional index）</td>
          <td>GIN / functional / partial 都行</td>
      </tr>
      <tr>
          <td>Operator 支援</td>
          <td>有限（→ / →&gt;）</td>
          <td>完整（@&gt; / ? / @? / ? 等）</td>
      </tr>
      <tr>
          <td>Duplicate key</td>
          <td>保留（原樣）</td>
          <td>只保留最後一個（normalize）</td>
      </tr>
      <tr>
          <td>Key order</td>
          <td>保留</td>
          <td>不保留</td>
      </tr>
      <tr>
          <td>Whitespace</td>
          <td>保留</td>
          <td>不保留</td>
      </tr>
  </tbody>
</table>
<p>JSONB 唯一缺點是 <em>binary 儲存（不保留 key order / whitespace / duplicate）</em>。99% application 不在意這些。</p>
<p>從 <em>application semantics</em> 視角、JSONB 是 PG JSON 的 <em>the right type</em>、JSON 是 <em>legacy / niche</em>。</p>
<h2 id="jsonb-gin-index核心結構性優勢">JSONB GIN Index：核心結構性優勢</h2>
<p>PG GIN（Generalized Inverted Index）可以對 JSONB 內所有 key/value pair 建 inverted index：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">    </span><span class="n">id</span><span class="w"> </span><span class="nb">SERIAL</span><span class="w"> </span><span class="k">PRIMARY</span><span class="w"> </span><span class="k">KEY</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">    </span><span class="n">metadata</span><span class="w"> </span><span class="n">JSONB</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w"></span><span class="c1">-- GIN index
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">INDEX</span><span class="w"> </span><span class="n">idx_products_metadata</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">USING</span><span class="w"> </span><span class="n">GIN</span><span class="w"> </span><span class="p">(</span><span class="n">metadata</span><span class="p">);</span></span></span></code></pre></div><p>加完後、JSONB query 用 GIN index 加速：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- @&gt; (contains) 用 GIN
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">@&gt;</span><span class="w"> </span><span class="s1">&#39;{&#34;category&#34;: &#34;shoes&#34;}&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- ? (has key) 用 GIN
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="s1">&#39;discount&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="w"></span><span class="c1">-- ?| (has any of these keys) 用 GIN
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">?|</span><span class="w"> </span><span class="nb">array</span><span class="p">[</span><span class="s1">&#39;discount&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;promotion&#39;</span><span class="p">];</span></span></span></code></pre></div><p>跟 MongoDB index 對比、PG 不必 <em>預先 define</em> JSON path index、<code>USING GIN (metadata)</code> 對 <em>整個 JSONB document 任意 path</em> 都有效。</p>
<h3 id="jsonb_ops-vs-jsonb_path_ops"><code>jsonb_ops</code> vs <code>jsonb_path_ops</code></h3>
<p>PG GIN 對 JSONB 有兩種 <em>operator class</em>：</p>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th><code>jsonb_ops</code>（預設）</th>
          <th><code>jsonb_path_ops</code></th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>索引內容</td>
          <td>Key + value 都索引</td>
          <td>只索引 path → value pair</td>
      </tr>
      <tr>
          <td>Index size</td>
          <td>大</td>
          <td>小（約一半）</td>
      </tr>
      <tr>
          <td>支援 operator</td>
          <td><code>@&gt; / ? / ?| / ?&amp;</code></td>
          <td>只 <code>@&gt;</code> (containment)</td>
      </tr>
      <tr>
          <td>適用</td>
          <td>多種 query pattern</td>
          <td>只用 <code>@&gt;</code> 的場景</td>
      </tr>
  </tbody>
</table>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- jsonb_ops（預設）
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">INDEX</span><span class="w"> </span><span class="n">idx_meta_default</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">USING</span><span class="w"> </span><span class="n">GIN</span><span class="w"> </span><span class="p">(</span><span class="n">metadata</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- jsonb_path_ops（小、快、但只支援 @&gt;）
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">INDEX</span><span class="w"> </span><span class="n">idx_meta_path</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">USING</span><span class="w"> </span><span class="n">GIN</span><span class="w"> </span><span class="p">(</span><span class="n">metadata</span><span class="w"> </span><span class="n">jsonb_path_ops</span><span class="p">);</span></span></span></code></pre></div><p><strong>選擇</strong>：</p>
<ul>
<li>只跑 <code>@&gt;</code> containment query → <code>jsonb_path_ops</code>（index 小、快）</li>
<li>跑 <code>?</code> / <code>?|</code> / <code>?&amp;</code> key existence query → <code>jsonb_ops</code>（預設）</li>
</ul>
<h2 id="operator--path-query">Operator + Path Query</h2>
<p>JSONB 提供豐富 operator + jsonpath：</p>
<h3 id="operator">Operator</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1">-- Extract value（returns jsonb）
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="s1">&#39;name&#39;</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w"></span><span class="c1">-- Extract text（returns text）
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">-&gt;&gt;</span><span class="w"> </span><span class="s1">&#39;name&#39;</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w"></span><span class="c1">-- Path extract
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">#&gt;</span><span class="w"> </span><span class="s1">&#39;{variants, 0, price}&#39;</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">#&gt;&gt;</span><span class="w"> </span><span class="s1">&#39;{variants, 0, price}&#39;</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="p">;</span><span class="w">  </span><span class="c1">-- 返回 text
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="c1"></span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w"></span><span class="c1">-- Containment（用 GIN index）
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">@&gt;</span><span class="w"> </span><span class="s1">&#39;{&#34;category&#34;: &#34;shoes&#34;, &#34;active&#34;: true}&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w"></span><span class="c1">-- Reverse containment
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="s1">&#39;{&#34;sub&#34;: &#34;value&#34;}&#39;</span><span class="w"> </span><span class="o">&lt;@</span><span class="w"> </span><span class="n">metadata</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w"></span><span class="c1">-- Key existence
</span></span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="s1">&#39;discount&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">19</span><span class="cl"><span class="w"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">?|</span><span class="w"> </span><span class="nb">array</span><span class="p">[</span><span class="s1">&#39;a&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;b&#39;</span><span class="p">];</span><span class="w">  </span><span class="c1">-- 任一 key
</span></span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">?&amp;</span><span class="w"> </span><span class="nb">array</span><span class="p">[</span><span class="s1">&#39;a&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;b&#39;</span><span class="p">];</span><span class="w">  </span><span class="c1">-- 全部 key</span></span></span></code></pre></div><h3 id="jsonpathpg-12">jsonpath（PG 12+）</h3>
<p>SQL/JSON jsonpath 是 SQL standard、PG 12+ 支援：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1">-- jsonb_path_query：展開 path 結果
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">jsonb_path_query</span><span class="p">(</span><span class="n">metadata</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;$.variants[*].price&#39;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w"></span><span class="c1">-- jsonb_path_exists：返 boolean
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w"></span><span class="k">WHERE</span><span class="w"> </span><span class="n">jsonb_path_exists</span><span class="p">(</span><span class="n">metadata</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;$.variants[*] ? (@.price &gt; 100)&#39;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w"></span><span class="c1">-- jsonb_path_query_array：返 array of result
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">jsonb_path_query_array</span><span class="p">(</span><span class="n">metadata</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;$.tags[*]&#39;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w"></span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="p">;</span></span></span></code></pre></div><p>jsonpath 比 PG-specific operator 標準化、跨 vendor portable。</p>
<h2 id="partial-jsonb-index">Partial JSONB Index</h2>
<p>對 <em>只 query subset row</em> 的場景、建 partial index：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 只對 active product 建 metadata index
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">CREATE</span><span class="w"> </span><span class="k">INDEX</span><span class="w"> </span><span class="n">idx_active_products_metadata</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">ON</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">USING</span><span class="w"> </span><span class="n">GIN</span><span class="w"> </span><span class="p">(</span><span class="n">metadata</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="k">WHERE</span><span class="w"> </span><span class="n">status</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;active&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="w"></span><span class="c1">-- Query active products + JSONB filter
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w">
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="w"></span><span class="k">WHERE</span><span class="w"> </span><span class="n">status</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;active&#39;</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">@&gt;</span><span class="w"> </span><span class="s1">&#39;{&#34;category&#34;: &#34;shoes&#34;}&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">9</span><span class="cl"><span class="w"></span><span class="c1">-- → planner 用 partial GIN index</span></span></span></code></pre></div><p>Partial index 比 full GIN 小很多、write cost 低、index hit rate 高。</p>
<h2 id="5-個-production-踩雷">5 個 Production 踩雷</h2>
<h3 id="1-大-jsonb--toast--性能崩潰">1. 大 JSONB + TOAST — 性能崩潰</h3>
<p>JSONB &gt; 2 KB 自動進 TOAST（PG 內外部 storage）、每次 query read 該 row 都要 <em>de-TOAST</em>（拉外部 storage 再合併）。大 JSONB（&gt; 50 KB）每次 query 慢 10-100x。</p>
<p>修法：</p>
<ul>
<li>把 <em>大 attribute 拆獨立 column</em>（如 <code>description TEXT</code> 不放 metadata）</li>
<li>用 <em>JSON path index</em> 對 hot path 加速、不必每次讀整個 JSONB</li>
<li>用 <code>pg_column_size(metadata)</code> 監控 JSONB size 分布、找 outlier</li>
<li>對 truly 大 document（&gt; 1 MB）考慮 separate table 或 object storage</li>
</ul>
<h3 id="2-nested-update--整個-jsonb-重寫">2. Nested update — 整個 JSONB 重寫</h3>
<p>PG 沒 <em>atomic partial update</em>。修改 nested key 必須讀整個 JSONB → 修改 → 寫回：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">UPDATE</span><span class="w"> </span><span class="n">products</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="k">SET</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">jsonb_set</span><span class="p">(</span><span class="n">metadata</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;{discount}&#39;</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;0.2&#39;</span><span class="p">::</span><span class="n">jsonb</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="k">WHERE</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">100</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w"></span><span class="c1">-- 等同於：讀 metadata、改 discount、寫回整個 metadata</span></span></span></code></pre></div><p>對 <em>大 JSONB + 高頻 update</em> 場景、寫吞吐受限。跟 MongoDB <code>$set</code> operator 對應 <em>partial document update</em> 不同。</p>
<p>修法：</p>
<ul>
<li>對 <em>high-update nested key</em> 拆獨立 column</li>
<li>Application 層 batch update（攢一批一次 update）</li>
<li>接受 PG JSONB <em>是 immutable-replace</em> 心智模型、不是 <em>mutable in-place</em></li>
</ul>
<h3 id="3-index-選錯-op-class---query-走-full-scan">3. Index 選錯 op class — <code>?</code> query 走 full scan</h3>
<p>對 <code>jsonb_path_ops</code> index、<code>?</code> key existence query 走 <em>full scan</em>（不用 index）。Application 看 query 慢、查 EXPLAIN 才發現 index 沒用。</p>
<p>修法：</p>
<ul>
<li>設計階段確認 <em>application query pattern</em>：只用 <code>@&gt;</code> 還是會用 <code>?</code></li>
<li>多 query pattern → <code>jsonb_ops</code>（預設）</li>
<li>純 containment → <code>jsonb_path_ops</code>（省 index size）</li>
<li>不確定先用預設、production 觀察後再優化</li>
</ul>
<h3 id="4-jsonb_path_query-跟-jsonb_path_exists-行為差">4. <code>jsonb_path_query</code> 跟 <code>jsonb_path_exists</code> 行為差</h3>
<ul>
<li><code>jsonb_path_query(metadata, '$.variants[*].price')</code> — 展開、每個 match return 一 row</li>
<li><code>jsonb_path_exists(metadata, '$.variants[*]')</code> — return boolean（true if any match）</li>
</ul>
<p>Application 想要「過濾 row」用前者寫成：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 錯：返多 row 給每個 product、結果 row count 暴增
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="n">jsonb_path_query</span><span class="p">(</span><span class="n">metadata</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;$.variants[*].price&#39;</span><span class="p">)</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="p">;</span></span></span></code></pre></div><p>應該：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">-- 對：只過濾 product
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">jsonb_path_exists</span><span class="p">(</span><span class="n">metadata</span><span class="p">,</span><span class="w"> </span><span class="s1">&#39;$.variants[*] ? (@.price &gt; 100)&#39;</span><span class="p">);</span></span></span></code></pre></div><p>修法：</p>
<ul>
<li>區分 <em>exists 過濾 row</em> vs <em>query 展開 row</em></li>
<li>過濾用 <code>jsonb_path_exists</code> 或 <code>@&gt;</code> operator</li>
<li>展開用 <code>jsonb_path_query</code> + 配合 <code>LATERAL</code> 或 subquery</li>
</ul>
<h3 id="5-partial-index-條件不對齊-query">5. Partial index 條件不對齊 query</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">CREATE</span><span class="w"> </span><span class="k">INDEX</span><span class="w"> </span><span class="n">idx_active_metadata</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">USING</span><span class="w"> </span><span class="n">GIN</span><span class="w"> </span><span class="p">(</span><span class="n">metadata</span><span class="p">)</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">status</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;active&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w"></span><span class="c1">-- Application query 但 status 沒 explicit
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"></span><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">@&gt;</span><span class="w"> </span><span class="s1">&#39;{&#34;category&#34;: &#34;shoes&#34;}&#39;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="w"></span><span class="c1">-- → 不用 partial index（planner 不知道 status=&#39;active&#39; 條件）</span></span></span></code></pre></div><p>修法：</p>
<ul>
<li>
<p>Application query <em>必須包含 partial index 的 WHERE 條件</em>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">SELECT</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">products</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">status</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">&#39;active&#39;</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">metadata</span><span class="w"> </span><span class="o">@&gt;</span><span class="w"> </span><span class="s1">&#39;...&#39;</span><span class="p">;</span></span></span></code></pre></div></li>
<li>
<p>確認 planner 用 partial index：<code>EXPLAIN</code> 看 <code>Index Scan using idx_active_metadata</code></p>
</li>
<li>
<p>不對齊 query pattern 的 partial index = waste</p>
</li>
</ul>
<h2 id="何時用-jsonb-vs-拆-column">何時用 JSONB vs 拆 column</h2>
<table>
  <thead>
      <tr>
          <th>場景</th>
          <th>選擇</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>不規則 schema（user-generated metadata / customization）</td>
          <td>JSONB</td>
      </tr>
      <tr>
          <td>半結構化 + 5-10 個常 query key</td>
          <td>JSONB + GIN partial index</td>
      </tr>
      <tr>
          <td>規則 schema、column 數量穩定</td>
          <td>拆 column（更快 / index 易）</td>
      </tr>
      <tr>
          <td>Nested 結構 + 經常需要展開 query</td>
          <td>JSONB + jsonb_path_query</td>
      </tr>
      <tr>
          <td>大 document（&gt; 1 KB）+ 高頻 update</td>
          <td>拆 column 或 separate table</td>
      </tr>
      <tr>
          <td>完全 schemaless workload</td>
          <td>考慮 MongoDB 而非 PG</td>
      </tr>
  </tbody>
</table>
<p>JSONB 是 <em>PG 適合 semi-structured data</em> 的工具、不是 <em>MongoDB 替代品</em>。對 <em>主要結構化 + 少量 JSON</em> 場景 JSONB 完美；對 <em>主要 JSON / 複雜 nested aggregation</em> 場景 MongoDB 仍是專業選擇。</p>
<h2 id="跟其他模組整合">跟其他模組整合</h2>
<h3 id="跟-query-optimization">跟 Query Optimization</h3>
<p>JSONB query 的 planner 行為：</p>
<ul>
<li><code>@&gt;</code> containment 對 jsonb_ops / jsonb_path_ops 都用 GIN</li>
<li><code>?</code> 只對 jsonb_ops 用 GIN</li>
<li>jsonb_path_exists 用 <em>functional index</em>（不是 GIN）</li>
<li>看 EXPLAIN 確認用對 index、詳見 <a href="/blog/backend/01-database/vendors/postgresql/query-optimization/" data-link-title="PostgreSQL Query Optimization：EXPLAIN ANALYZE / pg_hint_plan / auto_explain 三層工具跟 4 個 case" data-link-desc="PG query 慢的根因常是 *planner 選錯 plan 或 statistics 過時*。本文從 4 個 production case 開場（seq scan vs index / hash vs nested loop / 多 column 統計缺 / parallel query 沒觸發）、走 EXPLAIN / EXPLAIN ANALYZE / auto_explain 三層工具、pg_hint_plan extension 跟 planner GUC 取捨、5 production 踩雷（ANALYZE 過時 / multi-column statistics / cost-base setting 不對齊硬體 / random_page_cost SSD 沒調 / parallel query 配置）、跟 MySQL query-optimization sibling 對比">Query Optimization</a></li>
</ul>
<h3 id="跟-sql-features-baseline">跟 SQL Features Baseline</h3>
<p>JSONB 是 PG 結構性領先特性之一、詳見 <a href="/blog/backend/01-database/vendors/postgresql/sql-features-baseline/" data-link-title="PostgreSQL SQL Features：PG 早就有的、MySQL 8.0 才補的、PG 仍領先的" data-link-desc="PG 在 SQL features 上長期領先 MySQL — CTE / window function / lateral / partial index / FTS / JSONB / GIN index / materialized view 在 PG 早 5-15 年。MySQL 8.0（2018）補多數但 *index / storage / extension* 層仍是 PG 結構優勢。本文整理 PG 早期就有的特性、MySQL 8.0 補的差異、PG 仍領先的、跟 MySQL modern-sql-features sibling 反向視角">SQL Features Baseline</a>。</p>
<h3 id="跟-mvcc--lock-model">跟 MVCC + Lock Model</h3>
<p>JSONB UPDATE 整個 column 重寫、每次 update 創新 tuple、跟 row update 相同 MVCC behavior。詳見 <a href="/blog/backend/01-database/vendors/postgresql/mvcc-lock-model/" data-link-title="PostgreSQL MVCC &#43; Lock Model：為什麼 PG 比 MySQL 少 deadlock、但 vacuum 是別的代價" data-link-desc="PG 用 *MVCC-heavy &#43; 少 explicit lock* 的並行控制、跟 MySQL InnoDB 的 *lock-based*（record / gap / next-key）相反。本文走 MVCC 機制（tuple version &#43; xmin/xmax &#43; visibility）、PG 4 種 lock（row-level / table-level / advisory / predicate）、預測 SERIALIZABLE 行為、5 production 踩雷（idle transaction 卡 vacuum / SELECT FOR UPDATE 跨 transaction / advisory lock 沒釋放 / bloat 不是 vacuum 問題 / predicate lock 在 SSI 下 rollback）、跟 MySQL lock-contention sibling 對比">MVCC + Lock Model</a>。</p>
<h3 id="跟-mysql-json_table">跟 MySQL JSON_TABLE</h3>
<p>MySQL 8.0 JSON_TABLE 跟 PG jsonpath 類似（都 SQL standard）、但 <em>index 機制</em> 完全不同：</p>
<ul>
<li>PG：JSONB + GIN index over 整個 column</li>
<li>MySQL：JSON column + generated column + index over generated</li>
</ul>
<p>PG JSONB GIN 是 <em>結構性領先</em>、MySQL 短期內難對應。詳見 <a href="/blog/backend/01-database/vendors/mysql/modern-sql-features/" data-link-title="MySQL 8.0 Modern SQL：CTE / window function / JSON_TABLE 不是「終於跟上 PG」、是進入 SQL 工程深度的入場券" data-link-desc="MySQL 8.0 在 SQL 特性上 *終於補齊* CTE、window function、lateral derived table、JSON_TABLE、hash join 等現代 SQL 特性。本文走 5 個關鍵特性、各自實際 production 場景、跟 PostgreSQL 對應特性的行為差異（特別是 JSON_TABLE vs PG JSONB / jsonb_path_query）、配置 / migration 注意事項、5 production 踩雷（CTE 不 materialize / window function 大量 sort spill / JSON_TABLE 跟 generated column 取捨 / hash join 預設沒開 / recursive CTE 深度上限）">MySQL Modern SQL Features</a>。</p>
<h2 id="觀測-metric">觀測 metric</h2>
<ul>
<li><code>pg_column_size(metadata)</code> — 每 row JSONB size 分布</li>
<li><code>pg_relation_size('idx_name')</code> — JSONB GIN index 大小</li>
<li><code>pg_stat_user_indexes.idx_scan</code> — JSONB index 使用次數</li>
<li>TOAST table size：<code>SELECT pg_relation_size(reltoastrelid) FROM pg_class WHERE relname='products'</code></li>
</ul>
<h2 id="相關連結">相關連結</h2>
<ul>
<li><a href="/blog/backend/01-database/vendors/postgresql/" data-link-title="PostgreSQL" data-link-desc="多用途 OLTP 主流關聯式資料庫、MVCC、豐富 SQL 特性、是 Aurora / Cosmos DB / Spanner / CockroachDB / Aurora DSQL 的相容目標">PostgreSQL vendor overview</a></li>
<li><a href="/blog/backend/01-database/vendors/postgresql/sql-features-baseline/" data-link-title="PostgreSQL SQL Features：PG 早就有的、MySQL 8.0 才補的、PG 仍領先的" data-link-desc="PG 在 SQL features 上長期領先 MySQL — CTE / window function / lateral / partial index / FTS / JSONB / GIN index / materialized view 在 PG 早 5-15 年。MySQL 8.0（2018）補多數但 *index / storage / extension* 層仍是 PG 結構優勢。本文整理 PG 早期就有的特性、MySQL 8.0 補的差異、PG 仍領先的、跟 MySQL modern-sql-features sibling 反向視角">PG SQL Features Baseline</a>（JSONB 是 PG 結構領先之一）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/query-optimization/" data-link-title="PostgreSQL Query Optimization：EXPLAIN ANALYZE / pg_hint_plan / auto_explain 三層工具跟 4 個 case" data-link-desc="PG query 慢的根因常是 *planner 選錯 plan 或 statistics 過時*。本文從 4 個 production case 開場（seq scan vs index / hash vs nested loop / 多 column 統計缺 / parallel query 沒觸發）、走 EXPLAIN / EXPLAIN ANALYZE / auto_explain 三層工具、pg_hint_plan extension 跟 planner GUC 取捨、5 production 踩雷（ANALYZE 過時 / multi-column statistics / cost-base setting 不對齊硬體 / random_page_cost SSD 沒調 / parallel query 配置）、跟 MySQL query-optimization sibling 對比">PG Query Optimization</a>（JSONB index 用對）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/mvcc-lock-model/" data-link-title="PostgreSQL MVCC &#43; Lock Model：為什麼 PG 比 MySQL 少 deadlock、但 vacuum 是別的代價" data-link-desc="PG 用 *MVCC-heavy &#43; 少 explicit lock* 的並行控制、跟 MySQL InnoDB 的 *lock-based*（record / gap / next-key）相反。本文走 MVCC 機制（tuple version &#43; xmin/xmax &#43; visibility）、PG 4 種 lock（row-level / table-level / advisory / predicate）、預測 SERIALIZABLE 行為、5 production 踩雷（idle transaction 卡 vacuum / SELECT FOR UPDATE 跨 transaction / advisory lock 沒釋放 / bloat 不是 vacuum 問題 / predicate lock 在 SSI 下 rollback）、跟 MySQL lock-contention sibling 對比">PG MVCC + Lock Model</a>（JSONB update 跟 MVCC）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/modern-sql-features/" data-link-title="MySQL 8.0 Modern SQL：CTE / window function / JSON_TABLE 不是「終於跟上 PG」、是進入 SQL 工程深度的入場券" data-link-desc="MySQL 8.0 在 SQL 特性上 *終於補齊* CTE、window function、lateral derived table、JSON_TABLE、hash join 等現代 SQL 特性。本文走 5 個關鍵特性、各自實際 production 場景、跟 PostgreSQL 對應特性的行為差異（特別是 JSON_TABLE vs PG JSONB / jsonb_path_query）、配置 / migration 注意事項、5 production 踩雷（CTE 不 materialize / window function 大量 sort spill / JSON_TABLE 跟 generated column 取捨 / hash join 預設沒開 / recursive CTE 深度上限）">MySQL Modern SQL Features</a>（JSON_TABLE vs JSONB 對比）</li>
<li><a href="/blog/backend/01-database/vendors/mongodb/" data-link-title="MongoDB" data-link-desc="Document database 代表、Atlas managed、跨雲可用、許多大規模平台從 MongoDB 起家">MongoDB vendor</a>（純 document workload 替代）</li>
<li>官方：<a href="https://www.postgresql.org/docs/current/functions-json.html">PG JSON Functions</a> / <a href="https://www.postgresql.org/docs/current/datatype-json.html#JSON-INDEXING">JSONB Indexing</a></li>
</ul>
]]></content:encoded></item><item><title>MySQL Document Store / X Protocol</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/document-store-x-protocol/</link><pubDate>Fri, 22 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/document-store-x-protocol/</guid><description>&lt;p>MySQL Document Store / X Protocol 的核心責任是說明 MySQL 如何在 relational engine 內提供 JSON document workflow。&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/document-store/" data-link-title="Document Store" data-link-desc="說明以 JSON 文件與彈性 schema 提供資料存取的模式，以及它仍需的治理邊界">Document Store&lt;/a> 讓 application 透過 X Protocol 與 CRUD API 操作 collection，但資料仍落在 MySQL 的 storage、transaction、backup 與 permission 模型裡。&lt;/p>
&lt;p>本文的判讀錨點是：Document Store 是 MySQL 內的 document access pattern，而非 MongoDB 等專用 document database 的完整替代。它適合 relational schema 旁邊的 flexible JSON，但不適合把主要資料模型都藏進無治理 JSON。&lt;/p>
&lt;p>官方文件路由的核心責任是固定 X Protocol claim。實作前先查 &lt;a href="https://dev.mysql.com/doc/refman/en/document-store.html">MySQL 8.4 Document Store&lt;/a>；本文最後檢查日是 2026-05-22。&lt;/p>
&lt;h2 id="responsibility-boundary">Responsibility Boundary&lt;/h2>
&lt;p>Responsibility boundary 的核心責任是把 Document Store 和 SQL table 關係說清楚。&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>面向&lt;/th>
 &lt;th>Document Store&lt;/th>
 &lt;th>SQL table / JSON column&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Access API&lt;/td>
 &lt;td>X Protocol、CRUD-style API&lt;/td>
 &lt;td>SQL、JSON function&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Storage&lt;/td>
 &lt;td>MySQL InnoDB&lt;/td>
 &lt;td>MySQL InnoDB&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Transaction&lt;/td>
 &lt;td>MySQL transaction&lt;/td>
 &lt;td>MySQL transaction&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Governance&lt;/td>
 &lt;td>仍需 backup、role、audit、migration&lt;/td>
 &lt;td>仍需 schema / index review&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Query power&lt;/td>
 &lt;td>document-friendly access&lt;/td>
 &lt;td>SQL join、index、optimizer&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Document Store 的價值是降低 flexible object 的開發摩擦。它不免除資料合約、index、migration、backup 與 audit 的責任。&lt;/p>
&lt;h2 id="suitable-use-cases">Suitable Use Cases&lt;/h2>
&lt;p>Suitable use cases 的核心責任是找出 document pattern 的合理位置。&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>情境&lt;/th>
 &lt;th>適合原因&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Profile / preference&lt;/td>
 &lt;td>欄位變動快、查詢條件少&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Integration payload&lt;/td>
 &lt;td>需要保存外部 JSON 原文&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Feature flag / config&lt;/td>
 &lt;td>讀多寫少、schema 變化頻繁&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Hybrid relational + JSON&lt;/td>
 &lt;td>主體 relational，局部 flexible&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Prototype&lt;/td>
 &lt;td>先探索欄位，再逐步 relationalize&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Document Store 最適合局部 flexible data。若核心 query 需要大量 join、aggregation、transaction invariant，應把穩定欄位拉回 relational schema。&lt;/p>
&lt;h2 id="query-and-index">Query and Index&lt;/h2>
&lt;p>Query and index 的核心責任是避免 JSON 查詢變成不可觀測黑箱。&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>問題&lt;/th>
 &lt;th>審查方向&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>常用 filter&lt;/td>
 &lt;td>是否需要 generated column / functional index&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Sort / pagination&lt;/td>
 &lt;td>是否能走 index&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Schema drift&lt;/td>
 &lt;td>document version / validation&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Large document&lt;/td>
 &lt;td>update amplification、network payload&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Analytics&lt;/td>
 &lt;td>是否應 ETL 到 OLAP / warehouse&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>MySQL JSON 查詢可以從 generated column 建 index。正式服務要把常用 JSON path 寫進 query contract，避免每次都掃完整 document。&lt;/p></description><content:encoded><![CDATA[<p>MySQL Document Store / X Protocol 的核心責任是說明 MySQL 如何在 relational engine 內提供 JSON document workflow。<a href="/blog/backend/knowledge-cards/document-store/" data-link-title="Document Store" data-link-desc="說明以 JSON 文件與彈性 schema 提供資料存取的模式，以及它仍需的治理邊界">Document Store</a> 讓 application 透過 X Protocol 與 CRUD API 操作 collection，但資料仍落在 MySQL 的 storage、transaction、backup 與 permission 模型裡。</p>
<p>本文的判讀錨點是：Document Store 是 MySQL 內的 document access pattern，而非 MongoDB 等專用 document database 的完整替代。它適合 relational schema 旁邊的 flexible JSON，但不適合把主要資料模型都藏進無治理 JSON。</p>
<p>官方文件路由的核心責任是固定 X Protocol claim。實作前先查 <a href="https://dev.mysql.com/doc/refman/en/document-store.html">MySQL 8.4 Document Store</a>；本文最後檢查日是 2026-05-22。</p>
<h2 id="responsibility-boundary">Responsibility Boundary</h2>
<p>Responsibility boundary 的核心責任是把 Document Store 和 SQL table 關係說清楚。</p>
<table>
  <thead>
      <tr>
          <th>面向</th>
          <th>Document Store</th>
          <th>SQL table / JSON column</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Access API</td>
          <td>X Protocol、CRUD-style API</td>
          <td>SQL、JSON function</td>
      </tr>
      <tr>
          <td>Storage</td>
          <td>MySQL InnoDB</td>
          <td>MySQL InnoDB</td>
      </tr>
      <tr>
          <td>Transaction</td>
          <td>MySQL transaction</td>
          <td>MySQL transaction</td>
      </tr>
      <tr>
          <td>Governance</td>
          <td>仍需 backup、role、audit、migration</td>
          <td>仍需 schema / index review</td>
      </tr>
      <tr>
          <td>Query power</td>
          <td>document-friendly access</td>
          <td>SQL join、index、optimizer</td>
      </tr>
  </tbody>
</table>
<p>Document Store 的價值是降低 flexible object 的開發摩擦。它不免除資料合約、index、migration、backup 與 audit 的責任。</p>
<h2 id="suitable-use-cases">Suitable Use Cases</h2>
<p>Suitable use cases 的核心責任是找出 document pattern 的合理位置。</p>
<table>
  <thead>
      <tr>
          <th>情境</th>
          <th>適合原因</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Profile / preference</td>
          <td>欄位變動快、查詢條件少</td>
      </tr>
      <tr>
          <td>Integration payload</td>
          <td>需要保存外部 JSON 原文</td>
      </tr>
      <tr>
          <td>Feature flag / config</td>
          <td>讀多寫少、schema 變化頻繁</td>
      </tr>
      <tr>
          <td>Hybrid relational + JSON</td>
          <td>主體 relational，局部 flexible</td>
      </tr>
      <tr>
          <td>Prototype</td>
          <td>先探索欄位，再逐步 relationalize</td>
      </tr>
  </tbody>
</table>
<p>Document Store 最適合局部 flexible data。若核心 query 需要大量 join、aggregation、transaction invariant，應把穩定欄位拉回 relational schema。</p>
<h2 id="query-and-index">Query and Index</h2>
<p>Query and index 的核心責任是避免 JSON 查詢變成不可觀測黑箱。</p>
<table>
  <thead>
      <tr>
          <th>問題</th>
          <th>審查方向</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>常用 filter</td>
          <td>是否需要 generated column / functional index</td>
      </tr>
      <tr>
          <td>Sort / pagination</td>
          <td>是否能走 index</td>
      </tr>
      <tr>
          <td>Schema drift</td>
          <td>document version / validation</td>
      </tr>
      <tr>
          <td>Large document</td>
          <td>update amplification、network payload</td>
      </tr>
      <tr>
          <td>Analytics</td>
          <td>是否應 ETL 到 OLAP / warehouse</td>
      </tr>
  </tbody>
</table>
<p>MySQL JSON 查詢可以從 generated column 建 index。正式服務要把常用 JSON path 寫進 query contract，避免每次都掃完整 document。</p>
<h2 id="migration-boundary">Migration Boundary</h2>
<p>Migration boundary 的核心責任是讓 document data 可演進。Document 欄位雖然 flexible，但 application 仍會依賴某些 key；這些 key 一旦進入 workflow，就要有版本與 validation。</p>
<p>最小治理：</p>
<ol>
<li>Document version field。</li>
<li>Required key validation at application boundary。</li>
<li>Backfill script for new required key。</li>
<li>Index review for promoted key。</li>
<li>Export / backup restore validation。</li>
</ol>
<p>當 JSON key 變成 join key、permission key 或 reporting key，應評估搬到 relational column。</p>
<h2 id="no-go-conditions">No-Go Conditions</h2>
<p>No-go conditions 的核心責任是指出 Document Store 的邊界。</p>
<table>
  <thead>
      <tr>
          <th>訊號</th>
          <th>建議路由</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>主要資料都是 nested document</td>
          <td>MongoDB / document database evaluation</td>
      </tr>
      <tr>
          <td>大量 document aggregation</td>
          <td>OLAP / search / document-oriented engine</td>
      </tr>
      <tr>
          <td>JSON path 已成核心 index</td>
          <td>relationalize key 或 generated column</td>
      </tr>
      <tr>
          <td>需要跨 document complex join</td>
          <td>relational schema</td>
      </tr>
      <tr>
          <td>需要 schema governance</td>
          <td>migration + validation</td>
      </tr>
  </tbody>
</table>
<p>Document Store 要服務於 flexible edge，而非取代資料建模。當 flexible area 穩定下來，就把它納入 schema governance。</p>
<h2 id="下一步路由">下一步路由</h2>
<p>Document Store / X Protocol 完成後，JSON 與 SQL 能力讀 <a href="../modern-sql-features/">Modern SQL Features</a>；若主要資料模型是 document，讀 <a href="/blog/backend/01-database/vendors/mongodb/" data-link-title="MongoDB" data-link-desc="Document database 代表、Atlas managed、跨雲可用、許多大規模平台從 MongoDB 起家">MongoDB</a>；migration 到 PostgreSQL JSONB 可讀 <a href="../migrate-to-postgresql/">MySQL to PostgreSQL</a>。</p>
]]></content:encoded></item></channel></rss>