<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Innodb on Tarragon</title><link>https://tarrragon.github.io/blog/tags/innodb/</link><description>Recent content in Innodb on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Tue, 19 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/innodb/index.xml" rel="self" type="application/rss+xml"/><item><title>MySQL InnoDB Tuning：為什麼一個 100 GB DB 在 64 GB RAM server 上 query 慢 5 倍</title><link>https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/innodb-tuning/</link><pubDate>Tue, 19 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/innodb-tuning/</guid><description>&lt;blockquote>
&lt;p>本文是 &lt;a href="https://tarrragon.github.io/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL&lt;/a> overview 的 implementation-layer deep article。Overview 已說明 MySQL 在 OLTP 譜系的定位、本文聚焦 &lt;em>InnoDB engine tuning&lt;/em> — 4 個影響最大的 knob 跟對應 production 行為。&lt;/p>&lt;/blockquote>
&lt;hr>
&lt;h2 id="開場常見痛點">開場：常見痛點&lt;/h2>
&lt;p>一個 100 GB MySQL DB、64 GB RAM 的 server、p99 query latency 從 5ms 飆到 50ms。第一直覺是 server overload — 但 CPU &amp;lt; 30%、disk IO 50 IOPS。為什麼慢？&lt;/p>
&lt;p>打開 &lt;code>SHOW VARIABLES LIKE 'innodb_buffer_pool_size'&lt;/code>：&lt;code>134217728&lt;/code>（128 MB）。對 64 GB RAM server、buffer pool 只用了 128 MB、剩 99.9% 的 working set 每次 query 都要從 disk 讀。CPU 閒、disk 沒滿、是因為 &lt;em>MySQL 自己不用 RAM&lt;/em> — 用 InnoDB 預設值跑 100 GB DB 等於 disk-only 模式。&lt;/p>
&lt;p>這個案例展示 InnoDB tuning 的核心：MySQL 預設值是 &lt;em>為 16 GB RAM 設計&lt;/em>、production server RAM 越大、預設值離 optimal 越遠。&lt;/p>
&lt;h2 id="4-個-critical-knob">4 個 critical knob&lt;/h2>
&lt;p>對 90% production case、調這 4 個就解決大部分 InnoDB 性能問題：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Knob&lt;/th>
 &lt;th>預設&lt;/th>
 &lt;th>對 production 建議&lt;/th>
 &lt;th>影響&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;code>innodb_buffer_pool_size&lt;/code>&lt;/td>
 &lt;td>128 MB&lt;/td>
 &lt;td>系統 RAM 50-75%（dedicated server 75%）&lt;/td>
 &lt;td>讀效能（資料能否在 RAM）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>innodb_log_file_size&lt;/code>&lt;/td>
 &lt;td>48 MB（×2 file）&lt;/td>
 &lt;td>1-4 GB（依寫吞吐、8.0.30+ 改 &lt;code>innodb_redo_log_capacity&lt;/code>）&lt;/td>
 &lt;td>寫效能（flush 頻率）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>innodb_flush_log_at_trx_commit&lt;/code>&lt;/td>
 &lt;td>1 (full ACID)&lt;/td>
 &lt;td>1（金融 / 訂單）/ 2（高吞吐可容 1 秒 loss）&lt;/td>
 &lt;td>寫吞吐 vs durability&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>innodb_io_capacity&lt;/code> + &lt;code>_max&lt;/code>&lt;/td>
 &lt;td>200 / 2000&lt;/td>
 &lt;td>SSD: 2000 / 20000; NVMe: 10000 / 40000&lt;/td>
 &lt;td>flush 速度（適配儲存）&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>其他 knob（&lt;code>innodb_thread_concurrency&lt;/code> / &lt;code>innodb_buffer_pool_instances&lt;/code> / &lt;code>innodb_read_io_threads&lt;/code> 等）也有影響、但對多數 case &lt;em>先把這 4 個調對&lt;/em> 比微調其他 20 個重要。&lt;/p>
&lt;h2 id="knob-1buffer-pool--把-working-set-拉進-ram">Knob 1：Buffer pool — 把 working set 拉進 RAM&lt;/h2>
&lt;p>&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/buffer-pool/" data-link-title="Buffer Pool" data-link-desc="說明資料庫如何用記憶體快取磁碟頁，以降低 I/O 並影響查詢效能">InnoDB buffer pool&lt;/a> 是 &lt;em>page cache&lt;/em> — 從 disk 讀過的 16 KB page 快取在 RAM、下次 query 直接 RAM 讀。Buffer pool 越大、cache hit ratio 越高、disk IO 越少。&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>本文是 <a href="/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL</a> overview 的 implementation-layer deep article。Overview 已說明 MySQL 在 OLTP 譜系的定位、本文聚焦 <em>InnoDB engine tuning</em> — 4 個影響最大的 knob 跟對應 production 行為。</p></blockquote>
<hr>
<h2 id="開場常見痛點">開場：常見痛點</h2>
<p>一個 100 GB MySQL DB、64 GB RAM 的 server、p99 query latency 從 5ms 飆到 50ms。第一直覺是 server overload — 但 CPU &lt; 30%、disk IO 50 IOPS。為什麼慢？</p>
<p>打開 <code>SHOW VARIABLES LIKE 'innodb_buffer_pool_size'</code>：<code>134217728</code>（128 MB）。對 64 GB RAM server、buffer pool 只用了 128 MB、剩 99.9% 的 working set 每次 query 都要從 disk 讀。CPU 閒、disk 沒滿、是因為 <em>MySQL 自己不用 RAM</em> — 用 InnoDB 預設值跑 100 GB DB 等於 disk-only 模式。</p>
<p>這個案例展示 InnoDB tuning 的核心：MySQL 預設值是 <em>為 16 GB RAM 設計</em>、production server RAM 越大、預設值離 optimal 越遠。</p>
<h2 id="4-個-critical-knob">4 個 critical knob</h2>
<p>對 90% production case、調這 4 個就解決大部分 InnoDB 性能問題：</p>
<table>
  <thead>
      <tr>
          <th>Knob</th>
          <th>預設</th>
          <th>對 production 建議</th>
          <th>影響</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>innodb_buffer_pool_size</code></td>
          <td>128 MB</td>
          <td>系統 RAM 50-75%（dedicated server 75%）</td>
          <td>讀效能（資料能否在 RAM）</td>
      </tr>
      <tr>
          <td><code>innodb_log_file_size</code></td>
          <td>48 MB（×2 file）</td>
          <td>1-4 GB（依寫吞吐、8.0.30+ 改 <code>innodb_redo_log_capacity</code>）</td>
          <td>寫效能（flush 頻率）</td>
      </tr>
      <tr>
          <td><code>innodb_flush_log_at_trx_commit</code></td>
          <td>1 (full ACID)</td>
          <td>1（金融 / 訂單）/ 2（高吞吐可容 1 秒 loss）</td>
          <td>寫吞吐 vs durability</td>
      </tr>
      <tr>
          <td><code>innodb_io_capacity</code> + <code>_max</code></td>
          <td>200 / 2000</td>
          <td>SSD: 2000 / 20000; NVMe: 10000 / 40000</td>
          <td>flush 速度（適配儲存）</td>
      </tr>
  </tbody>
</table>
<p>其他 knob（<code>innodb_thread_concurrency</code> / <code>innodb_buffer_pool_instances</code> / <code>innodb_read_io_threads</code> 等）也有影響、但對多數 case <em>先把這 4 個調對</em> 比微調其他 20 個重要。</p>
<h2 id="knob-1buffer-pool--把-working-set-拉進-ram">Knob 1：Buffer pool — 把 working set 拉進 RAM</h2>
<p><a href="/blog/backend/knowledge-cards/buffer-pool/" data-link-title="Buffer Pool" data-link-desc="說明資料庫如何用記憶體快取磁碟頁，以降低 I/O 並影響查詢效能">InnoDB buffer pool</a> 是 <em>page cache</em> — 從 disk 讀過的 16 KB page 快取在 RAM、下次 query 直接 RAM 讀。Buffer pool 越大、cache hit ratio 越高、disk IO 越少。</p>
<p><strong>Sizing</strong>：</p>
<ul>
<li><em>Dedicated MySQL server</em>：RAM 70-80%（剩 20-30% 給 OS / MySQL 其他結構 / connection buffer）</li>
<li><em>Shared server</em>：RAM 30-50%（看其他 process 需求）</li>
<li><em>Container / Kubernetes</em>：對 container memory limit 70%（不是 host RAM）</li>
</ul>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 64 GB RAM dedicated server</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="na">innodb_buffer_pool_size</span> <span class="o">=</span> <span class="s">48G</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="na">innodb_buffer_pool_instances</span> <span class="o">=</span> <span class="s">8  # 分 8 個 instance 降 mutex contention（每 instance 6 GB）</span></span></span></code></pre></div><p><strong>Buffer pool warm-up</strong>：MySQL 重啟後 buffer pool 是空的、要慢慢從 disk 把熱資料拉回 RAM。預設 5.7+ MySQL 啟動時 <em>dump buffer pool LRU list 到 disk</em>、重啟時 <em>自動 restore</em>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="ln">1</span><span class="cl"><span class="na">innodb_buffer_pool_dump_at_shutdown</span> <span class="o">=</span> <span class="s">1</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="na">innodb_buffer_pool_load_at_startup</span> <span class="o">=</span> <span class="s">1</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="na">innodb_buffer_pool_dump_pct</span> <span class="o">=</span> <span class="s">75  # 只 dump 最 hot 的 75% page list</span></span></span></code></pre></div><p>沒這個 warm-up、重啟後第 1 個小時 query latency 都偏高、application 看到 p99 spike。</p>
<h2 id="knob-2redo-log--flush-頻率跟寫吞吐">Knob 2：Redo log — flush 頻率跟寫吞吐</h2>
<p>InnoDB 寫入 <em>先寫 redo log（順序寫）</em>、再非同步寫到 data file（隨機寫）。Redo log 滿了強迫 flush data file、flush 期間寫吞吐降。</p>
<p><code>innodb_log_file_size</code> 控制每個 log file 大小（預設 2 個 file）：</p>
<ul>
<li>5.7：預設 48 MB × 2 = 96 MB total</li>
<li>8.0：預設仍是 48 MB × 2、8.0.30+ 改用動態 <code>innodb_redo_log_capacity</code>（default 100 MB total）</li>
</ul>
<p>對 5K WPS server、預設容量可能 <em>每分鐘 flush 一次</em>、寫吞吐持續 stall。提高到 1-4 GB total、flush 改成每 30 分鐘一次、寫吞吐穩定。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="ln">1</span><span class="cl"><span class="na">innodb_log_file_size</span> <span class="o">=</span> <span class="s">2G       # 大寫吞吐 server 設 1-4 GB</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="na">innodb_log_files_in_group</span> <span class="o">=</span> <span class="s">2   # 預設 2 個就夠</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="na">innodb_log_buffer_size</span> <span class="o">=</span> <span class="s">64M    # log 寫 disk 前的 RAM buffer</span></span></span></code></pre></div><p><strong>Trade-off</strong>：log file 越大、recovery 時間越長（crash 後 InnoDB 要 replay 全部 log）。1 GB log 通常 &lt; 1 分鐘 recovery、4 GB 可能 5 分鐘以上。SSD / NVMe 這個 trade-off 不嚴重、HDD 要注意。</p>
<p>MySQL 8.0+ 改進：log file 可動態調整（不用重啟）、且 <em>automatic redo log writer threads</em> 降低 mutex contention。</p>
<h2 id="knob-3flush-method--acid-vs-吞吐">Knob 3：Flush method — ACID vs 吞吐</h2>
<p><code>innodb_flush_log_at_trx_commit</code> 控制 <em>每個 transaction commit 時要不要 flush log 到 disk</em>：</p>
<ul>
<li><code>1</code>（預設）：每次 commit fsync log file → <em>zero data loss on crash</em></li>
<li><code>2</code>：每次 commit 寫 log file（但 OS-level cache、不 fsync）→ <em>server crash 不丟、OS crash 丟 1 秒</em></li>
<li><code>0</code>：每秒 fsync 一次 → <em>任何 crash 丟 1 秒</em></li>
</ul>
<p><code>sync_binlog</code> 對應 binlog（不是 InnoDB log）：</p>
<ul>
<li><code>1</code>（建議）：每次 commit fsync binlog</li>
<li><code>0</code>：依賴 OS sync、容易丟 binlog → replication / CDC 風險</li>
</ul>
<p><strong>Production 組合</strong>：</p>
<table>
  <thead>
      <tr>
          <th>用途</th>
          <th><code>innodb_flush_log_at_trx_commit</code></th>
          <th><code>sync_binlog</code></th>
          <th>寫吞吐</th>
          <th>Crash data loss</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>金融 / 訂單 / 支付</td>
          <td>1</td>
          <td>1</td>
          <td>baseline</td>
          <td>0</td>
      </tr>
      <tr>
          <td>一般 web 應用</td>
          <td>1</td>
          <td>1</td>
          <td>baseline</td>
          <td>0</td>
      </tr>
      <tr>
          <td>高寫吞吐 + 容忍 1 sec loss</td>
          <td>2</td>
          <td>1</td>
          <td>+30-50%</td>
          <td>OS crash 丟 1 秒</td>
      </tr>
      <tr>
          <td>Dev / test</td>
          <td>2</td>
          <td>0</td>
          <td>+50-100%</td>
          <td>不重要</td>
      </tr>
      <tr>
          <td>不要這樣設</td>
          <td>0</td>
          <td>0</td>
          <td>+100%</td>
          <td>任意 crash 丟資料</td>
      </tr>
  </tbody>
</table>
<p>多數 production 用 <code>1 + 1</code>、雖然慢但 <em>簡單可預測</em>。改成 <code>2 + 1</code> 之前要明確 <em>能容忍 1 秒 data loss</em>、且通常 review 過 Disaster Recovery Plan。</p>
<h2 id="knob-4io-capacity--適配儲存">Knob 4：IO capacity — 適配儲存</h2>
<p>InnoDB 後台 flush 速度受 <code>innodb_io_capacity</code> 限制：</p>
<ul>
<li><code>innodb_io_capacity</code>（一般）：後台 flush 目標 IOPS</li>
<li><code>innodb_io_capacity_max</code>（突發）：emergency flush 上限</li>
</ul>
<p><strong>對應儲存類型</strong>：</p>
<table>
  <thead>
      <tr>
          <th>儲存</th>
          <th>IOPS 能力</th>
          <th><code>innodb_io_capacity</code></th>
          <th><code>innodb_io_capacity_max</code></th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>7200 RPM HDD</td>
          <td>~80 IOPS</td>
          <td>100</td>
          <td>200</td>
      </tr>
      <tr>
          <td>SSD (SATA)</td>
          <td>10K-50K IOPS</td>
          <td>2000</td>
          <td>20000</td>
      </tr>
      <tr>
          <td>NVMe SSD</td>
          <td>100K-500K IOPS</td>
          <td>10000</td>
          <td>40000</td>
      </tr>
      <tr>
          <td>EBS gp3</td>
          <td>3000-16000 IOPS</td>
          <td>5000</td>
          <td>16000</td>
      </tr>
      <tr>
          <td>EBS io2</td>
          <td>50K-256K IOPS</td>
          <td>20000</td>
          <td>60000</td>
      </tr>
  </tbody>
</table>
<p>預設 <code>200 / 2000</code> 是 <em>為 HDD 設計</em>、SSD / NVMe server 用預設值 = InnoDB 自我限速、flush 慢、寫入瓶頸。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># NVMe SSD server</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="na">innodb_io_capacity</span> <span class="o">=</span> <span class="s">10000</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="na">innodb_io_capacity_max</span> <span class="o">=</span> <span class="s">40000</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="na">innodb_flush_neighbors</span> <span class="o">=</span> <span class="s">0  # NVMe 不需要 group flush 相鄰 page</span></span></span></code></pre></div><h2 id="5-個-production-踩雷">5 個 Production 踩雷</h2>
<h3 id="1-buffer-pool-沒-warm-up--重啟後-1-小時-p99-飆">1. Buffer pool 沒 warm-up — 重啟後 1 小時 p99 飆</h3>
<p>MySQL 重啟（OS upgrade / config change / failover）後、buffer pool 是空的、所有 query 第一次都 disk 讀、p99 latency 飆 5-10x、application 看到 timeout。</p>
<p>修法：</p>
<ul>
<li>啟用 <code>innodb_buffer_pool_dump_at_shutdown=1</code> + <code>innodb_buffer_pool_load_at_startup=1</code></li>
<li>對 <em>沒 graceful shutdown</em> 的 crash（OOM / kernel panic）、buffer pool 沒 dump、warm-up 後第一個小時仍辛苦</li>
<li>重要 server 重啟前手動 dump：<code>SET GLOBAL innodb_buffer_pool_dump_now=ON</code></li>
<li>對於不能容忍 cold cache 的場景、failover 前 <em>先 pre-warm new primary</em>（用 query replay 把 hot data 拉到 buffer pool）</li>
</ul>
<h3 id="2-log-file-size-設太小--checkpoint-storm">2. Log file size 設太小 — checkpoint storm</h3>
<p><code>innodb_log_file_size=48M</code> 預設、高寫吞吐 server log 每分鐘 flush 一次、flush 期間 <em>checkpoint storm</em> — 寫吞吐降 50%、p99 暴增。錯誤訊號是 <code>innodb_log_waits</code> 持續 &gt; 0。</p>
<p>修法：</p>
<ul>
<li>監控 <code>SHOW STATUS LIKE 'Innodb_log_waits'</code> — 應該長期接近 0</li>
<li>提高 <code>innodb_log_file_size</code> 到 1-4 GB（依寫吞吐）</li>
<li>8.0+ 可動態調整、5.7 需要 <em>正常 shutdown</em> 後改、開啟前先 dump buffer pool（避免 cold cache）</li>
</ul>
<h3 id="3-sync_binlog0-換速度--replication-永久-broken-風險">3. <code>sync_binlog=0</code> 換速度 — replication 永久 broken 風險</h3>
<p>開發 / staging 改 <code>sync_binlog=0</code>（加快寫入）、後來複製到 production 配置、production 同樣 <code>sync_binlog=0</code>。OS crash 後 binlog 缺最後幾秒 transaction、replica 跟 primary GTID set diverge、replication broken、要 <em>重建 replica from base backup</em>（小時級 recovery）。</p>
<p>修法：</p>
<ul>
<li><em>Production 永遠用 <code>sync_binlog=1</code></em>、不要為了寫吞吐犧牲 binlog durability</li>
<li>開發 / staging 配置跟 production 隔離、不要直接 copy config</li>
<li>Replica 失聯後 <em>用 GTID 自動 re-attach</em>（不是 binlog position）— 仍然需要 binlog 完整、<code>sync_binlog=0</code> 仍是風險</li>
</ul>
<h3 id="4-io-scheduler--不是-innodb-tuning-但影響大">4. IO scheduler — 不是 InnoDB tuning 但影響大</h3>
<p>Linux <code>noop</code> / <code>deadline</code> / <code>cfq</code> IO scheduler 對 SSD / NVMe 影響大：</p>
<ul>
<li><code>cfq</code>（traditional spinning disk default）：對 SSD 嚴重 bottleneck</li>
<li><code>deadline</code>：對 SSD 較好、但有 latency cap</li>
<li><code>noop</code> / <code>none</code>：對 NVMe 最好（讓 device 自己處理 queue）</li>
</ul>
<p><strong>Production check</strong>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">cat /sys/block/sda/queue/scheduler
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"># 應該顯示： [none] mq-deadline (NVMe)</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="c1"># 或：         noop deadline [cfq] (cfq 是錯的)</span></span></span></code></pre></div><p>不是 InnoDB knob、但影響 InnoDB IO behavior &gt; 30%。InnoDB tuning 前先確認 OS-level IO scheduler 對。</p>
<h3 id="5-undo-log-膨脹--purge-跟不上">5. Undo log 膨脹 — purge 跟不上</h3>
<p>Undo log 紀錄 <em>未來可能 rollback 需要的舊版本 row</em>。長 transaction（hours-level）讓 undo log 持續累積、不能 purge、最後 InnoDB tablespace 膨脹幾 GB、disk 滿。</p>
<p>訊號：</p>
<ul>
<li><code>SHOW ENGINE INNODB STATUS</code> 看 <code>History list length</code> 持續成長（正常 &lt; 1000、異常 millions）</li>
<li><code>information_schema.innodb_metrics</code> 的 <code>trx_rseg_history_len</code></li>
</ul>
<p>修法：</p>
<ul>
<li>找 long-running transaction：<code>SELECT * FROM information_schema.innodb_trx WHERE trx_started &lt; NOW() - INTERVAL 1 HOUR</code></li>
<li>KILL 該 transaction（謹慎、可能 application bug）</li>
<li>8.0+ 用 separate undo tablespace（<code>innodb_undo_tablespaces</code>）、不污染 main tablespace、且可以 truncate</li>
</ul>
<h2 id="容量規劃要點">容量規劃要點</h2>
<p>對 64 GB RAM、NVMe SSD、5K WPS、100 GB DB 的 server：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># my.cnf production-ready baseline</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="k">[mysqld]</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="c1"># Buffer pool (75% RAM)</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="na">innodb_buffer_pool_size</span> <span class="o">=</span> <span class="s">48G</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="na">innodb_buffer_pool_instances</span> <span class="o">=</span> <span class="s">8</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="na">innodb_buffer_pool_dump_at_shutdown</span> <span class="o">=</span> <span class="s">1</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="na">innodb_buffer_pool_load_at_startup</span> <span class="o">=</span> <span class="s">1</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">
</span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="c1"># Redo log</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="na">innodb_log_file_size</span> <span class="o">=</span> <span class="s">2G</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="na">innodb_log_files_in_group</span> <span class="o">=</span> <span class="s">2</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="na">innodb_log_buffer_size</span> <span class="o">=</span> <span class="s">64M</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">
</span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="c1"># Flush behavior</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="na">innodb_flush_log_at_trx_commit</span> <span class="o">=</span> <span class="s">1</span>
</span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="na">sync_binlog</span> <span class="o">=</span> <span class="s">1</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="na">innodb_flush_method</span> <span class="o">=</span> <span class="s">O_DIRECT  # 跳過 OS page cache 避免 double cache</span>
</span></span><span class="line"><span class="ln">18</span><span class="cl">
</span></span><span class="line"><span class="ln">19</span><span class="cl"><span class="c1"># IO capacity (NVMe)</span>
</span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="na">innodb_io_capacity</span> <span class="o">=</span> <span class="s">10000</span>
</span></span><span class="line"><span class="ln">21</span><span class="cl"><span class="na">innodb_io_capacity_max</span> <span class="o">=</span> <span class="s">40000</span>
</span></span><span class="line"><span class="ln">22</span><span class="cl"><span class="na">innodb_flush_neighbors</span> <span class="o">=</span> <span class="s">0</span>
</span></span><span class="line"><span class="ln">23</span><span class="cl"><span class="na">innodb_lru_scan_depth</span> <span class="o">=</span> <span class="s">1024</span>
</span></span><span class="line"><span class="ln">24</span><span class="cl">
</span></span><span class="line"><span class="ln">25</span><span class="cl"><span class="c1"># Concurrency</span>
</span></span><span class="line"><span class="ln">26</span><span class="cl"><span class="na">innodb_thread_concurrency</span> <span class="o">=</span> <span class="s">0  # 0 = no limit (8.0+ 推薦)</span>
</span></span><span class="line"><span class="ln">27</span><span class="cl"><span class="na">innodb_read_io_threads</span> <span class="o">=</span> <span class="s">8</span>
</span></span><span class="line"><span class="ln">28</span><span class="cl"><span class="na">innodb_write_io_threads</span> <span class="o">=</span> <span class="s">8</span>
</span></span><span class="line"><span class="ln">29</span><span class="cl">
</span></span><span class="line"><span class="ln">30</span><span class="cl"><span class="c1"># 額外</span>
</span></span><span class="line"><span class="ln">31</span><span class="cl"><span class="na">innodb_file_per_table</span> <span class="o">=</span> <span class="s">1</span>
</span></span><span class="line"><span class="ln">32</span><span class="cl"><span class="na">innodb_strict_mode</span> <span class="o">=</span> <span class="s">1</span></span></span></code></pre></div><p>跨不同 server spec、<code>buffer_pool_size</code> / <code>io_capacity</code> 隨硬體調整、其他 knob 變動小。</p>
<h2 id="跟其他模組整合">跟其他模組整合</h2>
<h3 id="跟-replication-topology">跟 Replication topology</h3>
<p><code>sync_binlog=1</code> + <code>innodb_flush_log_at_trx_commit=1</code> 是 <em>durability baseline</em>、影響 <a href="/blog/backend/01-database/vendors/mysql/replication-topology/" data-link-title="MySQL Replication Topology：async / semi-sync / GTID 不是三選一、是三個 trade-off 軸的疊加" data-link-desc="MySQL replication 不是「選 async 還是 semi-sync」、是 *durability / latency / consistency* 三個 trade-off 軸的疊加；GTID 是跨 mode 的 infrastructure layer、不是第三種 mode。本文走 3 軸取捨模型 → async / semi-sync 行為對比 → GTID 替代 binlog-position 的好處 → 配置 step-by-step → 5 production 踩雷（lag 暴衝 / semi-sync 退回 async / GTID gap / Loss-Less semi-sync 真的 loss-less / chained replication 雪崩）→ 跟 Aurora MySQL / Vitess / ProxySQL / Orchestrator 整合">Replication Topology</a> 的 <em>primary durability</em>。Semi-sync 加在這基礎上提供 <em>跨 server durability</em>。</p>
<h3 id="跟-proxysql">跟 ProxySQL</h3>
<p>ProxySQL connection pool 降低 <em>MySQL connection 開銷</em>、但 <em>每個 connection</em> 仍消耗 8-10 MB RAM（thread stack + session buffer）。Buffer pool 設 75% RAM 後、剩 25% 給 connection / temporary buffer / OS。Connection 太多會擠掉 buffer pool。</p>
<p>詳見 <a href="/blog/backend/01-database/vendors/mysql/proxysql-config/" data-link-title="MySQL ProxySQL 配置：connection / query / route / response 四段 lifecycle 跟 query rule 設計" data-link-desc="ProxySQL 是 MySQL 生態的 connection pool &#43; query routing 標準。本文走 connection → query parse → route → response 四段 lifecycle、query rule engine 的 rule chain 設計、Hostgroup / Server / User 三層 schema、配置 step-by-step（讀寫分離 &#43; replica lag-aware routing）、5 production 踩雷（query rule 順序錯亂 / connection 漂移 / write 路由到 replica / runtime / disk schema drift / mirror traffic 副作用）、跟 Replication / Orchestrator / HAProxy 整合">ProxySQL 配置</a>。</p>
<h3 id="跟-aurora-mysql">跟 Aurora MySQL</h3>
<p>Aurora 改寫 InnoDB storage layer、上方 knob 大多 <em>Aurora 自動管理</em>：</p>
<ul>
<li>Buffer pool size：Aurora compute instance 自動配</li>
<li>Redo log：Aurora 自己的 distributed log、不用 <code>innodb_log_file_size</code></li>
<li><code>sync_binlog</code> / <code>innodb_flush_log_at_trx_commit</code>：Aurora storage layer 保證 durability、應用層 knob 影響小</li>
</ul>
<p>Aurora user 仍可 tune <code>innodb_buffer_pool_size</code> 等、但操作面從 InnoDB 內部議題變成 <em>Aurora instance class 選擇</em>。詳見 <a href="/blog/backend/01-database/vendors/aurora/" data-link-title="AWS Aurora" data-link-desc="AWS managed PostgreSQL / MySQL、storage / compute 分離、&#43;75% 效能改善的 production 證據">Aurora vendor page</a>。</p>
<h3 id="跟-osc-tool">跟 OSC tool</h3>
<p>InnoDB tuning 不直接影響 OSC 工具行為、但 <em>log file size 太小</em> 時 gh-ost / pt-osc 寫 ghost table 容易 trigger checkpoint storm、放慢整個 schema migration。詳見 <a href="/blog/backend/01-database/vendors/mysql/online-schema-change-tools/" data-link-title="MySQL Online Schema Change：gh-ost 跟 pt-online-schema-change 兩條完全不同的 ghost table 路徑" data-link-desc="MySQL ALTER TABLE 可能鎖整張表，production 需要 online schema change 流程。gh-ost（GitHub）跟 pt-online-schema-change（Percona）都用 ghost table 解決、但底層機制完全不同：pt-osc 用 trigger 同步、gh-ost 用 binlog stream 同步。本文走兩工具機制對照表 → trigger vs binlog 各自取捨 → 配置 step-by-step → 5 production 踩雷（trigger overhead / binlog 延遲 / FK constraint / hot trigger lock / 切換瞬間 deadlock）→ 何時用哪一個">Online Schema Change Tools</a>。</p>
<h2 id="觀測-metric">觀測 metric</h2>
<p><code>SHOW STATUS LIKE</code> + Performance Schema 提供：</p>
<ul>
<li><code>Innodb_buffer_pool_read_requests</code> / <code>_reads</code> → cache hit ratio = <code>1 - reads/read_requests</code>、應該 &gt; 99%</li>
<li><code>Innodb_log_waits</code> → checkpoint pressure、應該 = 0</li>
<li><code>Innodb_log_write_requests</code> / <code>_writes</code> → log buffer 效率</li>
<li><code>Innodb_rows_inserted</code> / <code>_updated</code> / <code>_read</code> → workload 形狀</li>
<li><code>Innodb_row_lock_waits</code> / <code>_time</code> → lock contention</li>
</ul>
<p>把這些丟進 <a href="/blog/backend/04-observability/vendors/datadog/" data-link-title="Datadog" data-link-desc="All-in-one SaaS 觀測平台、APM / Logs / Metrics / RUM / Security">Datadog</a> / <a href="/blog/backend/04-observability/vendors/prometheus/" data-link-title="Prometheus" data-link-desc="Pull-based metrics 主流 OSS、PromQL 與 alerting">Prometheus</a> 透過 <a href="https://github.com/prometheus/mysqld_exporter">mysqld_exporter</a> / <a href="https://www.percona.com/software/database-tools/percona-monitoring-and-management">Percona Monitoring</a> 持續 trend。</p>
<h2 id="相關連結">相關連結</h2>
<ul>
<li><a href="/blog/backend/01-database/vendors/mysql/" data-link-title="MySQL" data-link-desc="高併發網路服務常用關聯式資料庫、Vitess / PlanetScale 分片生態、GitHub / Shopify / Facebook 規模驗證">MySQL vendor overview</a></li>
<li><a href="/blog/backend/01-database/vendors/mysql/replication-topology/" data-link-title="MySQL Replication Topology：async / semi-sync / GTID 不是三選一、是三個 trade-off 軸的疊加" data-link-desc="MySQL replication 不是「選 async 還是 semi-sync」、是 *durability / latency / consistency* 三個 trade-off 軸的疊加；GTID 是跨 mode 的 infrastructure layer、不是第三種 mode。本文走 3 軸取捨模型 → async / semi-sync 行為對比 → GTID 替代 binlog-position 的好處 → 配置 step-by-step → 5 production 踩雷（lag 暴衝 / semi-sync 退回 async / GTID gap / Loss-Less semi-sync 真的 loss-less / chained replication 雪崩）→ 跟 Aurora MySQL / Vitess / ProxySQL / Orchestrator 整合">MySQL Replication Topology</a>（<code>sync_binlog</code> 跟 replication 互動）</li>
<li><a href="/blog/backend/01-database/vendors/mysql/proxysql-config/" data-link-title="MySQL ProxySQL 配置：connection / query / route / response 四段 lifecycle 跟 query rule 設計" data-link-desc="ProxySQL 是 MySQL 生態的 connection pool &#43; query routing 標準。本文走 connection → query parse → route → response 四段 lifecycle、query rule engine 的 rule chain 設計、Hostgroup / Server / User 三層 schema、配置 step-by-step（讀寫分離 &#43; replica lag-aware routing）、5 production 踩雷（query rule 順序錯亂 / connection 漂移 / write 路由到 replica / runtime / disk schema drift / mirror traffic 副作用）、跟 Replication / Orchestrator / HAProxy 整合">MySQL ProxySQL 配置</a>（connection 跟 buffer pool 爭 RAM）</li>
<li><a href="/blog/backend/01-database/vendors/aurora/" data-link-title="AWS Aurora" data-link-desc="AWS managed PostgreSQL / MySQL、storage / compute 分離、&#43;75% 效能改善的 production 證據">Aurora vendor page</a>（managed MySQL、InnoDB tuning 部分轉手）</li>
<li><a href="/blog/backend/01-database/vendors/postgresql/autovacuum-tuning/" data-link-title="PostgreSQL autovacuum tuning：為什麼你的 autovacuum 永遠追不上 bloat" data-link-desc="MVCC 怎麼產生 dead tuple、autovacuum cost-based throttle 為什麼預設保守、per-table tuning 怎麼設、5 個 production 踩雷（cost_limit 太低 / 長 transaction blocks vacuum / anti-wraparound 在 peak / partition vacuum 滿 worker / index bloat 沒處理）、跟 partitioning &#43; monitoring 整合">PostgreSQL Autovacuum Tuning</a>（PG sibling、不同 engine 內部 tuning）</li>
<li>官方：<a href="https://dev.mysql.com/doc/refman/8.0/en/innodb-default-se.html">InnoDB Configuration</a> / <a href="https://www.percona.com/blog/mysql-101-tuning-mysql-after-installation/">Percona Tuning Guide</a></li>
</ul>
]]></content:encoded></item></channel></rss>