<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>k6 on Tarragon</title><link>https://tarrragon.github.io/blog/backend/06-reliability/vendors/k6/</link><description>Recent content in k6 on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Fri, 01 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/backend/06-reliability/vendors/k6/index.xml" rel="self" type="application/rss+xml"/><item><title>k6：Threshold CI Gate 與 Scenario 設計</title><link>https://tarrragon.github.io/blog/backend/06-reliability/vendors/k6/threshold-ci-gate-and-scenario-design/</link><pubDate>Tue, 23 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/06-reliability/vendors/k6/threshold-ci-gate-and-scenario-design/</guid><description>&lt;h2 id="問題情境">問題情境&lt;/h2>
&lt;p>Load test 跑完會產生大量指標，但 CI pipeline 需要的是 pass/fail 訊號。若沒有 threshold 把指標轉成判讀結論，效能退化只能靠人工看 dashboard 發現，等到看見時通常已經累積數個版本。&lt;/p>
&lt;p>另一面，threshold 的判讀品質取決於 workload model 的真實度。用 &lt;code>--vus 10 --duration 30s&lt;/code> 跑出來的結果跟 production 流量結構差距太大時，threshold 通過也無法證明 production 安全。&lt;/p>
&lt;p>這篇處理兩個問題：怎麼設 threshold 讓 CI gate 可靠，怎麼設 scenario 讓 workload 接近真實。&lt;/p>
&lt;h2 id="threshold-設計">Threshold 設計&lt;/h2>
&lt;p>Threshold 的責任是把 load test 指標轉成 CI 的 pass/fail 訊號。k6 在所有 threshold 都通過時回傳 exit code 0，任一 threshold 失敗就回傳非零 — CI pipeline 直接用 exit code 判斷。&lt;/p>
&lt;h3 id="多指標-threshold">多指標 threshold&lt;/h3>
&lt;p>單一指標 threshold 容易漏風險。latency 正常但 error rate 偏高代表系統在丟請求；throughput 正常但 latency 偏高代表排隊開始堆積。完整的 threshold 至少涵蓋三個面向：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-javascript" data-lang="javascript">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="kr">export&lt;/span> &lt;span class="kr">const&lt;/span> &lt;span class="nx">options&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl"> &lt;span class="nx">thresholds&lt;/span>&lt;span class="o">:&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl"> &lt;span class="nx">http_req_duration&lt;/span>&lt;span class="o">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;p(95)&amp;lt;500&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;p(99)&amp;lt;1000&amp;#39;&lt;/span>&lt;span class="p">],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl"> &lt;span class="nx">http_req_failed&lt;/span>&lt;span class="o">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;rate&amp;lt;0.01&amp;#39;&lt;/span>&lt;span class="p">],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl"> &lt;span class="nx">http_reqs&lt;/span>&lt;span class="o">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;rate&amp;gt;100&amp;#39;&lt;/span>&lt;span class="p">],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl"> &lt;span class="p">},&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">&lt;span class="p">};&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>latency threshold 用 percentile 而不是 average — average 會被長尾稀釋，p95/p99 更接近使用者感知的最差體驗。&lt;/p>
&lt;h3 id="門檻來源">門檻來源&lt;/h3>
&lt;p>Threshold 的門檻從 production baseline 出發。先從 observability 系統（Grafana / Datadog）取最近 7-30 天的 p95/p99 latency 與 error rate，加上可接受退化幅度（通常 10-20%）作為 threshold。門檻太緊會讓 CI 環境噪音觸發 false positive；門檻太寬會讓真退化滑過去。&lt;/p>
&lt;p>校準節奏：每月或每次重大架構變更後重新對齊 production baseline，避免 threshold 跟真實系統漂移。&lt;/p>
&lt;h3 id="path-level-threshold">Path-level threshold&lt;/h3>
&lt;p>不同 API path 的效能特徵不同。checkout 路徑的 latency 容忍度可能比 listing 路徑低很多。k6 的 group + tag 機制讓 threshold 可以按 path 設定：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-javascript" data-lang="javascript">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="kr">import&lt;/span> &lt;span class="p">{&lt;/span> &lt;span class="nx">group&lt;/span> &lt;span class="p">}&lt;/span> &lt;span class="nx">from&lt;/span> &lt;span class="s1">&amp;#39;k6&amp;#39;&lt;/span>&lt;span class="p">;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">&lt;span class="kr">export&lt;/span> &lt;span class="k">default&lt;/span> &lt;span class="kd">function&lt;/span> &lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl"> &lt;span class="nx">group&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s1">&amp;#39;checkout&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="kd">function&lt;/span> &lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl"> &lt;span class="c1">// checkout 請求
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span> &lt;span class="p">});&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl"> &lt;span class="nx">group&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s1">&amp;#39;listing&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="kd">function&lt;/span> &lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl"> &lt;span class="c1">// listing 請求
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span> &lt;span class="p">});&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl">&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl">&lt;span class="kr">export&lt;/span> &lt;span class="kr">const&lt;/span> &lt;span class="nx">options&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl"> &lt;span class="nx">thresholds&lt;/span>&lt;span class="o">:&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">14&lt;/span>&lt;span class="cl"> &lt;span class="s1">&amp;#39;http_req_duration{group:::checkout}&amp;#39;&lt;/span>&lt;span class="o">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;p(95)&amp;lt;300&amp;#39;&lt;/span>&lt;span class="p">],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">15&lt;/span>&lt;span class="cl"> &lt;span class="s1">&amp;#39;http_req_duration{group:::listing}&amp;#39;&lt;/span>&lt;span class="o">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;p(95)&amp;lt;800&amp;#39;&lt;/span>&lt;span class="p">],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">16&lt;/span>&lt;span class="cl"> &lt;span class="p">},&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">17&lt;/span>&lt;span class="cl">&lt;span class="p">};&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>path-level threshold 讓 gate 的判讀粒度從「整體效能」細化到「關鍵路徑效能」。&lt;/p>
&lt;h2 id="scenario-設計">Scenario 設計&lt;/h2>
&lt;p>Scenario 的責任是讓壓測的流量結構接近 production。k6 提供五種 scenario executor，選擇取決於要控制什麼變量。&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Executor&lt;/th>
 &lt;th>控制變量&lt;/th>
 &lt;th>適用場景&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>constant-vus&lt;/td>
 &lt;td>並發使用者數&lt;/td>
 &lt;td>簡單 smoke test&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>ramping-vus&lt;/td>
 &lt;td>並發使用者數&lt;/td>
 &lt;td>階梯式升壓找 saturation&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>constant-arrival-rate&lt;/td>
 &lt;td>固定 RPS&lt;/td>
 &lt;td>CI regression（穩定輸入）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>ramping-arrival-rate&lt;/td>
 &lt;td>變化 RPS&lt;/td>
 &lt;td>模擬 production peak/off-peak&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>externally-controlled&lt;/td>
 &lt;td>外部 API&lt;/td>
 &lt;td>結合 production 流量 replay&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;h3 id="executor-選擇判準">Executor 選擇判準&lt;/h3>
&lt;p>constant-vus 最簡單，但 throughput 會隨 response time 波動 — 伺服器變慢時 RPS 自動下降，掩蓋了真正的壓力。constant-arrival-rate 控制 RPS 穩定，能讓 threshold 的判讀基準一致，但需要設定足夠的 preAllocatedVUs 避免 k6 因為 VU 不足而主動降速。&lt;/p></description><content:encoded><![CDATA[<h2 id="問題情境">問題情境</h2>
<p>Load test 跑完會產生大量指標，但 CI pipeline 需要的是 pass/fail 訊號。若沒有 threshold 把指標轉成判讀結論，效能退化只能靠人工看 dashboard 發現，等到看見時通常已經累積數個版本。</p>
<p>另一面，threshold 的判讀品質取決於 workload model 的真實度。用 <code>--vus 10 --duration 30s</code> 跑出來的結果跟 production 流量結構差距太大時，threshold 通過也無法證明 production 安全。</p>
<p>這篇處理兩個問題：怎麼設 threshold 讓 CI gate 可靠，怎麼設 scenario 讓 workload 接近真實。</p>
<h2 id="threshold-設計">Threshold 設計</h2>
<p>Threshold 的責任是把 load test 指標轉成 CI 的 pass/fail 訊號。k6 在所有 threshold 都通過時回傳 exit code 0，任一 threshold 失敗就回傳非零 — CI pipeline 直接用 exit code 判斷。</p>
<h3 id="多指標-threshold">多指標 threshold</h3>
<p>單一指標 threshold 容易漏風險。latency 正常但 error rate 偏高代表系統在丟請求；throughput 正常但 latency 偏高代表排隊開始堆積。完整的 threshold 至少涵蓋三個面向：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-javascript" data-lang="javascript"><span class="line"><span class="ln">1</span><span class="cl"><span class="kr">export</span> <span class="kr">const</span> <span class="nx">options</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">  <span class="nx">thresholds</span><span class="o">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">    <span class="nx">http_req_duration</span><span class="o">:</span> <span class="p">[</span><span class="s1">&#39;p(95)&lt;500&#39;</span><span class="p">,</span> <span class="s1">&#39;p(99)&lt;1000&#39;</span><span class="p">],</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">    <span class="nx">http_req_failed</span><span class="o">:</span>   <span class="p">[</span><span class="s1">&#39;rate&lt;0.01&#39;</span><span class="p">],</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">    <span class="nx">http_reqs</span><span class="o">:</span>         <span class="p">[</span><span class="s1">&#39;rate&gt;100&#39;</span><span class="p">],</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="p">};</span></span></span></code></pre></div><p>latency threshold 用 percentile 而不是 average — average 會被長尾稀釋，p95/p99 更接近使用者感知的最差體驗。</p>
<h3 id="門檻來源">門檻來源</h3>
<p>Threshold 的門檻從 production baseline 出發。先從 observability 系統（Grafana / Datadog）取最近 7-30 天的 p95/p99 latency 與 error rate，加上可接受退化幅度（通常 10-20%）作為 threshold。門檻太緊會讓 CI 環境噪音觸發 false positive；門檻太寬會讓真退化滑過去。</p>
<p>校準節奏：每月或每次重大架構變更後重新對齊 production baseline，避免 threshold 跟真實系統漂移。</p>
<h3 id="path-level-threshold">Path-level threshold</h3>
<p>不同 API path 的效能特徵不同。checkout 路徑的 latency 容忍度可能比 listing 路徑低很多。k6 的 group + tag 機制讓 threshold 可以按 path 設定：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-javascript" data-lang="javascript"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kr">import</span> <span class="p">{</span> <span class="nx">group</span> <span class="p">}</span> <span class="nx">from</span> <span class="s1">&#39;k6&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="kr">export</span> <span class="k">default</span> <span class="kd">function</span> <span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">  <span class="nx">group</span><span class="p">(</span><span class="s1">&#39;checkout&#39;</span><span class="p">,</span> <span class="kd">function</span> <span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">    <span class="c1">// checkout 請求
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="c1"></span>  <span class="p">});</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">  <span class="nx">group</span><span class="p">(</span><span class="s1">&#39;listing&#39;</span><span class="p">,</span> <span class="kd">function</span> <span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">    <span class="c1">// listing 請求
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="c1"></span>  <span class="p">});</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">
</span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="kr">export</span> <span class="kr">const</span> <span class="nx">options</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">  <span class="nx">thresholds</span><span class="o">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">    <span class="s1">&#39;http_req_duration{group:::checkout}&#39;</span><span class="o">:</span> <span class="p">[</span><span class="s1">&#39;p(95)&lt;300&#39;</span><span class="p">],</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">    <span class="s1">&#39;http_req_duration{group:::listing}&#39;</span><span class="o">:</span>  <span class="p">[</span><span class="s1">&#39;p(95)&lt;800&#39;</span><span class="p">],</span>
</span></span><span class="line"><span class="ln">16</span><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="p">};</span></span></span></code></pre></div><p>path-level threshold 讓 gate 的判讀粒度從「整體效能」細化到「關鍵路徑效能」。</p>
<h2 id="scenario-設計">Scenario 設計</h2>
<p>Scenario 的責任是讓壓測的流量結構接近 production。k6 提供五種 scenario executor，選擇取決於要控制什麼變量。</p>
<table>
  <thead>
      <tr>
          <th>Executor</th>
          <th>控制變量</th>
          <th>適用場景</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>constant-vus</td>
          <td>並發使用者數</td>
          <td>簡單 smoke test</td>
      </tr>
      <tr>
          <td>ramping-vus</td>
          <td>並發使用者數</td>
          <td>階梯式升壓找 saturation</td>
      </tr>
      <tr>
          <td>constant-arrival-rate</td>
          <td>固定 RPS</td>
          <td>CI regression（穩定輸入）</td>
      </tr>
      <tr>
          <td>ramping-arrival-rate</td>
          <td>變化 RPS</td>
          <td>模擬 production peak/off-peak</td>
      </tr>
      <tr>
          <td>externally-controlled</td>
          <td>外部 API</td>
          <td>結合 production 流量 replay</td>
      </tr>
  </tbody>
</table>
<h3 id="executor-選擇判準">Executor 選擇判準</h3>
<p>constant-vus 最簡單，但 throughput 會隨 response time 波動 — 伺服器變慢時 RPS 自動下降，掩蓋了真正的壓力。constant-arrival-rate 控制 RPS 穩定，能讓 threshold 的判讀基準一致，但需要設定足夠的 preAllocatedVUs 避免 k6 因為 VU 不足而主動降速。</p>
<p>CI regression 測試建議用 constant-arrival-rate：輸入固定、輸出可比較、版本間的差異才有意義。</p>
<h3 id="production-traffic-shape-對齊">Production traffic shape 對齊</h3>
<p>用 ramping-arrival-rate 模擬 production 的流量形狀：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-javascript" data-lang="javascript"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kr">export</span> <span class="kr">const</span> <span class="nx">options</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">  <span class="nx">scenarios</span><span class="o">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">    <span class="nx">peak_simulation</span><span class="o">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">      <span class="nx">executor</span><span class="o">:</span> <span class="s1">&#39;ramping-arrival-rate&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">      <span class="nx">startRate</span><span class="o">:</span> <span class="mi">50</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">      <span class="nx">stages</span><span class="o">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">        <span class="p">{</span> <span class="nx">target</span><span class="o">:</span> <span class="mi">200</span><span class="p">,</span> <span class="nx">duration</span><span class="o">:</span> <span class="s1">&#39;2m&#39;</span> <span class="p">},</span>  <span class="c1">// ramp up
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="c1"></span>        <span class="p">{</span> <span class="nx">target</span><span class="o">:</span> <span class="mi">200</span><span class="p">,</span> <span class="nx">duration</span><span class="o">:</span> <span class="s1">&#39;5m&#39;</span> <span class="p">},</span>  <span class="c1">// sustain peak
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="c1"></span>        <span class="p">{</span> <span class="nx">target</span><span class="o">:</span> <span class="mi">50</span><span class="p">,</span>  <span class="nx">duration</span><span class="o">:</span> <span class="s1">&#39;1m&#39;</span> <span class="p">},</span>  <span class="c1">// ramp down
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="c1"></span>      <span class="p">],</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">      <span class="nx">preAllocatedVUs</span><span class="o">:</span> <span class="mi">300</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="p">};</span></span></span></code></pre></div><p>流量形狀的參數（startRate / target / duration）從 production access log 的 peak 時段推算。Shopify 的 BFCM 準備流程把 game day 的 load test scenario 跟實際峰值形狀對齊 — 短時間爆量加高寫入比例需要特別設計 scenario 來覆蓋。</p>
<h3 id="cohort-模擬">Cohort 模擬</h3>
<p>Production 流量不是單一類型。用多 scenario 並行模擬不同 cohort：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-javascript" data-lang="javascript"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kr">export</span> <span class="kr">const</span> <span class="nx">options</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">  <span class="nx">scenarios</span><span class="o">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">    <span class="nx">read_traffic</span><span class="o">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">      <span class="nx">executor</span><span class="o">:</span> <span class="s1">&#39;constant-arrival-rate&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">      <span class="nx">rate</span><span class="o">:</span> <span class="mi">150</span><span class="p">,</span> <span class="nx">exec</span><span class="o">:</span> <span class="s1">&#39;readFlow&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">      <span class="nx">preAllocatedVUs</span><span class="o">:</span> <span class="mi">200</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">      <span class="nx">duration</span><span class="o">:</span> <span class="s1">&#39;5m&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">    <span class="nx">write_traffic</span><span class="o">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">      <span class="nx">executor</span><span class="o">:</span> <span class="s1">&#39;constant-arrival-rate&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">      <span class="nx">rate</span><span class="o">:</span> <span class="mi">30</span><span class="p">,</span> <span class="nx">exec</span><span class="o">:</span> <span class="s1">&#39;writeFlow&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">      <span class="nx">preAllocatedVUs</span><span class="o">:</span> <span class="mi">50</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">      <span class="nx">duration</span><span class="o">:</span> <span class="s1">&#39;5m&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="p">};</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">
</span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="kr">export</span> <span class="kd">function</span> <span class="nx">readFlow</span><span class="p">()</span> <span class="p">{</span> <span class="cm">/* GET 請求 */</span> <span class="p">}</span>
</span></span><span class="line"><span class="ln">19</span><span class="cl"><span class="kr">export</span> <span class="kd">function</span> <span class="nx">writeFlow</span><span class="p">()</span> <span class="p">{</span> <span class="cm">/* POST 請求 */</span> <span class="p">}</span></span></span></code></pre></div><p>讀寫比例從 production 的 access log 或 APM 資料推算。比例偏差會讓瓶頸位置失真 — 讀為主的模型抓不到寫入引起的 lock contention。</p>
<h3 id="資料驅動">資料驅動</h3>
<p>測試資料用 SharedArray 載入，避免每個 VU 各自載入造成記憶體浪費：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-javascript" data-lang="javascript"><span class="line"><span class="ln">1</span><span class="cl"><span class="kr">import</span> <span class="p">{</span> <span class="nx">SharedArray</span> <span class="p">}</span> <span class="nx">from</span> <span class="s1">&#39;k6/data&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="kr">const</span> <span class="nx">users</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">SharedArray</span><span class="p">(</span><span class="s1">&#39;users&#39;</span><span class="p">,</span> <span class="kd">function</span> <span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">  <span class="k">return</span> <span class="nx">JSON</span><span class="p">.</span><span class="nx">parse</span><span class="p">(</span><span class="nx">open</span><span class="p">(</span><span class="s1">&#39;./users.json&#39;</span><span class="p">));</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="p">});</span></span></span></code></pre></div><p>資料來源可以是 production sample（脫敏後）或 synthetic generation。資料分佈需要接近 production — ID 範圍、key 分佈、payload 大小都會影響 query plan 與 cache 行為。</p>
<h2 id="ci-整合實務">CI 整合實務</h2>
<h3 id="fast-path每次-push">Fast path（每次 push）</h3>
<p>固定 scenario + 短 duration（30s-2min），用 constant-arrival-rate 做 regression 偵測。threshold 設在 production baseline + 10%。這一層的目的是快速攔住明顯退化，不需要模擬完整峰值。</p>
<h3 id="slow-pathmerge-gate">Slow path（merge gate）</h3>
<p>完整 scenario + 較長 duration（5-15min），包含多 cohort 與 ramping 模擬。threshold 涵蓋 path-level 指標。這一層的目的是深層驗證變更在接近真實壓力下的行為。</p>
<h3 id="結果留存">結果留存</h3>
<p>k6 結果預設輸出到 stdout。CI 整合時用 <code>--out</code> flag 把結果送到時序資料庫（InfluxDB / Prometheus Remote Write / Grafana Cloud k6），讓歷史趨勢可查詢。趨勢比較能偵測 threshold 內但持續惡化的 slow drift。</p>
<p>LinkedIn 的自動化壓測實踐把 load test 結果跟容量預測接在一起 — saturation point 隨時間的變化趨勢直接驅動擴容決策。</p>
<h2 id="邊界與陷阱">邊界與陷阱</h2>
<p><strong>Threshold variance</strong>：CI runner 的硬體差異（shared runner 的鄰居效應、network jitter、GC pause）會讓同一份 code 在不同 run 產生不同結果。控制方式：dedicated runner 消除鄰居效應、warmup iteration 丟棄前幾輪結果、多次 run 取中位數。若 variance 超過 threshold 的退化幅度，gate 判讀就不可信。</p>
<p><strong>門檻過寬或過緊</strong>：threshold 永遠通過代表 gate 形同虛設；threshold 頻繁 false positive 會讓團隊忽略 CI 結果。兩者都會讓 gate 失去判讀價值。校準的判準是：過去 30 天的 threshold 結果中，真正需要關注的退化是否都被攔住，同時 false positive 率低於 5%。</p>
<p><strong>Scenario 跟 production drift</strong>：production 的流量結構會隨產品演進改變。定期（每月或每次重大功能上線）用 access log 校準 scenario 的 RPS、cohort 比例與資料分佈，避免模型越跑越偏。</p>
<h2 id="整合路由">整合路由</h2>
<ul>
<li>上游概念：<a href="/blog/backend/06-reliability/load-testing/" data-link-title="6.2 load test" data-link-desc="把 production 流量結構轉成可重播壓力情境，定位 saturation 轉折與容量邊界">6.2 load testing</a> 的 workload model 設計</li>
<li>下游能力：<a href="/blog/backend/06-reliability/performance-regression-gate/" data-link-title="6.13 Performance Regression Gate" data-link-desc="把效能 baseline 從一次性壓測變成持續對齊的 release gate，涵蓋 baseline 設定、判讀方法、variance 控制與退化定位">6.13 performance regression gate</a> 的 baseline 管理與退化定位</li>
<li>平行 vendor：<a href="/blog/backend/06-reliability/vendors/gatling/" data-link-title="Gatling" data-link-desc="JVM-based load test、Scala / Java / Kotlin DSL、強型別 scenario、HAR-driven recording">Gatling</a>、<a href="/blog/backend/06-reliability/vendors/locust/" data-link-title="Locust" data-link-desc="Python-based load test、distributed、易擴展">Locust</a>、<a href="/blog/backend/06-reliability/vendors/jmeter/" data-link-title="Apache JMeter" data-link-desc="老牌 load test 工具、GUI &#43; plugins">JMeter</a></li>
<li>案例回寫：<a href="/blog/backend/06-reliability/cases/shopify/bfcm-capacity-and-game-day/" data-link-title="Shopify：BFCM 容量治理與 Game Day 驗證節奏" data-link-desc="把季節性流量峰值轉成年度可靠性流程，透過容量模型、演練與隔離策略提前吸收風險。">Shopify BFCM 容量治理</a>（game day load test 對齊峰值形狀）、<a href="/blog/backend/06-reliability/cases/linkedin/automated-load-testing-and-capacity-forecasting/" data-link-title="LinkedIn：Automated Load Testing 與 Capacity Forecasting" data-link-desc="持續壓測驅動容量預測：用自動化回饋取代一次性壓測的容量規劃。">LinkedIn Automated Load Testing</a>（持續壓測驅動容量預測）</li>
</ul>
]]></content:encoded></item></channel></rss>