<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Runtime on Tarragon</title><link>https://tarrragon.github.io/blog/tags/runtime/</link><description>Recent content in Runtime on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Fri, 26 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/runtime/index.xml" rel="self" type="application/rss+xml"/><item><title>3.1 GC 與 memory limit</title><link>https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/gc-memory-limit/</link><pubDate>Wed, 22 Apr 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/gc-memory-limit/</guid><description>&lt;p>GC 與 memory limit 的核心關係是：Go runtime 會根據 heap 成長決定何時執行 GC，而 memory limit 讓 runtime 有一個軟性記憶體目標。Memory limit 不是硬性上限，也不是 leak 修復工具；它是讓 runtime 更早回應記憶體壓力的控制訊號。&lt;/p>
&lt;h2 id="本章目標">本章目標&lt;/h2>
&lt;p>學完本章後，你將能夠：&lt;/p>
&lt;ol>
&lt;li>理解 heap growth、GOGC 與 GC 頻率的關係&lt;/li>
&lt;li>判斷 &lt;code>debug.SetMemoryLimit&lt;/code> 能解決什麼、不能解決什麼&lt;/li>
&lt;li>從環境變數設定服務 memory limit&lt;/li>
&lt;li>用 runtime &lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/metrics/" data-link-title="Metrics" data-link-desc="說明指標如何描述服務趨勢、容量與健康狀態">metrics&lt;/a> 觀察調整效果&lt;/li>
&lt;li>分辨 GC 壓力、長期保留與真正 leak&lt;/li>
&lt;/ol>
&lt;hr>
&lt;h2 id="觀察長時間服務的記憶體問題通常是趨勢問題">【觀察】長時間服務的記憶體問題通常是趨勢問題&lt;/h2>
&lt;p>記憶體診斷的核心觀察是趨勢。Heap 是否持續上升、GC 後是否下降、goroutine 是否增加、某個操作後是否留下無法回收的資料，這些都比「現在用了多少 MB」更重要。&lt;/p>
&lt;p>常見現象：&lt;/p>
&lt;ul>
&lt;li>啟動後 heap 穩定在某個區間：通常正常。&lt;/li>
&lt;li>每次高峰後 heap 都能下降：可能是短暫配置。&lt;/li>
&lt;li>GC 後 heap 仍持續上升：可能有長期保留或 leak。&lt;/li>
&lt;li>GC 次數快速增加且 CPU 升高：可能是 allocation 壓力。&lt;/li>
&lt;li>goroutine 與 heap 同時增加：可能是 goroutine leak 或 send &lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/buffer/" data-link-title="Buffer" data-link-desc="說明系統如何用暫存空間吸收短暫速度差與尖峰流量">buffer&lt;/a> 堆積。&lt;/li>
&lt;/ul>
&lt;p>Memory limit 可以幫 runtime 更積極控制 heap，但不能替代趨勢判讀。&lt;/p>
&lt;h2 id="判讀gc-控制的是-heap-成長">【判讀】GC 控制的是 heap 成長&lt;/h2>
&lt;p>Go GC 的核心目標是回收不再被引用的 heap 物件。Runtime 會根據 &lt;code>GOGC&lt;/code> 控制下一次 GC 觸發點。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="nv">GOGC&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="m">100&lt;/span> go run ./cmd/server&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>GOGC=100&lt;/code> 大致表示 heap 在上次 GC 後成長約 100% 時觸發下一次 GC。數字越小，GC 越頻繁，記憶體通常較低但 CPU 成本較高；數字越大，GC 較少，記憶體通常較高但 CPU 成本較低。&lt;/p>
&lt;p>這是取捨，不是調大或調小就一定更好。CPU 緊繃的服務可能不能承受過低 &lt;code>GOGC&lt;/code>；記憶體緊繃的服務可能不能承受過高 &lt;code>GOGC&lt;/code>。&lt;/p>
&lt;h2 id="判讀memory-limit-是-runtime-軟目標">【判讀】memory limit 是 runtime 軟目標&lt;/h2>
&lt;p>&lt;code>debug.SetMemoryLimit&lt;/code> 的核心用途是告訴 Go runtime 希望整體記憶體使用量靠近某個目標。當 runtime 接近目標時，會更積極回收 heap。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-go" data-lang="go">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="kd">func&lt;/span> &lt;span class="nf">configureRuntime&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl"> &lt;span class="kd">const&lt;/span> &lt;span class="nx">limit&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="mi">512&lt;/span> &lt;span class="o">&amp;lt;&amp;lt;&lt;/span> &lt;span class="mi">20&lt;/span> &lt;span class="c1">// 512 MiB&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl"> &lt;span class="nx">debug&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">SetMemoryLimit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">limit&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="p">}&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>這不是作業系統層級的硬限制。程式仍可能短暫超過這個值，特別是有大量非 Go heap 記憶體、cgo、mmap、大型 byte slice 尖峰或外部 library 配置時。&lt;/p>
&lt;p>Memory limit 適合容器、桌面常駐服務、背景 worker、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">WebSocket&lt;/a> server 這類需要避免吃掉過多資源的服務。若部署平台已有 memory limit，Go runtime 的 limit 通常應略低於平台限制，留給非 Go heap 與系統開銷。&lt;/p>
&lt;h2 id="執行設定值應來自部署環境">【執行】設定值應來自部署環境&lt;/h2>
&lt;p>Memory limit 的核心配置原則是由部署環境決定，而不是寫死在 library 裡。應用入口可以讀取環境變數，解析後設定 runtime。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-go" data-lang="go">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="kd">func&lt;/span> &lt;span class="nf">ConfigureMemoryLimitFromEnv&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="kt">error&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl"> &lt;span class="nx">raw&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nx">os&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Getenv&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;APP_MEMORY_LIMIT_MB&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="nx">raw&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="s">&amp;#34;&amp;#34;&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="kc">nil&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl"> &lt;span class="nx">mb&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">err&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nx">strconv&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Atoi&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">raw&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="nx">err&lt;/span> &lt;span class="o">!=&lt;/span> &lt;span class="kc">nil&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="nx">fmt&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Errorf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;parse APP_MEMORY_LIMIT_MB: %w&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">err&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="nx">mb&lt;/span> &lt;span class="o">&amp;lt;=&lt;/span> &lt;span class="mi">0&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="nx">fmt&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Errorf&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;APP_MEMORY_LIMIT_MB must be positive&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">14&lt;/span>&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">15&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">16&lt;/span>&lt;span class="cl"> &lt;span class="nx">debug&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">SetMemoryLimit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">int64&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">mb&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">&amp;lt;&amp;lt;&lt;/span> &lt;span class="mi">20&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">17&lt;/span>&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="kc">nil&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">18&lt;/span>&lt;span class="cl">&lt;span class="p">}&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>錯誤應在啟動時明確失敗。服務若用錯誤設定悄悄運行，後續記憶體行為會很難解釋。&lt;/p></description><content:encoded><![CDATA[<p>GC 與 memory limit 的核心關係是：Go runtime 會根據 heap 成長決定何時執行 GC，而 memory limit 讓 runtime 有一個軟性記憶體目標。Memory limit 不是硬性上限，也不是 leak 修復工具；它是讓 runtime 更早回應記憶體壓力的控制訊號。</p>
<h2 id="本章目標">本章目標</h2>
<p>學完本章後，你將能夠：</p>
<ol>
<li>理解 heap growth、GOGC 與 GC 頻率的關係</li>
<li>判斷 <code>debug.SetMemoryLimit</code> 能解決什麼、不能解決什麼</li>
<li>從環境變數設定服務 memory limit</li>
<li>用 runtime <a href="/blog/backend/knowledge-cards/metrics/" data-link-title="Metrics" data-link-desc="說明指標如何描述服務趨勢、容量與健康狀態">metrics</a> 觀察調整效果</li>
<li>分辨 GC 壓力、長期保留與真正 leak</li>
</ol>
<hr>
<h2 id="觀察長時間服務的記憶體問題通常是趨勢問題">【觀察】長時間服務的記憶體問題通常是趨勢問題</h2>
<p>記憶體診斷的核心觀察是趨勢。Heap 是否持續上升、GC 後是否下降、goroutine 是否增加、某個操作後是否留下無法回收的資料，這些都比「現在用了多少 MB」更重要。</p>
<p>常見現象：</p>
<ul>
<li>啟動後 heap 穩定在某個區間：通常正常。</li>
<li>每次高峰後 heap 都能下降：可能是短暫配置。</li>
<li>GC 後 heap 仍持續上升：可能有長期保留或 leak。</li>
<li>GC 次數快速增加且 CPU 升高：可能是 allocation 壓力。</li>
<li>goroutine 與 heap 同時增加：可能是 goroutine leak 或 send <a href="/blog/backend/knowledge-cards/buffer/" data-link-title="Buffer" data-link-desc="說明系統如何用暫存空間吸收短暫速度差與尖峰流量">buffer</a> 堆積。</li>
</ul>
<p>Memory limit 可以幫 runtime 更積極控制 heap，但不能替代趨勢判讀。</p>
<h2 id="判讀gc-控制的是-heap-成長">【判讀】GC 控制的是 heap 成長</h2>
<p>Go GC 的核心目標是回收不再被引用的 heap 物件。Runtime 會根據 <code>GOGC</code> 控制下一次 GC 觸發點。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="nv">GOGC</span><span class="o">=</span><span class="m">100</span> go run ./cmd/server</span></span></code></pre></div><p><code>GOGC=100</code> 大致表示 heap 在上次 GC 後成長約 100% 時觸發下一次 GC。數字越小，GC 越頻繁，記憶體通常較低但 CPU 成本較高；數字越大，GC 較少，記憶體通常較高但 CPU 成本較低。</p>
<p>這是取捨，不是調大或調小就一定更好。CPU 緊繃的服務可能不能承受過低 <code>GOGC</code>；記憶體緊繃的服務可能不能承受過高 <code>GOGC</code>。</p>
<h2 id="判讀memory-limit-是-runtime-軟目標">【判讀】memory limit 是 runtime 軟目標</h2>
<p><code>debug.SetMemoryLimit</code> 的核心用途是告訴 Go runtime 希望整體記憶體使用量靠近某個目標。當 runtime 接近目標時，會更積極回收 heap。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln">1</span><span class="cl"><span class="kd">func</span> <span class="nf">configureRuntime</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">    <span class="kd">const</span> <span class="nx">limit</span> <span class="p">=</span> <span class="mi">512</span> <span class="o">&lt;&lt;</span> <span class="mi">20</span> <span class="c1">// 512 MiB</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">    <span class="nx">debug</span><span class="p">.</span><span class="nf">SetMemoryLimit</span><span class="p">(</span><span class="nx">limit</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>這不是作業系統層級的硬限制。程式仍可能短暫超過這個值，特別是有大量非 Go heap 記憶體、cgo、mmap、大型 byte slice 尖峰或外部 library 配置時。</p>
<p>Memory limit 適合容器、桌面常駐服務、背景 worker、<a href="/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">WebSocket</a> server 這類需要避免吃掉過多資源的服務。若部署平台已有 memory limit，Go runtime 的 limit 通常應略低於平台限制，留給非 Go heap 與系統開銷。</p>
<h2 id="執行設定值應來自部署環境">【執行】設定值應來自部署環境</h2>
<p>Memory limit 的核心配置原則是由部署環境決定，而不是寫死在 library 裡。應用入口可以讀取環境變數，解析後設定 runtime。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kd">func</span> <span class="nf">ConfigureMemoryLimitFromEnv</span><span class="p">()</span> <span class="kt">error</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">    <span class="nx">raw</span> <span class="o">:=</span> <span class="nx">os</span><span class="p">.</span><span class="nf">Getenv</span><span class="p">(</span><span class="s">&#34;APP_MEMORY_LIMIT_MB&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">    <span class="k">if</span> <span class="nx">raw</span> <span class="o">==</span> <span class="s">&#34;&#34;</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">        <span class="k">return</span> <span class="kc">nil</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">    <span class="nx">mb</span><span class="p">,</span> <span class="nx">err</span> <span class="o">:=</span> <span class="nx">strconv</span><span class="p">.</span><span class="nf">Atoi</span><span class="p">(</span><span class="nx">raw</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">    <span class="k">if</span> <span class="nx">err</span> <span class="o">!=</span> <span class="kc">nil</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">        <span class="k">return</span> <span class="nx">fmt</span><span class="p">.</span><span class="nf">Errorf</span><span class="p">(</span><span class="s">&#34;parse APP_MEMORY_LIMIT_MB: %w&#34;</span><span class="p">,</span> <span class="nx">err</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">
</span></span><span class="line"><span class="ln">12</span><span class="cl">    <span class="k">if</span> <span class="nx">mb</span> <span class="o">&lt;=</span> <span class="mi">0</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">        <span class="k">return</span> <span class="nx">fmt</span><span class="p">.</span><span class="nf">Errorf</span><span class="p">(</span><span class="s">&#34;APP_MEMORY_LIMIT_MB must be positive&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">
</span></span><span class="line"><span class="ln">16</span><span class="cl">    <span class="nx">debug</span><span class="p">.</span><span class="nf">SetMemoryLimit</span><span class="p">(</span><span class="nb">int64</span><span class="p">(</span><span class="nx">mb</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">20</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">    <span class="k">return</span> <span class="kc">nil</span>
</span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>錯誤應在啟動時明確失敗。服務若用錯誤設定悄悄運行，後續記憶體行為會很難解釋。</p>
<h2 id="策略runtime-metrics-用來看調整是否有效">【策略】runtime metrics 用來看調整是否有效</h2>
<p>Runtime metrics 的核心用途是驗證調整效果。只改 <code>GOGC</code> 或 memory limit，不看 heap 與 GC 趨勢，容易變成憑感覺調參。</p>
<p>簡單方式可以用 <code>runtime.ReadMemStats</code>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln">1</span><span class="cl"><span class="kd">func</span> <span class="nf">ReadHeapAlloc</span><span class="p">()</span> <span class="kt">uint64</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">    <span class="kd">var</span> <span class="nx">stats</span> <span class="nx">runtime</span><span class="p">.</span><span class="nx">MemStats</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">    <span class="nx">runtime</span><span class="p">.</span><span class="nf">ReadMemStats</span><span class="p">(</span><span class="o">&amp;</span><span class="nx">stats</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">    <span class="k">return</span> <span class="nx">stats</span><span class="p">.</span><span class="nx">HeapAlloc</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>較完整的服務可以使用 <code>runtime/metrics</code>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln">1</span><span class="cl"><span class="kd">func</span> <span class="nf">ReadRuntimeSamples</span><span class="p">()</span> <span class="p">[]</span><span class="nx">metrics</span><span class="p">.</span><span class="nx">Sample</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">    <span class="nx">samples</span> <span class="o">:=</span> <span class="p">[]</span><span class="nx">metrics</span><span class="p">.</span><span class="nx">Sample</span><span class="p">{</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">        <span class="p">{</span><span class="nx">Name</span><span class="p">:</span> <span class="s">&#34;/memory/classes/heap/objects:bytes&#34;</span><span class="p">},</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">        <span class="p">{</span><span class="nx">Name</span><span class="p">:</span> <span class="s">&#34;/gc/cycles/total:gc-cycles&#34;</span><span class="p">},</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">        <span class="p">{</span><span class="nx">Name</span><span class="p">:</span> <span class="s">&#34;/sched/goroutines:goroutines&#34;</span><span class="p">},</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">7</span><span class="cl">    <span class="nx">metrics</span><span class="p">.</span><span class="nf">Read</span><span class="p">(</span><span class="nx">samples</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">8</span><span class="cl">    <span class="k">return</span> <span class="nx">samples</span>
</span></span><span class="line"><span class="ln">9</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>觀察時要看趨勢：調整後 heap 峰值是否下降、GC 次數是否合理、CPU 是否上升、goroutine 是否仍持續增加。</p>
<h2 id="判讀memory-limit-不能修正仍被引用的資料">【判讀】memory limit 不能修正仍被引用的資料</h2>
<p>Memory limit 的核心邊界是它只能影響 GC 行為，不能讓仍被引用的物件消失。若程式把資料一直留在 map、slice、cache、goroutine 或 send buffer 裡，GC 不能回收。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln">1</span><span class="cl"><span class="kd">var</span> <span class="nx">cache</span> <span class="p">=</span> <span class="kd">map</span><span class="p">[</span><span class="kt">string</span><span class="p">][]</span><span class="kt">byte</span><span class="p">{}</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="kd">func</span> <span class="nf">SavePayload</span><span class="p">(</span><span class="nx">id</span> <span class="kt">string</span><span class="p">,</span> <span class="nx">payload</span> <span class="p">[]</span><span class="kt">byte</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">    <span class="nx">cache</span><span class="p">[</span><span class="nx">id</span><span class="p">]</span> <span class="p">=</span> <span class="nx">payload</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>如果 <code>cache</code> 沒有大小限制、<a href="/blog/backend/knowledge-cards/ttl/" data-link-title="TTL" data-link-desc="說明資料過期時間如何影響快取新鮮度、成本與一致性">TTL</a> 或刪除策略，memory limit 只會讓 GC 更常跑，但資料仍被 <code>cache</code> 引用。真正修正是設計 cache 淘汰、分頁、快照大小限制或資料釋放路徑。</p>
<p>因此遇到 heap 持續上升時，下一步是用 pprof 確認誰保留了記憶體。</p>
<h2 id="策略判斷是-gc-壓力還是長期保留">【策略】判斷是 GC 壓力還是長期保留</h2>
<p>記憶體問題的核心分流是：物件被大量配置但很快回收，還是物件被長期保留。</p>
<table>
  <thead>
      <tr>
          <th>現象</th>
          <th>可能問題</th>
          <th>下一步</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>alloc_space</code> 高，<code>inuse_space</code> 不高</td>
          <td>短命配置多，GC 壓力大</td>
          <td>找熱路徑 allocation</td>
      </tr>
      <tr>
          <td><code>inuse_space</code> 持續上升</td>
          <td>長期保留或 leak</td>
          <td>看 heap profile retainers</td>
      </tr>
      <tr>
          <td>goroutine 數量同步上升</td>
          <td>goroutine leak 或 <a href="/blog/backend/knowledge-cards/queue/" data-link-title="Queue" data-link-desc="說明 queue 如何保存等待處理的工作並形成容量邊界">queue</a> 堆積</td>
          <td>看 goroutine profile</td>
      </tr>
      <tr>
          <td>GC 次數暴增但 heap 仍高</td>
          <td>memory limit 壓力或資料保留</td>
          <td>檢查 cache/map/buffer</td>
      </tr>
  </tbody>
</table>
<p>這個分流會決定後續工具。GC 參數能緩解壓力，但保留資料要回到資料結構與 lifecycle 修。</p>
<h2 id="測試runtime-設定函式可以獨立測解析">【測試】runtime 設定函式可以獨立測解析</h2>
<p>Runtime 本身不需要在單元測試中反覆調參。應把環境解析邏輯獨立出來，測試輸入與錯誤即可。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kd">func</span> <span class="nf">ParseMemoryLimitMB</span><span class="p">(</span><span class="nx">raw</span> <span class="kt">string</span><span class="p">)</span> <span class="p">(</span><span class="kt">int64</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">    <span class="k">if</span> <span class="nx">raw</span> <span class="o">==</span> <span class="s">&#34;&#34;</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">        <span class="k">return</span> <span class="mi">0</span><span class="p">,</span> <span class="kc">nil</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">    <span class="nx">mb</span><span class="p">,</span> <span class="nx">err</span> <span class="o">:=</span> <span class="nx">strconv</span><span class="p">.</span><span class="nf">Atoi</span><span class="p">(</span><span class="nx">raw</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="k">if</span> <span class="nx">err</span> <span class="o">!=</span> <span class="kc">nil</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">        <span class="k">return</span> <span class="mi">0</span><span class="p">,</span> <span class="nx">fmt</span><span class="p">.</span><span class="nf">Errorf</span><span class="p">(</span><span class="s">&#34;parse memory limit: %w&#34;</span><span class="p">,</span> <span class="nx">err</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">    <span class="k">if</span> <span class="nx">mb</span> <span class="o">&lt;=</span> <span class="mi">0</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">        <span class="k">return</span> <span class="mi">0</span><span class="p">,</span> <span class="nx">fmt</span><span class="p">.</span><span class="nf">Errorf</span><span class="p">(</span><span class="s">&#34;memory limit must be positive&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">    <span class="k">return</span> <span class="nb">int64</span><span class="p">(</span><span class="nx">mb</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">20</span><span class="p">,</span> <span class="kc">nil</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>測試：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln">1</span><span class="cl"><span class="kd">func</span> <span class="nf">TestParseMemoryLimitMB</span><span class="p">(</span><span class="nx">t</span> <span class="o">*</span><span class="nx">testing</span><span class="p">.</span><span class="nx">T</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">    <span class="nx">got</span><span class="p">,</span> <span class="nx">err</span> <span class="o">:=</span> <span class="nf">ParseMemoryLimitMB</span><span class="p">(</span><span class="s">&#34;512&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">    <span class="k">if</span> <span class="nx">err</span> <span class="o">!=</span> <span class="kc">nil</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">        <span class="nx">t</span><span class="p">.</span><span class="nf">Fatalf</span><span class="p">(</span><span class="s">&#34;parse memory limit: %v&#34;</span><span class="p">,</span> <span class="nx">err</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl">    <span class="k">if</span> <span class="nx">got</span> <span class="o">!=</span> <span class="mi">512</span><span class="o">&lt;&lt;</span><span class="mi">20</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">7</span><span class="cl">        <span class="nx">t</span><span class="p">.</span><span class="nf">Fatalf</span><span class="p">(</span><span class="s">&#34;limit = %d, want %d&#34;</span><span class="p">,</span> <span class="nx">got</span><span class="p">,</span> <span class="nb">int64</span><span class="p">(</span><span class="mi">512</span><span class="o">&lt;&lt;</span><span class="mi">20</span><span class="p">))</span>
</span></span><span class="line"><span class="ln">8</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">9</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>這讓設定邏輯可測，而不需要在每個測試中真的改 runtime 狀態。</p>
<h2 id="本章不處理">本章不處理</h2>
<p>本章先處理單一 Go process 如何判讀 heap、GC 與 memory limit；平台 OOM 與部署合約，會在下列章節再往外延伸：</p>
<ul>
<li><a href="/blog/go-advanced/07-distributed-operations/deployment-contracts/" data-link-title="7.5 Kubernetes、systemd 與 load balancer 合約" data-link-desc="理解部署平台如何影響 Go 服務的 shutdown、health 與資源限制">Go 進階：Kubernetes、systemd 與 load balancer 合約</a></li>
<li><a href="/blog/backend/05-deployment-platform/" data-link-title="模組五：部署平台與網路入口" data-link-desc="整理 Kubernetes、systemd、load balancer、container 與服務生命週期合約">Backend：部署平台與網路入口</a></li>
</ul>
<h2 id="和-go-教材的關係">和 Go 教材的關係</h2>
<p>這一章承接的是 runtime 壓力、allocation 與 pprof 診斷；如果你要先回看語言教材，可以讀：</p>
<ul>
<li><a href="/blog/go-advanced/03-runtime-profiling/allocation/" data-link-title="3.4 資料結構與 allocation 壓力" data-link-desc="分析列表、歷史資料與 WebSocket payload 的配置成本">Go：資料結構與 allocation 壓力</a></li>
<li><a href="/blog/go-advanced/03-runtime-profiling/goroutine-leak/" data-link-title="3.3 goroutine leak 偵測" data-link-desc="判斷背景工作與 client pump 是否正確退出">Go：goroutine leak 偵測</a></li>
<li><a href="/blog/go/07-refactoring/state-boundary/" data-link-title="7.4 狀態管理的安全邊界" data-link-desc="用 lock、copy 與 API 限制保護共享狀態">Go：狀態管理的安全邊界</a></li>
<li><a href="/blog/go/06-practical/new-background-worker/" data-link-title="6.4 如何新增背景工作流程" data-link-desc="接入 context、channel 與 shutdown">Go：如何新增背景工作流程</a></li>
</ul>
<h2 id="小結">小結</h2>
<p>GC 控制 heap 回收節奏，memory limit 給 runtime 一個記憶體軟目標。合理設定能降低長時間服務的資源風險，但不能修正 cache、map、slice、goroutine 或 buffer 長期持有資料。診斷時先看趨勢，再用 pprof 區分 GC 壓力與長期保留。</p>
]]></content:encoded></item><item><title>Runtime 版本升級</title><link>https://tarrragon.github.io/blog/infra/upgrade/runtime-version-upgrade/</link><pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/infra/upgrade/runtime-version-upgrade/</guid><description>&lt;p>Runtime 版本升級改變的是既有程式碼的執行環境。程式碼是針對某個版本的行為寫的——函式存不存在、預設值是什麼、型別檢查嚴不嚴格——新版本可能移除函式、改變預設行為、引入更嚴格的型別系統。升級的工作量不在「切換版本」這個動作本身（多數環境只需要改一個設定），而在「讓既有程式碼在新版本下行為正確」的驗證與修正。&lt;/p>
&lt;p>本篇以 PHP 為主要範例（legacy 升級最常見的情境），Node.js 和 Python 的對應工具在各段併列。&lt;/p>
&lt;h2 id="相容性評估">相容性評估&lt;/h2>
&lt;p>升級前要先知道「現有程式碼跟新版本有多少不相容」。不相容的類型分四種：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>類型&lt;/th>
 &lt;th>範例（PHP 7→8）&lt;/th>
 &lt;th>影響&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>移除的函式&lt;/td>
 &lt;td>&lt;code>each()&lt;/code>、&lt;code>create_function()&lt;/code>、&lt;code>mysql_*&lt;/code> 系列&lt;/td>
 &lt;td>呼叫直接 fatal error&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>改變的預設行為&lt;/td>
 &lt;td>&lt;code>error_reporting&lt;/code> 預設含 &lt;code>E_DEPRECATED&lt;/code>、字串比較更嚴格&lt;/td>
 &lt;td>行為靜默改變、不一定報錯&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>更嚴格的型別&lt;/td>
 &lt;td>內部函式的參數型別檢查從警告升級為 TypeError&lt;/td>
 &lt;td>之前能跑的呼叫現在拋例外&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>擴充模組可用性&lt;/td>
 &lt;td>&lt;code>json&lt;/code> 從可選變內建、&lt;code>mcrypt&lt;/code> 已移除&lt;/td>
 &lt;td>部分功能無法使用&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;h3 id="php-相容性掃描">PHP 相容性掃描&lt;/h3>
&lt;p>PHPCompatibility 是 PHP_CodeSniffer 的規則集，可以自動掃描程式碼裡哪些寫法在目標版本不相容：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># 安裝&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">composer global require phpcompatibility/php-compatibility
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="c1"># 掃描：目標版本 8.0&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">phpcs --standard&lt;span class="o">=&lt;/span>PHPCompatibility &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">&lt;span class="se">&lt;/span> --runtime-set testVersion 8.0 &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">&lt;span class="se">&lt;/span> --extensions&lt;span class="o">=&lt;/span>php &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">8&lt;/span>&lt;span class="cl">&lt;span class="se">&lt;/span> -p &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">9&lt;/span>&lt;span class="cl">&lt;span class="se">&lt;/span> src/&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>掃描結果會列出每一處不相容的位置、原因和嚴重度。常見的命中包括：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">FILE: src/legacy/Database.php
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">----------------------------------------------------------------------
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">FOUND 3 ERRORS:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl"> 42 | ERROR | Function mysql_connect() is removed since PHP 7.0
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl"> 89 | ERROR | Function each() is removed since PHP 8.0
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">156 | ERROR | Curly brace access syntax is deprecated since PHP 7.4
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">----------------------------------------------------------------------&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>php -l&lt;/code> 可以做基本的語法檢查，但它只抓語法錯誤、抓不到 deprecated 函式和行為變更。PHPCompatibility 掃描的覆蓋面更廣。&lt;/p>
&lt;h3 id="php-升級的高頻修改項">PHP 升級的高頻修改項&lt;/h3>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>項目&lt;/th>
 &lt;th>PHP 5.6→7.x&lt;/th>
 &lt;th>PHP 7.x→8.x&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>資料庫連線&lt;/td>
 &lt;td>&lt;code>mysql_*&lt;/code> → &lt;code>mysqli_*&lt;/code> 或 PDO&lt;/td>
 &lt;td>—&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>陣列遍歷&lt;/td>
 &lt;td>—&lt;/td>
 &lt;td>&lt;code>each()&lt;/code> → &lt;code>foreach&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>字串存取&lt;/td>
 &lt;td>—&lt;/td>
 &lt;td>&lt;code>$str{0}&lt;/code> → &lt;code>$str[0]&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>錯誤處理&lt;/td>
 &lt;td>&lt;code>set_error_handler&lt;/code> 行為變更&lt;/td>
 &lt;td>內部函式 TypeError 取代 warning&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>建構函式&lt;/td>
 &lt;td>同名建構函式 deprecated&lt;/td>
 &lt;td>同名建構函式 removed&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>正則表達式&lt;/td>
 &lt;td>&lt;code>ereg_*&lt;/code> → &lt;code>preg_*&lt;/code>&lt;/td>
 &lt;td>—&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>加密&lt;/td>
 &lt;td>&lt;code>mcrypt_*&lt;/code> → &lt;code>openssl_*&lt;/code> 或 sodium&lt;/td>
 &lt;td>—&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;h3 id="nodejs-相容性掃描">Node.js 相容性掃描&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># 用 nvm 切換版本後跑測試&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">nvm install &lt;span class="m">20&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">nvm use &lt;span class="m">20&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">npm &lt;span class="nb">test&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">&lt;span class="c1"># 檢查 package.json 的 engines 欄位&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">cat package.json &lt;span class="p">|&lt;/span> jq &lt;span class="s1">&amp;#39;.engines&amp;#39;&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Node.js 的 breaking change 集中在 V8 引擎行為（&lt;code>Buffer&lt;/code> 建構式、&lt;code>fs&lt;/code> 的 callback 簽章）和原生模組的 ABI 相容性。如果專案用了原生模組（&lt;code>node-gyp&lt;/code> 編譯的），版本升級後要重新 &lt;code>npm rebuild&lt;/code>。&lt;/p>
&lt;h3 id="python-相容性掃描">Python 相容性掃描&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># Python 2→3：用 2to3 掃描&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">2to3 --no-diffs -w src/
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="c1"># Python 3.x 小版本：用 pyupgrade&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">pip install pyupgrade
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">pyupgrade --py310-plus src/**/*.py&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Python 2→3 的修改量通常很大（print 語法、unicode 處理、dict 方法），是接近重寫等級的升級。Python 3.x 之間的升級相對溫和，主要是 deprecation 移除和 typing 語法的演進。&lt;/p></description><content:encoded><![CDATA[<p>Runtime 版本升級改變的是既有程式碼的執行環境。程式碼是針對某個版本的行為寫的——函式存不存在、預設值是什麼、型別檢查嚴不嚴格——新版本可能移除函式、改變預設行為、引入更嚴格的型別系統。升級的工作量不在「切換版本」這個動作本身（多數環境只需要改一個設定），而在「讓既有程式碼在新版本下行為正確」的驗證與修正。</p>
<p>本篇以 PHP 為主要範例（legacy 升級最常見的情境），Node.js 和 Python 的對應工具在各段併列。</p>
<h2 id="相容性評估">相容性評估</h2>
<p>升級前要先知道「現有程式碼跟新版本有多少不相容」。不相容的類型分四種：</p>
<table>
  <thead>
      <tr>
          <th>類型</th>
          <th>範例（PHP 7→8）</th>
          <th>影響</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>移除的函式</td>
          <td><code>each()</code>、<code>create_function()</code>、<code>mysql_*</code> 系列</td>
          <td>呼叫直接 fatal error</td>
      </tr>
      <tr>
          <td>改變的預設行為</td>
          <td><code>error_reporting</code> 預設含 <code>E_DEPRECATED</code>、字串比較更嚴格</td>
          <td>行為靜默改變、不一定報錯</td>
      </tr>
      <tr>
          <td>更嚴格的型別</td>
          <td>內部函式的參數型別檢查從警告升級為 TypeError</td>
          <td>之前能跑的呼叫現在拋例外</td>
      </tr>
      <tr>
          <td>擴充模組可用性</td>
          <td><code>json</code> 從可選變內建、<code>mcrypt</code> 已移除</td>
          <td>部分功能無法使用</td>
      </tr>
  </tbody>
</table>
<h3 id="php-相容性掃描">PHP 相容性掃描</h3>
<p>PHPCompatibility 是 PHP_CodeSniffer 的規則集，可以自動掃描程式碼裡哪些寫法在目標版本不相容：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 安裝</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">composer global require phpcompatibility/php-compatibility
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 掃描：目標版本 8.0</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">phpcs --standard<span class="o">=</span>PHPCompatibility <span class="se">\
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="se"></span>  --runtime-set testVersion 8.0 <span class="se">\
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="se"></span>  --extensions<span class="o">=</span>php <span class="se">\
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="se"></span>  -p <span class="se">\
</span></span></span><span class="line"><span class="ln">9</span><span class="cl"><span class="se"></span>  src/</span></span></code></pre></div><p>掃描結果會列出每一處不相容的位置、原因和嚴重度。常見的命中包括：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">FILE: src/legacy/Database.php
</span></span><span class="line"><span class="ln">2</span><span class="cl">----------------------------------------------------------------------
</span></span><span class="line"><span class="ln">3</span><span class="cl">FOUND 3 ERRORS:
</span></span><span class="line"><span class="ln">4</span><span class="cl"> 42 | ERROR | Function mysql_connect() is removed since PHP 7.0
</span></span><span class="line"><span class="ln">5</span><span class="cl"> 89 | ERROR | Function each() is removed since PHP 8.0
</span></span><span class="line"><span class="ln">6</span><span class="cl">156 | ERROR | Curly brace access syntax is deprecated since PHP 7.4
</span></span><span class="line"><span class="ln">7</span><span class="cl">----------------------------------------------------------------------</span></span></code></pre></div><p><code>php -l</code> 可以做基本的語法檢查，但它只抓語法錯誤、抓不到 deprecated 函式和行為變更。PHPCompatibility 掃描的覆蓋面更廣。</p>
<h3 id="php-升級的高頻修改項">PHP 升級的高頻修改項</h3>
<table>
  <thead>
      <tr>
          <th>項目</th>
          <th>PHP 5.6→7.x</th>
          <th>PHP 7.x→8.x</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>資料庫連線</td>
          <td><code>mysql_*</code> → <code>mysqli_*</code> 或 PDO</td>
          <td>—</td>
      </tr>
      <tr>
          <td>陣列遍歷</td>
          <td>—</td>
          <td><code>each()</code> → <code>foreach</code></td>
      </tr>
      <tr>
          <td>字串存取</td>
          <td>—</td>
          <td><code>$str{0}</code> → <code>$str[0]</code></td>
      </tr>
      <tr>
          <td>錯誤處理</td>
          <td><code>set_error_handler</code> 行為變更</td>
          <td>內部函式 TypeError 取代 warning</td>
      </tr>
      <tr>
          <td>建構函式</td>
          <td>同名建構函式 deprecated</td>
          <td>同名建構函式 removed</td>
      </tr>
      <tr>
          <td>正則表達式</td>
          <td><code>ereg_*</code> → <code>preg_*</code></td>
          <td>—</td>
      </tr>
      <tr>
          <td>加密</td>
          <td><code>mcrypt_*</code> → <code>openssl_*</code> 或 sodium</td>
          <td>—</td>
      </tr>
  </tbody>
</table>
<h3 id="nodejs-相容性掃描">Node.js 相容性掃描</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 用 nvm 切換版本後跑測試</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">nvm install <span class="m">20</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">nvm use <span class="m">20</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">npm <span class="nb">test</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">
</span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="c1"># 檢查 package.json 的 engines 欄位</span>
</span></span><span class="line"><span class="ln">7</span><span class="cl">cat package.json <span class="p">|</span> jq <span class="s1">&#39;.engines&#39;</span></span></span></code></pre></div><p>Node.js 的 breaking change 集中在 V8 引擎行為（<code>Buffer</code> 建構式、<code>fs</code> 的 callback 簽章）和原生模組的 ABI 相容性。如果專案用了原生模組（<code>node-gyp</code> 編譯的），版本升級後要重新 <code>npm rebuild</code>。</p>
<h3 id="python-相容性掃描">Python 相容性掃描</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># Python 2→3：用 2to3 掃描</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">2to3 --no-diffs -w src/
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># Python 3.x 小版本：用 pyupgrade</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">pip install pyupgrade
</span></span><span class="line"><span class="ln">6</span><span class="cl">pyupgrade --py310-plus src/**/*.py</span></span></code></pre></div><p>Python 2→3 的修改量通常很大（print 語法、unicode 處理、dict 方法），是接近重寫等級的升級。Python 3.x 之間的升級相對溫和，主要是 deprecation 移除和 typing 語法的演進。</p>
<h2 id="本地驗證">本地驗證</h2>
<p>相容性掃描找出的是靜態分析能偵測的不相容。執行期的行為變更（如字串比較規則改變、排序穩定性改變）只有跑起來才看得到。</p>
<h3 id="建立目標版本的本地環境">建立目標版本的本地環境</h3>
<p>用 Docker 建一個精確匹配目標版本的環境：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="nt">services</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w">  </span><span class="nt">app</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">    </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">php:8.2-apache</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">volumes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">      </span>- <span class="l">./src:/var/www/html</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">    </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">      </span>- <span class="s2">&#34;8080:80&#34;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">  </span><span class="nt">db</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">    </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">mysql:8.0</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">    </span><span class="nt">environment</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">      </span><span class="nt">MYSQL_ROOT_PASSWORD</span><span class="p">:</span><span class="w"> </span><span class="l">localdev</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">      </span><span class="nt">MYSQL_DATABASE</span><span class="p">:</span><span class="w"> </span><span class="l">app</span></span></span></code></pre></div><p>如果不用 Docker，MAMP Pro 或 Laragon 可以切換 PHP 版本。關鍵是本地環境的 runtime 版本要跟升級目標完全一致——PHP 8.0 跟 8.2 之間也有差異。</p>
<h3 id="驗證策略">驗證策略</h3>
<p>有測試套件的專案跑測試套件。沒有測試套件的專案（legacy 專案的常態）按照這個優先序手動驗證：</p>
<ol>
<li><strong>首頁能載入</strong>：最基本的 smoke test，確認 PHP 不 fatal error</li>
<li><strong>登入流程</strong>：session 處理是版本升級最常出問題的區域</li>
<li><strong>資料庫操作</strong>：CRUD 的每一種至少各跑一次</li>
<li><strong>金流 / 第三方 API</strong>：callback URL 和 API 呼叫是否正常</li>
<li><strong>表單提交</strong>：file upload、驗證邏輯</li>
</ol>
<p>PHP 升級時把 <code>error_reporting</code> 開到最大：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-php" data-lang="php"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1">// 開發環境設定（不要在 prod 開）
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"></span><span class="nx">error_reporting</span><span class="p">(</span><span class="k">E_ALL</span><span class="p">);</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="nx">ini_set</span><span class="p">(</span><span class="s1">&#39;display_errors&#39;</span><span class="p">,</span> <span class="s1">&#39;1&#39;</span><span class="p">);</span></span></span></code></pre></div><p>所有 notice、warning、deprecation 都要修——它們在下一個版本可能升級為 error。</p>
<h3 id="第三方依賴相容性">第三方依賴相容性</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># Composer：檢查哪些套件需要更新</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">composer outdated
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 檢查各套件是否支援目標 PHP 版本</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">composer why-not php 8.2</span></span></code></pre></div><p><code>composer why-not</code> 會列出哪些套件的 <code>require.php</code> 限制不允許目標版本。這些套件要先升級到支援新版本的版號，才能升 PHP。</p>
<p>如果某個套件已經不再維護且不支援新 PHP 版本，要評估替代方案或 fork 修改。這個評估的工作量可能佔整個升級的大部分時間。</p>
<h2 id="分批部署策略">分批部署策略</h2>
<h3 id="有獨立環境控制的情境vps--雲端">有獨立環境控制的情境（VPS / 雲端）</h3>
<p>最安全的策略是建一套平行環境跑新版本：</p>
<ol>
<li>用新 PHP 版本建一台新的 VM 或容器</li>
<li>部署相同的程式碼</li>
<li>匯入 prod 資料庫的副本</li>
<li>在新環境跑完整驗證</li>
<li>DNS 或 load balancer 切換流量到新環境</li>
<li>舊環境保留一段時間作為 rollback 目標</li>
</ol>
<p>rollback 是把流量切回舊環境。舊環境在確認新環境穩定之前不要關——保留期至少一週。</p>
<h3 id="面板管理主機無-ssh的情境">面板管理主機（無 SSH）的情境</h3>
<p>面板管理主機（cPanel / Plesk）的 PHP 版本切換通常是 per-domain 的設定：</p>
<ul>
<li><strong>cPanel</strong>：MultiPHP Manager，選域名 → 選 PHP 版本 → Apply</li>
<li><strong>Plesk</strong>：PHP Settings → PHP version 下拉選單</li>
</ul>
<p>切換是即時生效的，rollback 也是即時的（選回舊版本）。但沒有「平行環境驗證」的能力——除非主機商提供 staging subdomain 可以先測。</p>
<p>面板管理主機的升級策略：</p>
<ol>
<li>如果有 staging subdomain：先在 staging 切換版本、驗證、再切 prod</li>
<li>如果沒有：選流量最低的時段切換（如凌晨），切換後立刻驗證關鍵流程，出問題立刻切回</li>
<li>切換前備份（FTP mirror + DB dump），確認 rollback 路徑存在</li>
</ol>
<h3 id="wordpress--框架的版本矩陣">WordPress / 框架的版本矩陣</h3>
<p>WordPress 和主流框架有明確的 PHP 版本支援矩陣。升級 PHP 前要先確認框架版本是否支援目標 PHP 版本：</p>
<table>
  <thead>
      <tr>
          <th>框架</th>
          <th>查詢方式</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>WordPress</td>
          <td><a href="https://wordpress.org/about/requirements/">官方需求頁</a></td>
      </tr>
      <tr>
          <td>Laravel</td>
          <td>各版本 <code>composer.json</code> 的 <code>require.php</code></td>
      </tr>
      <tr>
          <td>Symfony</td>
          <td><a href="https://symfony.com/releases">Release and support</a> 頁面</td>
      </tr>
  </tbody>
</table>
<p>如果框架不支援目標 PHP 版本，要先升級框架。框架升級和 PHP 升級不要同時做——先升框架、驗證穩定、再升 PHP，每一步都有獨立的 rollback 點。</p>
<h2 id="常見的升級陷阱">常見的升級陷阱</h2>
<h3 id="session-序列化格式">Session 序列化格式</h3>
<p>PHP 的 session 序列化格式在某些版本之間有變更。版本切換後舊 session 檔案可能無法反序列化，使用者會被強制登出。處理方式：</p>
<ul>
<li>在維護窗口切換版本（使用者預期重新登入）</li>
<li>或在切換前清除所有 session 檔案</li>
</ul>
<h3 id="opcache-快取">opcache 快取</h3>
<p>PHP 的 opcache 會快取編譯後的 bytecode。版本切換後如果 opcache 沒清，可能用舊版本編譯的 bytecode 跑在新版本上。切換後的第一件事：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># CLI 方式清除（如果有 SSH）</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">php -r <span class="s2">&#34;opcache_reset();&#34;</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 或重啟 PHP-FPM / Apache</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">systemctl restart php8.2-fpm</span></span></code></pre></div><h3 id="composer-的-php-版本鎖定">Composer 的 PHP 版本鎖定</h3>
<p><code>composer.lock</code> 裡的套件版本是根據當時的 PHP 版本解析的。PHP 版本變了之後，要重新 <code>composer update</code> 讓 Composer 用新版本重新解析依賴。但 <code>composer update</code> 可能升級其他套件——較安全的做法是 <code>composer update --lock</code> 只更新 lock file 的 metadata、不升級套件版本。</p>
<h3 id="隱性的行為變更">隱性的行為變更</h3>
<p>PHP 8.0 起，字串跟數字的比較規則改了（<code>0 == &quot;foo&quot;</code> 從 <code>true</code> 變 <code>false</code>）。這類變更不會報錯、不會拋例外，程式碼照跑但行為不同。靜態分析抓不到，只有業務邏輯測試能覆蓋。</p>
<p>如果沒有測試套件，至少在切換後的一週內密切監控錯誤日誌和業務指標（訂單數、登入數、API 錯誤率），用業務指標的異常作為行為變更的偵測手段。</p>
<h2 id="時程與管理層溝通">時程與管理層溝通</h2>
<table>
  <thead>
      <tr>
          <th>升級類型</th>
          <th>典型時程</th>
          <th>主要成本來源</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>PHP 小版本（8.0→8.2）</td>
          <td>2-5 天</td>
          <td>依賴更新 + 測試</td>
      </tr>
      <tr>
          <td>PHP 跨大版本（7.4→8.x）</td>
          <td>1-2 週</td>
          <td>函式替換 + 行為驗證</td>
      </tr>
      <tr>
          <td>PHP 跳代（5.6→8.x）</td>
          <td>4-8 週</td>
          <td>大量程式碼修改 + 框架升級</td>
      </tr>
      <tr>
          <td>Node.js 大版本</td>
          <td>3-5 天</td>
          <td>原生模組重編 + API 變更</td>
      </tr>
      <tr>
          <td>Python 2→3</td>
          <td>8-16 週</td>
          <td>接近重寫等級</td>
      </tr>
  </tbody>
</table>
<p>向管理層溝通時要說明：「升級 runtime 版本不只是在伺服器改一個設定。程式碼裡用到的函式和行為在新版本有不同的定義，需要逐一修改和驗證。時程取決於程式碼用了多少舊版本的專屬功能。」</p>
<p>成本參考：PHP 版本升級本身的工具和環境不花錢（PHPCompatibility 開源、Docker 免費、cPanel 版本切換內建）。成本全在工程師時間。</p>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>→ <a href="/blog/infra/upgrade/upgrade-framework/" data-link-title="升級的共通操作框架" data-link-desc="任何環境或系統升級的四階段模型：差異評估、平行環境驗證、分批切換、退役舊環境，以及貫穿全程的升級紀律">升級的共通操作框架</a>：四階段模型（評估 → 平行環境 → 切換 → 退役）</li>
<li>→ <a href="/blog/infra/takeover/legacy-php-security-audit/" data-link-title="Legacy PHP 的安全盤點" data-link-desc="接手 legacy PHP 專案後的系統性安全審查：credential 掃描、PHP 版本風險、常見漏洞模式的 grep 偵測、.htaccess 防線、檔案權限、外部依賴與掃描工具">Legacy PHP 的安全盤點</a>：PHP 版本風險評估與漏洞掃描</li>
<li>→ <a href="/blog/infra/takeover/legacy-code-versioning-deployment/" data-link-title="程式碼版控與 FTP 部署紀律" data-link-desc="無 SSH 環境的 PHP 專案的程式碼怎麼從 FTP 拉回來建 Git repo、設定檔怎麼分離、FTP 部署怎麼建立可追蹤的流程、以及怎麼用 CI 取代手動上傳">程式碼版控與 FTP 部署紀律</a>：升級前的 Git 基準線與 rollback 策略</li>
</ul>
]]></content:encoded></item><item><title>3.2 pprof 基礎診斷流程</title><link>https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/pprof/</link><pubDate>Wed, 22 Apr 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/pprof/</guid><description>&lt;p>pprof 的核心用途是用實際執行資料定位效能問題。它能協助觀察 heap、goroutine、CPU、block、mutex 與 &lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/trace/" data-link-title="Trace" data-link-desc="說明 trace 如何重建跨服務請求的路徑、耗時與依賴關係">trace&lt;/a>，讓工程師從「感覺哪裡慢」改成「依 profile 判斷哪裡有壓力」。&lt;/p>
&lt;h2 id="本章目標">本章目標&lt;/h2>
&lt;p>學完本章後，你將能夠：&lt;/p>
&lt;ol>
&lt;li>安全地條件啟用 pprof endpoint&lt;/li>
&lt;li>判斷 heap、goroutine、CPU、block、mutex、trace 各自回答什麼問題&lt;/li>
&lt;li>用 &lt;code>go tool pprof&lt;/code> 取得 profile 並閱讀 &lt;code>top&lt;/code>&lt;/li>
&lt;li>區分 &lt;code>inuse_space&lt;/code> 與 &lt;code>alloc_space&lt;/code>&lt;/li>
&lt;li>把 profile 結果連回程式設計邊界&lt;/li>
&lt;/ol>
&lt;hr>
&lt;h2 id="觀察效能問題需要先問對問題">【觀察】效能問題需要先問對問題&lt;/h2>
&lt;p>pprof 診斷的核心起點是先確認你要回答哪個問題。不同 profile 回答不同問題，拿錯工具會讓分析變成猜測。&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>問題&lt;/th>
 &lt;th>優先 profile&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>記憶體持續上升&lt;/td>
 &lt;td>heap &lt;code>inuse_space&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>GC 壓力高、配置很多&lt;/td>
 &lt;td>heap &lt;code>alloc_space&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>goroutine 數量持續增加&lt;/td>
 &lt;td>goroutine profile&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>CPU 使用率高&lt;/td>
 &lt;td>CPU profile&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>goroutine 常卡在 channel 或 syscall&lt;/td>
 &lt;td>goroutine / trace&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>mutex 等待嚴重&lt;/td>
 &lt;td>mutex profile&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>channel/send/receive 阻塞多&lt;/td>
 &lt;td>block profile&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Profile 不是一次全抓就會自動給答案。先問清楚問題，再抓對應資料，分析成本會低很多。&lt;/p>
&lt;h2 id="判讀pprof-endpoint-是受控診斷入口">【判讀】pprof endpoint 是受控診斷入口&lt;/h2>
&lt;p>pprof endpoint 的核心安全責任是受控地暴露診斷資訊。它可能包含 goroutine stack、函式名稱、路徑、記憶體配置模式與部分請求脈絡；正式服務應把 &lt;code>/debug/pprof/&lt;/code> 放在明確啟用、內部網路或驗證保護之後。&lt;/p>
&lt;p>條件啟用範例：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-go" data-lang="go">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="kn">import&lt;/span> &lt;span class="nx">_&lt;/span> &lt;span class="s">&amp;#34;net/http/pprof&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="kd">func&lt;/span> &lt;span class="nf">RegisterDebugEndpoints&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">mux&lt;/span> &lt;span class="o">*&lt;/span>&lt;span class="nx">http&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">ServeMux&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="nx">os&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Getenv&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;APP_PPROF&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">!=&lt;/span> &lt;span class="s">&amp;#34;1&amp;#34;&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl"> &lt;span class="k">return&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">8&lt;/span>&lt;span class="cl"> &lt;span class="nx">mux&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Handle&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;/debug/pprof/&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">http&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">DefaultServeMux&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">9&lt;/span>&lt;span class="cl">&lt;span class="p">}&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>實務上還可以只綁定 localhost、掛在內部管理 port、加上身份驗證，或只在開發與診斷環境啟用。重點是 pprof 要受控，而不是跟公開 API 一起裸露。&lt;/p>
&lt;h2 id="執行heap-profile-看記憶體保留與配置壓力">【執行】heap profile 看記憶體保留與配置壓力&lt;/h2>
&lt;p>Heap profile 的核心問題是「哪些物件佔用或配置了記憶體」。當服務記憶體持續上升時，heap profile 是第一個常用工具。&lt;/p>
&lt;p>看目前仍被保留的記憶體：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">go tool pprof http://localhost:8080/debug/pprof/heap&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>進入 pprof 後：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">(pprof) top&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>inuse_space&lt;/code> 代表目前仍被保留的記憶體，適合分析 leak、cache、map、slice、send &lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/buffer/" data-link-title="Buffer" data-link-desc="說明系統如何用暫存空間吸收短暫速度差與尖峰流量">buffer&lt;/a>、長期持有資料。&lt;/p>
&lt;p>看累積配置量：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">go tool pprof -alloc_space http://localhost:8080/debug/pprof/heap&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>alloc_space&lt;/code> 代表累積配置量，適合分析 JSON marshal、slice append、短命 object、熱路徑反覆配置造成的 GC 壓力。&lt;/p>
&lt;h2 id="判讀heap-profile-要連回資料結構">【判讀】heap profile 要連回資料結構&lt;/h2>
&lt;p>Heap profile 的核心解讀是問「誰持有資料」或「誰反覆配置」。看到某個函式在 top 裡，下一步要回到資料結構與生命週期。&lt;/p>
&lt;p>常見對應：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>profile 現象&lt;/th>
 &lt;th>可能設計問題&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>map 持續佔用&lt;/td>
 &lt;td>cache 沒有淘汰或 key 無限制成長&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>slice/history 佔用高&lt;/td>
 &lt;td>history 無上限或 list 回傳太大&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>JSON marshal alloc 高&lt;/td>
 &lt;td>高頻推送每個 client 重複 marshal&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>bytes.Buffer 配置高&lt;/td>
 &lt;td>熱路徑重複建立 buffer&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">websocket&lt;/a> message 佔用高&lt;/td>
 &lt;td>send buffer 滿載或慢 client&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Profile 給的是線索，不是最終修正。修正要回到資料模型、copy boundary、buffer policy 或 cache policy。&lt;/p>
&lt;h2 id="執行goroutine-profile-看存活與卡住路徑">【執行】goroutine profile 看存活與卡住路徑&lt;/h2>
&lt;p>Goroutine profile 的核心問題是「哪些 goroutine 還活著，以及它們卡在哪裡」。它常用來診斷 goroutine leak、channel 等待、鎖等待與 network read 阻塞。&lt;/p></description><content:encoded><![CDATA[<p>pprof 的核心用途是用實際執行資料定位效能問題。它能協助觀察 heap、goroutine、CPU、block、mutex 與 <a href="/blog/backend/knowledge-cards/trace/" data-link-title="Trace" data-link-desc="說明 trace 如何重建跨服務請求的路徑、耗時與依賴關係">trace</a>，讓工程師從「感覺哪裡慢」改成「依 profile 判斷哪裡有壓力」。</p>
<h2 id="本章目標">本章目標</h2>
<p>學完本章後，你將能夠：</p>
<ol>
<li>安全地條件啟用 pprof endpoint</li>
<li>判斷 heap、goroutine、CPU、block、mutex、trace 各自回答什麼問題</li>
<li>用 <code>go tool pprof</code> 取得 profile 並閱讀 <code>top</code></li>
<li>區分 <code>inuse_space</code> 與 <code>alloc_space</code></li>
<li>把 profile 結果連回程式設計邊界</li>
</ol>
<hr>
<h2 id="觀察效能問題需要先問對問題">【觀察】效能問題需要先問對問題</h2>
<p>pprof 診斷的核心起點是先確認你要回答哪個問題。不同 profile 回答不同問題，拿錯工具會讓分析變成猜測。</p>
<table>
  <thead>
      <tr>
          <th>問題</th>
          <th>優先 profile</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>記憶體持續上升</td>
          <td>heap <code>inuse_space</code></td>
      </tr>
      <tr>
          <td>GC 壓力高、配置很多</td>
          <td>heap <code>alloc_space</code></td>
      </tr>
      <tr>
          <td>goroutine 數量持續增加</td>
          <td>goroutine profile</td>
      </tr>
      <tr>
          <td>CPU 使用率高</td>
          <td>CPU profile</td>
      </tr>
      <tr>
          <td>goroutine 常卡在 channel 或 syscall</td>
          <td>goroutine / trace</td>
      </tr>
      <tr>
          <td>mutex 等待嚴重</td>
          <td>mutex profile</td>
      </tr>
      <tr>
          <td>channel/send/receive 阻塞多</td>
          <td>block profile</td>
      </tr>
  </tbody>
</table>
<p>Profile 不是一次全抓就會自動給答案。先問清楚問題，再抓對應資料，分析成本會低很多。</p>
<h2 id="判讀pprof-endpoint-是受控診斷入口">【判讀】pprof endpoint 是受控診斷入口</h2>
<p>pprof endpoint 的核心安全責任是受控地暴露診斷資訊。它可能包含 goroutine stack、函式名稱、路徑、記憶體配置模式與部分請求脈絡；正式服務應把 <code>/debug/pprof/</code> 放在明確啟用、內部網路或驗證保護之後。</p>
<p>條件啟用範例：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln">1</span><span class="cl"><span class="kn">import</span> <span class="nx">_</span> <span class="s">&#34;net/http/pprof&#34;</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="kd">func</span> <span class="nf">RegisterDebugEndpoints</span><span class="p">(</span><span class="nx">mux</span> <span class="o">*</span><span class="nx">http</span><span class="p">.</span><span class="nx">ServeMux</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">    <span class="k">if</span> <span class="nx">os</span><span class="p">.</span><span class="nf">Getenv</span><span class="p">(</span><span class="s">&#34;APP_PPROF&#34;</span><span class="p">)</span> <span class="o">!=</span> <span class="s">&#34;1&#34;</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">        <span class="k">return</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">7</span><span class="cl">
</span></span><span class="line"><span class="ln">8</span><span class="cl">    <span class="nx">mux</span><span class="p">.</span><span class="nf">Handle</span><span class="p">(</span><span class="s">&#34;/debug/pprof/&#34;</span><span class="p">,</span> <span class="nx">http</span><span class="p">.</span><span class="nx">DefaultServeMux</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">9</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>實務上還可以只綁定 localhost、掛在內部管理 port、加上身份驗證，或只在開發與診斷環境啟用。重點是 pprof 要受控，而不是跟公開 API 一起裸露。</p>
<h2 id="執行heap-profile-看記憶體保留與配置壓力">【執行】heap profile 看記憶體保留與配置壓力</h2>
<p>Heap profile 的核心問題是「哪些物件佔用或配置了記憶體」。當服務記憶體持續上升時，heap profile 是第一個常用工具。</p>
<p>看目前仍被保留的記憶體：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">go tool pprof http://localhost:8080/debug/pprof/heap</span></span></code></pre></div><p>進入 pprof 後：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">(pprof) top</span></span></code></pre></div><p><code>inuse_space</code> 代表目前仍被保留的記憶體，適合分析 leak、cache、map、slice、send <a href="/blog/backend/knowledge-cards/buffer/" data-link-title="Buffer" data-link-desc="說明系統如何用暫存空間吸收短暫速度差與尖峰流量">buffer</a>、長期持有資料。</p>
<p>看累積配置量：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">go tool pprof -alloc_space http://localhost:8080/debug/pprof/heap</span></span></code></pre></div><p><code>alloc_space</code> 代表累積配置量，適合分析 JSON marshal、slice append、短命 object、熱路徑反覆配置造成的 GC 壓力。</p>
<h2 id="判讀heap-profile-要連回資料結構">【判讀】heap profile 要連回資料結構</h2>
<p>Heap profile 的核心解讀是問「誰持有資料」或「誰反覆配置」。看到某個函式在 top 裡，下一步要回到資料結構與生命週期。</p>
<p>常見對應：</p>
<table>
  <thead>
      <tr>
          <th>profile 現象</th>
          <th>可能設計問題</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>map 持續佔用</td>
          <td>cache 沒有淘汰或 key 無限制成長</td>
      </tr>
      <tr>
          <td>slice/history 佔用高</td>
          <td>history 無上限或 list 回傳太大</td>
      </tr>
      <tr>
          <td>JSON marshal alloc 高</td>
          <td>高頻推送每個 client 重複 marshal</td>
      </tr>
      <tr>
          <td>bytes.Buffer 配置高</td>
          <td>熱路徑重複建立 buffer</td>
      </tr>
      <tr>
          <td><a href="/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">websocket</a> message 佔用高</td>
          <td>send buffer 滿載或慢 client</td>
      </tr>
  </tbody>
</table>
<p>Profile 給的是線索，不是最終修正。修正要回到資料模型、copy boundary、buffer policy 或 cache policy。</p>
<h2 id="執行goroutine-profile-看存活與卡住路徑">【執行】goroutine profile 看存活與卡住路徑</h2>
<p>Goroutine profile 的核心問題是「哪些 goroutine 還活著，以及它們卡在哪裡」。它常用來診斷 goroutine leak、channel 等待、鎖等待與 network read 阻塞。</p>
<p>互動模式：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">go tool pprof http://localhost:8080/debug/pprof/goroutine</span></span></code></pre></div><p>文字 stack：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">curl <span class="s2">&#34;http://localhost:8080/debug/pprof/goroutine?debug=2&#34;</span></span></span></code></pre></div><p>若大量 goroutine 卡在同一個 channel receive、send、network read、ticker loop，通常代表某個退出條件、close path、<a href="/blog/backend/knowledge-cards/deadline/" data-link-title="Deadline" data-link-desc="說明整體操作的截止時間如何沿著服務邊界傳遞">deadline</a> 或 unregister 設計有問題。</p>
<p>Goroutine profile 不只看數量。少量但卡在錯誤位置的 goroutine，也可能代表資源沒有被釋放。</p>
<h2 id="執行cpu-profile-看熱路徑">【執行】CPU profile 看熱路徑</h2>
<p>CPU profile 的核心問題是「程式把 CPU 時間花在哪裡」。它需要採樣一段時間，適合分析 CPU 使用率高或 request latency 異常。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">go tool pprof <span class="s2">&#34;http://localhost:8080/debug/pprof/profile?seconds=30&#34;</span></span></span></code></pre></div><p>常用指令：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">(pprof) top
</span></span><span class="line"><span class="ln">2</span><span class="cl">(pprof) list Encode</span></span></code></pre></div><p>CPU profile 要搭配流量情境解讀。低流量時抓到的 profile 可能沒有代表性；高流量時則要注意診斷本身也會造成額外負擔。</p>
<p>若 top 顯示大量時間花在 JSON encode、sort、lock、regex 或 compression，下一步應回到對應熱路徑，判斷是否可以減少工作、快取結果、改資料結構或降低呼叫頻率。</p>
<h2 id="策略block-與-mutex-profile-需要先啟用取樣">【策略】block 與 mutex profile 需要先啟用取樣</h2>
<p>Block/mutex profile 的核心用途是分析等待，而不是分析 CPU 計算。它們通常需要在程式中設定取樣比例。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln">1</span><span class="cl"><span class="kd">func</span> <span class="nf">ConfigureBlockingProfiles</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">    <span class="nx">runtime</span><span class="p">.</span><span class="nf">SetBlockProfileRate</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">    <span class="nx">runtime</span><span class="p">.</span><span class="nf">SetMutexProfileFraction</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>Block profile 看 goroutine 在同步原語上阻塞的時間，例如 channel send/receive、select、mutex。Mutex profile 看鎖競爭。</p>
<p>啟用取樣有成本，不一定要常駐開最高強度。診斷時可以條件啟用，或在壓測環境中使用。</p>
<h2 id="執行trace-看排程與延遲">【執行】trace 看排程與延遲</h2>
<p>Trace 的核心用途是觀察 goroutine 排程、network block、syscall、GC pause 與延遲事件。它比單一 profile 更完整，但也更重。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">curl -o trace.out <span class="s2">&#34;http://localhost:8080/debug/pprof/trace?seconds=5&#34;</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">go tool trace trace.out</span></span></code></pre></div><p>Trace 適合用在你已經知道有延遲問題，但 heap、CPU、goroutine profile 都不足以解釋時。它能顯示 goroutine 何時 runnable、何時 blocked、何時被排程。</p>
<p>Trace 檔案可能很大，不適合長時間收集。通常先抓短時間，確認問題窗口後再精準分析。</p>
<h2 id="策略診斷流程要先保留現場">【策略】診斷流程要先保留現場</h2>
<p>pprof 診斷的核心流程是先保留現場，再改程式。若你先重啟服務或調參，可能會清掉最重要的證據。</p>
<p>建議流程：</p>
<ol>
<li>記錄當下流量、版本、操作、時間區間。</li>
<li>讀 runtime <a href="/blog/backend/knowledge-cards/metrics/" data-link-title="Metrics" data-link-desc="說明指標如何描述服務趨勢、容量與健康狀態">metrics</a>：heap、GC、goroutine、<a href="/blog/backend/knowledge-cards/queue/" data-link-title="Queue" data-link-desc="說明 queue 如何保存等待處理的工作並形成容量邊界">queue</a> 長度。</li>
<li>依問題抓 profile：heap、goroutine、CPU 或 trace。</li>
<li>用 profile 找出函式與 stack pattern。</li>
<li>回到程式碼確認資料結構、goroutine lifecycle 或 hot path。</li>
<li>修改後用相同情境再抓一次 profile 驗證。</li>
</ol>
<p>這個流程能避免「看到 top 第一名就改」的衝動。Profile 需要和情境一起讀，才不會誤判。</p>
<h2 id="本章不處理">本章不處理</h2>
<p>本章先處理單一服務內的 profile 讀法；商用 APM 與分散式 tracing，會在下列章節再往外延伸：</p>
<ul>
<li><a href="/blog/go-advanced/07-distributed-operations/observability-pipeline/" data-link-title="7.4 Observability pipeline、metrics 與 tracing" data-link-desc="把 structured log、metric、trace 與 profile 組成可操作的診斷系統">Go 進階：Observability pipeline、metrics 與 tracing</a></li>
<li><a href="/blog/backend/04-observability/" data-link-title="模組四：可觀測性平台" data-link-desc="整理 log、metric、trace、dashboard 與 alert 的後端操作實務">Backend：可觀測性平台</a></li>
</ul>
<h2 id="和-go-教材的關係">和 Go 教材的關係</h2>
<p>這一章承接的是 goroutine、allocation 與 runtime metrics；如果你要先回看語言教材，可以讀：</p>
<ul>
<li><a href="/blog/go/04-concurrency/goroutine/" data-link-title="4.1 goroutine：輕量並發工作" data-link-desc="用 goroutine 啟動並發工作，並設計清楚的退出條件">Go：goroutine：輕量並發工作</a></li>
<li><a href="/blog/go/04-concurrency/select/" data-link-title="4.3 select：同時等待多種事件" data-link-desc="用 select 建立事件迴圈">Go：select：同時等待多種事件</a></li>
<li><a href="/blog/go/06-practical/new-background-worker/" data-link-title="6.4 如何新增背景工作流程" data-link-desc="接入 context、channel 與 shutdown">Go：如何新增背景工作流程</a></li>
<li><a href="/blog/go/07-refactoring/state-boundary/" data-link-title="7.4 狀態管理的安全邊界" data-link-desc="用 lock、copy 與 API 限制保護共享狀態">Go：狀態管理的安全邊界</a></li>
</ul>
<h2 id="小結">小結</h2>
<p>pprof 是診斷工具，不是公開 API。Heap profile 看保留與配置，goroutine profile 看存活與卡住路徑，CPU profile 看熱點，block/mutex profile 看等待，trace 看排程與延遲。好的診斷流程會先問對問題、抓對 profile，再把結果連回資料結構、goroutine lifecycle 與服務行為。</p>
]]></content:encoded></item><item><title>3.3 goroutine leak 偵測</title><link>https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/goroutine-leak/</link><pubDate>Wed, 22 Apr 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/goroutine-leak/</guid><description>&lt;p>Goroutine leak 偵測的核心目標是確認已經沒有存在價值的 goroutine 能被停止。它通常是生命週期問題：誰取消、誰 close、誰解除 I/O 阻塞、誰停止 ticker。&lt;/p>
&lt;h2 id="本章目標">本章目標&lt;/h2>
&lt;p>學完本章後，你將能夠：&lt;/p>
&lt;ol>
&lt;li>分辨合理長期 goroutine 與 goroutine leak&lt;/li>
&lt;li>用 context、done channel、connection close 設計退出路徑&lt;/li>
&lt;li>用 pprof goroutine profile 判讀卡住 stack&lt;/li>
&lt;li>測試 worker、ticker、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">WebSocket&lt;/a> pump 是否能退出&lt;/li>
&lt;li>從 leak pattern 回到 ownership 修正&lt;/li>
&lt;/ol>
&lt;hr>
&lt;h2 id="觀察goroutine-leak-是生命週期沒有結束">【觀察】goroutine leak 是生命週期沒有結束&lt;/h2>
&lt;p>Goroutine leak 的核心定義是某個 goroutine 已經沒有存在價值，卻仍然活著。它可能卡在 channel receive、channel send、network read、ticker、mutex 或永遠不會觸發的 select case。&lt;/p>
&lt;p>反模式：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-go" data-lang="go">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="kd">func&lt;/span> &lt;span class="nf">StartWorker&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">jobs&lt;/span> &lt;span class="o">&amp;lt;-&lt;/span>&lt;span class="kd">chan&lt;/span> &lt;span class="nx">Job&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl"> &lt;span class="k">go&lt;/span> &lt;span class="kd">func&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="nx">job&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="k">range&lt;/span> &lt;span class="nx">jobs&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl"> &lt;span class="nf">process&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">job&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl"> &lt;span class="p">}()&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">&lt;span class="p">}&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>這個 worker 只有在 &lt;code>jobs&lt;/code> 被關閉時才會退出。若呼叫端永遠不關閉 &lt;code>jobs&lt;/code>，而 worker 也沒有 context，這個 goroutine 可能永久存在。&lt;/p>
&lt;p>長期存在不一定是 leak。HTTP server accept loop、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/metrics/" data-link-title="Metrics" data-link-desc="說明指標如何描述服務趨勢、容量與健康狀態">metrics&lt;/a> exporter、background scheduler 都可能合理存在；問題是它們是否有明確停止條件，且 shutdown 時是否真的會走到。&lt;/p>
&lt;h2 id="判讀每個-goroutine-都要有退出原因">【判讀】每個 goroutine 都要有退出原因&lt;/h2>
&lt;p>Goroutine lifecycle 的核心檢查是每個 goroutine 都能回答三個問題：&lt;/p>
&lt;ol>
&lt;li>誰要求它停止？&lt;/li>
&lt;li>它如果卡在 channel 或 I/O，如何被喚醒？&lt;/li>
&lt;li>它停止後如何讓測試或上層知道？&lt;/li>
&lt;/ol>
&lt;p>若三題任一題答不出來，就有 leak 風險。&lt;/p>
&lt;p>例如 worker 應該有 context：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-go" data-lang="go">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="kd">func&lt;/span> &lt;span class="nf">RunWorker&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">ctx&lt;/span> &lt;span class="nx">context&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">Context&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">jobs&lt;/span> &lt;span class="o">&amp;lt;-&lt;/span>&lt;span class="kd">chan&lt;/span> &lt;span class="nx">Job&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl"> &lt;span class="k">select&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl"> &lt;span class="k">case&lt;/span> &lt;span class="o">&amp;lt;-&lt;/span>&lt;span class="nx">ctx&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Done&lt;/span>&lt;span class="p">():&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl"> &lt;span class="k">return&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl"> &lt;span class="k">case&lt;/span> &lt;span class="nx">job&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">ok&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="o">&amp;lt;-&lt;/span>&lt;span class="nx">jobs&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="p">!&lt;/span>&lt;span class="nx">ok&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl"> &lt;span class="k">return&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl"> &lt;span class="nf">process&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">job&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl">&lt;span class="p">}&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>這個 worker 有兩條退出路徑：上層取消 context，或 jobs channel 被關閉。這比只依賴 channel close 更容易整合到服務 shutdown。&lt;/p>
&lt;h2 id="策略io-阻塞需要-deadline-或-close">【策略】I/O 阻塞需要 deadline 或 close&lt;/h2>
&lt;p>I/O goroutine 的核心風險是 context 本身不一定能打斷底層阻塞呼叫。WebSocket read、TCP read、file watcher、外部 API call 都要確認是否支援 context、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/deadline/" data-link-title="Deadline" data-link-desc="說明整體操作的截止時間如何沿著服務邊界傳遞">deadline&lt;/a> 或 close。&lt;/p>
&lt;p>WebSocket read pump 常見退出方式：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-go" data-lang="go">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="kd">func&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="nx">c&lt;/span> &lt;span class="o">*&lt;/span>&lt;span class="nx">Client&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="nf">readPump&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">ctx&lt;/span> &lt;span class="nx">context&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">Context&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">hub&lt;/span> &lt;span class="o">*&lt;/span>&lt;span class="nx">Hub&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">router&lt;/span> &lt;span class="nx">MessageRouter&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl"> &lt;span class="k">defer&lt;/span> &lt;span class="kd">func&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl"> &lt;span class="nx">hub&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">unregister&lt;/span> &lt;span class="o">&amp;lt;-&lt;/span> &lt;span class="nx">c&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl"> &lt;span class="p">}()&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl"> &lt;span class="kd">var&lt;/span> &lt;span class="nx">message&lt;/span> &lt;span class="nx">ClientMessage&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="nx">err&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nx">c&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">conn&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">ReadJSON&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">&amp;amp;&lt;/span>&lt;span class="nx">message&lt;/span>&lt;span class="p">);&lt;/span> &lt;span class="nx">err&lt;/span> &lt;span class="o">!=&lt;/span> &lt;span class="kc">nil&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl"> &lt;span class="k">return&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl"> &lt;span class="nx">_&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="nx">router&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Route&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">ctx&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">c&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">message&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl">&lt;span class="p">}&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>若 &lt;code>ReadJSON&lt;/code> 卡住，context 取消不一定直接讓它返回。實務上需要 read deadline、connection close 或 heartbeat 讓 read pump 有機會退出。&lt;/p>
&lt;h2 id="執行done-channel-讓測試能觀察退出">【執行】done channel 讓測試能觀察退出&lt;/h2>
&lt;p>測試 goroutine 是否退出的核心問題是需要可觀察訊號。&lt;code>done&lt;/code> channel 可以在 goroutine 結束時 close。&lt;/p></description><content:encoded><![CDATA[<p>Goroutine leak 偵測的核心目標是確認已經沒有存在價值的 goroutine 能被停止。它通常是生命週期問題：誰取消、誰 close、誰解除 I/O 阻塞、誰停止 ticker。</p>
<h2 id="本章目標">本章目標</h2>
<p>學完本章後，你將能夠：</p>
<ol>
<li>分辨合理長期 goroutine 與 goroutine leak</li>
<li>用 context、done channel、connection close 設計退出路徑</li>
<li>用 pprof goroutine profile 判讀卡住 stack</li>
<li>測試 worker、ticker、<a href="/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">WebSocket</a> pump 是否能退出</li>
<li>從 leak pattern 回到 ownership 修正</li>
</ol>
<hr>
<h2 id="觀察goroutine-leak-是生命週期沒有結束">【觀察】goroutine leak 是生命週期沒有結束</h2>
<p>Goroutine leak 的核心定義是某個 goroutine 已經沒有存在價值，卻仍然活著。它可能卡在 channel receive、channel send、network read、ticker、mutex 或永遠不會觸發的 select case。</p>
<p>反模式：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln">1</span><span class="cl"><span class="kd">func</span> <span class="nf">StartWorker</span><span class="p">(</span><span class="nx">jobs</span> <span class="o">&lt;-</span><span class="kd">chan</span> <span class="nx">Job</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">    <span class="k">go</span> <span class="kd">func</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">        <span class="k">for</span> <span class="nx">job</span> <span class="o">:=</span> <span class="k">range</span> <span class="nx">jobs</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">            <span class="nf">process</span><span class="p">(</span><span class="nx">job</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl">    <span class="p">}()</span>
</span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>這個 worker 只有在 <code>jobs</code> 被關閉時才會退出。若呼叫端永遠不關閉 <code>jobs</code>，而 worker 也沒有 context，這個 goroutine 可能永久存在。</p>
<p>長期存在不一定是 leak。HTTP server accept loop、<a href="/blog/backend/knowledge-cards/metrics/" data-link-title="Metrics" data-link-desc="說明指標如何描述服務趨勢、容量與健康狀態">metrics</a> exporter、background scheduler 都可能合理存在；問題是它們是否有明確停止條件，且 shutdown 時是否真的會走到。</p>
<h2 id="判讀每個-goroutine-都要有退出原因">【判讀】每個 goroutine 都要有退出原因</h2>
<p>Goroutine lifecycle 的核心檢查是每個 goroutine 都能回答三個問題：</p>
<ol>
<li>誰要求它停止？</li>
<li>它如果卡在 channel 或 I/O，如何被喚醒？</li>
<li>它停止後如何讓測試或上層知道？</li>
</ol>
<p>若三題任一題答不出來，就有 leak 風險。</p>
<p>例如 worker 應該有 context：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kd">func</span> <span class="nf">RunWorker</span><span class="p">(</span><span class="nx">ctx</span> <span class="nx">context</span><span class="p">.</span><span class="nx">Context</span><span class="p">,</span> <span class="nx">jobs</span> <span class="o">&lt;-</span><span class="kd">chan</span> <span class="nx">Job</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">    <span class="k">for</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">        <span class="k">select</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">        <span class="k">case</span> <span class="o">&lt;-</span><span class="nx">ctx</span><span class="p">.</span><span class="nf">Done</span><span class="p">():</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">            <span class="k">return</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">        <span class="k">case</span> <span class="nx">job</span><span class="p">,</span> <span class="nx">ok</span> <span class="o">:=</span> <span class="o">&lt;-</span><span class="nx">jobs</span><span class="p">:</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">            <span class="k">if</span> <span class="p">!</span><span class="nx">ok</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">                <span class="k">return</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">            <span class="nf">process</span><span class="p">(</span><span class="nx">job</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>這個 worker 有兩條退出路徑：上層取消 context，或 jobs channel 被關閉。這比只依賴 channel close 更容易整合到服務 shutdown。</p>
<h2 id="策略io-阻塞需要-deadline-或-close">【策略】I/O 阻塞需要 deadline 或 close</h2>
<p>I/O goroutine 的核心風險是 context 本身不一定能打斷底層阻塞呼叫。WebSocket read、TCP read、file watcher、外部 API call 都要確認是否支援 context、<a href="/blog/backend/knowledge-cards/deadline/" data-link-title="Deadline" data-link-desc="說明整體操作的截止時間如何沿著服務邊界傳遞">deadline</a> 或 close。</p>
<p>WebSocket read pump 常見退出方式：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kd">func</span> <span class="p">(</span><span class="nx">c</span> <span class="o">*</span><span class="nx">Client</span><span class="p">)</span> <span class="nf">readPump</span><span class="p">(</span><span class="nx">ctx</span> <span class="nx">context</span><span class="p">.</span><span class="nx">Context</span><span class="p">,</span> <span class="nx">hub</span> <span class="o">*</span><span class="nx">Hub</span><span class="p">,</span> <span class="nx">router</span> <span class="nx">MessageRouter</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">    <span class="k">defer</span> <span class="kd">func</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">        <span class="nx">hub</span><span class="p">.</span><span class="nx">unregister</span> <span class="o">&lt;-</span> <span class="nx">c</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">    <span class="p">}()</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="k">for</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">        <span class="kd">var</span> <span class="nx">message</span> <span class="nx">ClientMessage</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">        <span class="k">if</span> <span class="nx">err</span> <span class="o">:=</span> <span class="nx">c</span><span class="p">.</span><span class="nx">conn</span><span class="p">.</span><span class="nf">ReadJSON</span><span class="p">(</span><span class="o">&amp;</span><span class="nx">message</span><span class="p">);</span> <span class="nx">err</span> <span class="o">!=</span> <span class="kc">nil</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">            <span class="k">return</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">        <span class="nx">_</span> <span class="p">=</span> <span class="nx">router</span><span class="p">.</span><span class="nf">Route</span><span class="p">(</span><span class="nx">ctx</span><span class="p">,</span> <span class="nx">c</span><span class="p">,</span> <span class="nx">message</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>若 <code>ReadJSON</code> 卡住，context 取消不一定直接讓它返回。實務上需要 read deadline、connection close 或 heartbeat 讓 read pump 有機會退出。</p>
<h2 id="執行done-channel-讓測試能觀察退出">【執行】done channel 讓測試能觀察退出</h2>
<p>測試 goroutine 是否退出的核心問題是需要可觀察訊號。<code>done</code> channel 可以在 goroutine 結束時 close。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kd">func</span> <span class="nf">RunWorker</span><span class="p">(</span><span class="nx">ctx</span> <span class="nx">context</span><span class="p">.</span><span class="nx">Context</span><span class="p">,</span> <span class="nx">jobs</span> <span class="o">&lt;-</span><span class="kd">chan</span> <span class="nx">Job</span><span class="p">,</span> <span class="nx">done</span> <span class="kd">chan</span><span class="o">&lt;-</span> <span class="kd">struct</span><span class="p">{})</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">    <span class="k">defer</span> <span class="nb">close</span><span class="p">(</span><span class="nx">done</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">    <span class="k">for</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">        <span class="k">select</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">        <span class="k">case</span> <span class="o">&lt;-</span><span class="nx">ctx</span><span class="p">.</span><span class="nf">Done</span><span class="p">():</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">            <span class="k">return</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">        <span class="k">case</span> <span class="nx">job</span><span class="p">,</span> <span class="nx">ok</span> <span class="o">:=</span> <span class="o">&lt;-</span><span class="nx">jobs</span><span class="p">:</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">            <span class="k">if</span> <span class="p">!</span><span class="nx">ok</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">                <span class="k">return</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">            <span class="nf">process</span><span class="p">(</span><span class="nx">job</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>測試：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kd">func</span> <span class="nf">TestRunWorkerStops</span><span class="p">(</span><span class="nx">t</span> <span class="o">*</span><span class="nx">testing</span><span class="p">.</span><span class="nx">T</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">    <span class="nx">ctx</span><span class="p">,</span> <span class="nx">cancel</span> <span class="o">:=</span> <span class="nx">context</span><span class="p">.</span><span class="nf">WithCancel</span><span class="p">(</span><span class="nx">context</span><span class="p">.</span><span class="nf">Background</span><span class="p">())</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">    <span class="nx">jobs</span> <span class="o">:=</span> <span class="nb">make</span><span class="p">(</span><span class="kd">chan</span> <span class="nx">Job</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">    <span class="nx">done</span> <span class="o">:=</span> <span class="nb">make</span><span class="p">(</span><span class="kd">chan</span> <span class="kd">struct</span><span class="p">{})</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="k">go</span> <span class="nf">RunWorker</span><span class="p">(</span><span class="nx">ctx</span><span class="p">,</span> <span class="nx">jobs</span><span class="p">,</span> <span class="nx">done</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">    <span class="nf">cancel</span><span class="p">()</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">    <span class="k">select</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">    <span class="k">case</span> <span class="o">&lt;-</span><span class="nx">done</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">    <span class="k">case</span> <span class="o">&lt;-</span><span class="nx">time</span><span class="p">.</span><span class="nf">After</span><span class="p">(</span><span class="nx">time</span><span class="p">.</span><span class="nx">Second</span><span class="p">):</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">        <span class="nx">t</span><span class="p">.</span><span class="nf">Fatalf</span><span class="p">(</span><span class="s">&#34;worker did not stop&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p><a href="/blog/backend/knowledge-cards/timeout/" data-link-title="Timeout" data-link-desc="說明等待外部操作的時間上限如何保護資源與使用者體驗">Timeout</a> 是測試保護，不是功能本身。真正的退出訊號是 <code>done</code> 被關閉。</p>
<h2 id="執行ticker-必須停止">【執行】ticker 必須停止</h2>
<p>Ticker leak 的核心原因是建立 ticker 後沒有呼叫 <code>Stop</code>。Ticker 會持有 runtime 資源；長時間服務若反覆建立不停止，會累積不必要成本。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kd">func</span> <span class="nf">RunCleanup</span><span class="p">(</span><span class="nx">ctx</span> <span class="nx">context</span><span class="p">.</span><span class="nx">Context</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">    <span class="nx">ticker</span> <span class="o">:=</span> <span class="nx">time</span><span class="p">.</span><span class="nf">NewTicker</span><span class="p">(</span><span class="nx">time</span><span class="p">.</span><span class="nx">Minute</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">    <span class="k">defer</span> <span class="nx">ticker</span><span class="p">.</span><span class="nf">Stop</span><span class="p">()</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">    <span class="k">for</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">        <span class="k">select</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">        <span class="k">case</span> <span class="o">&lt;-</span><span class="nx">ctx</span><span class="p">.</span><span class="nf">Done</span><span class="p">():</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">            <span class="k">return</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">        <span class="k">case</span> <span class="o">&lt;-</span><span class="nx">ticker</span><span class="p">.</span><span class="nx">C</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">            <span class="nf">cleanup</span><span class="p">()</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p><code>defer ticker.Stop()</code> 應緊跟在成功建立 ticker 後。這樣不管函式因 context、錯誤或 channel 關閉退出，ticker 都會被停止。</p>
<p><code>time.After</code> 在一次性 timeout 很方便，但在高頻迴圈裡反覆建立 timer 可能造成額外配置。需要重複觸發時，優先使用 <code>Ticker</code> 或可重設的 <code>Timer</code> 並明確停止。</p>
<h2 id="判讀pprof-goroutine-profile-看-stack-pattern">【判讀】pprof goroutine profile 看 stack pattern</h2>
<p>Goroutine profile 的核心價值是顯示 goroutine stack。當 goroutine 數量持續上升時，先看它們卡在哪裡。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">curl <span class="s2">&#34;http://localhost:8080/debug/pprof/goroutine?debug=2&#34;</span></span></span></code></pre></div><p>常見 pattern：</p>
<table>
  <thead>
      <tr>
          <th>stack 類型</th>
          <th>可能原因</th>
          <th>回到哪個邊界</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>channel receive</td>
          <td>上游不會再送，也沒 close/context</td>
          <td>channel ownership</td>
      </tr>
      <tr>
          <td>channel send</td>
          <td>下游不再接收或 <a href="/blog/backend/knowledge-cards/buffer/" data-link-title="Buffer" data-link-desc="說明系統如何用暫存空間吸收短暫速度差與尖峰流量">buffer</a> 滿</td>
          <td><a href="/blog/backend/knowledge-cards/backpressure/" data-link-title="Backpressure" data-link-desc="說明下游處理速度不足時系統如何讓上游依下游能力送出工作">backpressure</a> / unregister</td>
      </tr>
      <tr>
          <td>network read</td>
          <td>沒有 deadline 或 connection 未 close</td>
          <td>heartbeat / I/O lifecycle</td>
      </tr>
      <tr>
          <td>ticker loop</td>
          <td>context 沒接上或 ticker 未 stop</td>
          <td>select loop lifecycle</td>
      </tr>
      <tr>
          <td>mutex lock</td>
          <td>鎖競爭或死鎖</td>
          <td>shared state owner</td>
      </tr>
  </tbody>
</table>
<p>看到 stack 後，下一步是回到對應 lifecycle 設計：誰負責停止，誰負責釋放阻塞點。</p>
<h2 id="策略websocket-pump-leak-要看-readwriteunregister-三方">【策略】WebSocket pump leak 要看 read/write/unregister 三方</h2>
<p>WebSocket goroutine leak 的核心常見原因是 read pump、write pump、hub unregister 沒有形成閉環。</p>
<p>目標流程：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln"> 1</span><span class="cl">read pump error 或 connection close
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">        │
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">        ▼
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">hub unregister
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">        │
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">        ├── close client.send
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">        └── close conn
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">        │
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">        ▼
</span></span><span class="line"><span class="ln">10</span><span class="cl">write pump exits</span></span></code></pre></div><p>若 hub 沒有 close <code>send</code>，write pump 可能一直等。若 connection 沒有 close，read pump 可能卡在 read。若 unregister 不是 idempotent，重複 close 可能 panic。</p>
<p>Goroutine profile 若顯示大量 goroutine 卡在 <code>writePump</code> 的 send receive，通常要檢查 <code>client.send</code> 是否會被 close。若卡在 <code>ReadJSON</code>，要檢查 read deadline、heartbeat 與 connection close。</p>
<h2 id="測試用-goroutine-數量做粗略回歸檢查">【測試】用 goroutine 數量做粗略回歸檢查</h2>
<p>Goroutine 數量測試的核心用途是粗略檢查是否有明顯 leak。它不是精準證明，因為 Go runtime 與測試環境本身也會有 goroutine。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="kd">func</span> <span class="nf">TestNoObviousGoroutineLeak</span><span class="p">(</span><span class="nx">t</span> <span class="o">*</span><span class="nx">testing</span><span class="p">.</span><span class="nx">T</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">    <span class="nx">before</span> <span class="o">:=</span> <span class="nx">runtime</span><span class="p">.</span><span class="nf">NumGoroutine</span><span class="p">()</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">    <span class="nx">ctx</span><span class="p">,</span> <span class="nx">cancel</span> <span class="o">:=</span> <span class="nx">context</span><span class="p">.</span><span class="nf">WithCancel</span><span class="p">(</span><span class="nx">context</span><span class="p">.</span><span class="nf">Background</span><span class="p">())</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">    <span class="nx">done</span> <span class="o">:=</span> <span class="nb">make</span><span class="p">(</span><span class="kd">chan</span> <span class="kd">struct</span><span class="p">{})</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="k">go</span> <span class="kd">func</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">        <span class="k">defer</span> <span class="nb">close</span><span class="p">(</span><span class="nx">done</span><span class="p">)</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">        <span class="nf">RunWorker</span><span class="p">(</span><span class="nx">ctx</span><span class="p">,</span> <span class="nb">make</span><span class="p">(</span><span class="kd">chan</span> <span class="nx">Job</span><span class="p">))</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">    <span class="p">}()</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">
</span></span><span class="line"><span class="ln">11</span><span class="cl">    <span class="nf">cancel</span><span class="p">()</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">    <span class="k">select</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">    <span class="k">case</span> <span class="o">&lt;-</span><span class="nx">done</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">    <span class="k">case</span> <span class="o">&lt;-</span><span class="nx">time</span><span class="p">.</span><span class="nf">After</span><span class="p">(</span><span class="nx">time</span><span class="p">.</span><span class="nx">Second</span><span class="p">):</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">        <span class="nx">t</span><span class="p">.</span><span class="nf">Fatalf</span><span class="p">(</span><span class="s">&#34;worker did not stop&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">16</span><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">
</span></span><span class="line"><span class="ln">18</span><span class="cl">    <span class="nf">eventually</span><span class="p">(</span><span class="nx">t</span><span class="p">,</span> <span class="nx">time</span><span class="p">.</span><span class="nx">Second</span><span class="p">,</span> <span class="kd">func</span><span class="p">()</span> <span class="kt">bool</span> <span class="p">{</span>
</span></span><span class="line"><span class="ln">19</span><span class="cl">        <span class="k">return</span> <span class="nx">runtime</span><span class="p">.</span><span class="nf">NumGoroutine</span><span class="p">()</span> <span class="o">&lt;=</span> <span class="nx">before</span><span class="o">+</span><span class="mi">2</span>
</span></span><span class="line"><span class="ln">20</span><span class="cl">    <span class="p">})</span>
</span></span><span class="line"><span class="ln">21</span><span class="cl"><span class="p">}</span></span></span></code></pre></div><p>這類測試要留緩衝，避免因 runtime 或其他測試 goroutine 造成假失敗。更可靠的測試仍是等待明確 <code>done</code> 訊號。</p>
<h2 id="判讀goroutine-leak-修正要改停止路徑">【判讀】goroutine leak 修正要改停止路徑</h2>
<p>Goroutine leak 的核心修正是補上停止路徑。</p>
<p>常見修正：</p>
<ul>
<li>加入 <code>ctx.Done()</code> case。</li>
<li>關閉由自己擁有的 output channel。</li>
<li>由 coordinator 等 sender 完成再 close。</li>
<li>對 network read/write 設定 deadline。</li>
<li>shutdown 時 close connection。</li>
<li>ticker 建立後 <code>defer Stop()</code>。</li>
<li>hub unregister 時 close send channel。</li>
</ul>
<p>修正後要用測試證明退出路徑真的會發生，再用 pprof 或 goroutine count 驗證趨勢。</p>
<h2 id="本章不處理">本章不處理</h2>
<p>本章先處理 goroutine 的啟動、停止與阻塞邊界；更完整的 worker 全域治理，會在下列章節再往外延伸：</p>
<ul>
<li><a href="/blog/go-advanced/01-concurrency-patterns/channel-ownership/" data-link-title="1.1 channel ownership 與關閉責任" data-link-desc="判斷誰能送出、接收與關閉 channel">Go 進階：channel ownership 與關閉責任</a></li>
<li><a href="/blog/backend/knowledge-cards/worker-pool/" data-link-title="Worker Pool" data-link-desc="說明一組 worker 如何限制同時處理量並保護下游資源">Go 進階：bounded worker pool</a></li>
<li><a href="/blog/go-advanced/01-concurrency-patterns/select-loop/" data-link-title="1.2 select loop 的生命週期設計" data-link-desc="理解長時間運行 goroutine 如何同時處理事件、ticker 與取消">Go 進階：select loop 的生命週期設計</a></li>
</ul>
<h2 id="和-go-教材的關係">和 Go 教材的關係</h2>
<p>這一章承接的是 goroutine lifecycle、channel 與 shutdown；如果你要先回看語言教材，可以讀：</p>
<ul>
<li><a href="/blog/go/04-concurrency/goroutine/" data-link-title="4.1 goroutine：輕量並發工作" data-link-desc="用 goroutine 啟動並發工作，並設計清楚的退出條件">Go：goroutine：輕量並發工作</a></li>
<li><a href="/blog/go/04-concurrency/channel/" data-link-title="4.2 channel：資料傳遞與 backpressure " data-link-desc="理解 channel 如何在 goroutine 之間傳遞資料並形成 backpressure ">Go：channel：資料傳遞與 backpressure </a></li>
<li><a href="/blog/go/04-concurrency/select/" data-link-title="4.3 select：同時等待多種事件" data-link-desc="用 select 建立事件迴圈">Go：select：同時等待多種事件</a></li>
<li><a href="/blog/go/06-practical/new-background-worker/" data-link-title="6.4 如何新增背景工作流程" data-link-desc="接入 context、channel 與 shutdown">Go：如何新增背景工作流程</a></li>
</ul>
<h2 id="小結">小結</h2>
<p>Goroutine leak 是生命週期問題。每個長期 goroutine 都應知道誰能停止它、如何解除阻塞、如何讓測試觀察退出。Context、done channel、deadline、connection close、ticker stop 與 hub unregister 是主要工具。pprof goroutine profile 則用來確認還活著的 goroutine 卡在哪個邊界。</p>
]]></content:encoded></item><item><title>模組三：Runtime 與效能診斷</title><link>https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/</link><pubDate>Wed, 22 Apr 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/</guid><description>&lt;p>Runtime 診斷的核心目標是用資料判斷服務壓力來源。Go 服務長時間運行後，問題常出現在 heap 成長、GC 壓力、goroutine 數量、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">WebSocket&lt;/a> &lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/buffer/" data-link-title="Buffer" data-link-desc="說明系統如何用暫存空間吸收短暫速度差與尖峰流量">buffer&lt;/a> 堆積、JSON 配置與共享狀態保留；診斷流程應先看趨勢，再用 profile 定位來源。&lt;/p>
&lt;p>本模組承接前面的並發、WebSocket 與測試可靠性：如果 goroutine lifecycle、send buffer、repository copy boundary 沒設計好，runtime 訊號會在 heap profile、goroutine profile、CPU profile 或 allocation profile 中反映出來。&lt;/p>
&lt;h2 id="章節列表">章節列表&lt;/h2>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>章節&lt;/th>
 &lt;th>主題&lt;/th>
 &lt;th>關鍵收穫&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/gc-memory-limit/" data-link-title="3.1 GC 與 memory limit" data-link-desc="理解 debug.SetMemoryLimit 在長時間服務中的用途">3.1&lt;/a>&lt;/td>
 &lt;td>GC 與 memory limit&lt;/td>
 &lt;td>理解 heap、GOGC、memory limit 與 runtime &lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/metrics/" data-link-title="Metrics" data-link-desc="說明指標如何描述服務趨勢、容量與健康狀態">metrics&lt;/a> 的關係&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/pprof/" data-link-title="3.2 pprof 基礎診斷流程" data-link-desc="用 pprof endpoint 診斷 heap、goroutine 與 CPU 問題">3.2&lt;/a>&lt;/td>
 &lt;td>pprof 基礎診斷流程&lt;/td>
 &lt;td>用 heap、goroutine、CPU、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/trace/" data-link-title="Trace" data-link-desc="說明 trace 如何重建跨服務請求的路徑、耗時與依賴關係">trace&lt;/a> profile 定位壓力來源&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/goroutine-leak/" data-link-title="3.3 goroutine leak 偵測" data-link-desc="判斷背景工作與 client pump 是否正確退出">3.3&lt;/a>&lt;/td>
 &lt;td>goroutine leak 偵測&lt;/td>
 &lt;td>從 stack pattern 回到 context、close、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/deadline/" data-link-title="Deadline" data-link-desc="說明整體操作的截止時間如何沿著服務邊界傳遞">deadline&lt;/a> 與 ticker lifecycle&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;a href="https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/allocation/" data-link-title="3.4 資料結構與 allocation 壓力" data-link-desc="分析列表、歷史資料與 WebSocket payload 的配置成本">3.4&lt;/a>&lt;/td>
 &lt;td>資料結構與 allocation 壓力&lt;/td>
 &lt;td>區分必要 copy、安全邊界與可優化熱路徑配置&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;h2 id="本模組使用的範例主題">本模組使用的範例主題&lt;/h2>
&lt;p>本模組使用虛構的即時通知服務作為範例。範例包含 WebSocket client lifecycle、background worker、repository list、JSON push payload 與 cache。&lt;/p>
&lt;p>範例只用來展示 Go runtime 診斷方法，不假設讀者正在維護任何特定專案。&lt;/p>
&lt;h2 id="本模組的-go-核心概念">本模組的 Go 核心概念&lt;/h2>
&lt;ul>
&lt;li>用 &lt;code>runtime.ReadMemStats&lt;/code> 或 &lt;code>runtime/metrics&lt;/code> 觀察 heap、GC 與 goroutine 趨勢。&lt;/li>
&lt;li>用 &lt;code>debug.SetMemoryLimit&lt;/code> 給 runtime 軟性記憶體目標。&lt;/li>
&lt;li>用 pprof 分析 heap、goroutine、CPU、block、mutex 與 trace。&lt;/li>
&lt;li>用 goroutine profile 找出卡在 channel、network read、ticker、mutex 的路徑。&lt;/li>
&lt;li>用 &lt;code>alloc_space&lt;/code> 與 &lt;code>inuse_space&lt;/code> 區分配置壓力與保留記憶體。&lt;/li>
&lt;li>用資料結構設計降低不必要 allocation，但保留必要 copy boundary。&lt;/li>
&lt;/ul>
&lt;h2 id="學習重點">學習重點&lt;/h2>
&lt;p>學完本模組後，你應該能判斷：&lt;/p>
&lt;ol>
&lt;li>記憶體問題是 GC 壓力、長期保留，還是短暫尖峰&lt;/li>
&lt;li>什麼情境適合調整 memory limit，什麼情境應該找 leak&lt;/li>
&lt;li>heap、goroutine、CPU、trace 各自回答什麼問題&lt;/li>
&lt;li>goroutine leak 應回到哪個 lifecycle 邊界修&lt;/li>
&lt;li>allocation 優化何時值得做，何時會破壞安全邊界&lt;/li>
&lt;/ol>
&lt;h2 id="本模組不處理">本模組不處理&lt;/h2>
&lt;p>本模組不討論分散式 tracing 平台、完整監控系統或雲端特定 profiler。這些工具可以接在本模組之後；本模組先建立 Go runtime 原生訊號與 pprof 的診斷思路。後續可接 &lt;a href="https://tarrragon.github.io/blog/go-advanced/07-distributed-operations/observability-pipeline/" data-link-title="7.4 Observability pipeline、metrics 與 tracing" data-link-desc="把 structured log、metric、trace 與 profile 組成可操作的診斷系統">Observability pipeline、metrics 與 tracing&lt;/a>。&lt;/p></description><content:encoded><![CDATA[<p>Runtime 診斷的核心目標是用資料判斷服務壓力來源。Go 服務長時間運行後，問題常出現在 heap 成長、GC 壓力、goroutine 數量、<a href="/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">WebSocket</a> <a href="/blog/backend/knowledge-cards/buffer/" data-link-title="Buffer" data-link-desc="說明系統如何用暫存空間吸收短暫速度差與尖峰流量">buffer</a> 堆積、JSON 配置與共享狀態保留；診斷流程應先看趨勢，再用 profile 定位來源。</p>
<p>本模組承接前面的並發、WebSocket 與測試可靠性：如果 goroutine lifecycle、send buffer、repository copy boundary 沒設計好，runtime 訊號會在 heap profile、goroutine profile、CPU profile 或 allocation profile 中反映出來。</p>
<h2 id="章節列表">章節列表</h2>
<table>
  <thead>
      <tr>
          <th>章節</th>
          <th>主題</th>
          <th>關鍵收穫</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="/blog/go-advanced/03-runtime-profiling/gc-memory-limit/" data-link-title="3.1 GC 與 memory limit" data-link-desc="理解 debug.SetMemoryLimit 在長時間服務中的用途">3.1</a></td>
          <td>GC 與 memory limit</td>
          <td>理解 heap、GOGC、memory limit 與 runtime <a href="/blog/backend/knowledge-cards/metrics/" data-link-title="Metrics" data-link-desc="說明指標如何描述服務趨勢、容量與健康狀態">metrics</a> 的關係</td>
      </tr>
      <tr>
          <td><a href="/blog/go-advanced/03-runtime-profiling/pprof/" data-link-title="3.2 pprof 基礎診斷流程" data-link-desc="用 pprof endpoint 診斷 heap、goroutine 與 CPU 問題">3.2</a></td>
          <td>pprof 基礎診斷流程</td>
          <td>用 heap、goroutine、CPU、<a href="/blog/backend/knowledge-cards/trace/" data-link-title="Trace" data-link-desc="說明 trace 如何重建跨服務請求的路徑、耗時與依賴關係">trace</a> profile 定位壓力來源</td>
      </tr>
      <tr>
          <td><a href="/blog/go-advanced/03-runtime-profiling/goroutine-leak/" data-link-title="3.3 goroutine leak 偵測" data-link-desc="判斷背景工作與 client pump 是否正確退出">3.3</a></td>
          <td>goroutine leak 偵測</td>
          <td>從 stack pattern 回到 context、close、<a href="/blog/backend/knowledge-cards/deadline/" data-link-title="Deadline" data-link-desc="說明整體操作的截止時間如何沿著服務邊界傳遞">deadline</a> 與 ticker lifecycle</td>
      </tr>
      <tr>
          <td><a href="/blog/go-advanced/03-runtime-profiling/allocation/" data-link-title="3.4 資料結構與 allocation 壓力" data-link-desc="分析列表、歷史資料與 WebSocket payload 的配置成本">3.4</a></td>
          <td>資料結構與 allocation 壓力</td>
          <td>區分必要 copy、安全邊界與可優化熱路徑配置</td>
      </tr>
  </tbody>
</table>
<h2 id="本模組使用的範例主題">本模組使用的範例主題</h2>
<p>本模組使用虛構的即時通知服務作為範例。範例包含 WebSocket client lifecycle、background worker、repository list、JSON push payload 與 cache。</p>
<p>範例只用來展示 Go runtime 診斷方法，不假設讀者正在維護任何特定專案。</p>
<h2 id="本模組的-go-核心概念">本模組的 Go 核心概念</h2>
<ul>
<li>用 <code>runtime.ReadMemStats</code> 或 <code>runtime/metrics</code> 觀察 heap、GC 與 goroutine 趨勢。</li>
<li>用 <code>debug.SetMemoryLimit</code> 給 runtime 軟性記憶體目標。</li>
<li>用 pprof 分析 heap、goroutine、CPU、block、mutex 與 trace。</li>
<li>用 goroutine profile 找出卡在 channel、network read、ticker、mutex 的路徑。</li>
<li>用 <code>alloc_space</code> 與 <code>inuse_space</code> 區分配置壓力與保留記憶體。</li>
<li>用資料結構設計降低不必要 allocation，但保留必要 copy boundary。</li>
</ul>
<h2 id="學習重點">學習重點</h2>
<p>學完本模組後，你應該能判斷：</p>
<ol>
<li>記憶體問題是 GC 壓力、長期保留，還是短暫尖峰</li>
<li>什麼情境適合調整 memory limit，什麼情境應該找 leak</li>
<li>heap、goroutine、CPU、trace 各自回答什麼問題</li>
<li>goroutine leak 應回到哪個 lifecycle 邊界修</li>
<li>allocation 優化何時值得做，何時會破壞安全邊界</li>
</ol>
<h2 id="本模組不處理">本模組不處理</h2>
<p>本模組不討論分散式 tracing 平台、完整監控系統或雲端特定 profiler。這些工具可以接在本模組之後；本模組先建立 Go runtime 原生訊號與 pprof 的診斷思路。後續可接 <a href="/blog/go-advanced/07-distributed-operations/observability-pipeline/" data-link-title="7.4 Observability pipeline、metrics 與 tracing" data-link-desc="把 structured log、metric、trace 與 profile 組成可操作的診斷系統">Observability pipeline、metrics 與 tracing</a>。</p>
<h2 id="學習時間">學習時間</h2>
<p>預計 3-4 小時</p>
]]></content:encoded></item><item><title>Go 進階指南</title><link>https://tarrragon.github.io/blog/go-advanced/</link><pubDate>Wed, 22 Apr 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/go-advanced/</guid><description>&lt;p>本系列是接在入門教學之後的延伸路線，目標是把 Go 的並發模式、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">WebSocket&lt;/a> 服務架構、runtime 診斷、狀態邊界與生產環境可觀測性講到能真正用在服務上。語法細節留在入門篇；進階篇聚焦長時間服務會遇到的設計壓力。&lt;/p>
&lt;h2 id="目標讀者">目標讀者&lt;/h2>
&lt;ul>
&lt;li>已完成 &lt;a href="https://tarrragon.github.io/blog/go/" data-link-title="Go 入門實戰指南" data-link-desc="理解 Go 語言精神與核心開發能力">Go 入門實戰指南&lt;/a> 的工程師&lt;/li>
&lt;li>想深入理解 Go 並發模型與 runtime 行為的開發者&lt;/li>
&lt;li>需要維護長時間運行服務的人&lt;/li>
&lt;li>想把 Go 服務從「能跑」提升到「可觀測、可測、可演進」的人&lt;/li>
&lt;/ul>
&lt;h2 id="學習目標">學習目標&lt;/h2>
&lt;ol>
&lt;li>掌握 goroutine、channel、mutex 的進階使用邊界&lt;/li>
&lt;li>理解 &lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">WebSocket&lt;/a> client lifecycle、heartbeat、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/buffer/" data-link-title="Buffer" data-link-desc="說明系統如何用暫存空間吸收短暫速度差與尖峰流量">buffer&lt;/a> 與慢客戶端問題&lt;/li>
&lt;li>使用 pprof、runtime 記憶體限制與結構化日誌診斷服務&lt;/li>
&lt;li>設計 event-driven service 的資料邊界與去重策略&lt;/li>
&lt;li>建立並發測試、整合測試與可重現的時間控制&lt;/li>
&lt;li>能評估 Go 服務在生產環境的風險與操作策略&lt;/li>
&lt;li>知道單一 Go 服務延伸到跨節點與平台整合時，哪些責任會轉移到資料庫、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/queue/" data-link-title="Queue" data-link-desc="說明 queue 如何保存等待處理的工作並形成容量邊界">queue&lt;/a>、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/broker/" data-link-title="Broker" data-link-desc="說明 broker 在訊息傳遞系統中負責保存、路由與交付訊息">broker&lt;/a>、observability pipeline 與部署平台&lt;/li>
&lt;/ol>
&lt;h2 id="共用術語">共用術語&lt;/h2>
&lt;p>進階篇延續入門篇的 action、command、domain event、repository、port、adapter、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/projection/" data-link-title="Projection" data-link-desc="說明從事件流或資料變更推算出查詢用讀取視圖的轉換機制">projection&lt;/a> 等詞彙。若你需要確認這些詞在這套教材中的責任邊界，可以先回到 &lt;a href="https://tarrragon.github.io/blog/go/glossary/" data-link-title="Go 教材核心術語" data-link-desc="整理 Go 入門與進階篇共用的架構、事件、狀態與邊界詞彙">Go 教材核心術語&lt;/a>。&lt;/p>
&lt;h2 id="與-backend-教材的分工">與 Backend 教材的分工&lt;/h2>
&lt;p>Go 進階篇處理單一 Go 服務內部的高階能力：goroutine lifecycle、WebSocket pump、runtime 診斷、event boundary、race test、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/graceful-shutdown/" data-link-title="Graceful Shutdown" data-link-desc="說明服務停止前如何排空流量、完成工作與保存狀態">graceful shutdown&lt;/a> 與 diagnostics endpoint。當內容開始碰到資料庫、Redis、RabbitMQ、Kafka、OpenTelemetry、Kubernetes 或 CI 平台操作時，就應該轉到跨語言的 &lt;a href="https://tarrragon.github.io/blog/backend/" data-link-title="Backend 服務實務指南" data-link-desc="用跨語言教學路線整理資料庫、快取、訊息佇列、觀測、部署、可靠性、資安、事故與容量等後端服務能力">Backend 服務實務指南&lt;/a>。&lt;/p>
&lt;p>模組七保留在進階篇裡，因為它要回答「Go 服務在跨出去以前，還需要先把哪些 port、訊號與測試合約準備好」。外部系統本身的選型與部署細節，則放在 Backend，讓不同語言都能共用同一套實作知識。&lt;/p>
&lt;h2 id="教學模組">教學模組&lt;/h2>
&lt;h3 id="模組一進階並發模式">&lt;a href="https://tarrragon.github.io/blog/go-advanced/01-concurrency-patterns/" data-link-title="模組一：進階並發模式" data-link-desc="channel ownership、select loop、非阻塞送出、共享狀態、worker pool 與 rate limiting">模組一：進階並發模式&lt;/a>&lt;/h3>
&lt;p>從服務實例理解 fan-in、&lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/fan-out/" data-link-title="Fan-out" data-link-desc="說明單一事件同時分發給多個下游的訊息拓撲">fan-out&lt;/a>、取消傳播與 &lt;a href="https://tarrragon.github.io/blog/backend/knowledge-cards/backpressure/" data-link-title="Backpressure" data-link-desc="說明下游處理速度不足時系統如何讓上游依下游能力送出工作">backpressure&lt;/a> ，先把並發語意說清楚。&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://tarrragon.github.io/blog/go-advanced/01-concurrency-patterns/channel-ownership/" data-link-title="1.1 channel ownership 與關閉責任" data-link-desc="判斷誰能送出、接收與關閉 channel">channel ownership 與關閉責任&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://tarrragon.github.io/blog/go-advanced/01-concurrency-patterns/select-loop/" data-link-title="1.2 select loop 的生命週期設計" data-link-desc="理解長時間運行 goroutine 如何同時處理事件、ticker 與取消">select loop 的生命週期設計&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://tarrragon.github.io/blog/go-advanced/01-concurrency-patterns/non-blocking-send/" data-link-title="1.3 非阻塞送出與事件丟棄策略" data-link-desc="設計 channel 滿載時的服務行為">非阻塞送出與事件丟棄策略&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://tarrragon.github.io/blog/go-advanced/01-concurrency-patterns/shared-state/" data-link-title="1.4 共享狀態與複製邊界" data-link-desc="用 lock 與 copy 保護長期服務的狀態資料">共享狀態與複製邊界&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://tarrragon.github.io/blog/go-advanced/01-concurrency-patterns/worker-pool/" data-link-title="1.5 bounded worker pool" data-link-desc="限制同時執行的 goroutine 數量，讓背景工作有明確容量邊界">bounded worker pool&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://tarrragon.github.io/blog/go-advanced/01-concurrency-patterns/rate-limit/" data-link-title="1.6 rate limiting 與 backpressure " data-link-desc="用本地速率限制與 backpressure 策略保護服務入口與下游依賴">rate limiting 與 backpressure &lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="模組二websocket-服務架構">&lt;a href="https://tarrragon.github.io/blog/go-advanced/02-networking-websocket/" data-link-title="模組二：WebSocket 服務架構" data-link-desc="WebSocket client lifecycle、heartbeat、訂閱路由與慢客戶端管理">模組二：WebSocket 服務架構&lt;/a>&lt;/h3>
&lt;p>把 WebSocket server 的連線、訂閱、推送與錯誤處理拆成可維護的邊界。&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://tarrragon.github.io/blog/go-advanced/02-networking-websocket/read-write-pump/" data-link-title="2.1 read pump / write pump 模式" data-link-desc="分離 WebSocket 讀取、寫入與心跳">read pump / write pump 模式&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://tarrragon.github.io/blog/go-advanced/02-networking-websocket/heartbeat-deadline/" data-link-title="2.2 heartbeat、deadline 與連線清理" data-link-desc="用 ping/pong 和 deadline 偵測失效連線">heartbeat、deadline 與連線清理&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://tarrragon.github.io/blog/go-advanced/02-networking-websocket/subscription-routing/" data-link-title="2.3 訂閱模型與訊息路由" data-link-desc="將 client action 對應到主題訂閱狀態">訂閱模型與訊息路由&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://tarrragon.github.io/blog/go-advanced/02-networking-websocket/slow-client/" data-link-title="2.4 慢客戶端與 send buffer 管理" data-link-desc="控制推送佇列與記憶體風險">慢客戶端與 send buffer 管理&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="模組三runtime-與效能診斷">&lt;a href="https://tarrragon.github.io/blog/go-advanced/03-runtime-profiling/" data-link-title="模組三：Runtime 與效能診斷" data-link-desc="GC、memory limit、pprof、goroutine leak 與 allocation 壓力">模組三：Runtime 與效能診斷&lt;/a>&lt;/h3>
&lt;p>理解 Go runtime 如何在長時間運行服務中影響記憶體與排程行為。&lt;/p></description><content:encoded><![CDATA[<p>本系列是接在入門教學之後的延伸路線，目標是把 Go 的並發模式、<a href="/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">WebSocket</a> 服務架構、runtime 診斷、狀態邊界與生產環境可觀測性講到能真正用在服務上。語法細節留在入門篇；進階篇聚焦長時間服務會遇到的設計壓力。</p>
<h2 id="目標讀者">目標讀者</h2>
<ul>
<li>已完成 <a href="/blog/go/" data-link-title="Go 入門實戰指南" data-link-desc="理解 Go 語言精神與核心開發能力">Go 入門實戰指南</a> 的工程師</li>
<li>想深入理解 Go 並發模型與 runtime 行為的開發者</li>
<li>需要維護長時間運行服務的人</li>
<li>想把 Go 服務從「能跑」提升到「可觀測、可測、可演進」的人</li>
</ul>
<h2 id="學習目標">學習目標</h2>
<ol>
<li>掌握 goroutine、channel、mutex 的進階使用邊界</li>
<li>理解 <a href="/blog/backend/knowledge-cards/websocket/" data-link-title="WebSocket" data-link-desc="說明 WebSocket 如何提供長連線雙向即時通訊">WebSocket</a> client lifecycle、heartbeat、<a href="/blog/backend/knowledge-cards/buffer/" data-link-title="Buffer" data-link-desc="說明系統如何用暫存空間吸收短暫速度差與尖峰流量">buffer</a> 與慢客戶端問題</li>
<li>使用 pprof、runtime 記憶體限制與結構化日誌診斷服務</li>
<li>設計 event-driven service 的資料邊界與去重策略</li>
<li>建立並發測試、整合測試與可重現的時間控制</li>
<li>能評估 Go 服務在生產環境的風險與操作策略</li>
<li>知道單一 Go 服務延伸到跨節點與平台整合時，哪些責任會轉移到資料庫、<a href="/blog/backend/knowledge-cards/queue/" data-link-title="Queue" data-link-desc="說明 queue 如何保存等待處理的工作並形成容量邊界">queue</a>、<a href="/blog/backend/knowledge-cards/broker/" data-link-title="Broker" data-link-desc="說明 broker 在訊息傳遞系統中負責保存、路由與交付訊息">broker</a>、observability pipeline 與部署平台</li>
</ol>
<h2 id="共用術語">共用術語</h2>
<p>進階篇延續入門篇的 action、command、domain event、repository、port、adapter、<a href="/blog/backend/knowledge-cards/projection/" data-link-title="Projection" data-link-desc="說明從事件流或資料變更推算出查詢用讀取視圖的轉換機制">projection</a> 等詞彙。若你需要確認這些詞在這套教材中的責任邊界，可以先回到 <a href="/blog/go/glossary/" data-link-title="Go 教材核心術語" data-link-desc="整理 Go 入門與進階篇共用的架構、事件、狀態與邊界詞彙">Go 教材核心術語</a>。</p>
<h2 id="與-backend-教材的分工">與 Backend 教材的分工</h2>
<p>Go 進階篇處理單一 Go 服務內部的高階能力：goroutine lifecycle、WebSocket pump、runtime 診斷、event boundary、race test、<a href="/blog/backend/knowledge-cards/graceful-shutdown/" data-link-title="Graceful Shutdown" data-link-desc="說明服務停止前如何排空流量、完成工作與保存狀態">graceful shutdown</a> 與 diagnostics endpoint。當內容開始碰到資料庫、Redis、RabbitMQ、Kafka、OpenTelemetry、Kubernetes 或 CI 平台操作時，就應該轉到跨語言的 <a href="/blog/backend/" data-link-title="Backend 服務實務指南" data-link-desc="用跨語言教學路線整理資料庫、快取、訊息佇列、觀測、部署、可靠性、資安、事故與容量等後端服務能力">Backend 服務實務指南</a>。</p>
<p>模組七保留在進階篇裡，因為它要回答「Go 服務在跨出去以前，還需要先把哪些 port、訊號與測試合約準備好」。外部系統本身的選型與部署細節，則放在 Backend，讓不同語言都能共用同一套實作知識。</p>
<h2 id="教學模組">教學模組</h2>
<h3 id="模組一進階並發模式"><a href="/blog/go-advanced/01-concurrency-patterns/" data-link-title="模組一：進階並發模式" data-link-desc="channel ownership、select loop、非阻塞送出、共享狀態、worker pool 與 rate limiting">模組一：進階並發模式</a></h3>
<p>從服務實例理解 fan-in、<a href="/blog/backend/knowledge-cards/fan-out/" data-link-title="Fan-out" data-link-desc="說明單一事件同時分發給多個下游的訊息拓撲">fan-out</a>、取消傳播與 <a href="/blog/backend/knowledge-cards/backpressure/" data-link-title="Backpressure" data-link-desc="說明下游處理速度不足時系統如何讓上游依下游能力送出工作">backpressure</a> ，先把並發語意說清楚。</p>
<ul>
<li><a href="/blog/go-advanced/01-concurrency-patterns/channel-ownership/" data-link-title="1.1 channel ownership 與關閉責任" data-link-desc="判斷誰能送出、接收與關閉 channel">channel ownership 與關閉責任</a></li>
<li><a href="/blog/go-advanced/01-concurrency-patterns/select-loop/" data-link-title="1.2 select loop 的生命週期設計" data-link-desc="理解長時間運行 goroutine 如何同時處理事件、ticker 與取消">select loop 的生命週期設計</a></li>
<li><a href="/blog/go-advanced/01-concurrency-patterns/non-blocking-send/" data-link-title="1.3 非阻塞送出與事件丟棄策略" data-link-desc="設計 channel 滿載時的服務行為">非阻塞送出與事件丟棄策略</a></li>
<li><a href="/blog/go-advanced/01-concurrency-patterns/shared-state/" data-link-title="1.4 共享狀態與複製邊界" data-link-desc="用 lock 與 copy 保護長期服務的狀態資料">共享狀態與複製邊界</a></li>
<li><a href="/blog/go-advanced/01-concurrency-patterns/worker-pool/" data-link-title="1.5 bounded worker pool" data-link-desc="限制同時執行的 goroutine 數量，讓背景工作有明確容量邊界">bounded worker pool</a></li>
<li><a href="/blog/go-advanced/01-concurrency-patterns/rate-limit/" data-link-title="1.6 rate limiting 與 backpressure " data-link-desc="用本地速率限制與 backpressure 策略保護服務入口與下游依賴">rate limiting 與 backpressure </a></li>
</ul>
<h3 id="模組二websocket-服務架構"><a href="/blog/go-advanced/02-networking-websocket/" data-link-title="模組二：WebSocket 服務架構" data-link-desc="WebSocket client lifecycle、heartbeat、訂閱路由與慢客戶端管理">模組二：WebSocket 服務架構</a></h3>
<p>把 WebSocket server 的連線、訂閱、推送與錯誤處理拆成可維護的邊界。</p>
<ul>
<li><a href="/blog/go-advanced/02-networking-websocket/read-write-pump/" data-link-title="2.1 read pump / write pump 模式" data-link-desc="分離 WebSocket 讀取、寫入與心跳">read pump / write pump 模式</a></li>
<li><a href="/blog/go-advanced/02-networking-websocket/heartbeat-deadline/" data-link-title="2.2 heartbeat、deadline 與連線清理" data-link-desc="用 ping/pong 和 deadline 偵測失效連線">heartbeat、deadline 與連線清理</a></li>
<li><a href="/blog/go-advanced/02-networking-websocket/subscription-routing/" data-link-title="2.3 訂閱模型與訊息路由" data-link-desc="將 client action 對應到主題訂閱狀態">訂閱模型與訊息路由</a></li>
<li><a href="/blog/go-advanced/02-networking-websocket/slow-client/" data-link-title="2.4 慢客戶端與 send buffer 管理" data-link-desc="控制推送佇列與記憶體風險">慢客戶端與 send buffer 管理</a></li>
</ul>
<h3 id="模組三runtime-與效能診斷"><a href="/blog/go-advanced/03-runtime-profiling/" data-link-title="模組三：Runtime 與效能診斷" data-link-desc="GC、memory limit、pprof、goroutine leak 與 allocation 壓力">模組三：Runtime 與效能診斷</a></h3>
<p>理解 Go runtime 如何在長時間運行服務中影響記憶體與排程行為。</p>
<ul>
<li><a href="/blog/go-advanced/03-runtime-profiling/gc-memory-limit/" data-link-title="3.1 GC 與 memory limit" data-link-desc="理解 debug.SetMemoryLimit 在長時間服務中的用途">GC 與 memory limit</a></li>
<li><a href="/blog/go-advanced/03-runtime-profiling/pprof/" data-link-title="3.2 pprof 基礎診斷流程" data-link-desc="用 pprof endpoint 診斷 heap、goroutine 與 CPU 問題">pprof 基礎診斷流程</a></li>
<li><a href="/blog/go-advanced/03-runtime-profiling/goroutine-leak/" data-link-title="3.3 goroutine leak 偵測" data-link-desc="判斷背景工作與 client pump 是否正確退出">goroutine leak 偵測</a></li>
<li><a href="/blog/go-advanced/03-runtime-profiling/allocation/" data-link-title="3.4 資料結構與 allocation 壓力" data-link-desc="分析列表、歷史資料與 WebSocket payload 的配置成本">資料結構與 allocation 壓力</a></li>
</ul>
<h3 id="模組四架構邊界與事件系統"><a href="/blog/go-advanced/04-architecture-boundaries/" data-link-title="模組四：架構邊界與事件系統" data-link-desc="用事件驅動架構拆解事件來源、處理流程、狀態邊界與即時推送">模組四：架構邊界與事件系統</a></h3>
<p>用事件驅動架構拆解服務責任，讓來源、處理與狀態不再混在一起。</p>
<ul>
<li><a href="/blog/go-advanced/04-architecture-boundaries/component-boundaries/" data-link-title="4.1 事件來源、處理流程與狀態邊界" data-link-desc="分辨事件來源、事件融合、處理流程、狀態真相與推送邊界">事件來源、處理流程與狀態邊界</a></li>
<li><a href="/blog/go-advanced/04-architecture-boundaries/dedup-key/" data-link-title="4.2 事件去重與語義鍵設計" data-link-desc="用 entity ID、event type、來源語意與時間窗口建立去重鍵">事件去重與語義鍵設計</a></li>
<li><a href="/blog/go-advanced/04-architecture-boundaries/source-of-truth/" data-link-title="4.3 Source of Truth：狀態邊界" data-link-desc="集中狀態更新、保護可變資料、設計查詢 projection">Source of Truth：狀態邊界</a></li>
<li><a href="/blog/go-advanced/04-architecture-boundaries/event-fusion/" data-link-title="4.4 多來源 event 融合" data-link-desc="合併 HTTP、queue、timer 與外部事件來源">多來源 event 融合</a></li>
</ul>
<h3 id="模組五測試與可靠性"><a href="/blog/go-advanced/05-testing-reliability/" data-link-title="模組五：測試與可靠性" data-link-desc="時間控制、WebSocket integration test、race check 與 table-driven test">模組五：測試與可靠性</a></h3>
<p>針對並發服務建立能真正揭露風險的測試，而不是只追求覆蓋率。</p>
<ul>
<li><a href="/blog/go-advanced/05-testing-reliability/time-control/" data-link-title="5.1 時間注入與狀態轉移測試" data-link-desc="讓時間相關邏輯可重現">時間注入與狀態轉移測試</a></li>
<li><a href="/blog/go-advanced/05-testing-reliability/websocket-integration/" data-link-title="5.2 WebSocket integration test" data-link-desc="驗證 client/server 實際互動">WebSocket integration test</a></li>
<li><a href="/blog/go-advanced/05-testing-reliability/race-check/" data-link-title="5.3 race condition 檢查" data-link-desc="用 go test -race 找資料競爭">race condition 檢查</a></li>
<li><a href="/blog/go-advanced/05-testing-reliability/table-tests/" data-link-title="5.4 table-driven test 的設計邊界" data-link-desc="避免測試資料混雜太多概念">table-driven test 的設計邊界</a></li>
</ul>
<h3 id="模組六生產操作"><a href="/blog/go-advanced/06-production-operations/" data-link-title="模組六：生產操作" data-link-desc="graceful shutdown、健康檢查、結構化日誌與 feature gate">模組六：生產操作</a></h3>
<p>把本地服務推向可維護、可診斷、可部署的操作狀態。</p>
<ul>
<li><a href="/blog/go-advanced/06-production-operations/graceful-shutdown/" data-link-title="6.1 graceful shutdown 與 signal handling" data-link-desc="用 signal 與 context 傳遞停止訊號">graceful shutdown 與 signal handling</a></li>
<li><a href="/blog/go-advanced/06-production-operations/health-diagnostics/" data-link-title="6.2 健康檢查與診斷 endpoint" data-link-desc="區分服務可用性與工程診斷入口">健康檢查與診斷 endpoint</a></li>
<li><a href="/blog/go-advanced/06-production-operations/log-fields/" data-link-title="6.3 結構化日誌欄位設計" data-link-desc="讓 log 可 grep、可聚合、可追蹤">結構化日誌欄位設計</a></li>
<li><a href="/blog/go-advanced/06-production-operations/feature-gate/" data-link-title="6.4 版本偵測與 feature gate" data-link-desc="依版本與環境能力啟用功能">版本偵測與 feature gate</a></li>
</ul>
<h3 id="模組七跨節點與平台整合"><a href="/blog/go-advanced/07-distributed-operations/" data-link-title="模組七：跨節點與平台整合" data-link-desc="把單一 Go 服務延伸到資料庫、queue、跨節點 WebSocket、可觀測性與部署平台">模組七：跨節點與平台整合</a></h3>
<p>承接各章「本章不處理」的延伸邊界，把單一服務往外擴張時必須補上的責任整理成一條清楚路線。</p>
<ul>
<li><a href="/blog/go-advanced/07-distributed-operations/database-transactions/" data-link-title="7.1 資料庫 transaction 與 schema migration" data-link-desc="把 repository 邊界延伸到資料庫交易、migration 與一致性語意">資料庫 transaction 與 schema migration</a></li>
<li><a href="/blog/go-advanced/07-distributed-operations/outbox-idempotency/" data-link-title="7.2 Durable queue、outbox 與 idempotency" data-link-desc="設計跨 process 事件傳遞的可靠性與去重邊界">Durable queue、outbox 與 idempotency</a></li>
<li><a href="/blog/go-advanced/07-distributed-operations/cross-node-websocket/" data-link-title="7.3 跨節點 WebSocket、presence 與重連協定" data-link-desc="把單一 server 的 WebSocket hub 擴展到多節點推送與連線狀態">跨節點 WebSocket、presence 與重連協定</a></li>
<li><a href="/blog/go-advanced/07-distributed-operations/observability-pipeline/" data-link-title="7.4 Observability pipeline、metrics 與 tracing" data-link-desc="把 structured log、metric、trace 與 profile 組成可操作的診斷系統">Observability pipeline、metrics 與 tracing</a></li>
<li><a href="/blog/go-advanced/07-distributed-operations/deployment-contracts/" data-link-title="7.5 Kubernetes、systemd 與 load balancer 合約" data-link-desc="理解部署平台如何影響 Go 服務的 shutdown、health 與資源限制">Kubernetes、systemd 與 load balancer 合約</a></li>
<li><a href="/blog/go-advanced/07-distributed-operations/reliability-pipeline/" data-link-title="7.6 CI、fuzz、load test 與 chaos testing" data-link-desc="把單元測試與整合測試擴展成服務可靠性驗證流程">CI、fuzz、load test 與 chaos testing</a></li>
</ul>
<h2 id="學習路徑">學習路徑</h2>
<h3 id="路徑-a並發服務維護者">路徑 A：並發服務維護者</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">模組一 → 模組四 → 模組五</span></span></code></pre></div><p>重點：事件流、共享狀態、並發測試。</p>
<h3 id="路徑-bwebsocketapi-開發者">路徑 B：WebSocket/API 開發者</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">模組二 → 模組六 → 模組五</span></span></code></pre></div><p>重點：連線生命週期、訊息路由、操作診斷。</p>
<h3 id="路徑-c效能與可靠性工程師">路徑 C：效能與可靠性工程師</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">模組三 → 模組五 → 模組六</span></span></code></pre></div><p>重點：pprof、goroutine leak、race check、服務操作。</p>
<h3 id="路徑-d完整學習">路徑 D：完整學習</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">模組一 → 模組二 → 模組三 → 模組四 → 模組五 → 模組六 → 模組七</span></span></code></pre></div><p>按順序學習，建立完整的 Go 長時間運行服務模型。</p>
<h2 id="主題延伸地圖">主題延伸地圖</h2>
<p>進階篇的章節會反覆碰到 <a href="/blog/backend/knowledge-cards/log/" data-link-title="Log" data-link-desc="說明 log 如何記錄單一事件的上下文並支援事故排查">log</a>、time、state、event、WebSocket 與 testing。這些主題會在不同服務壓力下承擔不同責任；主題延伸地圖用來幫讀者辨識每一層的分工。</p>
<table>
  <thead>
      <tr>
          <th>主題</th>
          <th>單一 process 內的設計</th>
          <th>生產操作</th>
          <th>跨節點邊界</th>
          <th>Backend 實作</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>並發與容量</td>
          <td><a href="/blog/go-advanced/01-concurrency-patterns/channel-ownership/" data-link-title="1.1 channel ownership 與關閉責任" data-link-desc="判斷誰能送出、接收與關閉 channel">channel ownership</a>、<a href="/blog/go-advanced/01-concurrency-patterns/select-loop/" data-link-title="1.2 select loop 的生命週期設計" data-link-desc="理解長時間運行 goroutine 如何同時處理事件、ticker 與取消">select loop</a>、<a href="/blog/go-advanced/01-concurrency-patterns/non-blocking-send/" data-link-title="1.3 非阻塞送出與事件丟棄策略" data-link-desc="設計 channel 滿載時的服務行為">非阻塞送出</a></td>
          <td><a href="/blog/go-advanced/05-testing-reliability/race-check/" data-link-title="5.3 race condition 檢查" data-link-desc="用 go test -race 找資料競爭">race condition 檢查</a>、<a href="/blog/go-advanced/06-production-operations/graceful-shutdown/" data-link-title="6.1 graceful shutdown 與 signal handling" data-link-desc="用 signal 與 context 傳遞停止訊號">graceful shutdown</a></td>
          <td><a href="/blog/go-advanced/07-distributed-operations/reliability-pipeline/" data-link-title="7.6 CI、fuzz、load test 與 chaos testing" data-link-desc="把單元測試與整合測試擴展成服務可靠性驗證流程">可靠性驗證流程</a></td>
          <td><a href="/blog/backend/06-reliability/" data-link-title="模組六：可靠性驗證流程" data-link-desc="用 SRE 領域詞彙建問題節點、以服務級案例庫累積驗證脈絡，先建概念與案例庫再進實作交接">Backend：可靠性驗證</a></td>
      </tr>
      <tr>
          <td>WebSocket</td>
          <td><a href="/blog/go-advanced/02-networking-websocket/read-write-pump/" data-link-title="2.1 read pump / write pump 模式" data-link-desc="分離 WebSocket 讀取、寫入與心跳">read/write pump</a>、<a href="/blog/go-advanced/02-networking-websocket/heartbeat-deadline/" data-link-title="2.2 heartbeat、deadline 與連線清理" data-link-desc="用 ping/pong 和 deadline 偵測失效連線">heartbeat</a>、<a href="/blog/go-advanced/02-networking-websocket/slow-client/" data-link-title="2.4 慢客戶端與 send buffer 管理" data-link-desc="控制推送佇列與記憶體風險">慢客戶端</a></td>
          <td><a href="/blog/go-advanced/05-testing-reliability/websocket-integration/" data-link-title="5.2 WebSocket integration test" data-link-desc="驗證 client/server 實際互動">WebSocket integration test</a>、<a href="/blog/go-advanced/06-production-operations/health-diagnostics/" data-link-title="6.2 健康檢查與診斷 endpoint" data-link-desc="區分服務可用性與工程診斷入口">health diagnostics</a></td>
          <td><a href="/blog/go-advanced/07-distributed-operations/cross-node-websocket/" data-link-title="7.3 跨節點 WebSocket、presence 與重連協定" data-link-desc="把單一 server 的 WebSocket hub 擴展到多節點推送與連線狀態">跨節點 WebSocket</a></td>
          <td><a href="/blog/backend/02-cache-redis/" data-link-title="模組二：快取與 Redis" data-link-desc="整理快取策略、Redis 資料型別與分散式狀態輔助能力">Backend：快取與 Redis</a>、<a href="/blog/backend/03-message-queue/" data-link-title="模組三：訊息佇列與事件傳遞" data-link-desc="整理 durable queue、broker、retry、outbox 與 idempotency 的後端實務">訊息佇列</a></td>
      </tr>
      <tr>
          <td>Runtime 診斷</td>
          <td><a href="/blog/go-advanced/03-runtime-profiling/gc-memory-limit/" data-link-title="3.1 GC 與 memory limit" data-link-desc="理解 debug.SetMemoryLimit 在長時間服務中的用途">GC 與 memory limit</a>、<a href="/blog/go-advanced/03-runtime-profiling/pprof/" data-link-title="3.2 pprof 基礎診斷流程" data-link-desc="用 pprof endpoint 診斷 heap、goroutine 與 CPU 問題">pprof</a>、<a href="/blog/go-advanced/03-runtime-profiling/goroutine-leak/" data-link-title="3.3 goroutine leak 偵測" data-link-desc="判斷背景工作與 client pump 是否正確退出">goroutine leak</a></td>
          <td><a href="/blog/go-advanced/06-production-operations/health-diagnostics/" data-link-title="6.2 健康檢查與診斷 endpoint" data-link-desc="區分服務可用性與工程診斷入口">diagnostics endpoint</a></td>
          <td><a href="/blog/go-advanced/07-distributed-operations/observability-pipeline/" data-link-title="7.4 Observability pipeline、metrics 與 tracing" data-link-desc="把 structured log、metric、trace 與 profile 組成可操作的診斷系統">Observability pipeline</a>、<a href="/blog/go-advanced/07-distributed-operations/deployment-contracts/" data-link-title="7.5 Kubernetes、systemd 與 load balancer 合約" data-link-desc="理解部署平台如何影響 Go 服務的 shutdown、health 與資源限制">部署平台合約</a></td>
          <td><a href="/blog/backend/04-observability/" data-link-title="模組四：可觀測性平台" data-link-desc="整理 log、metric、trace、dashboard 與 alert 的後端操作實務">Backend：可觀測性平台</a>、<a href="/blog/backend/05-deployment-platform/" data-link-title="模組五：部署平台與網路入口" data-link-desc="整理 Kubernetes、systemd、load balancer、container 與服務生命週期合約">部署平台</a></td>
      </tr>
      <tr>
          <td>事件與狀態</td>
          <td><a href="/blog/go-advanced/04-architecture-boundaries/component-boundaries/" data-link-title="4.1 事件來源、處理流程與狀態邊界" data-link-desc="分辨事件來源、事件融合、處理流程、狀態真相與推送邊界">component boundaries</a>、<a href="/blog/go-advanced/04-architecture-boundaries/source-of-truth/" data-link-title="4.3 Source of Truth：狀態邊界" data-link-desc="集中狀態更新、保護可變資料、設計查詢 projection">source of truth</a>、<a href="/blog/go-advanced/04-architecture-boundaries/event-fusion/" data-link-title="4.4 多來源 event 融合" data-link-desc="合併 HTTP、queue、timer 與外部事件來源">event fusion</a></td>
          <td><a href="/blog/go-advanced/06-production-operations/log-fields/" data-link-title="6.3 結構化日誌欄位設計" data-link-desc="讓 log 可 grep、可聚合、可追蹤">結構化日誌欄位</a></td>
          <td><a href="/blog/go-advanced/07-distributed-operations/outbox-idempotency/" data-link-title="7.2 Durable queue、outbox 與 idempotency" data-link-desc="設計跨 process 事件傳遞的可靠性與去重邊界">outbox 與 idempotency</a>、<a href="/blog/go-advanced/07-distributed-operations/database-transactions/" data-link-title="7.1 資料庫 transaction 與 schema migration" data-link-desc="把 repository 邊界延伸到資料庫交易、migration 與一致性語意">資料庫 transaction</a></td>
          <td><a href="/blog/backend/01-database/" data-link-title="模組一：資料庫與持久化" data-link-desc="整理 SQL、transaction、migration 與 repository adapter 的後端實務">Backend：資料庫</a>、<a href="/blog/backend/03-message-queue/" data-link-title="模組三：訊息佇列與事件傳遞" data-link-desc="整理 durable queue、broker、retry、outbox 與 idempotency 的後端實務">訊息佇列</a></td>
      </tr>
      <tr>
          <td>測試分層</td>
          <td><a href="/blog/go-advanced/05-testing-reliability/time-control/" data-link-title="5.1 時間注入與狀態轉移測試" data-link-desc="讓時間相關邏輯可重現">時間控制</a>、<a href="/blog/go-advanced/05-testing-reliability/table-tests/" data-link-title="5.4 table-driven test 的設計邊界" data-link-desc="避免測試資料混雜太多概念">table-driven test</a></td>
          <td><a href="/blog/go-advanced/05-testing-reliability/race-check/" data-link-title="5.3 race condition 檢查" data-link-desc="用 go test -race 找資料競爭">race check</a>、<a href="/blog/go-advanced/05-testing-reliability/websocket-integration/" data-link-title="5.2 WebSocket integration test" data-link-desc="驗證 client/server 實際互動">integration test</a></td>
          <td><a href="/blog/go-advanced/07-distributed-operations/reliability-pipeline/" data-link-title="7.6 CI、fuzz、load test 與 chaos testing" data-link-desc="把單元測試與整合測試擴展成服務可靠性驗證流程">可靠性驗證流程</a></td>
          <td><a href="/blog/backend/06-reliability/" data-link-title="模組六：可靠性驗證流程" data-link-desc="用 SRE 領域詞彙建問題節點、以服務級案例庫累積驗證脈絡，先建概念與案例庫再進實作交接">Backend：可靠性驗證</a></td>
      </tr>
  </tbody>
</table>
<h2 id="先備知識">先備知識</h2>
<p>本系列假設你已經完成 <a href="/blog/go/" data-link-title="Go 入門實戰指南" data-link-desc="理解 Go 語言精神與核心開發能力">Go 入門實戰指南</a> 的基礎部分，因為下面這些章節會直接沿用那些概念：</p>
<ul>
<li><a href="/blog/go/03-stdlib/" data-link-title="模組三：標準庫實戰" data-link-desc="使用 fmt、time、encoding/json、net/http、log/slog、context、defer、flag 與 os/env 解決實務問題">模組三：標準庫實戰</a></li>
<li><a href="/blog/go/04-concurrency/" data-link-title="模組四：並發模型" data-link-desc="從 goroutine、channel、select 與 RWMutex 理解 Go 並發模型">模組四：並發模型</a></li>
<li><a href="/blog/go/05-error-testing/" data-link-title="模組五：錯誤處理與測試" data-link-desc="用明確錯誤路徑、testing、table-driven test 與時間注入驗證 Go 程式">模組五：錯誤處理與測試</a></li>
</ul>
<h2 id="每章結構">每章結構</h2>
<p>每章都採用「由淺到深」的結構，先說明問題，再切到設計與實作：</p>
<ol>
<li><strong>原理層</strong>：這個機制解決什麼問題</li>
<li><strong>設計層</strong>：在服務架構中如何切責任</li>
<li><strong>實作層</strong>：用簡化範例程式碼看具體做法</li>
<li><strong>實戰檢查</strong>：維護時要確認哪些風險</li>
</ol>
<hr>
<p><em>文件版本：v0.1.0</em>
<em>最後更新：2026-04-22</em>
<em>系列狀態：核心初稿完成，延伸模組規劃中</em></p>
]]></content:encoded></item></channel></rss>