<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Dns on Tarragon</title><link>https://tarrragon.github.io/blog/tags/dns/</link><description>Recent content in Dns on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Thu, 02 Jul 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/dns/index.xml" rel="self" type="application/rss+xml"/><item><title>安裝期套件與網路故障排除：pacman / DNS / mirror / keyring</title><link>https://tarrragon.github.io/blog/linux/install/package-and-network-troubleshooting/</link><pubDate>Thu, 02 Jul 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/linux/install/package-and-network-troubleshooting/</guid><description>&lt;p>裝好 OS、第一次跑套件管理器抓 bootstrap 要的東西時，最常撞的一類故障是「套件裝不下來」。這類故障的第一步判讀，是把它拆成兩層完全不同的問題：&lt;strong>連不到（網路 / DNS / mirror）&lt;/strong>，還是&lt;strong>連得到但被拒（套件管理器自己的狀態）&lt;/strong>。這兩層的檢查工具、根因、修法都不一樣，先分對層再往下查，才不會拿修 DNS 的方法去治簽章過期。這篇以 Arch 的 &lt;code>pacman&lt;/code> 為主要案例（本系列 VM 實測踩過的坑），其他發行版的套件管理器概念對應相同。&lt;/p>
&lt;h2 id="第一步分連不到還是連得到但被拒">第一步：分「連不到」還是「連得到但被拒」&lt;/h2>
&lt;p>錯誤訊息本身就能分層，不用猜：&lt;/p>
&lt;ul>
&lt;li>&lt;strong>訊息提到主機名解不出、連線逾時、retrieving file 失敗&lt;/strong> → 連不到，往網路 / DNS / mirror 查。&lt;/li>
&lt;li>&lt;strong>訊息提到 database lock、signature、trust、conflicting、partial&lt;/strong> → 連得到、封包也拿到了，是套件管理器的狀態問題。&lt;/li>
&lt;/ul>
&lt;p>判準是問一句：「它到底有沒有成功連上 mirror？」有連上才談得到簽章、相依、db 狀態；連都沒連上，那些都還輪不到。剛裝好的最小系統最常見的是前者——網路設定還沒到位。&lt;/p>
&lt;h2 id="連不到那層從實體介面往上查到域名">連不到那層：從實體介面往上查到域名&lt;/h2>
&lt;p>網路不通有好幾層，從最底層往上逐層確認，哪一層斷了一目了然。這條鏈跟&lt;a href="../minimal-install-verify/">最小安裝後的驗證&lt;/a>裡的網路檢查同源，這裡聚焦在「抓套件失敗」這個症狀上：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">ip -brief a &lt;span class="c1"># 1. 有沒有拿到 IP？介面 UP 且有位址&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">ping -c1 8.8.8.8 &lt;span class="c1"># 2. IP 層對外通不通？（直接打 IP、跳過 DNS）&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">getent hosts archlinux.org &lt;span class="c1"># 3. 域名解得出來嗎？&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">timedatectl &lt;span class="c1"># 4. 時間對嗎？（影響下一層的簽章驗證）&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>第 2 步通、第 3 步不通 = DNS 問題&lt;/strong>，這是最小安裝最典型的落點：IP 層明明通（&lt;code>ping 8.8.8.8&lt;/code> 有回應），但域名解不出來，因為 &lt;code>/etc/resolv.conf&lt;/code> 還沒設 nameserver。這時 pacman 會卡在解析 mirror 主機名。修法是給系統一個 resolver——臨時可直接寫 &lt;code>/etc/resolv.conf&lt;/code>（&lt;code>nameserver 1.1.1.1&lt;/code>）。先看它是什麼（&lt;code>ls -l /etc/resolv.conf&lt;/code>）：啟用了 &lt;code>systemd-resolved&lt;/code> 或 NetworkManager 的系統上它是那些服務管理的 symlink，手寫會被覆蓋，治本要透過該網路管理服務設定 DNS；裸 Arch 最小安裝若沒啟用這些服務，它通常就是一個普通檔案，手寫即持久生效。&lt;/p>
&lt;p>&lt;strong>mirror 逾時 / 抓不到&lt;/strong>：DNS 通了、但某個 mirror 慢或掛了。換 &lt;code>/etc/pacman.d/mirrorlist&lt;/code> 到地理近且快的鏡像（實測不同 mirror 速度可差數倍）。這也接回&lt;a href="../install-option-decisions/">安裝選項判讀&lt;/a>裡選 mirror 的決策——裝機當下選錯 mirror，這裡就會慢。&lt;/p>
&lt;h2 id="連得到但被拒那層pacman-自己的狀態">連得到但被拒那層：pacman 自己的狀態&lt;/h2>
&lt;p>連上 mirror、封包也拿到了卻失敗，問題在 pacman 的本地狀態或簽章驗證。這幾種各有明確徵兆與修法：&lt;/p>
&lt;h3 id="database-lock上次沒清乾淨的殘留">database lock：上次沒清乾淨的殘留&lt;/h3>
&lt;p>&lt;code>error: failed to init transaction (unable to lock database)&lt;/code>。pacman 用 &lt;code>/var/lib/pacman/db.lck&lt;/code> 這個鎖檔保證同時只有一個 pacman 在動資料庫；上次 pacman 被中斷（斷電、Ctrl+C、當掉）沒清掉鎖檔就會殘留。&lt;strong>先確認真的沒有 pacman 在跑&lt;/strong>（&lt;code>pgrep -x pacman&lt;/code>），確認沒有再刪鎖檔：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">pgrep -x pacman &lt;span class="o">&amp;amp;&amp;amp;&lt;/span> &lt;span class="nb">echo&lt;/span> &lt;span class="s2">&amp;#34;有 pacman 在跑、別刪&amp;#34;&lt;/span> &lt;span class="o">||&lt;/span> sudo rm /var/lib/pacman/db.lck&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>先查再刪這個順序重要——盲刪鎖檔時如果真的有另一個 pacman 在跑，兩個同時寫資料庫會弄壞它。&lt;/p>
&lt;h3 id="簽章--keyring-過期十之八九是時間不對">簽章 / keyring 過期：十之八九是時間不對&lt;/h3>
&lt;p>&lt;code>invalid or corrupted package (PGP signature)&lt;/code> 或 &lt;code>signature is unknown trust&lt;/code>。pacman 驗證每個套件的 GPG 簽章，驗證失敗最常見的根因是&lt;strong>系統時間不對&lt;/strong>——這正是第一步要 &lt;code>timedatectl&lt;/code> 的原因。時間差太多（新裝的 VM、主機板電池沒電的老機器）會讓「簽章的有效期」判斷錯誤，明明有效的簽章被判過期。先校時：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">sudo timedatectl set-ntp &lt;span class="nb">true&lt;/span> &lt;span class="c1"># 開 NTP 自動校時（SSH 進最小系統無 polkit 互動代理、裸跑會被拒，要 sudo）&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>時間對了還失敗，才是 keyring 本身的問題（archlinux-keyring 太舊）：&lt;code>sudo pacman -Sy archlinux-keyring&lt;/code> 更新 keyring，必要時 &lt;code>sudo pacman-key --refresh-keys&lt;/code>。順序是先校時再動 keyring，因為時間不對時連 keyring 都更新不了。&lt;/p>
&lt;h3 id="partial-upgrade只同步不升級造成的相依斷裂">partial upgrade：只同步不升級造成的相依斷裂&lt;/h3>
&lt;p>&lt;code>conflicting dependencies&lt;/code> 或裝完某個套件後系統行為異常。根因是在 rolling 發行版上只做了 &lt;code>pacman -Sy&lt;/code>（同步資料庫）就裝新套件，卻沒 &lt;code>-u&lt;/code>（升級既有套件）——新套件依賴新版函式庫，但系統還是舊的，相依對不上。Arch 只支援 full upgrade：&lt;strong>一律 &lt;code>pacman -Syu&lt;/code>，永遠不要單獨 &lt;code>-Sy&lt;/code> 之後裝東西&lt;/strong>。這條規則救掉這一整類故障。&lt;/p></description><content:encoded><![CDATA[<p>裝好 OS、第一次跑套件管理器抓 bootstrap 要的東西時，最常撞的一類故障是「套件裝不下來」。這類故障的第一步判讀，是把它拆成兩層完全不同的問題：<strong>連不到（網路 / DNS / mirror）</strong>，還是<strong>連得到但被拒（套件管理器自己的狀態）</strong>。這兩層的檢查工具、根因、修法都不一樣，先分對層再往下查，才不會拿修 DNS 的方法去治簽章過期。這篇以 Arch 的 <code>pacman</code> 為主要案例（本系列 VM 實測踩過的坑），其他發行版的套件管理器概念對應相同。</p>
<h2 id="第一步分連不到還是連得到但被拒">第一步：分「連不到」還是「連得到但被拒」</h2>
<p>錯誤訊息本身就能分層，不用猜：</p>
<ul>
<li><strong>訊息提到主機名解不出、連線逾時、retrieving file 失敗</strong> → 連不到，往網路 / DNS / mirror 查。</li>
<li><strong>訊息提到 database lock、signature、trust、conflicting、partial</strong> → 連得到、封包也拿到了，是套件管理器的狀態問題。</li>
</ul>
<p>判準是問一句：「它到底有沒有成功連上 mirror？」有連上才談得到簽章、相依、db 狀態；連都沒連上，那些都還輪不到。剛裝好的最小系統最常見的是前者——網路設定還沒到位。</p>
<h2 id="連不到那層從實體介面往上查到域名">連不到那層：從實體介面往上查到域名</h2>
<p>網路不通有好幾層，從最底層往上逐層確認，哪一層斷了一目了然。這條鏈跟<a href="../minimal-install-verify/">最小安裝後的驗證</a>裡的網路檢查同源，這裡聚焦在「抓套件失敗」這個症狀上：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">ip -brief a              <span class="c1"># 1. 有沒有拿到 IP？介面 UP 且有位址</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">ping -c1 8.8.8.8         <span class="c1"># 2. IP 層對外通不通？（直接打 IP、跳過 DNS）</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">getent hosts archlinux.org   <span class="c1"># 3. 域名解得出來嗎？</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">timedatectl              <span class="c1"># 4. 時間對嗎？（影響下一層的簽章驗證）</span></span></span></code></pre></div><p><strong>第 2 步通、第 3 步不通 = DNS 問題</strong>，這是最小安裝最典型的落點：IP 層明明通（<code>ping 8.8.8.8</code> 有回應），但域名解不出來，因為 <code>/etc/resolv.conf</code> 還沒設 nameserver。這時 pacman 會卡在解析 mirror 主機名。修法是給系統一個 resolver——臨時可直接寫 <code>/etc/resolv.conf</code>（<code>nameserver 1.1.1.1</code>）。先看它是什麼（<code>ls -l /etc/resolv.conf</code>）：啟用了 <code>systemd-resolved</code> 或 NetworkManager 的系統上它是那些服務管理的 symlink，手寫會被覆蓋，治本要透過該網路管理服務設定 DNS；裸 Arch 最小安裝若沒啟用這些服務，它通常就是一個普通檔案，手寫即持久生效。</p>
<p><strong>mirror 逾時 / 抓不到</strong>：DNS 通了、但某個 mirror 慢或掛了。換 <code>/etc/pacman.d/mirrorlist</code> 到地理近且快的鏡像（實測不同 mirror 速度可差數倍）。這也接回<a href="../install-option-decisions/">安裝選項判讀</a>裡選 mirror 的決策——裝機當下選錯 mirror，這裡就會慢。</p>
<h2 id="連得到但被拒那層pacman-自己的狀態">連得到但被拒那層：pacman 自己的狀態</h2>
<p>連上 mirror、封包也拿到了卻失敗，問題在 pacman 的本地狀態或簽章驗證。這幾種各有明確徵兆與修法：</p>
<h3 id="database-lock上次沒清乾淨的殘留">database lock：上次沒清乾淨的殘留</h3>
<p><code>error: failed to init transaction (unable to lock database)</code>。pacman 用 <code>/var/lib/pacman/db.lck</code> 這個鎖檔保證同時只有一個 pacman 在動資料庫；上次 pacman 被中斷（斷電、Ctrl+C、當掉）沒清掉鎖檔就會殘留。<strong>先確認真的沒有 pacman 在跑</strong>（<code>pgrep -x pacman</code>），確認沒有再刪鎖檔：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">pgrep -x pacman <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s2">&#34;有 pacman 在跑、別刪&#34;</span> <span class="o">||</span> sudo rm /var/lib/pacman/db.lck</span></span></code></pre></div><p>先查再刪這個順序重要——盲刪鎖檔時如果真的有另一個 pacman 在跑，兩個同時寫資料庫會弄壞它。</p>
<h3 id="簽章--keyring-過期十之八九是時間不對">簽章 / keyring 過期：十之八九是時間不對</h3>
<p><code>invalid or corrupted package (PGP signature)</code> 或 <code>signature is unknown trust</code>。pacman 驗證每個套件的 GPG 簽章，驗證失敗最常見的根因是<strong>系統時間不對</strong>——這正是第一步要 <code>timedatectl</code> 的原因。時間差太多（新裝的 VM、主機板電池沒電的老機器）會讓「簽章的有效期」判斷錯誤，明明有效的簽章被判過期。先校時：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">sudo timedatectl set-ntp <span class="nb">true</span>     <span class="c1"># 開 NTP 自動校時（SSH 進最小系統無 polkit 互動代理、裸跑會被拒，要 sudo）</span></span></span></code></pre></div><p>時間對了還失敗，才是 keyring 本身的問題（archlinux-keyring 太舊）：<code>sudo pacman -Sy archlinux-keyring</code> 更新 keyring，必要時 <code>sudo pacman-key --refresh-keys</code>。順序是先校時再動 keyring，因為時間不對時連 keyring 都更新不了。</p>
<h3 id="partial-upgrade只同步不升級造成的相依斷裂">partial upgrade：只同步不升級造成的相依斷裂</h3>
<p><code>conflicting dependencies</code> 或裝完某個套件後系統行為異常。根因是在 rolling 發行版上只做了 <code>pacman -Sy</code>（同步資料庫）就裝新套件，卻沒 <code>-u</code>（升級既有套件）——新套件依賴新版函式庫，但系統還是舊的，相依對不上。Arch 只支援 full upgrade：<strong>一律 <code>pacman -Syu</code>，永遠不要單獨 <code>-Sy</code> 之後裝東西</strong>。這條規則救掉這一整類故障。</p>
<h3 id="stale-db-404裝機當下的資料庫已經過期">stale db 404：裝機當下的資料庫已經過期</h3>
<p><code>error: failed retrieving file '...' 404</code>，而且換好幾個 mirror 都一樣。這是 rolling 發行版特有的時序陷阱：Arch 的 mirror 不保留舊版檔案，你裝機時 ISO 內建的套件資料庫指向的檔名，可能幾天內就被輪替掉了——資料庫說有這個檔、mirror 上已經沒有。修法跟上一條同源：<code>pacman -Syu</code> 先把資料庫同步到最新，檔名對上了就抓得到。這也是為什麼「一律 <code>-Syu</code>」是 Arch 的鐵律，而不只是建議。</p>
<h2 id="判讀總表">判讀總表</h2>
<table>
  <thead>
      <tr>
          <th>症狀</th>
          <th>層</th>
          <th>權威檢查</th>
          <th>修法</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>主機名解不出</td>
          <td>網路</td>
          <td><code>getent hosts &lt;域名&gt;</code></td>
          <td>設 resolver（注意 symlink）</td>
      </tr>
      <tr>
          <td>ping IP 通、域名不通</td>
          <td>DNS</td>
          <td><code>ping 8.8.8.8</code> vs <code>getent</code></td>
          <td>設 <code>/etc/resolv.conf</code> 或網管服務</td>
      </tr>
      <tr>
          <td>mirror 慢 / 逾時</td>
          <td>網路</td>
          <td>換 mirror 測速</td>
          <td>改 mirrorlist</td>
      </tr>
      <tr>
          <td>unable to lock database</td>
          <td>pacman</td>
          <td><code>pgrep -x pacman</code></td>
          <td>確認無後刪 db.lck</td>
      </tr>
      <tr>
          <td>PGP signature / unknown trust</td>
          <td>pacman</td>
          <td><code>timedatectl</code>（先校時）</td>
          <td>校時 →（仍失敗）更新 keyring</td>
      </tr>
      <tr>
          <td>conflicting / partial</td>
          <td>pacman</td>
          <td>是否只跑了 <code>-Sy</code></td>
          <td><code>pacman -Syu</code>（永遠 full）</td>
      </tr>
      <tr>
          <td>retrieving file 404（多 mirror）</td>
          <td>pacman</td>
          <td>rolling stale db</td>
          <td><code>pacman -Syu</code> 同步再裝</td>
      </tr>
  </tbody>
</table>
<h2 id="下一步">下一步</h2>
<ul>
<li>這幾步用到的網路驗證，完整版在<a href="../minimal-install-verify/">最小安裝後的工具驗證與補足</a>。</li>
<li>裝機時選 mirror / locale / 時區的決策，見<a href="../install-option-decisions/">Linux 安裝選項判讀</a>。</li>
<li>跨發行版時「這個套件名 / 這個旗標在別的發行版叫什麼」的差異判讀，見<a href="../platform-divergence-map/">平台與發行版差異的判讀地圖</a>。</li>
<li>套件抓下來了、但 bootstrap 腳本本身失敗要 debug，見<a href="../observable-bootstrap/">可除錯的 bootstrap</a>。</li>
<li>系統跑起來後才出的套件問題（AUR 建置失敗、<code>-bin</code> 包 soname 斷裂等），屬除錯範疇，見<a href="../../debug/">Linux 除錯與診斷</a>。</li>
</ul>
]]></content:encoded></item><item><title>ACM 憑證、DNS 與 HTTPS 設定</title><link>https://tarrragon.github.io/blog/infra/05-core-services/acm-tls-dns-setup/</link><pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/infra/05-core-services/acm-tls-dns-setup/</guid><description>&lt;p>HTTPS 的運作需要三個元件配合：一個管理網域記錄的 DNS zone、一張證明網域所有權的 TLS 憑證、以及一個用這張憑證終結 TLS 連線的入口（ALB listener）。這三者在 IaC 裡各自是獨立資源，但建立順序有依賴——zone 先存在、憑證才能用 DNS 驗證、驗證通過才能掛到 listener。把這條鏈路寫進 Terraform，讓憑證的申請、驗證與續期都在版本控制裡，是避免「憑證過期才發現沒人盯」的結構性做法。&lt;/p>
&lt;h2 id="route-53-hosted-zone">Route 53 Hosted Zone&lt;/h2>
&lt;p>Hosted zone 是 Route 53 用來管理某個網域的 DNS 記錄集合。建立 zone 後，Route 53 會分配一組 NS（Name Server）記錄，網域的 DNS 解析就由這組 NS 負責。&lt;/p>
&lt;h3 id="public-vs-private-zone">Public vs Private Zone&lt;/h3>
&lt;p>Public hosted zone 對應的是可從網際網路解析的網域（如 &lt;code>example.com&lt;/code>），用於對外服務的 A / CNAME / MX 記錄。Private hosted zone 只在指定的 VPC 內可解析，用於內部服務發現（如 &lt;code>db.internal.example.com&lt;/code> 解析到 RDS 的 private IP）。多數專案兩者都需要：public zone 給對外流量、private zone 給內部服務互連。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-hcl" data-lang="hcl">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="k">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_route53_zone&amp;#34; &amp;#34;public&amp;#34;&lt;/span> {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">&lt;span class="n"> name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;example.com&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">&lt;span class="n"> tags&lt;/span> &lt;span class="o">=&lt;/span>&lt;span class="n"> { Environment&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;production&amp;#34;&lt;/span> }
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl">}
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">&lt;span class="k">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_route53_zone&amp;#34; &amp;#34;private&amp;#34;&lt;/span> {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl">&lt;span class="n"> name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;internal.example.com&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl"> &lt;span class="k">vpc&lt;/span> {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl">&lt;span class="n"> vpc_id&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="k">aws_vpc&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="k">main&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="k">id&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl"> }
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl">&lt;span class="n"> tags&lt;/span> &lt;span class="o">=&lt;/span>&lt;span class="n"> { Environment&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;production&amp;#34;&lt;/span> }
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">14&lt;/span>&lt;span class="cl">}&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="子網域-delegation">子網域 delegation&lt;/h3>
&lt;p>當 dev / staging / prod 各用獨立帳號時，每個帳號建自己的 hosted zone 管理子網域（如 &lt;code>dev.example.com&lt;/code>）。父網域的 zone 需要加一組 NS 記錄指向子網域的 zone，這個動作叫 delegation。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-hcl" data-lang="hcl">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="k">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_route53_record&amp;#34; &amp;#34;dev_ns&amp;#34;&lt;/span> {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="n"> zone_id&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="k">aws_route53_zone&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="k">public&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="k">zone_id&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="n"> name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;dev.example.com&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="n"> type&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;NS&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">&lt;span class="n"> ttl&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="m">300&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">&lt;span class="n"> records&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="k">aws_route53_zone&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="k">dev&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="k">name_servers&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">}&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>delegation 的 NS 記錄指向子帳號 zone 的 name server。子帳號內的所有 DNS 記錄（如 &lt;code>api.dev.example.com&lt;/code>）由子帳號的 zone 管理，父帳號不需要逐條設定。跨帳號 delegation 需要兩邊的 Terraform 各自管理自己的 zone，NS 記錄在父帳號的 state 裡。&lt;/p>
&lt;p>判讀設定是否正確：用 &lt;code>dig dev.example.com NS&lt;/code> 查回的 name server 應該是子帳號 zone 的 NS，不是父帳號的。如果查回父帳號的 NS，代表 delegation 沒生效，子網域的 DNS 記錄不會被解析。&lt;/p>
&lt;h2 id="acm-憑證申請與-dns-驗證">ACM 憑證申請與 DNS 驗證&lt;/h2>
&lt;p>AWS Certificate Manager（ACM）提供免費的 TLS 憑證，條件是透過 DNS 或 email 驗證網域所有權。DNS 驗證是 IaC 友善的方式——ACM 要求在指定網域下建一條 CNAME 記錄，記錄值由 ACM 提供，驗證通過後憑證自動簽發。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-hcl" data-lang="hcl">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="k">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_acm_certificate&amp;#34; &amp;#34;main&amp;#34;&lt;/span> {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">&lt;span class="n"> domain_name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;example.com&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">&lt;span class="n"> subject_alternative_names&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;*.example.com&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl">&lt;span class="n"> validation_method&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;DNS&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl"> &lt;span class="k">lifecycle&lt;/span> {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl">&lt;span class="n"> create_before_destroy&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kt">true&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl"> }
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl">&lt;span class="n"> tags&lt;/span> &lt;span class="o">=&lt;/span>&lt;span class="n"> { Environment&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;production&amp;#34;&lt;/span> }
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl">}&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>subject_alternative_names&lt;/code> 加 &lt;code>*.example.com&lt;/code> 讓同一張憑證涵蓋所有子網域（如 &lt;code>api.example.com&lt;/code>、&lt;code>admin.example.com&lt;/code>），省去為每個子網域各申請一張。&lt;/p></description><content:encoded><![CDATA[<p>HTTPS 的運作需要三個元件配合：一個管理網域記錄的 DNS zone、一張證明網域所有權的 TLS 憑證、以及一個用這張憑證終結 TLS 連線的入口（ALB listener）。這三者在 IaC 裡各自是獨立資源，但建立順序有依賴——zone 先存在、憑證才能用 DNS 驗證、驗證通過才能掛到 listener。把這條鏈路寫進 Terraform，讓憑證的申請、驗證與續期都在版本控制裡，是避免「憑證過期才發現沒人盯」的結構性做法。</p>
<h2 id="route-53-hosted-zone">Route 53 Hosted Zone</h2>
<p>Hosted zone 是 Route 53 用來管理某個網域的 DNS 記錄集合。建立 zone 後，Route 53 會分配一組 NS（Name Server）記錄，網域的 DNS 解析就由這組 NS 負責。</p>
<h3 id="public-vs-private-zone">Public vs Private Zone</h3>
<p>Public hosted zone 對應的是可從網際網路解析的網域（如 <code>example.com</code>），用於對外服務的 A / CNAME / MX 記錄。Private hosted zone 只在指定的 VPC 內可解析，用於內部服務發現（如 <code>db.internal.example.com</code> 解析到 RDS 的 private IP）。多數專案兩者都需要：public zone 給對外流量、private zone 給內部服務互連。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_route53_zone&#34; &#34;public&#34;</span> {
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="n">  name</span> <span class="o">=</span> <span class="s2">&#34;example.com&#34;</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="n">  tags</span> <span class="o">=</span><span class="n"> { Environment</span> <span class="o">=</span> <span class="s2">&#34;production&#34;</span> }
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">}
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">
</span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_route53_zone&#34; &#34;private&#34;</span> {
</span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="n">  name</span> <span class="o">=</span> <span class="s2">&#34;internal.example.com&#34;</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">  <span class="k">vpc</span> {
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="n">    vpc_id</span> <span class="o">=</span> <span class="k">aws_vpc</span><span class="p">.</span><span class="k">main</span><span class="p">.</span><span class="k">id</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">  }
</span></span><span class="line"><span class="ln">12</span><span class="cl">
</span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="n">  tags</span> <span class="o">=</span><span class="n"> { Environment</span> <span class="o">=</span> <span class="s2">&#34;production&#34;</span> }
</span></span><span class="line"><span class="ln">14</span><span class="cl">}</span></span></code></pre></div><h3 id="子網域-delegation">子網域 delegation</h3>
<p>當 dev / staging / prod 各用獨立帳號時，每個帳號建自己的 hosted zone 管理子網域（如 <code>dev.example.com</code>）。父網域的 zone 需要加一組 NS 記錄指向子網域的 zone，這個動作叫 delegation。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_route53_record&#34; &#34;dev_ns&#34;</span> {
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="n">  zone_id</span> <span class="o">=</span> <span class="k">aws_route53_zone</span><span class="p">.</span><span class="k">public</span><span class="p">.</span><span class="k">zone_id</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="n">  name</span>    <span class="o">=</span> <span class="s2">&#34;dev.example.com&#34;</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="n">  type</span>    <span class="o">=</span> <span class="s2">&#34;NS&#34;</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="n">  ttl</span>     <span class="o">=</span> <span class="m">300</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="n">  records</span> <span class="o">=</span> <span class="k">aws_route53_zone</span><span class="p">.</span><span class="k">dev</span><span class="p">.</span><span class="k">name_servers</span>
</span></span><span class="line"><span class="ln">7</span><span class="cl">}</span></span></code></pre></div><p>delegation 的 NS 記錄指向子帳號 zone 的 name server。子帳號內的所有 DNS 記錄（如 <code>api.dev.example.com</code>）由子帳號的 zone 管理，父帳號不需要逐條設定。跨帳號 delegation 需要兩邊的 Terraform 各自管理自己的 zone，NS 記錄在父帳號的 state 裡。</p>
<p>判讀設定是否正確：用 <code>dig dev.example.com NS</code> 查回的 name server 應該是子帳號 zone 的 NS，不是父帳號的。如果查回父帳號的 NS，代表 delegation 沒生效，子網域的 DNS 記錄不會被解析。</p>
<h2 id="acm-憑證申請與-dns-驗證">ACM 憑證申請與 DNS 驗證</h2>
<p>AWS Certificate Manager（ACM）提供免費的 TLS 憑證，條件是透過 DNS 或 email 驗證網域所有權。DNS 驗證是 IaC 友善的方式——ACM 要求在指定網域下建一條 CNAME 記錄，記錄值由 ACM 提供，驗證通過後憑證自動簽發。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_acm_certificate&#34; &#34;main&#34;</span> {
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="n">  domain_name</span>               <span class="o">=</span> <span class="s2">&#34;example.com&#34;</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="n">  subject_alternative_names</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;*.example.com&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="n">  validation_method</span>         <span class="o">=</span> <span class="s2">&#34;DNS&#34;</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">  <span class="k">lifecycle</span> {
</span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="n">    create_before_destroy</span> <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">  }
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="n">  tags</span> <span class="o">=</span><span class="n"> { Environment</span> <span class="o">=</span> <span class="s2">&#34;production&#34;</span> }
</span></span><span class="line"><span class="ln">11</span><span class="cl">}</span></span></code></pre></div><p><code>subject_alternative_names</code> 加 <code>*.example.com</code> 讓同一張憑證涵蓋所有子網域（如 <code>api.example.com</code>、<code>admin.example.com</code>），省去為每個子網域各申請一張。</p>
<h3 id="dns-驗證記錄">DNS 驗證記錄</h3>
<p>ACM 簽發後會產出一組驗證用的 CNAME 記錄。用 Terraform 自動在 Route 53 建立這些記錄，讓驗證流程不需要手動操作：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_route53_record&#34; &#34;cert_validation&#34;</span> {
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="n">  for_each</span> <span class="o">=</span> {
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="n">    for dvo in aws_acm_certificate.main.domain_validation_options : dvo.domain_name</span> <span class="o">=</span><span class="err">&gt;</span> {
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="n">      name</span>   <span class="o">=</span> <span class="k">dvo</span><span class="p">.</span><span class="k">resource_record_name</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="n">      record</span> <span class="o">=</span> <span class="k">dvo</span><span class="p">.</span><span class="k">resource_record_value</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="n">      type</span>   <span class="o">=</span> <span class="k">dvo</span><span class="p">.</span><span class="k">resource_record_type</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">    }
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">  }
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="n">  zone_id</span> <span class="o">=</span> <span class="k">aws_route53_zone</span><span class="p">.</span><span class="k">public</span><span class="p">.</span><span class="k">zone_id</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="n">  name</span>    <span class="o">=</span> <span class="k">each</span><span class="p">.</span><span class="k">value</span><span class="p">.</span><span class="k">name</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="n">  type</span>    <span class="o">=</span> <span class="k">each</span><span class="p">.</span><span class="k">value</span><span class="p">.</span><span class="k">type</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="n">  ttl</span>     <span class="o">=</span> <span class="m">300</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="n">  records</span> <span class="o">=</span> <span class="p">[</span><span class="k">each</span><span class="p">.</span><span class="k">value</span><span class="p">.</span><span class="k">record</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">
</span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="n">  allow_overwrite</span> <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">}
</span></span><span class="line"><span class="ln">18</span><span class="cl">
</span></span><span class="line"><span class="ln">19</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_acm_certificate_validation&#34; &#34;main&#34;</span> {
</span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="n">  certificate_arn</span>         <span class="o">=</span> <span class="k">aws_acm_certificate</span><span class="p">.</span><span class="k">main</span><span class="p">.</span><span class="k">arn</span>
</span></span><span class="line"><span class="ln">21</span><span class="cl"><span class="n">  validation_record_fqdns</span> <span class="o">=</span> <span class="p">[</span><span class="k">for</span> <span class="k">record</span> <span class="k">in</span> <span class="k">aws_route53_record</span><span class="p">.</span><span class="k">cert_validation</span> <span class="err">:</span> <span class="k">record</span><span class="p">.</span><span class="k">fqdn</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">22</span><span class="cl">}</span></span></code></pre></div><p><code>aws_acm_certificate_validation</code> 資源會等到 ACM 確認驗證通過才算 apply 成功。如果 DNS 記錄設錯或 zone 的 NS delegation 有問題，這個資源會卡住直到 timeout——排查方向是先確認驗證 CNAME 記錄能被公網 DNS 解析。</p>
<h3 id="create_before_destroy">create_before_destroy</h3>
<p><code>lifecycle { create_before_destroy = true }</code> 在憑證需要替換時（如增加 SAN、更換網域），讓 Terraform 先建新憑證、再刪舊憑證。沒有這個設定，預設行為是先刪後建——刪除的瞬間 ALB listener 失去憑證，HTTPS 連線全部中斷直到新憑證驗證通過（可能要幾分鐘到幾十分鐘）。</p>
<h2 id="alb-https-listener">ALB HTTPS Listener</h2>
<p>憑證驗證通過後，把它掛到 ALB 的 HTTPS listener：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_lb_listener&#34; &#34;https&#34;</span> {
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="n">  load_balancer_arn</span> <span class="o">=</span> <span class="k">aws_lb</span><span class="p">.</span><span class="k">main</span><span class="p">.</span><span class="k">arn</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="n">  port</span>              <span class="o">=</span> <span class="m">443</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="n">  protocol</span>          <span class="o">=</span> <span class="s2">&#34;HTTPS&#34;</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="n">  ssl_policy</span>        <span class="o">=</span> <span class="s2">&#34;ELBSecurityPolicy-TLS13-1-2-2021-06&#34;</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="n">  certificate_arn</span>   <span class="o">=</span> <span class="k">aws_acm_certificate_validation</span><span class="p">.</span><span class="k">main</span><span class="p">.</span><span class="k">certificate_arn</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">  <span class="k">default_action</span> {
</span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="n">    type</span>             <span class="o">=</span> <span class="s2">&#34;forward&#34;</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="n">    target_group_arn</span> <span class="o">=</span> <span class="k">aws_lb_target_group</span><span class="p">.</span><span class="k">app</span><span class="p">.</span><span class="k">arn</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">  }
</span></span><span class="line"><span class="ln">12</span><span class="cl">}</span></span></code></pre></div><p><code>ssl_policy</code> 決定 TLS 版本與加密套件。<code>ELBSecurityPolicy-TLS13-1-2-2021-06</code> 支援 TLS 1.2 和 1.3、停用已知不安全的舊版協定。選型判準是相容性與安全性的平衡——TLS 1.3-only policy 最安全但可能排除舊版客戶端，多數場景用 1.2+1.3 的組合。</p>
<p><code>certificate_arn</code> 引用的是 <code>aws_acm_certificate_validation</code> 而非直接引用 <code>aws_acm_certificate</code>，確保 listener 只在憑證驗證通過後才建立。</p>
<h3 id="http--https-重導">HTTP → HTTPS 重導</h3>
<p>同時建立一個 HTTP listener，把所有 80 埠流量重導到 443：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_lb_listener&#34; &#34;http_redirect&#34;</span> {
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="n">  load_balancer_arn</span> <span class="o">=</span> <span class="k">aws_lb</span><span class="p">.</span><span class="k">main</span><span class="p">.</span><span class="k">arn</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="n">  port</span>              <span class="o">=</span> <span class="m">80</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="n">  protocol</span>          <span class="o">=</span> <span class="s2">&#34;HTTP&#34;</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">  <span class="k">default_action</span> {
</span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="n">    type</span> <span class="o">=</span> <span class="s2">&#34;redirect&#34;</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">    <span class="k">redirect</span> {
</span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="n">      port</span>        <span class="o">=</span> <span class="s2">&#34;443&#34;</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="n">      protocol</span>    <span class="o">=</span> <span class="s2">&#34;HTTPS&#34;</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="n">      status_code</span> <span class="o">=</span> <span class="s2">&#34;HTTP_301&#34;</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">    }
</span></span><span class="line"><span class="ln">13</span><span class="cl">  }
</span></span><span class="line"><span class="ln">14</span><span class="cl">}</span></span></code></pre></div><p>301 永久重導讓瀏覽器記住後續直接走 HTTPS。security group 仍然需要開放 80 埠入站，否則重導不會發生——client 連 80 埠被擋、收到的是連線失敗而非重導回應。</p>
<h2 id="多網域與-san-憑證">多網域與 SAN 憑證</h2>
<p>一張 ACM 憑證最多支援 10 個 SAN（Subject Alternative Name）。多數場景用主網域 + wildcard（<code>example.com</code> + <code>*.example.com</code>）就夠用。如果有多個不同根網域（如 <code>example.com</code> 和 <code>example-app.com</code>），可以加進同一張憑證：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_acm_certificate&#34; &#34;multi_domain&#34;</span> {
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="n">  domain_name</span>               <span class="o">=</span> <span class="s2">&#34;example.com&#34;</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="n">  subject_alternative_names</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">    <span class="s2">&#34;*.example.com&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">    <span class="s2">&#34;example-app.com&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="s2">&#34;*.example-app.com&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">  <span class="p">]</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="n">  validation_method</span> <span class="o">=</span> <span class="s2">&#34;DNS&#34;</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">
</span></span><span class="line"><span class="ln">10</span><span class="cl">  <span class="k">lifecycle</span> {
</span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="n">    create_before_destroy</span> <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">  }
</span></span><span class="line"><span class="ln">13</span><span class="cl">}</span></span></code></pre></div><p>每個 SAN 網域都需要獨立的 DNS 驗證記錄。如果不同網域在不同的 hosted zone 裡，驗證記錄的建立要分別指向各自的 zone。</p>
<p>當 SAN 數量超過 10、或不同網域的憑證需要獨立管理（不同 team 負責不同網域），改用 <code>aws_lb_listener_certificate</code> 額外掛載：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_lb_listener_certificate&#34; &#34;additional&#34;</span> {
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="n">  listener_arn</span>    <span class="o">=</span> <span class="k">aws_lb_listener</span><span class="p">.</span><span class="k">https</span><span class="p">.</span><span class="k">arn</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="n">  certificate_arn</span> <span class="o">=</span> <span class="k">aws_acm_certificate</span><span class="p">.</span><span class="k">other_domain</span><span class="p">.</span><span class="k">arn</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">}</span></span></code></pre></div><p>ALB 會根據 SNI（Server Name Indication）自動選擇匹配的憑證。</p>
<h2 id="穩定的-dns-別名記錄">穩定的 DNS 別名記錄</h2>
<p>ALB 重建後 DNS 名稱會改變，對外服務不應該直接用 ALB 的 DNS 名稱。用 Route 53 的 alias record 把穩定的網域名指向 ALB：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_route53_record&#34; &#34;app&#34;</span> {
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="n">  zone_id</span> <span class="o">=</span> <span class="k">aws_route53_zone</span><span class="p">.</span><span class="k">public</span><span class="p">.</span><span class="k">zone_id</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="n">  name</span>    <span class="o">=</span> <span class="s2">&#34;api.example.com&#34;</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="n">  type</span>    <span class="o">=</span> <span class="s2">&#34;A&#34;</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">  <span class="k">alias</span> {
</span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="n">    name</span>                   <span class="o">=</span> <span class="k">aws_lb</span><span class="p">.</span><span class="k">main</span><span class="p">.</span><span class="k">dns_name</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="n">    zone_id</span>                <span class="o">=</span> <span class="k">aws_lb</span><span class="p">.</span><span class="k">main</span><span class="p">.</span><span class="k">zone_id</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="n">    evaluate_target_health</span> <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">  }
</span></span><span class="line"><span class="ln">11</span><span class="cl">}</span></span></code></pre></div><p>alias record 不收費（一般的 A/CNAME 記錄每百萬次查詢 $0.40，alias 到 AWS 資源免費），且支援 zone apex（如 <code>example.com</code>，一般 CNAME 不支援 zone apex）。<code>evaluate_target_health = true</code> 讓 Route 53 在 ALB 不健康時停止回應該記錄，配合 failover routing 使用。</p>
<h2 id="憑證續期監控">憑證續期監控</h2>
<p>ACM 的 DNS 驗證憑證會自動續期——條件是驗證用的 CNAME 記錄仍然存在且可解析。只要那條記錄沒被刪掉，憑證到期前 60 天 ACM 會自動續期。</p>
<p>自動續期失敗的常見原因：驗證 CNAME 記錄被手動刪除、hosted zone 的 NS delegation 失效、或 zone 本身被刪除重建導致 NS 改變。用 CloudWatch alarm 監控憑證到期日，在自動續期失敗時提前收到通知：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_cloudwatch_metric_alarm&#34; &#34;cert_expiry&#34;</span> {
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="n">  alarm_name</span>          <span class="o">=</span> <span class="s2">&#34;acm-cert-expiry-${aws_acm_certificate.main.domain_name}&#34;</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="n">  comparison_operator</span> <span class="o">=</span> <span class="s2">&#34;LessThanThreshold&#34;</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="n">  evaluation_periods</span>  <span class="o">=</span> <span class="m">1</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="n">  metric_name</span>         <span class="o">=</span> <span class="s2">&#34;DaysToExpiry&#34;</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="n">  namespace</span>           <span class="o">=</span> <span class="s2">&#34;AWS/CertificateManager&#34;</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="n">  period</span>              <span class="o">=</span> <span class="m">86400</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="n">  statistic</span>           <span class="o">=</span> <span class="s2">&#34;Minimum&#34;</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="n">  threshold</span>           <span class="o">=</span> <span class="m">30</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="n">  alarm_actions</span>       <span class="o">=</span> <span class="p">[</span><span class="k">aws_sns_topic</span><span class="p">.</span><span class="k">oncall</span><span class="p">.</span><span class="k">arn</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">
</span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="n">  dimensions</span> <span class="o">=</span> {
</span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="n">    CertificateArn</span> <span class="o">=</span> <span class="k">aws_acm_certificate</span><span class="p">.</span><span class="k">main</span><span class="p">.</span><span class="k">arn</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl">  }
</span></span><span class="line"><span class="ln">15</span><span class="cl">}</span></span></code></pre></div><p>這個 alarm 在憑證距離到期不足 30 天時觸發。正常情況下 ACM 在到期前 60 天就會完成續期，收到 30 天警報代表自動續期失敗了、需要人工介入確認驗證記錄。</p>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>→ <a href="/blog/infra/05-core-services/loadbalancer-alb/" data-link-title="入口上 IaC — ALB、TLS 與健康檢查" data-link-desc="Application Load Balancer 的 listener、target group、健康檢查閾值設計，以及用 ACM 把 TLS 憑證的簽發、驗證與掛載整條鏈寫進版本控制">入口上 IaC — ALB</a>：ALB listener、target group、健康檢查的完整設定</li>
<li>→ <a href="/blog/infra/03-network-foundation/" data-link-title="模組三：網路地基 — VPC 與分層" data-link-desc="VPC、public / private subnet 切分、route table、NAT、security group 設計">模組三：網路地基</a>：ALB 所在的 public subnet 與 security group 設計</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/" data-link-title="模組七：infra 走 PR 流程與自動化護欄" data-link-desc="infra 變更走 PR → plan → review diff → 合併 → apply，配 fmt / validate / tflint / checkov / tfsec 與 Atlantis 自動化，讓基礎設施可審查、可回溯、可交接">模組七：infra 走 PR 流程</a>：憑證與 DNS 變更走 PR review</li>
</ul>
]]></content:encoded></item><item><title>斷網環境的基礎服務：DNS、NTP、CA 與 Secret Management</title><link>https://tarrragon.github.io/blog/infra/air-gapped/air-gapped-infrastructure-services/</link><pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/infra/air-gapped/air-gapped-infrastructure-services/</guid><description>&lt;p>斷網環境裡的 GitLab、&lt;a href="https://tarrragon.github.io/blog/infra/knowledge-cards/harbor/" data-link-title="Harbor" data-link-desc="開源的 container image registry，支援映像掃描、RBAC、複製，斷網環境取代 Docker Hub 的方案">Harbor&lt;/a>、&lt;a href="https://tarrragon.github.io/blog/infra/knowledge-cards/prometheus/" data-link-title="Prometheus" data-link-desc="開源的 metrics 收集與告警系統，用 pull 模式從 target 拉取指標，斷網環境的預設監控方案">Prometheus&lt;/a>、Nexus 都有一個共同前提：它們需要名稱解析（&lt;a href="https://tarrragon.github.io/blog/infra/knowledge-cards/dns/" data-link-title="DNS" data-link-desc="Domain Name System — 把域名轉成 IP 位址的系統，以及 A record、CNAME、NS、TTL 的角色">DNS&lt;/a>）才能互相找到、需要時間同步（NTP）才能讓 log 和憑證有效、需要 &lt;a href="https://tarrragon.github.io/blog/infra/knowledge-cards/ssl-tls/" data-link-title="SSL / TLS" data-link-desc="加密 client 與 server 之間通訊的協定，讓 HTTPS 成為可能。TLS 是 SSL 的後繼者，但 SSL 憑證的稱呼仍廣泛使用">TLS&lt;/a> 憑證（CA）才能走 HTTPS、需要機密儲存（&lt;a href="https://tarrragon.github.io/blog/infra/knowledge-cards/vault/" data-link-title="HashiCorp Vault" data-link-desc="機密管理系統，集中存放密碼、API key、TLS 私鑰，提供存取控制、稽核和自動輪替">Vault&lt;/a>）才能安全管理密碼和 token。這四個是「服務的服務」——沒有它們，其他自建服務要麼無法啟動、要麼只能用不安全的 HTTP 明文通訊。&lt;/p>
&lt;h2 id="internal-dns內部名稱解析">Internal DNS：內部名稱解析&lt;/h2>
&lt;p>斷網環境沒有公開 DNS 可用。內部服務之間的互相引用（GitLab 連 PostgreSQL、Harbor 連 storage backend）如果靠 IP 位址，每次 IP 變動都要改一輪設定。內部 DNS 讓服務用 hostname（&lt;code>gitlab.internal&lt;/code>、&lt;code>harbor.internal&lt;/code>）互相引用，IP 變動只改 DNS zone 一處。&lt;/p>
&lt;h3 id="coredns-vs-bind">CoreDNS vs BIND&lt;/h3>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>面向&lt;/th>
 &lt;th>CoreDNS&lt;/th>
 &lt;th>BIND&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>設定方式&lt;/td>
 &lt;td>Corefile（宣告式、短）&lt;/td>
 &lt;td>named.conf（傳統、長）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>部署方式&lt;/td>
 &lt;td>單一 binary / container&lt;/td>
 &lt;td>系統套件&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>適合情境&lt;/td>
 &lt;td>Kubernetes 原生整合、輕量&lt;/td>
 &lt;td>複雜 DNS 需求（split-horizon、DNSSEC）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>學習曲線&lt;/td>
 &lt;td>低&lt;/td>
 &lt;td>中高&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>多數斷網環境用 CoreDNS 就夠——zone 檔案放在磁碟上、Corefile 幾行就能啟動。&lt;/p>
&lt;h3 id="最小設定">最小設定&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl"># Corefile
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">internal:53 {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl"> file /etc/coredns/zones/internal.zone
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl"> log
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl"> errors
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">}
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl">.:53 {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl"> forward . /dev/null
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl"> log
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl">}&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>第一個 block 處理 &lt;code>internal&lt;/code> 域名的查詢、從 zone 檔案回應。第二個 block 攔截所有其他查詢——斷網環境不能轉發到上游 DNS，&lt;code>forward . /dev/null&lt;/code> 讓非內部域名直接返回 NXDOMAIN 而非 timeout。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">; /etc/coredns/zones/internal.zone
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">$ORIGIN internal.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">@ IN SOA ns1.internal. admin.internal. (
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl"> 2026062601 ; serial
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl"> 3600 ; refresh
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl"> 600 ; retry
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl"> 86400 ; expire
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl"> 60 ; minimum
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl"> IN NS ns1.internal.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl">ns1 IN A 10.0.1.10
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl">gitlab IN A 10.0.1.20
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl">harbor IN A 10.0.1.21
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">14&lt;/span>&lt;span class="cl">vault IN A 10.0.1.22
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">15&lt;/span>&lt;span class="cl">nexus IN A 10.0.1.23
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">16&lt;/span>&lt;span class="cl">prom IN A 10.0.1.24
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">17&lt;/span>&lt;span class="cl">grafana IN A 10.0.1.25
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">18&lt;/span>&lt;span class="cl">ntp IN A 10.0.1.11&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>新增服務時加一行 A record、重載 CoreDNS（&lt;code>kill -SIGUSR1 $(pidof coredns)&lt;/code> 或重啟 container）。serial 號遞增讓變更可追蹤。&lt;/p></description><content:encoded><![CDATA[<p>斷網環境裡的 GitLab、<a href="/blog/infra/knowledge-cards/harbor/" data-link-title="Harbor" data-link-desc="開源的 container image registry，支援映像掃描、RBAC、複製，斷網環境取代 Docker Hub 的方案">Harbor</a>、<a href="/blog/infra/knowledge-cards/prometheus/" data-link-title="Prometheus" data-link-desc="開源的 metrics 收集與告警系統，用 pull 模式從 target 拉取指標，斷網環境的預設監控方案">Prometheus</a>、Nexus 都有一個共同前提：它們需要名稱解析（<a href="/blog/infra/knowledge-cards/dns/" data-link-title="DNS" data-link-desc="Domain Name System — 把域名轉成 IP 位址的系統，以及 A record、CNAME、NS、TTL 的角色">DNS</a>）才能互相找到、需要時間同步（NTP）才能讓 log 和憑證有效、需要 <a href="/blog/infra/knowledge-cards/ssl-tls/" data-link-title="SSL / TLS" data-link-desc="加密 client 與 server 之間通訊的協定，讓 HTTPS 成為可能。TLS 是 SSL 的後繼者，但 SSL 憑證的稱呼仍廣泛使用">TLS</a> 憑證（CA）才能走 HTTPS、需要機密儲存（<a href="/blog/infra/knowledge-cards/vault/" data-link-title="HashiCorp Vault" data-link-desc="機密管理系統，集中存放密碼、API key、TLS 私鑰，提供存取控制、稽核和自動輪替">Vault</a>）才能安全管理密碼和 token。這四個是「服務的服務」——沒有它們，其他自建服務要麼無法啟動、要麼只能用不安全的 HTTP 明文通訊。</p>
<h2 id="internal-dns內部名稱解析">Internal DNS：內部名稱解析</h2>
<p>斷網環境沒有公開 DNS 可用。內部服務之間的互相引用（GitLab 連 PostgreSQL、Harbor 連 storage backend）如果靠 IP 位址，每次 IP 變動都要改一輪設定。內部 DNS 讓服務用 hostname（<code>gitlab.internal</code>、<code>harbor.internal</code>）互相引用，IP 變動只改 DNS zone 一處。</p>
<h3 id="coredns-vs-bind">CoreDNS vs BIND</h3>
<table>
  <thead>
      <tr>
          <th>面向</th>
          <th>CoreDNS</th>
          <th>BIND</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>設定方式</td>
          <td>Corefile（宣告式、短）</td>
          <td>named.conf（傳統、長）</td>
      </tr>
      <tr>
          <td>部署方式</td>
          <td>單一 binary / container</td>
          <td>系統套件</td>
      </tr>
      <tr>
          <td>適合情境</td>
          <td>Kubernetes 原生整合、輕量</td>
          <td>複雜 DNS 需求（split-horizon、DNSSEC）</td>
      </tr>
      <tr>
          <td>學習曲線</td>
          <td>低</td>
          <td>中高</td>
      </tr>
  </tbody>
</table>
<p>多數斷網環境用 CoreDNS 就夠——zone 檔案放在磁碟上、Corefile 幾行就能啟動。</p>
<h3 id="最小設定">最小設定</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln"> 1</span><span class="cl"># Corefile
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">internal:53 {
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">    file /etc/coredns/zones/internal.zone
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">    log
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">    errors
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">}
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">.:53 {
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">    forward . /dev/null
</span></span><span class="line"><span class="ln">10</span><span class="cl">    log
</span></span><span class="line"><span class="ln">11</span><span class="cl">}</span></span></code></pre></div><p>第一個 block 處理 <code>internal</code> 域名的查詢、從 zone 檔案回應。第二個 block 攔截所有其他查詢——斷網環境不能轉發到上游 DNS，<code>forward . /dev/null</code> 讓非內部域名直接返回 NXDOMAIN 而非 timeout。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln"> 1</span><span class="cl">; /etc/coredns/zones/internal.zone
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">$ORIGIN internal.
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">@       IN SOA  ns1.internal. admin.internal. (
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">        2026062601 ; serial
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">        3600       ; refresh
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">        600        ; retry
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">        86400      ; expire
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">        60         ; minimum
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">)
</span></span><span class="line"><span class="ln">10</span><span class="cl">        IN NS   ns1.internal.
</span></span><span class="line"><span class="ln">11</span><span class="cl">ns1     IN A    10.0.1.10
</span></span><span class="line"><span class="ln">12</span><span class="cl">gitlab  IN A    10.0.1.20
</span></span><span class="line"><span class="ln">13</span><span class="cl">harbor  IN A    10.0.1.21
</span></span><span class="line"><span class="ln">14</span><span class="cl">vault   IN A    10.0.1.22
</span></span><span class="line"><span class="ln">15</span><span class="cl">nexus   IN A    10.0.1.23
</span></span><span class="line"><span class="ln">16</span><span class="cl">prom    IN A    10.0.1.24
</span></span><span class="line"><span class="ln">17</span><span class="cl">grafana IN A    10.0.1.25
</span></span><span class="line"><span class="ln">18</span><span class="cl">ntp     IN A    10.0.1.11</span></span></code></pre></div><p>新增服務時加一行 A record、重載 CoreDNS（<code>kill -SIGUSR1 $(pidof coredns)</code> 或重啟 container）。serial 號遞增讓變更可追蹤。</p>
<h3 id="客戶端設定">客戶端設定</h3>
<p>每台機器的 <code>/etc/resolv.conf</code> 指向 CoreDNS 的 IP：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">nameserver 10.0.1.10
</span></span><span class="line"><span class="ln">2</span><span class="cl">search internal</span></span></code></pre></div><p>如果環境有 DHCP server，在 DHCP option 裡配 DNS server 位址，新加入的機器自動取得。沒有 DHCP 就靠 provisioning 腳本或 Ansible playbook 推送。</p>
<h2 id="ntp內部時間同步">NTP：內部時間同步</h2>
<p>時間不同步在斷網環境會引發三類問題：log 的時間戳錯亂讓事故排查無法跨機器對齊、TLS 憑證的有效期判斷出錯導致合法憑證被拒絕、以及 Kerberos 等時間敏感的認證協定直接失敗。正常環境從 <code>pool.ntp.org</code> 取得時間，斷網環境需要自己的時間源。</p>
<h3 id="chrony-作為-ntp-server">chrony 作為 NTP server</h3>
<p>chrony 比傳統的 ntpd 更適合網路不穩或隔離的環境——它的時鐘修正演算法在長時間無外部時間源時仍能保持較準確的漂移補償。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># /etc/chrony.conf（NTP server 端）</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"># 斷網環境：沒有上游 NTP、用本機時鐘作為最後手段</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="nb">local</span> stratum <span class="m">10</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">allow 10.0.0.0/8
</span></span><span class="line"><span class="ln">5</span><span class="cl">driftfile /var/lib/chrony/drift</span></span></code></pre></div><p><code>local stratum 10</code> 宣告「我自己是時間源、但 stratum 很低（精度不高）」。其他機器的 chrony 設定指向這台 server：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># /etc/chrony.conf（客戶端）</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">server ntp.internal iburst
</span></span><span class="line"><span class="ln">3</span><span class="cl">makestep 1.0 <span class="m">3</span></span></span></code></pre></div><p><code>iburst</code> 讓開機時快速同步、<code>makestep 1.0 3</code> 允許前三次校正時跳大步（修正啟動時的大偏差）。</p>
<h3 id="高精度需求">高精度需求</h3>
<p>如果環境對時間精度有要求（金融交易、工控系統），NTP server 需要硬體時間源——GPS 接收器或原子鐘模組。GPS 天線不需要網路連線、只需要看得到衛星的位置（屋頂或窗邊）。chrony 支援 PPS（Pulse Per Second）輸入、可以達到微秒級精度。</p>
<p>多數斷網環境不需要這個精度——毫秒級一致（chrony 預設行為）對 log 對齊和 TLS 驗證已經足夠。</p>
<h2 id="internal-ca內部憑證簽發">Internal CA：內部憑證簽發</h2>
<p>斷網環境的每個內部 HTTPS 服務都需要 TLS 憑證。Let&rsquo;s Encrypt 的 ACME challenge 需要連網驗證，在斷網環境無法使用。替代方案是建立內部 CA（Certificate Authority），自己簽發憑證。</p>
<h3 id="step-casmallstep">step-ca（Smallstep）</h3>
<p>step-ca 是一個輕量的 CA server，支援 ACME 協定——內部服務可以用跟 Let&rsquo;s Encrypt 相同的流程自動申請和續期憑證，只是 ACME server 是內網的 step-ca 而非 Let&rsquo;s Encrypt。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 初始化 CA</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">step ca init --name<span class="o">=</span><span class="s2">&#34;Internal CA&#34;</span> --dns<span class="o">=</span><span class="s2">&#34;ca.internal&#34;</span> <span class="se">\
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="se"></span>  --address<span class="o">=</span><span class="s2">&#34;:443&#34;</span> --provisioner<span class="o">=</span><span class="s2">&#34;admin&#34;</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"># 啟動 CA server</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl">step-ca <span class="k">$(</span>step path<span class="k">)</span>/config/ca.json</span></span></code></pre></div><p>初始化會產生 root CA 和 intermediate CA 的 key pair。root CA 的私鑰是整個信任鏈的根——它的保護等級要最高（離線儲存、存取紀錄）。</p>
<h3 id="憑證簽發流程">憑證簽發流程</h3>
<p>服務用 ACME client 向 step-ca 申請憑證：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 用 step CLI 申請憑證（手動方式）</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">step ca certificate <span class="s2">&#34;gitlab.internal&#34;</span> gitlab.crt gitlab.key
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 用 ACME 自動續期（搭配 certbot 或 step 的 renewal daemon）</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">step ca renew --daemon gitlab.crt gitlab.key</span></span></code></pre></div><p>certbot 也能配合 step-ca 使用——把 ACME server URL 從 Let&rsquo;s Encrypt 改成 <code>https://ca.internal/acme/acme/directory</code>。已有 certbot 自動續期腳本的服務只要改一行設定。</p>
<h3 id="root-ca-分發">Root CA 分發</h3>
<p>每台機器和每個服務都要信任內部 CA 的 root certificate：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># Debian/Ubuntu</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">cp root_ca.crt /usr/local/share/ca-certificates/internal-ca.crt
</span></span><span class="line"><span class="ln">3</span><span class="cl">update-ca-certificates
</span></span><span class="line"><span class="ln">4</span><span class="cl">
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="c1"># RHEL/CentOS</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl">cp root_ca.crt /etc/pki/ca-trust/source/anchors/internal-ca.crt
</span></span><span class="line"><span class="ln">7</span><span class="cl">update-ca-trust</span></span></code></pre></div><p>Docker daemon 也需要信任內部 CA（否則 <code>docker pull harbor.internal/image</code> 會報 TLS 錯誤）：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">mkdir -p /etc/docker/certs.d/harbor.internal
</span></span><span class="line"><span class="ln">2</span><span class="cl">cp root_ca.crt /etc/docker/certs.d/harbor.internal/ca.crt
</span></span><span class="line"><span class="ln">3</span><span class="cl">systemctl restart docker</span></span></code></pre></div><p>Ansible playbook 批量推送 root CA 到所有機器，是初始部署的標準做法。</p>
<h3 id="cfssl-作為替代">cfssl 作為替代</h3>
<p>cfssl（Cloudflare 的 PKI 工具組）比 step-ca 更簡單但沒有 ACME 自動化——每張憑證要手動簽發。適合只有 5-10 個服務、不需要自動續期的小規模環境。</p>
<h2 id="secret-managementhashicorp-vault">Secret Management：HashiCorp Vault</h2>
<p>資料庫密碼、API token、TLS 私鑰這些機密值需要一個集中的安全儲存。斷網環境不能用 AWS Secrets Manager 或 GCP Secret Manager，HashiCorp Vault 是最常見的自建選項。</p>
<h3 id="斷網環境的-vault-初始化">斷網環境的 Vault 初始化</h3>
<p>Vault 的初始化（unsealing）在雲端環境通常用 AWS KMS 或 GCP Cloud KMS 自動 unseal。斷網環境沒有雲端 KMS，退回 Shamir&rsquo;s Secret Sharing——初始化時產生 N 個 unseal key、啟動時需要 M 個 key 才能解鎖（典型設定：5 個 key、3 個即可 unseal）。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 初始化 Vault（5 key shares、3 threshold）</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">vault operator init -key-shares<span class="o">=</span><span class="m">5</span> -key-threshold<span class="o">=</span><span class="m">3</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># Unseal（需要 3 次、每次用不同的 key）</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">vault operator unseal &lt;key-1&gt;
</span></span><span class="line"><span class="ln">6</span><span class="cl">vault operator unseal &lt;key-2&gt;
</span></span><span class="line"><span class="ln">7</span><span class="cl">vault operator unseal &lt;key-3&gt;</span></span></code></pre></div><p>5 個 unseal key 分別交給不同的人保管。任何單一個人都無法獨自解鎖 Vault——這是刻意的安全設計。Vault 重啟後需要重新 unseal，所以 unseal key 的保管和取用流程要事先演練。</p>
<h3 id="機器身分認證">機器身分認證</h3>
<p>服務從 Vault 讀取 secret 時需要認證自己的身分。雲端環境用 IAM role，斷網環境用 AppRole——每個服務拿到一組 role_id + secret_id、用它們換取短期 token。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># 建立 AppRole</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl">vault auth <span class="nb">enable</span> approle
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">vault write auth/approle/role/gitlab <span class="se">\
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="se"></span>  <span class="nv">token_ttl</span><span class="o">=</span>1h <span class="se">\
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="se"></span>  <span class="nv">token_max_ttl</span><span class="o">=</span>4h <span class="se">\
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="se"></span>  <span class="nv">policies</span><span class="o">=</span>gitlab-secrets
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="c1"># 服務端取得 token</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">vault write auth/approle/login <span class="se">\
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="se"></span>  <span class="nv">role_id</span><span class="o">=</span><span class="s2">&#34;</span><span class="nv">$ROLE_ID</span><span class="s2">&#34;</span> <span class="se">\
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="se"></span>  <span class="nv">secret_id</span><span class="o">=</span><span class="s2">&#34;</span><span class="nv">$SECRET_ID</span><span class="s2">&#34;</span></span></span></code></pre></div><p>secret_id 本身也是 secret——初次部署時由 Vault admin 手動提供給服務、或透過 Ansible 的 encrypted variable 推送。</p>
<h3 id="儲存後端">儲存後端</h3>
<p>Vault 需要一個持久化的儲存後端。雲端用 DynamoDB 或 Consul，斷網環境用：</p>
<table>
  <thead>
      <tr>
          <th>後端</th>
          <th>適用情境</th>
          <th>特性</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>檔案系統</td>
          <td>單節點、小規模</td>
          <td>最簡單、但沒有 HA</td>
      </tr>
      <tr>
          <td>PostgreSQL</td>
          <td>已有 PostgreSQL 的環境</td>
          <td>利用現有基礎設施</td>
      </tr>
      <tr>
          <td>Consul</td>
          <td>需要 HA 的環境</td>
          <td>Vault + Consul 是官方推薦的 HA 組合</td>
      </tr>
  </tbody>
</table>
<h2 id="部署順序的相互依賴">部署順序的相互依賴</h2>
<p>四個服務之間有依賴鏈：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">DNS → NTP → CA → Vault
</span></span><span class="line"><span class="ln">2</span><span class="cl"> ↑_________________↓（Vault 的 FQDN 要 DNS 解析）</span></span></code></pre></div><p>DNS 先啟動（其他服務靠它解析 hostname）→ NTP 跟著（CA 簽發憑證時需要準確的時間、否則 notBefore/notAfter 判斷會出問題）→ CA 啟動（Vault 的 HTTPS 需要 TLS 憑證）→ Vault 最後（依賴 DNS 和 TLS）。</p>
<p>DNS 跟 CA 之間有一個循環依賴：CA 簽發憑證時需要 DNS 解析（ACME challenge 或 CSR 裡的 SAN），但 DNS server 本身要不要 TLS？解法是 DNS 第一次啟動時用明文（不走 HTTPS），CA 啟動後回頭替 DNS 簽一張憑證、再切到 DNS-over-TLS。多數內網環境 DNS 維持明文即可——DNS 查詢在內網不加密是常見做法，風險可控。</p>
<h2 id="時程與維護">時程與維護</h2>
<table>
  <thead>
      <tr>
          <th>服務</th>
          <th>初始部署</th>
          <th>持續維護</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>CoreDNS</td>
          <td>2-4 小時</td>
          <td>新增服務時加 zone record（分鐘級）</td>
      </tr>
      <tr>
          <td>chrony</td>
          <td>1-2 小時</td>
          <td>幾乎不需要（漂移補償自動運作）</td>
      </tr>
      <tr>
          <td>step-ca</td>
          <td>3-4 小時</td>
          <td>憑證到期前的監控和續期（自動化後接近零）</td>
      </tr>
      <tr>
          <td>Vault</td>
          <td>4-8 小時</td>
          <td>unseal key 管理、policy 更新、備份</td>
      </tr>
  </tbody>
</table>
<p>四個服務合計約 1.5-2 個工作天完成初始部署。部署完成後的日常維護負擔集中在 Vault（unseal key 管理和 policy 維護）和 DNS zone 更新。CA 的憑證續期如果用 ACME 自動化就接近零維護。</p>
<p>向管理層溝通時的框架：「這四個服務是所有其他服務的地基——沒有它們，其他服務要麼找不到彼此（DNS）、時間對不上（NTP）、通訊不加密（CA）、密碼寫在設定檔裡（Vault）。部署一次、之後幾乎自動運作。」</p>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>→ <a href="/blog/infra/air-gapped/air-gapped-principles/" data-link-title="斷網環境的通用原則" data-link-desc="離線套件管理、內容搬運、變更追蹤的共通操作模式 — 所有斷網情境都要先建立的基礎能力">斷網環境的通用原則</a>：content ferry 和離線套件管理的通用操作模式</li>
<li>→ <a href="/blog/infra/air-gapped/air-gapped-iac/" data-link-title="斷網環境的 IaC" data-link-desc="Terraform provider mirror、離線 plugin cache、本地 state backend、沒有雲端時的 plan/apply 流程與內網 CI">斷網環境的 IaC</a>：Vault 作為 Terraform 的 secret backend</li>
<li>→ <a href="/blog/infra/air-gapped/air-gapped-container/" data-link-title="斷網環境的容器與映像管理" data-link-desc="Private registry 架設、映像搬運（docker save/load、skopeo）、base image 更新週期、離線漏洞掃描">斷網環境的容器與映像管理</a>：Harbor 依賴 DNS 和 TLS、映像拉取需要信任內部 CA</li>
<li>→ <a href="/blog/infra/02-identity-credentials/" data-link-title="模組二：身分與憑證地基 — IAM 與 OIDC" data-link-desc="IAM role / policy 設計、最小權限，以及用 OIDC 短期憑證取代長期 access key">模組二：身分與憑證地基</a>：Vault 的角色跟雲端的 Secrets Manager 對應</li>
<li>→ <a href="/blog/infra/08-governance-habits/" data-link-title="模組八：治理好習慣 — 規模長大後不失控的最小節奏" data-link-desc="tagging 規範、secrets 不進 code、成本可見性、最小可行節奏，規模長大後不失控">模組八：治理好習慣</a>：Secret 不進 code 的原則在斷網環境用 Vault 落地</li>
</ul>
]]></content:encoded></item></channel></rss>