<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Ci-Cd on Tarragon</title><link>https://tarrragon.github.io/blog/tags/ci-cd/</link><description>Recent content in Ci-Cd on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Fri, 26 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/tags/ci-cd/index.xml" rel="self" type="application/rss+xml"/><item><title>infra 走 PR 流程與自動化護欄</title><link>https://tarrragon.github.io/blog/infra/07-infra-as-pr/plan-review-apply-guardrails/</link><pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/infra/07-infra-as-pr/plan-review-apply-guardrails/</guid><description>&lt;p>infra 變更要走跟 application code 一樣的流程：開分支、提 PR、跑檢查、review diff、合併、發布。這條原則把基礎設施變更從「某個人在自己終端機 apply」轉成「團隊可審查的紀錄」，是 IaC 真正兌現價值的地方，也是解開「只有我懂 infra」這個單點依賴的關鍵。基礎設施跟程式碼一樣會出錯、會需要回溯、會交接給別人，所以它需要同一套保護機制。&lt;/p>
&lt;h2 id="infra-變更走-code-流程">infra 變更走 code 流程&lt;/h2>
&lt;p>infra 變更的標準路徑是 PR → plan → review diff → 合併 → apply。這個順序的核心責任是把「執行前先看清楚要改什麼」變成強制步驟，而不是 apply 之後才從事故裡發現改錯了。每個環節各自承擔一段審查責任，少掉任一段，infra 就退回到不可審查的狀態。&lt;/p>
&lt;h3 id="plan-是整條鏈最關鍵的一環">plan 是整條鏈最關鍵的一環&lt;/h3>
&lt;p>&lt;code>terraform plan&lt;/code> 把當前 state、雲端實際資源、與目標設定三方比對，產出一份「會新增 / 修改 / 刪除哪些資源」的 diff。這份 diff 是 review 的對象：reviewer 直接看 plan 算出來的實際變更，而非讀 HCL 自行想像結果。&lt;/p>
&lt;p>plan 輸出裡最關鍵的判讀訊號是操作類型。&lt;code>+&lt;/code> 是新增，&lt;code>~&lt;/code> 是就地更新，&lt;code>-&lt;/code> 是銷毀，&lt;code>-/+&lt;/code> 是先銷毀再重建。前兩者多數情境是安全的，後兩者需要逐行細看。改一個看似無害的欄位可能觸發整個資源重建（&lt;code>-/+&lt;/code>），例如某些雲資源的 &lt;code>name&lt;/code> 或 &lt;code>identifier&lt;/code> 是 immutable 屬性，改它的唯一方式就是銷毀再建。對有狀態的服務（RDS、帶資料的 EBS volume），&lt;code>-/+&lt;/code> 代表資料遺失或停機。Review 階段抓到這個 &lt;code>-/+&lt;/code>，比 apply 到一半才發現便宜太多。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl"># plan 輸出中要特別警惕的標記
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl"># forces replacement — 某個 immutable 屬性被修改，將觸發銷毀重建
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl"># must be replaced — 跟上面同義，Terraform 新版的表達方式
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl"># will be destroyed — 資源將被刪除
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl"> # aws_db_instance.primary must be replaced
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl"> -/+ resource &amp;#34;aws_db_instance&amp;#34; &amp;#34;primary&amp;#34; {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl"> ~ identifier = &amp;#34;app-prod&amp;#34; -&amp;gt; &amp;#34;app-production&amp;#34; # forces replacement
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl"> ...
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl"> }&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="把-plan-結果貼回-pr">把 plan 結果貼回 PR&lt;/h3>
&lt;p>把 plan 結果貼回 PR 是讓 review 真正生效的做法。流程上，PR 觸發 CI 跑 plan，plan 輸出回貼成 PR comment，reviewer 連同程式碼 diff 一起看；approve 後才允許合併，合併才觸發 apply。&lt;/p>
&lt;p>這裡有個取捨：plan 與 apply 之間若隔了很久，雲端實際狀態可能已經漂移（有人手動改了、或別的 PR 先 apply 了），導致 apply 時的 plan 跟 review 時看到的不一致。應對方式分保守與務實兩種。保守做法是 apply 前重跑一次 plan 並比對結果 — 一致才繼續，不一致就中斷。務實做法是在合併觸發 apply 時自動跑 plan 並只在無 destroy / replace 時自動執行，有 destroy / replace 就停下來要人確認。多數團隊從務實做法開始，到遇過一次 plan-apply 不一致的事故後才升級到保守做法。&lt;/p>
&lt;h3 id="apply-失敗的回退邊界">apply 失敗的回退邊界&lt;/h3>
&lt;p>infra apply 不像程式碼部署可以直接 rollback 到上一版 image — 中途失敗時部分資源已經建立、state 可能處於半完成狀態。例如 apply 建了一個新 subnet 但在建 route table 時 timeout，此時 subnet 存在於雲端和 state 裡，route table 只在雲端不在 state 裡（或反過來），下一次 plan 的計算基礎就不精準。&lt;/p></description><content:encoded><![CDATA[<p>infra 變更要走跟 application code 一樣的流程：開分支、提 PR、跑檢查、review diff、合併、發布。這條原則把基礎設施變更從「某個人在自己終端機 apply」轉成「團隊可審查的紀錄」，是 IaC 真正兌現價值的地方，也是解開「只有我懂 infra」這個單點依賴的關鍵。基礎設施跟程式碼一樣會出錯、會需要回溯、會交接給別人，所以它需要同一套保護機制。</p>
<h2 id="infra-變更走-code-流程">infra 變更走 code 流程</h2>
<p>infra 變更的標準路徑是 PR → plan → review diff → 合併 → apply。這個順序的核心責任是把「執行前先看清楚要改什麼」變成強制步驟，而不是 apply 之後才從事故裡發現改錯了。每個環節各自承擔一段審查責任，少掉任一段，infra 就退回到不可審查的狀態。</p>
<h3 id="plan-是整條鏈最關鍵的一環">plan 是整條鏈最關鍵的一環</h3>
<p><code>terraform plan</code> 把當前 state、雲端實際資源、與目標設定三方比對，產出一份「會新增 / 修改 / 刪除哪些資源」的 diff。這份 diff 是 review 的對象：reviewer 直接看 plan 算出來的實際變更，而非讀 HCL 自行想像結果。</p>
<p>plan 輸出裡最關鍵的判讀訊號是操作類型。<code>+</code> 是新增，<code>~</code> 是就地更新，<code>-</code> 是銷毀，<code>-/+</code> 是先銷毀再重建。前兩者多數情境是安全的，後兩者需要逐行細看。改一個看似無害的欄位可能觸發整個資源重建（<code>-/+</code>），例如某些雲資源的 <code>name</code> 或 <code>identifier</code> 是 immutable 屬性，改它的唯一方式就是銷毀再建。對有狀態的服務（RDS、帶資料的 EBS volume），<code>-/+</code> 代表資料遺失或停機。Review 階段抓到這個 <code>-/+</code>，比 apply 到一半才發現便宜太多。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln"> 1</span><span class="cl"># plan 輸出中要特別警惕的標記
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"># forces replacement  — 某個 immutable 屬性被修改，將觸發銷毀重建
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"># must be replaced    — 跟上面同義，Terraform 新版的表達方式
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"># will be destroyed   — 資源將被刪除
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">  # aws_db_instance.primary must be replaced
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">  -/+ resource &#34;aws_db_instance&#34; &#34;primary&#34; {
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">      ~ identifier = &#34;app-prod&#34; -&gt; &#34;app-production&#34;  # forces replacement
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">        ...
</span></span><span class="line"><span class="ln">10</span><span class="cl">    }</span></span></code></pre></div><h3 id="把-plan-結果貼回-pr">把 plan 結果貼回 PR</h3>
<p>把 plan 結果貼回 PR 是讓 review 真正生效的做法。流程上，PR 觸發 CI 跑 plan，plan 輸出回貼成 PR comment，reviewer 連同程式碼 diff 一起看；approve 後才允許合併，合併才觸發 apply。</p>
<p>這裡有個取捨：plan 與 apply 之間若隔了很久，雲端實際狀態可能已經漂移（有人手動改了、或別的 PR 先 apply 了），導致 apply 時的 plan 跟 review 時看到的不一致。應對方式分保守與務實兩種。保守做法是 apply 前重跑一次 plan 並比對結果 — 一致才繼續，不一致就中斷。務實做法是在合併觸發 apply 時自動跑 plan 並只在無 destroy / replace 時自動執行，有 destroy / replace 就停下來要人確認。多數團隊從務實做法開始，到遇過一次 plan-apply 不一致的事故後才升級到保守做法。</p>
<h3 id="apply-失敗的回退邊界">apply 失敗的回退邊界</h3>
<p>infra apply 不像程式碼部署可以直接 rollback 到上一版 image — 中途失敗時部分資源已經建立、state 可能處於半完成狀態。例如 apply 建了一個新 subnet 但在建 route table 時 timeout，此時 subnet 存在於雲端和 state 裡，route table 只在雲端不在 state 裡（或反過來），下一次 plan 的計算基礎就不精準。</p>
<p>應對的紀律是：apply 失敗後，先跑一次 <code>terraform plan</code> 確認 state 與現實的差距，再決定是修正 code 重新 apply 還是手動清理殘留資源後 <code>terraform state rm</code>。在清理之前不要再改 code、不要連發第二次 apply — 第二次 apply 在不確定的 state 上跑，可能把問題擴大。</p>
<p>PR 流程的價值在這裡不只是事前審查，也是事後可追溯：每次變更都對應一個 commit 與一個 PR，要回溯時知道是哪次改的、為什麼改、誰 review 的。</p>
<h2 id="fmt-與-validate最便宜的第一道檢查">fmt 與 validate：最便宜的第一道檢查</h2>
<p><code>fmt</code> 與 <code>validate</code> 是進到任何安全掃描之前的基礎檢查，責任是擋掉格式不一致與語法 / 型別錯誤這類不需要動腦判斷的問題。它們跑得快（通常不到五秒）、沒有誤判空間，適合放在 CI 最前面當作快速 fail 的關卡。</p>
<p><code>terraform fmt -check</code> 驗證程式碼是否符合標準排版。它本身不影響基礎設施行為，價值在於消除 diff 噪音：當每個人的編輯器縮排習慣不同，PR diff 會混入大量純排版變動，把真正的邏輯變更淹沒，reviewer 更容易看漏。統一格式後，diff 裡剩下的就是語意變更。在本地開發階段配合 editor plugin 或 pre-commit hook 在存檔時自動 fmt，讓 CI 的 fmt check 幾乎不會再 fail — 它存在的意義是攔住那些沒裝 plugin 的人。</p>
<p><code>validate</code> 則檢查設定在語法與內部一致性上是否成立 — reference 到不存在的變數、型別不匹配、必填參數缺漏、module 呼叫的 source 解析不了，這些在 validate 階段就會報錯，不必等到 plan 連線雲端才發現。validate 需要先跑 <code>terraform init</code>，但可以用 <code>-backend=false</code> 跳過連線 state backend，這樣在 CI 裡不需要雲端憑證就能跑完。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c"># .github/workflows/terraform.yml — plan 前的基礎檢查</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">hashicorp/setup-terraform@v3</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform fmt -check -recursive</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform init -backend=false</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform validate</span></span></span></code></pre></div><p>判讀上，fmt 與 validate 失敗代表的是「這份 code 還沒準備好被認真 review」，屬於作者自己該先修掉的問題，不該佔用 reviewer 注意力。把它們設成 CI 必過的 gate，作者在本地就會先跑、先修，PR 送出時已經是乾淨的。</p>
<h2 id="tflint--checkov--tfsec抓壞寫法與安全漏洞">tflint / checkov / tfsec：抓壞寫法與安全漏洞</h2>
<p>fmt 與 validate 確認 code「語法正確」，但語法正確的設定仍然可能是危險的設定。tflint、checkov、tfsec 這類靜態掃描工具承擔的是「語意正確」這層：在不實際建立資源的前提下，從 HCL 裡比對已知的壞寫法與安全反模式，把問題擋在 plan 之前。它們補的是 reviewer 肉眼容易漏掉的盲區 — 人會看漏一個 <code>0.0.0.0/0</code>，規則不會。</p>
<h3 id="三者的側重">三者的側重</h3>
<table>
  <thead>
      <tr>
          <th>工具</th>
          <th>側重領域</th>
          <th>典型命中</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>tflint</td>
          <td>provider 層正確性與慣例</td>
          <td>棄用參數、region 不存在的 instance type、命名違規</td>
      </tr>
      <tr>
          <td>checkov</td>
          <td>安全與合規（CIS benchmark 導向）</td>
          <td>S3 公開、未加密、缺少 log、IAM 過寬</td>
      </tr>
      <tr>
          <td>tfsec</td>
          <td>安全反模式（HCL 結構導向）</td>
          <td>敏感埠全開、未加密、hardcode secret</td>
      </tr>
  </tbody>
</table>
<p>checkov 與 tfsec 的覆蓋範圍有重疊（都會掃 S3 公開與 SG 全開），差別在規則來源與報告格式。checkov 的規則對標 CIS benchmark 和多雲合規框架（AWS、Azure、GCP、Kubernetes），tfsec 更專注在 Terraform HCL 結構。兩者跑在一起時，重複的命中可以用其中一個的 skip 標記豁免。</p>
<h3 id="兩個最常攔下的反模式">兩個最常攔下的反模式</h3>
<p><strong>S3 bucket 對外公開</strong>。一個漏設 <code>block_public_access</code> 或 ACL 寫成 <code>public-read</code> 的 bucket，會讓裡面的物件對整個網際網路可讀。這類設定在 HCL 裡只是一兩行，肉眼 review 時很容易因為「看起來像樣板」而放過，但後果是資料外洩。checkov 規則 <code>CKV_AWS_19</code>（S3 bucket 未啟用 server-side encryption）和 <code>CKV_AWS_53</code>（block public access 未全開）會標記這類漏洞：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># checkov 會攔下的寫法 — 缺少 block_public_access
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1"></span><span class="k">resource</span> <span class="s2">&#34;aws_s3_bucket&#34; &#34;data&#34;</span> {
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="n">  bucket</span> <span class="o">=</span> <span class="s2">&#34;acme-customer-data&#34;</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">}<span class="c1">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="c1">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="c1"># 正確寫法 — 顯式關閉公開存取
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="c1"></span><span class="k">resource</span> <span class="s2">&#34;aws_s3_bucket_public_access_block&#34; &#34;data&#34;</span> {
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="n">  bucket</span>                  <span class="o">=</span> <span class="k">aws_s3_bucket</span><span class="p">.</span><span class="k">data</span><span class="p">.</span><span class="k">id</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="n">  block_public_acls</span>       <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="n">  block_public_policy</span>     <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="n">  ignore_public_acls</span>      <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="n">  restrict_public_buckets</span> <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">}</span></span></code></pre></div><p><strong>Security group 對全世界開放</strong>。一條 ingress 寫成 <code>cidr_blocks = [&quot;0.0.0.0/0&quot;]</code> 加上 port 22 或 3306，等於把 SSH 或資料庫埠暴露給全網掃描器。tfsec 與 checkov 都會標記這種「敏感埠 + 全開 CIDR」的組合。這條規則跟<a href="/blog/infra/03-network-foundation/" data-link-title="模組三：網路地基 — VPC 與分層" data-link-desc="VPC、public / private subnet 切分、route table、NAT、security group 設計">模組三：網路地基</a>講的 security group 收斂原則是同一件事的兩端 — 模組三教怎麼把規則寫對，本章用靜態掃描確保寫錯時擋得下來。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 三道掃描串在一起，任一 fail 就中斷</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">tflint --recursive
</span></span><span class="line"><span class="ln">3</span><span class="cl">checkov -d . --quiet --compact
</span></span><span class="line"><span class="ln">4</span><span class="cl">tfsec . --soft-fail<span class="o">=</span>false</span></span></code></pre></div><h3 id="命中是候選不是判決">命中是候選不是判決</h3>
<p>判讀這些工具的命中時，要區分「真漏洞」與「情境合理的例外」。並非每個 <code>0.0.0.0/0</code> 都是錯 — 一個對外的 HTTPS load balancer 在 port 443 開全網是設計本意。所以掃描的命中是候選不是判決。</p>
<p>多數工具支援用行內註解標記豁免。checkov 用 <code>#checkov:skip=CKV_AWS_260:ALB 443 對外是設計本意</code>，tfsec 用 <code>#tfsec:ignore:aws-elb-alb-not-public</code>。豁免的紀律是：每個 skip 都要寫理由、要在 PR 裡可見。沒有理由的 skip 跟關掉整條規則沒有差別 — review 時看到無理由的 skip 應該當成跟看到裸 <code>0.0.0.0/0</code> 一樣的警報。</p>
<p>把例外顯式化、留下為什麼豁免的紀錄，比關掉整條規則安全。隨時間累積的 skip 也要定期盤點：某個當初合理的例外，在架構演進後可能已經不再合理。</p>
<h2 id="atlantis-與-github-actions自動化-plan-與-apply">Atlantis 與 GitHub Actions：自動化 plan 與 apply</h2>
<p>把上述流程自動化，需要一個能監聽 PR 事件、在對的時機跑 plan 與 apply 的執行層。兩種常見做法是直接用 CI 平台（如 GitHub Actions）寫 workflow，或用 Atlantis 這類專為 Terraform PR 流程設計的工具。</p>
<h3 id="atlantis">Atlantis</h3>
<p>Atlantis 是一個常駐服務，掛在 git 平台的 webhook 上。PR 開啟時它自動跑 <code>plan</code> 並把結果貼回 PR comment，reviewer approve 後在 PR 留言 <code>atlantis apply</code>，它才執行 apply 並回報結果。它的價值在於把「誰能 apply、apply 前要不要 approve、plan 結果在哪看」這些規則收斂成一致的、可設定的流程。</p>
<p>Atlantis 內建的 state lock 語意在多 PR 並行時特別有用：當兩個 PR 都改到同一個 Terraform project，第二個 PR 的 plan 會被 lock 擋住，直到第一個 apply 完成或 PR 關閉。這避免了兩個 PR 各自拿到的 plan 基於不同的 state 快照、apply 時互相覆蓋的問題。用 GitHub Actions 要自己實作這個 lock 邏輯（通常靠 Terraform 自己的 state lock + workflow concurrency group），複雜度高得多。</p>
<p>Atlantis 的代價是它本身是一個要部署、要升級、要保護的常駐服務 — 它持有對雲端的寫入權限，所以它的部署環境必須嚴格控制存取。</p>
<h3 id="github-actions">GitHub Actions</h3>
<p>GitHub Actions workflow 的優點是不必額外維運服務、跟既有 CI 共用同一套 runner。缺點是 apply 的 gating 邏輯要自己用 workflow 條件拼出來。一個完整的 workflow 通常分成兩個 job：PR 觸發 plan job（跑 fmt / validate / scan / plan、把結果貼回 PR），合併到 main 才觸發 apply job。</p>
<p>無論哪種執行層，自動化的 apply 都需要對雲端的寫入權限，而這個權限怎麼來是整條管線的安全根基。這裡正是<a href="/blog/infra/02-identity-credentials/" data-link-title="模組二：身分與憑證地基 — IAM 與 OIDC" data-link-desc="IAM role / policy 設計、最小權限，以及用 OIDC 短期憑證取代長期 access key">模組二：身分與憑證地基</a>鋪設的 OIDC 兌現的地方 — 管線不該存放長期的 access key，而是在 runner 執行時用 OIDC 向雲端換取短期 token。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c"># 合併到主幹後，用 OIDC 換短期憑證再 apply（呼應模組二）</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">apply</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">if</span><span class="p">:</span><span class="w"> </span><span class="l">github.ref == &#39;refs/heads/main&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">    </span><span class="nt">permissions</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">      </span><span class="nt">id-token</span><span class="p">:</span><span class="w"> </span><span class="l">write  </span><span class="w"> </span><span class="c"># 允許 runner 取得 OIDC token</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">      </span><span class="nt">contents</span><span class="p">:</span><span class="w"> </span><span class="l">read</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">aws-actions/configure-aws-credentials@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">          </span><span class="nt">role-to-assume</span><span class="p">:</span><span class="w"> </span><span class="l">arn:aws:iam::123456789012:role/infra-apply</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">          </span><span class="nt">aws-region</span><span class="p">:</span><span class="w"> </span><span class="l">ap-northeast-1</span><span class="w">
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">hashicorp/setup-terraform@v3</span><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform init</span><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform apply -auto-approve</span></span></span></code></pre></div><h3 id="選型判準">選型判準</h3>
<table>
  <thead>
      <tr>
          <th>考量</th>
          <th>GitHub Actions</th>
          <th>Atlantis</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>維運成本</td>
          <td>無額外服務</td>
          <td>需部署 + 升級常駐服務</td>
      </tr>
      <tr>
          <td>state lock</td>
          <td>靠 Terraform 自身 + concurrency</td>
          <td>內建 project lock、跨 PR 互斥</td>
      </tr>
      <tr>
          <td>apply gating</td>
          <td>自己用 environment rule 拼</td>
          <td>內建 approve + <code>atlantis apply</code> 語意</td>
      </tr>
      <tr>
          <td>跨 repo 一致</td>
          <td>每 repo 各自寫 workflow</td>
          <td>一套 server config 管所有 repo</td>
      </tr>
      <tr>
          <td>適合規模</td>
          <td>少量 repo、簡單流程</td>
          <td>多 repo、需統一 apply 治理</td>
      </tr>
  </tbody>
</table>
<p>判讀自動 apply 的邊界：對會觸發資源重建或刪除的高風險 plan，多數團隊會保留人工 apply 的關卡（Atlantis 的手動 <code>atlantis apply</code>、或 workflow 加 environment protection rule 要人按確認），不讓這類變更在合併瞬間無人看管地執行。自動化的目的是消除重複勞動與人為遺漏，不是把判斷也一起省掉。</p>
<h2 id="知識留在-code而不是留在個人腦中">知識留在 code，而不是留在個人腦中</h2>
<p>走完整套 PR 流程後，infra 的真正收穫是知識從個人的記憶移到了 repo 裡。每一次「為什麼這個 security group 開這個埠」「為什麼這台機器選這個 instance type」的決策，都以 code + PR 描述 + review 討論的形式留下，新人讀 repo 就能還原當初的判斷，不必去問那個「只有他懂 infra」的人。基礎設施可被閱讀，等於它可被交接。PR 流程上線後，管理層可以從 repo 的 PR merge 歷史與 plan comment 確認所有 infra 變更都經過提案與審查——這本身就是稽核要求的變更紀錄證據，不需要額外產出。</p>
<h3 id="git-revert-的能力與邊界">git revert 的能力與邊界</h3>
<p>可 revert 是 PR 流程最直接的兌現。當某次變更引發問題，回退手段是 <code>git revert</code> 那個 commit 再走一次 PR 流程，讓基礎設施回到變更前的設定 — 跟回退一段壞掉的程式碼是同一個動作。對照手動操作的舊狀態：回退靠的是當事人記得自己改了什麼、手動在 Console 改回去，記錯或人不在就無從回退。把變更歷史留在 git，回退就從「依賴某人的記憶」變成「依賴版本紀錄」。</p>
<p>這份 revert 能力的邊界要講清楚。revert code 救得回的是「設定」，救不回已經被銷毀的狀態與資料：</p>
<ul>
<li>revert 掉一個刪除 RDS 的 commit，只是讓設定回到「該資源應該存在」。apply 時 Terraform 會試圖建一個新的空資料庫 — 但被刪掉的資料庫裡的資料不會跟著回來。</li>
<li>rename 或 replace 類的變更 revert 後，可能再觸發一次資源重建 — 因為 <code>identifier</code> 又改回去了，而 identifier 是 immutable 屬性。</li>
<li>apply 到一半失敗的 state 不能直接 revert code 修復，得先處理 state 與雲端現實的不一致。</li>
</ul>
<p>stateful 變更的真正回退仍然靠備份與快照，這正是<a href="/blog/infra/05-core-services/" data-link-title="模組五：核心服務上 IaC" data-link-desc="資料庫、運算、儲存、load balancer 怎麼寫進基礎設施程式碼，以及上線順序">模組五：核心服務上 IaC</a> stateful 處理與<a href="/blog/infra/08-governance-habits/" data-link-title="模組八：治理好習慣 — 規模長大後不失控的最小節奏" data-link-desc="tagging 規範、secrets 不進 code、成本可見性、最小可行節奏，規模長大後不失控">模組八：治理好習慣</a> secret / state 保護要顧的事。把 git revert 當「設定層回退」就誠實，把它當「資料層回退」就會在事故裡踩空。</p>
<h3 id="知識共享的判讀訊號">知識共享的判讀訊號</h3>
<p>判讀一個團隊是否確實把知識留在 code 的訊號：當主要負責 infra 的人請假，其他人能不能只靠讀 repo 就理解現狀並安全地改一個小設定。如果答案是「得等他回來」，那不論工具鏈多完整，知識還在個人腦中，PR 流程只是形式。這個訊號比任何工具設定都更能反映 infra 的成熟度。</p>
<p>讓知識真正從個人腦中搬進 repo 的方式，除了 PR 流程本身，還需要組織層的配合 — 刻意的 review 輪替、on-call 輪值、配對操作。這條路線在<a href="/blog/infra/09-driving-adoption/" data-link-title="模組九：怎麼把 infra 推動起來" data-link-desc="技術正確不等於推得動 — 信任赤字、期望值對齊、知識共享，infra 落地的組織課題">模組九：怎麼把 infra 推動起來</a>展開到組織層。本章解決的是技術機制 — code 留得住知識；模組九解決的是怎麼讓團隊實際願意走這套流程、把知識交出來。</p>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>→ <a href="/blog/ci/" data-link-title="CI/CD 教學" data-link-desc="整理 CI/CD 的驗證、建置、發布 gate 與不同部署場域的流程差異，讓每次變更都能被穩定驗證與交付">CI/CD 教學</a>：infra 管線用的就是這套驗證 / 發布 gate，plan / apply 對應 build / deploy 階段</li>
<li>→ <a href="/blog/infra/02-identity-credentials/" data-link-title="模組二：身分與憑證地基 — IAM 與 OIDC" data-link-desc="IAM role / policy 設計、最小權限，以及用 OIDC 短期憑證取代長期 access key">模組二：身分與憑證地基</a>：管線用 OIDC 取得 apply 權限，本章是該章 OIDC 設計的回報兌現處</li>
<li>→ <a href="/blog/infra/03-network-foundation/" data-link-title="模組三：網路地基 — VPC 與分層" data-link-desc="VPC、public / private subnet 切分、route table、NAT、security group 設計">模組三：網路地基</a>：security group 收斂原則，本章用 tfsec / checkov 在 CI 攔下寫錯的全開規則</li>
<li>→ <a href="/blog/infra/05-core-services/" data-link-title="模組五：核心服務上 IaC" data-link-desc="資料庫、運算、儲存、load balancer 怎麼寫進基礎設施程式碼，以及上線順序">模組五：核心服務上 IaC</a>：stateful 資源的保護策略，git revert 救不回資料層</li>
<li>→ <a href="/blog/infra/08-governance-habits/" data-link-title="模組八：治理好習慣 — 規模長大後不失控的最小節奏" data-link-desc="tagging 規範、secrets 不進 code、成本可見性、最小可行節奏，規模長大後不失控">模組八：治理好習慣</a>：secret / state 保護</li>
<li>→ <a href="/blog/infra/09-driving-adoption/" data-link-title="模組九：怎麼把 infra 推動起來" data-link-desc="技術正確不等於推得動 — 信任赤字、期望值對齊、知識共享，infra 落地的組織課題">模組九：怎麼把 infra 推動起來</a>：本章把知識留在 code 的技術機制，在該章展開成組織層的採用與知識共享</li>
<li>→ <a href="/blog/backend/07-security-data-protection/" data-link-title="模組七：資安與資料保護" data-link-desc="以問題驅動方式擴充資安知識網：先定義服務環節問題，再以案例作為觸發式參考">backend 模組七：資安與資料保護</a>：S3 公開、敏感埠全開這類掃描攔截的反模式，對應的資料保護原則</li>
<li>→ <a href="/blog/infra/02-identity-credentials/team-access-management/" data-link-title="團隊權限分級與存取管理" data-link-desc="用 admin / operator / viewer 三級劃分團隊成員的雲端操作權限，設計臨時提權流程、定期 access review 節奏，以及 contractor 與外部 vendor 的存取邊界">團隊權限分級</a>：權限變更走 PR 流程，讓 policy 調整有審查紀錄</li>
<li>→ <a href="/blog/infra/08-governance-habits/handover-design/" data-link-title="職務交接與存取撤銷設計" data-link-desc="人員異動時的存取撤銷順序、credential rotation、最小交接清單，以及讓交接成本結構性降低的 infra 設計原則">職務交接設計</a>：PR 歷史是交接時的知識載體</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/terraform-ci-pipeline-setup/" data-link-title="Terraform CI Pipeline 設定指南" data-link-desc="用 GitHub Actions 建立完整的 Terraform CI pipeline：fmt → validate → tflint → plan → PR comment → apply，含 OIDC credential 與環境保護規則">Terraform CI Pipeline 設定指南</a>：GitHub Actions 完整 workflow</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/checkov-tfsec-rule-customization/" data-link-title="checkov 與 tfsec 規則配置" data-link-desc="靜態掃描工具的規則選擇策略、自訂規則、豁免管理、false positive 處理與 CI 整合，讓掃描從噪音來源變成可信的品質關卡">checkov 與 tfsec 規則配置</a>：規則選擇、豁免管理、CI 整合</li>
</ul>
]]></content:encoded></item><item><title>Terraform CI Pipeline 設定指南</title><link>https://tarrragon.github.io/blog/infra/07-infra-as-pr/terraform-ci-pipeline-setup/</link><pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/infra/07-infra-as-pr/terraform-ci-pipeline-setup/</guid><description>&lt;p>Terraform 的 PR 流程要發揮價值，plan 和 apply 需要在 CI 裡自動執行，而非在工程師的本機跑。本篇用 GitHub Actions 建立一條完整的 pipeline：PR 開啟時跑檢查和 plan、plan 結果貼回 PR comment 讓 reviewer 看、合併到主幹後才 apply。整條管線的 credential 用 OIDC 取得短期 token（見 &lt;a href="https://tarrragon.github.io/blog/infra/02-identity-credentials/oidc-trust-policy-setup/" data-link-title="OIDC Trust Policy 設定指南" data-link-desc="GitHub Actions 與 AWS 之間的 OIDC 聯合設定：建立 provider、設計 trust policy 的 claim 收斂、plan 與 apply role 分離、常見錯誤排查">OIDC Trust Policy 設定&lt;/a>），不存任何長期 key。&lt;/p>
&lt;h2 id="pipeline-的兩個階段">Pipeline 的兩個階段&lt;/h2>
&lt;p>整條 pipeline 分成兩個觸發時機，各自承擔不同責任：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>階段&lt;/th>
 &lt;th>觸發條件&lt;/th>
 &lt;th>責任&lt;/th>
 &lt;th>失敗時&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Plan&lt;/td>
 &lt;td>PR 開啟或更新&lt;/td>
 &lt;td>檢查格式、驗證語法、靜態掃描、產出 plan diff&lt;/td>
 &lt;td>PR 無法合併&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Apply&lt;/td>
 &lt;td>合併到 main&lt;/td>
 &lt;td>把 plan 過的變更套用到雲端&lt;/td>
 &lt;td>需要人工介入&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>兩個階段用不同的 IAM role：plan role 只有唯讀權限（能跑 &lt;code>terraform plan&lt;/code> 但不能改任何資源），apply role 有寫入權限。這個分離確保 PR 階段的任何 code 都沒辦法偷偷改動雲端資源。&lt;/p>
&lt;h2 id="plan-階段的完整-workflow">Plan 階段的完整 workflow&lt;/h2>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Terraform Plan&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">on&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">pull_request&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">paths&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="s1">&amp;#39;infra/**&amp;#39;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">permissions&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">id-token&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">write&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">contents&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">read&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">pull-requests&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">write&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">jobs&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">plan&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">14&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">runs-on&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">ubuntu-latest&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">15&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">defaults&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">16&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">17&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">working-directory&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">infra/environments/prod&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">18&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">19&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">steps&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">20&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">actions/checkout@v4&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">21&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">22&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">aws-actions/configure-aws-credentials@v4&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">23&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">with&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">24&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">role-to-assume&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">arn:aws:iam::123456789012:role/infra-plan&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">25&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">aws-region&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">ap-northeast-1&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">26&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">27&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">hashicorp/setup-terraform@v3&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">28&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">with&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">29&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">terraform_version&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">1.9.0&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">30&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">31&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Format check&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">32&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">terraform fmt -check -recursive -diff&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">33&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">34&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Init&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">35&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">terraform init -input=false&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">36&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">37&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Validate&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">38&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">terraform validate&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">39&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">40&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">TFLint&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">41&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">terraform-linters/setup-tflint@v4&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">42&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">with&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">43&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">tflint_version&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">latest&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">44&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">tflint --recursive --format compact&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">45&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">46&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Plan&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">47&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">id&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">plan&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">48&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">|&lt;/span>&lt;span class="sd">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">49&lt;/span>&lt;span class="cl">&lt;span class="sd"> terraform plan -no-color -input=false -out=tfplan \
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">50&lt;/span>&lt;span class="cl">&lt;span class="sd"> -detailed-exitcode 2&amp;gt;&amp;amp;1 | tee plan-output.txt&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">51&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">continue-on-error&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">true&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">52&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">53&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Comment plan on PR&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">54&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">actions/github-script@v7&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">55&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">with&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">56&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">script&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">|&lt;/span>&lt;span class="sd">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">57&lt;/span>&lt;span class="cl">&lt;span class="sd"> const fs = require(&amp;#39;fs&amp;#39;);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">58&lt;/span>&lt;span class="cl">&lt;span class="sd"> const plan = fs.readFileSync(&amp;#39;infra/environments/prod/plan-output.txt&amp;#39;, &amp;#39;utf8&amp;#39;);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">59&lt;/span>&lt;span class="cl">&lt;span class="sd"> const truncated = plan.length &amp;gt; 60000
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">60&lt;/span>&lt;span class="cl">&lt;span class="sd"> ? plan.substring(0, 60000) + &amp;#39;\n\n... (truncated)&amp;#39;
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">61&lt;/span>&lt;span class="cl">&lt;span class="sd"> : plan;
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">62&lt;/span>&lt;span class="cl">&lt;span class="sd"> await github.rest.issues.createComment({
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">63&lt;/span>&lt;span class="cl">&lt;span class="sd"> owner: context.repo.owner,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">64&lt;/span>&lt;span class="cl">&lt;span class="sd"> repo: context.repo.repo,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">65&lt;/span>&lt;span class="cl">&lt;span class="sd"> issue_number: context.issue.number,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">66&lt;/span>&lt;span class="cl">&lt;span class="sd"> body: `### Terraform Plan\n\`\`\`\n${truncated}\n\`\`\``
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">67&lt;/span>&lt;span class="cl">&lt;span class="sd"> });&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">68&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">69&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Fail if plan errored&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">70&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">if&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">steps.plan.outcome == &amp;#39;failure&amp;#39;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">71&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">exit 1&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="各步驟的職責">各步驟的職責&lt;/h3>
&lt;p>&lt;strong>Format check&lt;/strong> 驗證 HCL 是否符合標準排版。它不影響功能，但消除 diff 噪音——排版不一致時 PR diff 會混入純格式變更，reviewer 分不清哪些是邏輯改動。&lt;code>-diff&lt;/code> flag 讓 CI 輸出具體哪幾行不符合，作者在本地跑 &lt;code>terraform fmt&lt;/code> 就能修。&lt;/p></description><content:encoded><![CDATA[<p>Terraform 的 PR 流程要發揮價值，plan 和 apply 需要在 CI 裡自動執行，而非在工程師的本機跑。本篇用 GitHub Actions 建立一條完整的 pipeline：PR 開啟時跑檢查和 plan、plan 結果貼回 PR comment 讓 reviewer 看、合併到主幹後才 apply。整條管線的 credential 用 OIDC 取得短期 token（見 <a href="/blog/infra/02-identity-credentials/oidc-trust-policy-setup/" data-link-title="OIDC Trust Policy 設定指南" data-link-desc="GitHub Actions 與 AWS 之間的 OIDC 聯合設定：建立 provider、設計 trust policy 的 claim 收斂、plan 與 apply role 分離、常見錯誤排查">OIDC Trust Policy 設定</a>），不存任何長期 key。</p>
<h2 id="pipeline-的兩個階段">Pipeline 的兩個階段</h2>
<p>整條 pipeline 分成兩個觸發時機，各自承擔不同責任：</p>
<table>
  <thead>
      <tr>
          <th>階段</th>
          <th>觸發條件</th>
          <th>責任</th>
          <th>失敗時</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Plan</td>
          <td>PR 開啟或更新</td>
          <td>檢查格式、驗證語法、靜態掃描、產出 plan diff</td>
          <td>PR 無法合併</td>
      </tr>
      <tr>
          <td>Apply</td>
          <td>合併到 main</td>
          <td>把 plan 過的變更套用到雲端</td>
          <td>需要人工介入</td>
      </tr>
  </tbody>
</table>
<p>兩個階段用不同的 IAM role：plan role 只有唯讀權限（能跑 <code>terraform plan</code> 但不能改任何資源），apply role 有寫入權限。這個分離確保 PR 階段的任何 code 都沒辦法偷偷改動雲端資源。</p>
<h2 id="plan-階段的完整-workflow">Plan 階段的完整 workflow</h2>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Terraform Plan</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">on</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">pull_request</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">paths</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">      </span>- <span class="s1">&#39;infra/**&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w"></span><span class="nt">permissions</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">  </span><span class="nt">id-token</span><span class="p">:</span><span class="w"> </span><span class="l">write</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">  </span><span class="nt">contents</span><span class="p">:</span><span class="w"> </span><span class="l">read</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">  </span><span class="nt">pull-requests</span><span class="p">:</span><span class="w"> </span><span class="l">write</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">  </span><span class="nt">plan</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="w">    </span><span class="nt">defaults</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w">      </span><span class="nt">run</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w">        </span><span class="nt">working-directory</span><span class="p">:</span><span class="w"> </span><span class="l">infra/environments/prod</span><span class="w">
</span></span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">19</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">21</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">22</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">aws-actions/configure-aws-credentials@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">23</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">24</span><span class="cl"><span class="w">          </span><span class="nt">role-to-assume</span><span class="p">:</span><span class="w"> </span><span class="l">arn:aws:iam::123456789012:role/infra-plan</span><span class="w">
</span></span></span><span class="line"><span class="ln">25</span><span class="cl"><span class="w">          </span><span class="nt">aws-region</span><span class="p">:</span><span class="w"> </span><span class="l">ap-northeast-1</span><span class="w">
</span></span></span><span class="line"><span class="ln">26</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">27</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">hashicorp/setup-terraform@v3</span><span class="w">
</span></span></span><span class="line"><span class="ln">28</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">29</span><span class="cl"><span class="w">          </span><span class="nt">terraform_version</span><span class="p">:</span><span class="w"> </span><span class="m">1.9.0</span><span class="w">
</span></span></span><span class="line"><span class="ln">30</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">31</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Format check</span><span class="w">
</span></span></span><span class="line"><span class="ln">32</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform fmt -check -recursive -diff</span><span class="w">
</span></span></span><span class="line"><span class="ln">33</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">34</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Init</span><span class="w">
</span></span></span><span class="line"><span class="ln">35</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform init -input=false</span><span class="w">
</span></span></span><span class="line"><span class="ln">36</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">37</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Validate</span><span class="w">
</span></span></span><span class="line"><span class="ln">38</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform validate</span><span class="w">
</span></span></span><span class="line"><span class="ln">39</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">40</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">TFLint</span><span class="w">
</span></span></span><span class="line"><span class="ln">41</span><span class="cl"><span class="w">        </span><span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">terraform-linters/setup-tflint@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">42</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">43</span><span class="cl"><span class="w">          </span><span class="nt">tflint_version</span><span class="p">:</span><span class="w"> </span><span class="l">latest</span><span class="w">
</span></span></span><span class="line"><span class="ln">44</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">tflint --recursive --format compact</span><span class="w">
</span></span></span><span class="line"><span class="ln">45</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">46</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Plan</span><span class="w">
</span></span></span><span class="line"><span class="ln">47</span><span class="cl"><span class="w">        </span><span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="l">plan</span><span class="w">
</span></span></span><span class="line"><span class="ln">48</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="ln">49</span><span class="cl"><span class="sd">          terraform plan -no-color -input=false -out=tfplan \
</span></span></span><span class="line"><span class="ln">50</span><span class="cl"><span class="sd">            -detailed-exitcode 2&gt;&amp;1 | tee plan-output.txt</span><span class="w">
</span></span></span><span class="line"><span class="ln">51</span><span class="cl"><span class="w">        </span><span class="nt">continue-on-error</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="ln">52</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">53</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Comment plan on PR</span><span class="w">
</span></span></span><span class="line"><span class="ln">54</span><span class="cl"><span class="w">        </span><span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/github-script@v7</span><span class="w">
</span></span></span><span class="line"><span class="ln">55</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">56</span><span class="cl"><span class="w">          </span><span class="nt">script</span><span class="p">:</span><span class="w"> </span><span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="ln">57</span><span class="cl"><span class="sd">            const fs = require(&#39;fs&#39;);
</span></span></span><span class="line"><span class="ln">58</span><span class="cl"><span class="sd">            const plan = fs.readFileSync(&#39;infra/environments/prod/plan-output.txt&#39;, &#39;utf8&#39;);
</span></span></span><span class="line"><span class="ln">59</span><span class="cl"><span class="sd">            const truncated = plan.length &gt; 60000
</span></span></span><span class="line"><span class="ln">60</span><span class="cl"><span class="sd">              ? plan.substring(0, 60000) + &#39;\n\n... (truncated)&#39;
</span></span></span><span class="line"><span class="ln">61</span><span class="cl"><span class="sd">              : plan;
</span></span></span><span class="line"><span class="ln">62</span><span class="cl"><span class="sd">            await github.rest.issues.createComment({
</span></span></span><span class="line"><span class="ln">63</span><span class="cl"><span class="sd">              owner: context.repo.owner,
</span></span></span><span class="line"><span class="ln">64</span><span class="cl"><span class="sd">              repo: context.repo.repo,
</span></span></span><span class="line"><span class="ln">65</span><span class="cl"><span class="sd">              issue_number: context.issue.number,
</span></span></span><span class="line"><span class="ln">66</span><span class="cl"><span class="sd">              body: `### Terraform Plan\n\`\`\`\n${truncated}\n\`\`\``
</span></span></span><span class="line"><span class="ln">67</span><span class="cl"><span class="sd">            });</span><span class="w">
</span></span></span><span class="line"><span class="ln">68</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">69</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Fail if plan errored</span><span class="w">
</span></span></span><span class="line"><span class="ln">70</span><span class="cl"><span class="w">        </span><span class="nt">if</span><span class="p">:</span><span class="w"> </span><span class="l">steps.plan.outcome == &#39;failure&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln">71</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">exit 1</span></span></span></code></pre></div><h3 id="各步驟的職責">各步驟的職責</h3>
<p><strong>Format check</strong> 驗證 HCL 是否符合標準排版。它不影響功能，但消除 diff 噪音——排版不一致時 PR diff 會混入純格式變更，reviewer 分不清哪些是邏輯改動。<code>-diff</code> flag 讓 CI 輸出具體哪幾行不符合，作者在本地跑 <code>terraform fmt</code> 就能修。</p>
<p><strong>Init</strong> 初始化 provider 和 backend。<code>-input=false</code> 避免 CI 卡在等待互動式輸入。如果 backend 設定錯了（bucket 不存在、權限不足），這一步就會失敗，不會跑到後面浪費時間。</p>
<p><strong>Validate</strong> 檢查 HCL 的語法和內部一致性——變數沒宣告、型別不匹配、必填參數缺漏。它不連線雲端，只讀 code，所以不需要 AWS credential 也能跑（但放在 init 之後是因為 validate 需要 provider schema）。</p>
<p><strong>TFLint</strong> 做 provider 層的正確性檢查：instance type 在該 region 不存在、已棄用的參數、命名不符規範。它補的是 validate 抓不到的「語法對但值不對」的問題。</p>
<p><strong>Plan</strong> 是整條 pipeline 的核心產出。<code>-detailed-exitcode</code> 讓 exit code 區分三種狀態：0 = 無差異、1 = 錯誤、2 = 有差異。<code>-out=tfplan</code> 把 plan 結果存成二進位檔，apply 階段可以直接用這份 plan 執行，避免 plan 和 apply 之間的時間差導致不一致。</p>
<p><strong>Comment</strong> 把 plan 輸出貼回 PR，reviewer 看 code diff 的同時看到 plan 的實際變更。plan 輸出可能很長（幾百行），超過 GitHub comment 上限時截斷，但保留開頭（通常包含 add/change/destroy 的摘要行）。</p>
<h2 id="apply-階段">Apply 階段</h2>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Terraform Apply</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">on</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">push</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">branches</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">main]</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="nt">paths</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">      </span>- <span class="s1">&#39;infra/**&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w"></span><span class="nt">permissions</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">  </span><span class="nt">id-token</span><span class="p">:</span><span class="w"> </span><span class="l">write</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">  </span><span class="nt">contents</span><span class="p">:</span><span class="w"> </span><span class="l">read</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">  </span><span class="nt">apply</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="w">    </span><span class="nt">environment</span><span class="p">:</span><span class="w"> </span><span class="l">production</span><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w">    </span><span class="nt">defaults</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w">      </span><span class="nt">run</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="w">        </span><span class="nt">working-directory</span><span class="p">:</span><span class="w"> </span><span class="l">infra/environments/prod</span><span class="w">
</span></span></span><span class="line"><span class="ln">19</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">21</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">22</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">23</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">aws-actions/configure-aws-credentials@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">24</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">25</span><span class="cl"><span class="w">          </span><span class="nt">role-to-assume</span><span class="p">:</span><span class="w"> </span><span class="l">arn:aws:iam::123456789012:role/infra-apply</span><span class="w">
</span></span></span><span class="line"><span class="ln">26</span><span class="cl"><span class="w">          </span><span class="nt">aws-region</span><span class="p">:</span><span class="w"> </span><span class="l">ap-northeast-1</span><span class="w">
</span></span></span><span class="line"><span class="ln">27</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">28</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">hashicorp/setup-terraform@v3</span><span class="w">
</span></span></span><span class="line"><span class="ln">29</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">30</span><span class="cl"><span class="w">          </span><span class="nt">terraform_version</span><span class="p">:</span><span class="w"> </span><span class="m">1.9.0</span><span class="w">
</span></span></span><span class="line"><span class="ln">31</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">32</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Init</span><span class="w">
</span></span></span><span class="line"><span class="ln">33</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform init -input=false</span><span class="w">
</span></span></span><span class="line"><span class="ln">34</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">35</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Plan (verify)</span><span class="w">
</span></span></span><span class="line"><span class="ln">36</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform plan -no-color -input=false -detailed-exitcode</span><span class="w">
</span></span></span><span class="line"><span class="ln">37</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">38</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Apply</span><span class="w">
</span></span></span><span class="line"><span class="ln">39</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform apply -auto-approve -input=false</span></span></span></code></pre></div><h3 id="environment-protection-rule">environment protection rule</h3>
<p><code>environment: production</code> 這一行啟用 GitHub 的環境保護功能。在 repo 的 Settings → Environments → production 設定：</p>
<ul>
<li><strong>Required reviewers</strong>：指定至少一個人 approve 才能執行 apply job</li>
<li><strong>Wait timer</strong>：合併後等 N 分鐘才開始 apply（給人反應時間）</li>
<li><strong>Deployment branches</strong>：限定只有 main branch 能觸發</li>
</ul>
<p>這層保護讓高風險的變更（plan 顯示 destroy 或 replace）在 apply 前多一道人工確認。日常低風險變更（加一個 tag、調一個參數）可以直接通過。取捨點是：每次 apply 都要人按確認會拖慢頻繁的小變更，可以用 deployment rule 的條件只攔 production 環境。</p>
<h3 id="apply-階段重跑-plan-的理由">Apply 階段重跑 plan 的理由</h3>
<p>apply 之前重跑一次 plan，是為了驗證合併後的現實跟 PR review 時看到的一致。PR 從開啟到合併可能隔了幾小時或幾天，期間有人可能手動改了雲端資源（drift）或別的 PR 先 apply 了。重跑 plan 確認差異跟預期一致，不一致就停下來而非盲目 apply。</p>
<p>如果使用了 plan 階段的 <code>-out=tfplan</code> 保存 plan 檔，apply 可以改為 <code>terraform apply tfplan</code> 直接執行已 review 過的 plan。代價是 plan 檔需要跨 job 傳遞（GitHub Actions 的 artifact），且 plan 檔有時效——state 在 plan 之後被修改，apply 會拒絕執行。</p>
<h2 id="多環境的-pipeline-設計">多環境的 pipeline 設計</h2>
<p>管理 dev / staging / prod 三個環境時，pipeline 有兩種常見結構：</p>
<p><strong>單 workflow 加 matrix</strong>：一份 YAML 用 <code>strategy.matrix</code> 跑三個環境，每個環境有自己的 working directory 和 IAM role。好處是維護一份 YAML；代價是三個環境的 plan 都在同一次 PR run 裡，reviewer 要看三份 plan 輸出。</p>
<p><strong>每環境獨立 workflow</strong>：三份 YAML 各自觸發在對應環境目錄的變更上（<code>paths: ['infra/environments/dev/**']</code>）。好處是只有改到的環境才跑、PR comment 乾淨；代價是三份 YAML 有重複。</p>
<p>多數團隊起步時用單 workflow + matrix，環境數量超過三個或各環境的 apply 策略不同（dev 自動、prod 要 approval）時切到獨立 workflow。</p>
<h2 id="安全邊界">安全邊界</h2>
<p>CI pipeline 是 infra 變更的自動化執行者，它的安全性等同於 apply role 的權限。幾個邊界要守住：</p>
<p><strong>OIDC claim 收斂</strong>：apply role 的 trust policy 只允許特定 repo 的 main branch 假扮（見 <a href="/blog/infra/02-identity-credentials/oidc-trust-policy-setup/" data-link-title="OIDC Trust Policy 設定指南" data-link-desc="GitHub Actions 與 AWS 之間的 OIDC 聯合設定：建立 provider、設計 trust policy 的 claim 收斂、plan 與 apply role 分離、常見錯誤排查">OIDC Trust Policy 設定</a>）。如果 claim 只驗 repo 不驗 branch，任何人在 feature branch 推一個修改過的 workflow 就能觸發 apply。</p>
<p><strong>Workflow 修改的 review</strong>：<code>.github/workflows/</code> 底下的 YAML 變更應該跟 infra code 一樣走 PR review。修改 workflow 等於修改 pipeline 的行為——加一個 <code>terraform destroy</code> step 就能在合併時清掉整個環境。GitHub 的 CODEOWNERS 功能可以強制特定人 review workflow 變更。</p>
<p><strong>Secret 與 environment variable</strong>：OIDC 取代了存在 repo secrets 裡的 access key，但 workflow 可能還用到其他 secret（Terraform Cloud token、Slack webhook URL）。這些 secret 要限定在特定 environment 才能存取，不開放給所有 branch。</p>
<p>本篇聚焦 GitHub Actions。如果團隊選擇 Atlantis（常駐服務、內建 state lock 與 apply 語意），見<a href="/blog/infra/07-infra-as-pr/plan-review-apply-guardrails/" data-link-title="infra 走 PR 流程與自動化護欄" data-link-desc="infra 變更走 PR → plan → review diff → 合併 → apply，配 fmt / validate / tflint / checkov / tfsec 與 Atlantis 自動化，讓基礎設施可審查、可回溯、可交接">主文章的 Atlantis 段</a>的選型討論。</p>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>→ <a href="/blog/infra/02-identity-credentials/oidc-trust-policy-setup/" data-link-title="OIDC Trust Policy 設定指南" data-link-desc="GitHub Actions 與 AWS 之間的 OIDC 聯合設定：建立 provider、設計 trust policy 的 claim 收斂、plan 與 apply role 分離、常見錯誤排查">OIDC Trust Policy 設定</a>：pipeline 的 credential 來源</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/checkov-tfsec-rule-customization/" data-link-title="checkov 與 tfsec 規則配置" data-link-desc="靜態掃描工具的規則選擇策略、自訂規則、豁免管理、false positive 處理與 CI 整合，讓掃描從噪音來源變成可信的品質關卡">checkov / tfsec 規則配置</a>：pipeline 裡的靜態安全掃描怎麼配</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/plan-review-apply-guardrails/" data-link-title="infra 走 PR 流程與自動化護欄" data-link-desc="infra 變更走 PR → plan → review diff → 合併 → apply，配 fmt / validate / tflint / checkov / tfsec 與 Atlantis 自動化，讓基礎設施可審查、可回溯、可交接">infra 走 PR 流程與自動化護欄</a>：pipeline 背後的審查原則</li>
<li>→ <a href="/blog/infra/04-environment-separation/" data-link-title="模組四：環境分離與模組化" data-link-desc="dev / staging / prod 切分、目錄結構 vs workspace、用可重用 module 避免環境漂移">模組四：環境分離與模組化</a>：多環境的目錄結構決定 pipeline 的 working directory</li>
</ul>
]]></content:encoded></item><item><title>checkov 與 tfsec 規則配置</title><link>https://tarrragon.github.io/blog/infra/07-infra-as-pr/checkov-tfsec-rule-customization/</link><pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/infra/07-infra-as-pr/checkov-tfsec-rule-customization/</guid><description>&lt;p>checkov 和 tfsec 安裝後直接跑，通常會產出幾十到幾百條命中。全部修完不切實際、全部忽略又失去價值。這篇處理的是怎麼從「裝了工具」走到「工具的產出可信且可操作」——規則選擇、嚴重度過濾、豁免管理、自訂規則、CI 整合，以及 false positive 的處理流程。&lt;/p>
&lt;h2 id="規則選擇策略">規則選擇策略&lt;/h2>
&lt;p>兩個工具的內建規則集都超過數百條，涵蓋從加密設定到命名慣例。全開跑會讓命中清單長到沒人看。規則選擇的判準是「這條規則命中後，團隊會不會真的去修」——答案是不會的規則，開著只是製造噪音。&lt;/p>
&lt;h3 id="分層啟用">分層啟用&lt;/h3>
&lt;p>把規則分成三層逐步啟用，而非一次全開：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>層次&lt;/th>
 &lt;th>規則類型&lt;/th>
 &lt;th>範例&lt;/th>
 &lt;th>啟用時機&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>地基層&lt;/td>
 &lt;td>資料外洩與權限失控&lt;/td>
 &lt;td>S3 public access、SG 0.0.0.0/0、IAM wildcard&lt;/td>
 &lt;td>day 1&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>營運層&lt;/td>
 &lt;td>加密與備份&lt;/td>
 &lt;td>RDS encryption、EBS encryption、backup retention&lt;/td>
 &lt;td>IaC 覆蓋率 &amp;gt;50%&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>規範層&lt;/td>
 &lt;td>命名、tagging、logging&lt;/td>
 &lt;td>缺 tag、缺 log group、resource naming&lt;/td>
 &lt;td>治理成熟後&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>地基層是即使其他規則都關掉也要開的——S3 bucket 對外公開（&lt;code>CKV_AWS_19&lt;/code>、&lt;code>CKV_AWS_53&lt;/code>）和 security group 全開（&lt;code>CKV_AWS_24&lt;/code>、&lt;code>CKV_AWS_25&lt;/code>）這類規則命中就是真問題。營運層在 IaC 覆蓋率夠高時啟用，否則會掃到大量不在 IaC 管理內的資源。規範層等團隊有能力消化命中量再開。&lt;/p>
&lt;h3 id="checkov-的規則過濾">checkov 的規則過濾&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># 只跑地基層規則&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">checkov -d . --check CKV_AWS_19,CKV_AWS_53,CKV_AWS_24,CKV_AWS_25,CKV_AWS_40,CKV_AWS_145
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="c1"># 或者用 framework 過濾（只掃 Terraform）&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">checkov -d . --framework terraform --compact --quiet&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>checkov 支援 &lt;code>--check&lt;/code>（白名單，只跑這些）和 &lt;code>--skip-check&lt;/code>（黑名單，跳過這些）。初期用 &lt;code>--check&lt;/code> 白名單比較可控——明確列出要跑的規則，而非從全集去扣。隨著團隊消化能力提升再擴大白名單。&lt;/p>
&lt;h3 id="tfsec-的嚴重度過濾">tfsec 的嚴重度過濾&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># 只報 CRITICAL 和 HIGH&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">tfsec . --minimum-severity HIGH
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="c1"># 排除特定規則&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">tfsec . --exclude aws-s3-specify-public-access-block&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>tfsec 的嚴重度分 CRITICAL / HIGH / MEDIUM / LOW。初期設 &lt;code>--minimum-severity HIGH&lt;/code> 把低嚴重度的過濾掉，減少噪音量。降低閾值的時機是 HIGH 以上的命中清零後。&lt;/p>
&lt;h2 id="豁免管理">豁免管理&lt;/h2>
&lt;p>不是每個命中都是錯——對外的 ALB 在 port 443 開 &lt;code>0.0.0.0/0&lt;/code> 是設計意圖、不是漏洞。豁免的重點是讓例外顯式化、有理由、可被 review。&lt;/p>
&lt;h3 id="行內豁免">行內豁免&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-hcl" data-lang="hcl">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="k">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_security_group_rule&amp;#34; &amp;#34;alb_https&amp;#34;&lt;/span> {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="n"> type&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;ingress&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="n"> from_port&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="m">443&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="n"> to_port&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="m">443&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">&lt;span class="n"> protocol&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;tcp&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">&lt;span class="n"> cidr_blocks&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;0.0.0.0/0&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="c1">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">&lt;span class="c1"> #checkov:skip=CKV_AWS_24:ALB 的 HTTPS 入站需要對外開放
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">8&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>}&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>tfsec 的行內豁免：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-hcl" data-lang="hcl">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="k">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_security_group_rule&amp;#34; &amp;#34;alb_https&amp;#34;&lt;/span> {&lt;span class="c1">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="c1"> #tfsec:ignore:aws-ec2-no-public-ingress-sgr -- ALB HTTPS listener requires public access
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="n"> cidr_blocks&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;0.0.0.0/0&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">}&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>行內豁免的好處是理由跟程式碼在一起，review 時一眼可見。壞處是散落在各檔案裡，盤點所有豁免要 grep。&lt;/p>
&lt;h3 id="集中式豁免">集中式豁免&lt;/h3>
&lt;p>checkov 支援 &lt;code>.checkov.yaml&lt;/code> 集中管理豁免：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c"># .checkov.yaml&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">skip-check&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="l">CKV_AWS_24 &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># ALB public-facing SG rules&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="l">CKV_AWS_19 &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Legacy S3 buckets pending migration&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>集中式的好處是一個地方看到所有豁免，適合全域性的例外（如「這批 legacy S3 bucket 還沒遷完、暫時跳過 public access 檢查」）。壞處是理由離程式碼太遠，三個月後沒人記得為什麼跳過。&lt;/p>
&lt;h3 id="豁免紀律">豁免紀律&lt;/h3>
&lt;p>每個豁免都要寫理由（&lt;code>--&lt;/code> 之後的文字）。沒有理由的豁免等於靜默跳過——review 時看不出是故意的還是為了讓 CI 過而隨手加的。定期（每季度）跑一次豁免盤點：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># 盤點所有 checkov 豁免&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">grep -rn &lt;span class="s2">&amp;#34;checkov:skip&amp;#34;&lt;/span> --include&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;*.tf&amp;#34;&lt;/span> .
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="c1"># 盤點所有 tfsec 豁免&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">grep -rn &lt;span class="s2">&amp;#34;tfsec:ignore&amp;#34;&lt;/span> --include&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;*.tf&amp;#34;&lt;/span> .&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>每個命中問一句：當初跳過的原因還成立嗎？legacy 遷移完了嗎？臨時的例外變成永久的了嗎？&lt;/p></description><content:encoded><![CDATA[<p>checkov 和 tfsec 安裝後直接跑，通常會產出幾十到幾百條命中。全部修完不切實際、全部忽略又失去價值。這篇處理的是怎麼從「裝了工具」走到「工具的產出可信且可操作」——規則選擇、嚴重度過濾、豁免管理、自訂規則、CI 整合，以及 false positive 的處理流程。</p>
<h2 id="規則選擇策略">規則選擇策略</h2>
<p>兩個工具的內建規則集都超過數百條，涵蓋從加密設定到命名慣例。全開跑會讓命中清單長到沒人看。規則選擇的判準是「這條規則命中後，團隊會不會真的去修」——答案是不會的規則，開著只是製造噪音。</p>
<h3 id="分層啟用">分層啟用</h3>
<p>把規則分成三層逐步啟用，而非一次全開：</p>
<table>
  <thead>
      <tr>
          <th>層次</th>
          <th>規則類型</th>
          <th>範例</th>
          <th>啟用時機</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>地基層</td>
          <td>資料外洩與權限失控</td>
          <td>S3 public access、SG 0.0.0.0/0、IAM wildcard</td>
          <td>day 1</td>
      </tr>
      <tr>
          <td>營運層</td>
          <td>加密與備份</td>
          <td>RDS encryption、EBS encryption、backup retention</td>
          <td>IaC 覆蓋率 &gt;50%</td>
      </tr>
      <tr>
          <td>規範層</td>
          <td>命名、tagging、logging</td>
          <td>缺 tag、缺 log group、resource naming</td>
          <td>治理成熟後</td>
      </tr>
  </tbody>
</table>
<p>地基層是即使其他規則都關掉也要開的——S3 bucket 對外公開（<code>CKV_AWS_19</code>、<code>CKV_AWS_53</code>）和 security group 全開（<code>CKV_AWS_24</code>、<code>CKV_AWS_25</code>）這類規則命中就是真問題。營運層在 IaC 覆蓋率夠高時啟用，否則會掃到大量不在 IaC 管理內的資源。規範層等團隊有能力消化命中量再開。</p>
<h3 id="checkov-的規則過濾">checkov 的規則過濾</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 只跑地基層規則</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">checkov -d . --check CKV_AWS_19,CKV_AWS_53,CKV_AWS_24,CKV_AWS_25,CKV_AWS_40,CKV_AWS_145
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 或者用 framework 過濾（只掃 Terraform）</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">checkov -d . --framework terraform --compact --quiet</span></span></code></pre></div><p>checkov 支援 <code>--check</code>（白名單，只跑這些）和 <code>--skip-check</code>（黑名單，跳過這些）。初期用 <code>--check</code> 白名單比較可控——明確列出要跑的規則，而非從全集去扣。隨著團隊消化能力提升再擴大白名單。</p>
<h3 id="tfsec-的嚴重度過濾">tfsec 的嚴重度過濾</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 只報 CRITICAL 和 HIGH</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">tfsec . --minimum-severity HIGH
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 排除特定規則</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">tfsec . --exclude aws-s3-specify-public-access-block</span></span></code></pre></div><p>tfsec 的嚴重度分 CRITICAL / HIGH / MEDIUM / LOW。初期設 <code>--minimum-severity HIGH</code> 把低嚴重度的過濾掉，減少噪音量。降低閾值的時機是 HIGH 以上的命中清零後。</p>
<h2 id="豁免管理">豁免管理</h2>
<p>不是每個命中都是錯——對外的 ALB 在 port 443 開 <code>0.0.0.0/0</code> 是設計意圖、不是漏洞。豁免的重點是讓例外顯式化、有理由、可被 review。</p>
<h3 id="行內豁免">行內豁免</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_security_group_rule&#34; &#34;alb_https&#34;</span> {
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="n">  type</span>        <span class="o">=</span> <span class="s2">&#34;ingress&#34;</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="n">  from_port</span>   <span class="o">=</span> <span class="m">443</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="n">  to_port</span>     <span class="o">=</span> <span class="m">443</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="n">  protocol</span>    <span class="o">=</span> <span class="s2">&#34;tcp&#34;</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="n">  cidr_blocks</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;0.0.0.0/0&#34;</span><span class="p">]</span><span class="c1">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="c1">  #checkov:skip=CKV_AWS_24:ALB 的 HTTPS 入站需要對外開放
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="c1"></span>}</span></span></code></pre></div><p>tfsec 的行內豁免：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_security_group_rule&#34; &#34;alb_https&#34;</span> {<span class="c1">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1">  #tfsec:ignore:aws-ec2-no-public-ingress-sgr -- ALB HTTPS listener requires public access
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="c1"></span><span class="n">  cidr_blocks</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;0.0.0.0/0&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">}</span></span></code></pre></div><p>行內豁免的好處是理由跟程式碼在一起，review 時一眼可見。壞處是散落在各檔案裡，盤點所有豁免要 grep。</p>
<h3 id="集中式豁免">集中式豁免</h3>
<p>checkov 支援 <code>.checkov.yaml</code> 集中管理豁免：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln">1</span><span class="cl"><span class="c"># .checkov.yaml</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="nt">skip-check</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">  </span>- <span class="l">CKV_AWS_24 </span><span class="w"> </span><span class="c"># ALB public-facing SG rules</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">  </span>- <span class="l">CKV_AWS_19 </span><span class="w"> </span><span class="c"># Legacy S3 buckets pending migration</span></span></span></code></pre></div><p>集中式的好處是一個地方看到所有豁免，適合全域性的例外（如「這批 legacy S3 bucket 還沒遷完、暫時跳過 public access 檢查」）。壞處是理由離程式碼太遠，三個月後沒人記得為什麼跳過。</p>
<h3 id="豁免紀律">豁免紀律</h3>
<p>每個豁免都要寫理由（<code>--</code> 之後的文字）。沒有理由的豁免等於靜默跳過——review 時看不出是故意的還是為了讓 CI 過而隨手加的。定期（每季度）跑一次豁免盤點：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 盤點所有 checkov 豁免</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">grep -rn <span class="s2">&#34;checkov:skip&#34;</span> --include<span class="o">=</span><span class="s2">&#34;*.tf&#34;</span> .
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 盤點所有 tfsec 豁免</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">grep -rn <span class="s2">&#34;tfsec:ignore&#34;</span> --include<span class="o">=</span><span class="s2">&#34;*.tf&#34;</span> .</span></span></code></pre></div><p>每個命中問一句：當初跳過的原因還成立嗎？legacy 遷移完了嗎？臨時的例外變成永久的了嗎？</p>
<h2 id="自訂規則">自訂規則</h2>
<p>內建規則覆蓋通用安全實踐，但專案特有的規範（如「所有 RDS 必須有 <code>cost-center</code> tag」「所有 S3 bucket 名稱必須以公司前綴開頭」）需要自訂。</p>
<h3 id="checkov-自訂規則python">checkov 自訂規則（Python）</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># custom_checks/require_cost_center_tag.py</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="kn">from</span> <span class="nn">checkov.terraform.checks.resource.base_resource_check</span> <span class="kn">import</span> <span class="n">BaseResourceCheck</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="kn">from</span> <span class="nn">checkov.common.models.enums</span> <span class="kn">import</span> <span class="n">CheckResult</span><span class="p">,</span> <span class="n">CheckCategories</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="k">class</span> <span class="nc">CostCenterTagRequired</span><span class="p">(</span><span class="n">BaseResourceCheck</span><span class="p">):</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">        <span class="n">name</span> <span class="o">=</span> <span class="s2">&#34;Ensure cost-center tag is present&#34;</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">        <span class="nb">id</span> <span class="o">=</span> <span class="s2">&#34;CUSTOM_001&#34;</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">        <span class="n">supported_resources</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;aws_instance&#34;</span><span class="p">,</span> <span class="s2">&#34;aws_db_instance&#34;</span><span class="p">,</span> <span class="s2">&#34;aws_s3_bucket&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">        <span class="n">categories</span> <span class="o">=</span> <span class="p">[</span><span class="n">CheckCategories</span><span class="o">.</span><span class="n">GENERAL_SECURITY</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">        <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="n">name</span><span class="p">,</span> <span class="nb">id</span><span class="o">=</span><span class="nb">id</span><span class="p">,</span> <span class="n">categories</span><span class="o">=</span><span class="n">categories</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">                         <span class="n">supported_resources</span><span class="o">=</span><span class="n">supported_resources</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">
</span></span><span class="line"><span class="ln">14</span><span class="cl">    <span class="k">def</span> <span class="nf">scan_resource_conf</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">conf</span><span class="p">):</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">        <span class="n">tags</span> <span class="o">=</span> <span class="n">conf</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&#34;tags&#34;</span><span class="p">,</span> <span class="p">[{}])[</span><span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">16</span><span class="cl">        <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">tags</span><span class="p">,</span> <span class="nb">dict</span><span class="p">)</span> <span class="ow">and</span> <span class="s2">&#34;cost-center&#34;</span> <span class="ow">in</span> <span class="n">tags</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">            <span class="k">return</span> <span class="n">CheckResult</span><span class="o">.</span><span class="n">PASSED</span>
</span></span><span class="line"><span class="ln">18</span><span class="cl">        <span class="k">return</span> <span class="n">CheckResult</span><span class="o">.</span><span class="n">FAILED</span>
</span></span><span class="line"><span class="ln">19</span><span class="cl">
</span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="n">check</span> <span class="o">=</span> <span class="n">CostCenterTagRequired</span><span class="p">()</span></span></span></code></pre></div>




<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 跑自訂規則</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">checkov -d . --external-checks-dir ./custom_checks</span></span></code></pre></div><h3 id="tfsec-自訂規則yaml">tfsec 自訂規則（YAML）</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c"># .tfsec/custom_rules.yaml</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="l">CUSTOM_001</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l">S3 bucket name must start with company prefix</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">  </span><span class="nt">impact</span><span class="p">:</span><span class="w"> </span><span class="l">Non-standard naming breaks cross-account policies</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">  </span><span class="nt">resolution</span><span class="p">:</span><span class="w"> </span><span class="l">Add company prefix to bucket name</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">  </span><span class="nt">requiredTypes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">    </span>- <span class="l">resource</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">  </span><span class="nt">requiredLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">    </span>- <span class="l">aws_s3_bucket</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">  </span><span class="nt">severity</span><span class="p">:</span><span class="w"> </span><span class="l">MEDIUM</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">  </span><span class="nt">matchSpec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">    </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">bucket</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">    </span><span class="nt">action</span><span class="p">:</span><span class="w"> </span><span class="l">startsWith</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">    </span><span class="nt">value</span><span class="p">:</span><span class="w"> </span><span class="l">acme-</span></span></span></code></pre></div><p>自訂規則的數量保持精簡——每條規則都是維護成本。只有「違反後會在後續流程造成問題」的規範值得寫成自動化規則，純粹的風格偏好留給 review 時口頭提醒。</p>
<h2 id="ci-整合">CI 整合</h2>
<p>把掃描接進 CI 的目標是「PR 合併前就攔下問題」，而非 apply 之後才發現。</p>
<h3 id="github-actions-範例">GitHub Actions 範例</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w">  </span><span class="nt">security-scan</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Run checkov</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">        </span><span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">bridgecrewio/checkov-action@v12</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">          </span><span class="nt">directory</span><span class="p">:</span><span class="w"> </span><span class="l">.</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">          </span><span class="nt">check</span><span class="p">:</span><span class="w"> </span><span class="l">CKV_AWS_19,CKV_AWS_53,CKV_AWS_24,CKV_AWS_25</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">          </span><span class="nt">quiet</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">          </span><span class="nt">compact</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">          </span><span class="nt">soft_fail</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Run tfsec</span><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w">        </span><span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">aquasecurity/tfsec-action@v1</span><span class="w">
</span></span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">19</span><span class="cl"><span class="w">          </span><span class="nt">minimum_severity</span><span class="p">:</span><span class="w"> </span><span class="l">HIGH</span><span class="w">
</span></span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="w">          </span><span class="nt">soft_fail</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span></span></span></code></pre></div><p><code>soft_fail: false</code> 讓掃描命中時 CI 失敗、阻擋合併。初期可以先設 <code>soft_fail: true</code>（掃描報告但不阻擋），讓團隊觀察命中量，確認規則集合理後再切成強制。</p>
<h3 id="掃描結果回貼-pr">掃描結果回貼 PR</h3>
<p>checkov 和 tfsec 的 GitHub Actions 都支援把結果以 PR comment 回貼。讓 reviewer 在 PR 頁面直接看到掃描結果，不用去翻 CI log。checkov-action 預設會回貼；tfsec-action 需要額外的 <code>github_token</code> 設定。</p>
<h3 id="漸進式導入">漸進式導入</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">Week 1-2：soft_fail=true，觀察命中量和 false positive 率
</span></span><span class="line"><span class="ln">2</span><span class="cl">Week 3：修完所有真問題，豁免所有合理的 false positive
</span></span><span class="line"><span class="ln">3</span><span class="cl">Week 4：切 soft_fail=false，掃描變成強制 gate</span></span></code></pre></div><p>這個節奏讓團隊在掃描變成強制之前就清理完存量，避免「一開 hard fail 所有 PR 都過不了」的窘境。</p>
<h2 id="false-positive-處理">False positive 處理</h2>
<p>false positive 的處理有三條路，依復發頻率選：</p>
<table>
  <thead>
      <tr>
          <th>路徑</th>
          <th>適用情境</th>
          <th>做法</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>行內豁免</td>
          <td>單一資源的合理例外</td>
          <td>在該資源加 <code>checkov:skip</code> + 理由</td>
      </tr>
      <tr>
          <td>全域跳過</td>
          <td>整個規則不適用於此專案</td>
          <td>加進 <code>.checkov.yaml</code> skip-check</td>
      </tr>
      <tr>
          <td>自訂規則覆蓋</td>
          <td>內建規則的判準不適合</td>
          <td>寫自訂規則取代內建規則</td>
      </tr>
  </tbody>
</table>
<p>最常見的 false positive 是 ALB 的 public-facing security group（設計就是要開 443）和開發環境的寬鬆設定（dev 允許、prod 不允許）。後者可以用 checkov 的 <code>--var-file</code> 搭配環境變數區分——dev 跑寬鬆規則集、prod 跑嚴格規則集。</p>
<p>處理 false positive 時要抵抗「加 skip 讓 CI 過」的捷徑衝動。每個 skip 都要問：這是設計意圖（ALB 要開放）還是技術債（dev 環境暫時放寬）？前者寫永久豁免加理由，後者寫臨時豁免加 TODO 和預計修復時間。</p>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>→ <a href="/blog/infra/07-infra-as-pr/plan-review-apply-guardrails/" data-link-title="infra 走 PR 流程與自動化護欄" data-link-desc="infra 變更走 PR → plan → review diff → 合併 → apply，配 fmt / validate / tflint / checkov / tfsec 與 Atlantis 自動化，讓基礎設施可審查、可回溯、可交接">infra 走 PR 流程與自動化護欄</a>：掃描在 PR 流程裡的定位與 plan/apply 的關係</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/terraform-ci-pipeline-setup/" data-link-title="Terraform CI Pipeline 設定指南" data-link-desc="用 GitHub Actions 建立完整的 Terraform CI pipeline：fmt → validate → tflint → plan → PR comment → apply，含 OIDC credential 與環境保護規則">Terraform CI Pipeline 設定</a>：掃描步驟怎麼嵌入完整的 CI workflow</li>
<li>→ <a href="/blog/infra/03-network-foundation/security-group-audit-cleanup/" data-link-title="Security Group 稽核與清理" data-link-desc="盤點所有 security group 規則、找出 0.0.0.0/0 全開與未使用的 SG、依賴檢查後安全刪除、自動化治理">模組三：Security Group 稽核與清理</a>：掃描命中 0.0.0.0/0 後的處理流程</li>
</ul>
]]></content:encoded></item><item><title>斷網環境的版本控制與 CI/CD</title><link>https://tarrragon.github.io/blog/infra/air-gapped/air-gapped-vcs-ci/</link><pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/infra/air-gapped/air-gapped-vcs-ci/</guid><description>&lt;p>版本控制和 CI/CD 是所有 infra 操作的前提——程式碼要有地方存、變更要能被 review、build 和 deploy 要自動化。正常環境裡這些由 GitHub + GitHub Actions 提供，斷網環境裡這兩個服務都不存在，需要在內網自建替代品。&lt;/p>
&lt;h2 id="gitlab-ce-vs-gitea選型判準">GitLab CE vs Gitea：選型判準&lt;/h2>
&lt;p>兩個主流的自建版本控制方案定位不同：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>維度&lt;/th>
 &lt;th>GitLab CE&lt;/th>
 &lt;th>Gitea&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>定位&lt;/td>
 &lt;td>VCS + CI + Container Registry + Issue Tracker 一體&lt;/td>
 &lt;td>純 VCS（輕量 Git 伺服器）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>資源需求&lt;/td>
 &lt;td>4GB+ RAM、推薦 8GB&lt;/td>
 &lt;td>512MB RAM 即可運作&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>CI 內建&lt;/td>
 &lt;td>GitLab CI（&lt;code>.gitlab-ci.yml&lt;/code>）&lt;/td>
 &lt;td>無（搭配 Drone / Woodpecker / Jenkins）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Container Registry&lt;/td>
 &lt;td>內建&lt;/td>
 &lt;td>無（搭配 Harbor）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>安裝複雜度&lt;/td>
 &lt;td>中（Omnibus 包裝簡化了安裝、但設定項多）&lt;/td>
 &lt;td>低（單一二進位檔、啟動即可用）&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>維護負擔&lt;/td>
 &lt;td>高（PostgreSQL、Redis、Sidekiq 都在裡面）&lt;/td>
 &lt;td>低（SQLite 或 MySQL、無背景服務）&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>選型判準是團隊規模和需要的功能範圍。5 人以下、只需要 VCS + 輕量 CI 的團隊，Gitea + Drone 的組合維護成本低。10 人以上、需要 MR review + CI pipeline + Container Registry 一站到位的團隊，GitLab CE 的整合度值得它的資源消耗。&lt;/p>
&lt;p>接下來以 GitLab CE 為主線說明（功能最完整），Gitea 的差異在各段附註。&lt;/p>
&lt;h2 id="gitlab-ce-離線安裝">GitLab CE 離線安裝&lt;/h2>
&lt;p>GitLab Omnibus 包把所有依賴打包成單一安裝檔，不需要在目標機器上 &lt;code>apt install&lt;/code> 任何前置套件。&lt;/p>
&lt;h3 id="在外網機器下載安裝包">在外網機器下載安裝包&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># Ubuntu/Debian&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">wget https://packages.gitlab.com/gitlab/gitlab-ce/packages/ubuntu/jammy/gitlab-ce_17.0.0-ce.0_amd64.deb/download.deb
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="c1"># RHEL/CentOS&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">wget https://packages.gitlab.com/gitlab/gitlab-ce/packages/el/9/gitlab-ce-17.0.0-ce.0.el9.x86_64.rpm/download.rpm&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>把下載的 &lt;code>.deb&lt;/code> 或 &lt;code>.rpm&lt;/code> 透過&lt;a href="https://tarrragon.github.io/blog/infra/air-gapped/air-gapped-principles/" data-link-title="斷網環境的通用原則" data-link-desc="離線套件管理、內容搬運、變更追蹤的共通操作模式 — 所有斷網情境都要先建立的基礎能力">內容搬運機制&lt;/a>（USB、光碟、跨邊界傳輸站）帶進斷網環境。&lt;/p>
&lt;h3 id="在斷網機器安裝">在斷網機器安裝&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># Ubuntu/Debian&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">sudo dpkg -i gitlab-ce_17.0.0-ce.0_amd64.deb
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="c1"># RHEL/CentOS&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">sudo yum localinstall gitlab-ce-17.0.0-ce.0.el9.x86_64.rpm&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="離線設定">離線設定&lt;/h3>
&lt;p>安裝後編輯 &lt;code>/etc/gitlab/gitlab.rb&lt;/code>，把所有外部連線關掉：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-ruby" data-lang="ruby">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="c1"># 設定內部域名（不是公網域名）&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">&lt;span class="n">external_url&lt;/span> &lt;span class="s1">&amp;#39;https://gitlab.internal.example.com&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl">&lt;span class="c1"># 關閉 Gravatar（頭像服務、需要外網）&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">&lt;span class="n">gitlab_rails&lt;/span>&lt;span class="o">[&lt;/span>&lt;span class="s1">&amp;#39;gravatar_enabled&amp;#39;&lt;/span>&lt;span class="o">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kp">false&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl">&lt;span class="c1"># 關閉 usage ping（回報使用統計到 GitLab Inc）&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl">&lt;span class="n">gitlab_rails&lt;/span>&lt;span class="o">[&lt;/span>&lt;span class="s1">&amp;#39;usage_ping_enabled&amp;#39;&lt;/span>&lt;span class="o">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kp">false&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl">&lt;span class="c1"># 關閉 version check&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl">&lt;span class="n">gitlab_rails&lt;/span>&lt;span class="o">[&lt;/span>&lt;span class="s1">&amp;#39;gitlab_check_on_connect&amp;#39;&lt;/span>&lt;span class="o">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kp">false&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl">&lt;span class="c1"># 如果沒有內部 SMTP，用 sendmail 或關閉 email&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">14&lt;/span>&lt;span class="cl">&lt;span class="n">gitlab_rails&lt;/span>&lt;span class="o">[&lt;/span>&lt;span class="s1">&amp;#39;smtp_enable&amp;#39;&lt;/span>&lt;span class="o">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="kp">false&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">15&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">16&lt;/span>&lt;span class="cl">&lt;span class="c1"># TLS 憑證用內部 CA 簽發&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">17&lt;/span>&lt;span class="cl">&lt;span class="n">nginx&lt;/span>&lt;span class="o">[&lt;/span>&lt;span class="s1">&amp;#39;ssl_certificate&amp;#39;&lt;/span>&lt;span class="o">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;/etc/gitlab/ssl/gitlab.crt&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">18&lt;/span>&lt;span class="cl">&lt;span class="n">nginx&lt;/span>&lt;span class="o">[&lt;/span>&lt;span class="s1">&amp;#39;ssl_certificate_key&amp;#39;&lt;/span>&lt;span class="o">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;/etc/gitlab/ssl/gitlab.key&amp;#34;&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>




&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">sudo gitlab-ctl reconfigure&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Gitea 的離線安裝更簡單：下載單一二進位檔 &lt;code>gitea&lt;/code>、設定 &lt;code>app.ini&lt;/code>、用 systemd 管理即可。&lt;/p>
&lt;h3 id="升級策略">升級策略&lt;/h3>
&lt;p>GitLab CE 的升級包也要從外部下載帶進來。升級前先備份（&lt;code>gitlab-backup create&lt;/code>），升級路徑要按 GitLab 的&lt;a href="https://docs.gitlab.com/ee/update/index.html#upgrade-paths">版本跳級規則&lt;/a>——不能任意跳版、某些大版本之間需要中繼版本。在斷網環境裡，每次升級要預先規劃中繼版本、一次帶進所有需要的安裝包。&lt;/p></description><content:encoded><![CDATA[<p>版本控制和 CI/CD 是所有 infra 操作的前提——程式碼要有地方存、變更要能被 review、build 和 deploy 要自動化。正常環境裡這些由 GitHub + GitHub Actions 提供，斷網環境裡這兩個服務都不存在，需要在內網自建替代品。</p>
<h2 id="gitlab-ce-vs-gitea選型判準">GitLab CE vs Gitea：選型判準</h2>
<p>兩個主流的自建版本控制方案定位不同：</p>
<table>
  <thead>
      <tr>
          <th>維度</th>
          <th>GitLab CE</th>
          <th>Gitea</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>定位</td>
          <td>VCS + CI + Container Registry + Issue Tracker 一體</td>
          <td>純 VCS（輕量 Git 伺服器）</td>
      </tr>
      <tr>
          <td>資源需求</td>
          <td>4GB+ RAM、推薦 8GB</td>
          <td>512MB RAM 即可運作</td>
      </tr>
      <tr>
          <td>CI 內建</td>
          <td>GitLab CI（<code>.gitlab-ci.yml</code>）</td>
          <td>無（搭配 Drone / Woodpecker / Jenkins）</td>
      </tr>
      <tr>
          <td>Container Registry</td>
          <td>內建</td>
          <td>無（搭配 Harbor）</td>
      </tr>
      <tr>
          <td>安裝複雜度</td>
          <td>中（Omnibus 包裝簡化了安裝、但設定項多）</td>
          <td>低（單一二進位檔、啟動即可用）</td>
      </tr>
      <tr>
          <td>維護負擔</td>
          <td>高（PostgreSQL、Redis、Sidekiq 都在裡面）</td>
          <td>低（SQLite 或 MySQL、無背景服務）</td>
      </tr>
  </tbody>
</table>
<p>選型判準是團隊規模和需要的功能範圍。5 人以下、只需要 VCS + 輕量 CI 的團隊，Gitea + Drone 的組合維護成本低。10 人以上、需要 MR review + CI pipeline + Container Registry 一站到位的團隊，GitLab CE 的整合度值得它的資源消耗。</p>
<p>接下來以 GitLab CE 為主線說明（功能最完整），Gitea 的差異在各段附註。</p>
<h2 id="gitlab-ce-離線安裝">GitLab CE 離線安裝</h2>
<p>GitLab Omnibus 包把所有依賴打包成單一安裝檔，不需要在目標機器上 <code>apt install</code> 任何前置套件。</p>
<h3 id="在外網機器下載安裝包">在外網機器下載安裝包</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># Ubuntu/Debian</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">wget https://packages.gitlab.com/gitlab/gitlab-ce/packages/ubuntu/jammy/gitlab-ce_17.0.0-ce.0_amd64.deb/download.deb
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># RHEL/CentOS</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">wget https://packages.gitlab.com/gitlab/gitlab-ce/packages/el/9/gitlab-ce-17.0.0-ce.0.el9.x86_64.rpm/download.rpm</span></span></code></pre></div><p>把下載的 <code>.deb</code> 或 <code>.rpm</code> 透過<a href="/blog/infra/air-gapped/air-gapped-principles/" data-link-title="斷網環境的通用原則" data-link-desc="離線套件管理、內容搬運、變更追蹤的共通操作模式 — 所有斷網情境都要先建立的基礎能力">內容搬運機制</a>（USB、光碟、跨邊界傳輸站）帶進斷網環境。</p>
<h3 id="在斷網機器安裝">在斷網機器安裝</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># Ubuntu/Debian</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">sudo dpkg -i gitlab-ce_17.0.0-ce.0_amd64.deb
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># RHEL/CentOS</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">sudo yum localinstall gitlab-ce-17.0.0-ce.0.el9.x86_64.rpm</span></span></code></pre></div><h3 id="離線設定">離線設定</h3>
<p>安裝後編輯 <code>/etc/gitlab/gitlab.rb</code>，把所有外部連線關掉：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-ruby" data-lang="ruby"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># 設定內部域名（不是公網域名）</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="n">external_url</span> <span class="s1">&#39;https://gitlab.internal.example.com&#39;</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl">
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="c1"># 關閉 Gravatar（頭像服務、需要外網）</span>
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="n">gitlab_rails</span><span class="o">[</span><span class="s1">&#39;gravatar_enabled&#39;</span><span class="o">]</span> <span class="o">=</span> <span class="kp">false</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">
</span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="c1"># 關閉 usage ping（回報使用統計到 GitLab Inc）</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="n">gitlab_rails</span><span class="o">[</span><span class="s1">&#39;usage_ping_enabled&#39;</span><span class="o">]</span> <span class="o">=</span> <span class="kp">false</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="c1"># 關閉 version check</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="n">gitlab_rails</span><span class="o">[</span><span class="s1">&#39;gitlab_check_on_connect&#39;</span><span class="o">]</span> <span class="o">=</span> <span class="kp">false</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">
</span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="c1"># 如果沒有內部 SMTP，用 sendmail 或關閉 email</span>
</span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="n">gitlab_rails</span><span class="o">[</span><span class="s1">&#39;smtp_enable&#39;</span><span class="o">]</span> <span class="o">=</span> <span class="kp">false</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">
</span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="c1"># TLS 憑證用內部 CA 簽發</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="n">nginx</span><span class="o">[</span><span class="s1">&#39;ssl_certificate&#39;</span><span class="o">]</span> <span class="o">=</span> <span class="s2">&#34;/etc/gitlab/ssl/gitlab.crt&#34;</span>
</span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="n">nginx</span><span class="o">[</span><span class="s1">&#39;ssl_certificate_key&#39;</span><span class="o">]</span> <span class="o">=</span> <span class="s2">&#34;/etc/gitlab/ssl/gitlab.key&#34;</span></span></span></code></pre></div>




<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl">sudo gitlab-ctl reconfigure</span></span></code></pre></div><p>Gitea 的離線安裝更簡單：下載單一二進位檔 <code>gitea</code>、設定 <code>app.ini</code>、用 systemd 管理即可。</p>
<h3 id="升級策略">升級策略</h3>
<p>GitLab CE 的升級包也要從外部下載帶進來。升級前先備份（<code>gitlab-backup create</code>），升級路徑要按 GitLab 的<a href="https://docs.gitlab.com/ee/update/index.html#upgrade-paths">版本跳級規則</a>——不能任意跳版、某些大版本之間需要中繼版本。在斷網環境裡，每次升級要預先規劃中繼版本、一次帶進所有需要的安裝包。</p>
<h2 id="ci-runner-離線設定">CI Runner 離線設定</h2>
<p>CI pipeline 在斷網環境裡跑的最大差異是 runner 不能即時拉依賴。</p>
<h3 id="runner-安裝與註冊">Runner 安裝與註冊</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 下載 runner 二進位檔（外網下載、帶進來）</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1"># https://docs.gitlab.com/runner/install/linux-manually.html</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl">sudo gitlab-runner register <span class="se">\
</span></span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="se"></span>  --url https://gitlab.internal.example.com <span class="se">\
</span></span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="se"></span>  --token <span class="nv">$RUNNER_TOKEN</span> <span class="se">\
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="se"></span>  --executor docker <span class="se">\
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="se"></span>  --docker-image alpine:3.20</span></span></code></pre></div><h3 id="executor-選擇">Executor 選擇</h3>
<table>
  <thead>
      <tr>
          <th>Executor</th>
          <th>隔離性</th>
          <th>前置條件</th>
          <th>斷網適用度</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>shell</td>
          <td>低（直接跑在 runner 機器上）</td>
          <td>無</td>
          <td>高（最簡單）</td>
      </tr>
      <tr>
          <td>docker</td>
          <td>高（每個 job 一個容器）</td>
          <td>需要 Docker + 預拉 image</td>
          <td>中（image 管理成本）</td>
      </tr>
      <tr>
          <td>kubernetes</td>
          <td>高（每個 job 一個 pod）</td>
          <td>需要 K8s cluster</td>
          <td>低（斷網 K8s 維護重）</td>
      </tr>
  </tbody>
</table>
<p>斷網環境推薦 shell executor（最少依賴）或 docker executor 搭配預拉好的 image。</p>
<h3 id="docker-executor-的-image-管理">Docker executor 的 image 管理</h3>
<p>Docker executor 的每個 job 都基於一個 base image。斷網環境裡這些 image 必須預先存在於內網的 <a href="/blog/infra/air-gapped/air-gapped-container/" data-link-title="斷網環境的容器與映像管理" data-link-desc="Private registry 架設、映像搬運（docker save/load、skopeo）、base image 更新週期、離線漏洞掃描">private registry</a>：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># runner 的 /etc/docker/daemon.json 指向內部 registry</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="o">{</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl">  <span class="s2">&#34;insecure-registries&#34;</span>: <span class="o">[</span><span class="s2">&#34;registry.internal:5000&#34;</span><span class="o">]</span>,
</span></span><span class="line"><span class="ln">4</span><span class="cl">  <span class="s2">&#34;registry-mirrors&#34;</span>: <span class="o">[</span><span class="s2">&#34;https://registry.internal:5000&#34;</span><span class="o">]</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="o">}</span></span></span></code></pre></div><p>CI pipeline 裡用到的每個 image（build 用的 golang/node/php、lint 用的 tflint/checkov、deploy 用的 awscli）都要事先搬進內部 registry。</p>
<h3 id="依賴快取">依賴快取</h3>
<p>沒有 npm registry / PyPI / Maven Central 可以拉，CI job 的依賴安裝必須用本地來源：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln">1</span><span class="cl"><span class="c"># .gitlab-ci.yml — 使用內部 Nexus 作為套件來源</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="nt">variables</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">  </span><span class="nt">NPM_CONFIG_REGISTRY</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;https://nexus.internal/repository/npm-proxy/&#34;</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">  </span><span class="nt">PIP_INDEX_URL</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;https://nexus.internal/repository/pypi-proxy/simple/&#34;</span></span></span></code></pre></div><p>或者把 <code>node_modules</code> / <code>vendor</code> 打包成 CI artifact 快取，避免每次 job 都重新安裝。</p>
<h2 id="git-bundle-跨邊界傳輸">Git Bundle 跨邊界傳輸</h2>
<p>某些斷網環境不允許直接 <code>git push</code> 到內網 GitLab（例如開發在外網、部署在內網）。Git bundle 是把 commit 歷史打包成單一檔案的機制：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 外網開發機：打包最近的 commit</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">git bundle create changes.bundle main~5..main
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 帶進斷網環境後</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">git bundle verify changes.bundle
</span></span><span class="line"><span class="ln">6</span><span class="cl">git fetch changes.bundle main:incoming
</span></span><span class="line"><span class="ln">7</span><span class="cl">git merge incoming</span></span></code></pre></div><p>bundle 檔案包含完整的 Git 物件（commit、tree、blob），可以通過任何檔案傳輸方式帶過邊界——USB、光碟、審批後的檔案傳輸閘道。</p>
<p>跨邊界傳輸的安全考量：bundle 的內容應該在傳入前被掃描（至少 <code>git bundle verify</code>），確認不包含預期外的分支或異常大的物件。某些高安全環境要求所有跨邊界檔案經過人工審批。</p>
<h2 id="mr-review-流程">MR Review 流程</h2>
<p>斷網環境的 MR（Merge Request）review 流程跟<a href="/blog/infra/07-infra-as-pr/" data-link-title="模組七：infra 走 PR 流程與自動化護欄" data-link-desc="infra 變更走 PR → plan → review diff → 合併 → apply，配 fmt / validate / tflint / checkov / tfsec 與 Atlantis 自動化，讓基礎設施可審查、可回溯、可交接">模組七：infra 走 PR 流程</a>的原則相同——變更走 MR → CI 跑 plan → reviewer 看 diff + plan 輸出 → 合併 → apply。差別在於所有環節都在內網：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c"># .gitlab-ci.yml — Terraform plan 貼回 MR comment</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">plan</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">stage</span><span class="p">:</span><span class="w"> </span><span class="l">plan</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">  </span><span class="nt">script</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span>- <span class="l">terraform init -plugin-dir=/opt/terraform/plugins</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">    </span>- <span class="l">terraform plan -no-color -out=plan.tfplan | tee plan.txt</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">    </span>- <span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="sd">      curl --request POST \
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="sd">        --header &#34;PRIVATE-TOKEN: $GITLAB_TOKEN&#34; \
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="sd">        --data-urlencode &#34;body=$(cat plan.txt)&#34; \
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="sd">        &#34;https://gitlab.internal/api/v4/projects/$CI_PROJECT_ID/merge_requests/$CI_MERGE_REQUEST_IID/notes&#34;</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">  </span><span class="nt">only</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">    </span>- <span class="l">merge_requests</span></span></span></code></pre></div><p>GitLab CI 的 <code>merge_requests</code> trigger 跟 GitHub Actions 的 <code>pull_request</code> 等價——MR 開啟或更新時自動跑 pipeline。</p>
<p>reviewer 在 GitLab 的 MR 頁面看 code diff + plan 輸出 comment，approve 後合併，合併觸發 apply pipeline。流程跟有網路時完全相同，只是所有元件（GitLab、runner、Terraform、provider plugin）都在內網。</p>
<h2 id="時程與維護">時程與維護</h2>
<table>
  <thead>
      <tr>
          <th>項目</th>
          <th>初始設定</th>
          <th>持續維護</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>GitLab CE 安裝 + 設定</td>
          <td>1 天</td>
          <td>每季升級（含帶包 + 備份 + 升級 + 驗證）~半天</td>
      </tr>
      <tr>
          <td>CI runner 設定</td>
          <td>半天</td>
          <td>image 更新隨 registry 同步</td>
      </tr>
      <tr>
          <td>Gitea + Drone（替代方案）</td>
          <td>半天</td>
          <td>極低（二進位更新即可）</td>
      </tr>
      <tr>
          <td>Git bundle 流程建立</td>
          <td>2 小時</td>
          <td>按需（有跨邊界需求時）</td>
      </tr>
  </tbody>
</table>
<p>GitLab CE 的主要維護成本在升級——斷網環境的升級不能一鍵 <code>apt upgrade</code>，要預先下載正確版本的安裝包帶進來。跳版規則讓這個過程比正常環境多一層規劃。</p>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>→ <a href="/blog/infra/air-gapped/air-gapped-principles/" data-link-title="斷網環境的通用原則" data-link-desc="離線套件管理、內容搬運、變更追蹤的共通操作模式 — 所有斷網情境都要先建立的基礎能力">斷網環境的通用原則</a>：內容搬運、離線套件管理的共通模式</li>
<li>→ <a href="/blog/infra/air-gapped/air-gapped-container/" data-link-title="斷網環境的容器與映像管理" data-link-desc="Private registry 架設、映像搬運（docker save/load、skopeo）、base image 更新週期、離線漏洞掃描">斷網環境的容器與映像管理</a>：CI runner 的 Docker image 管理</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/" data-link-title="模組七：infra 走 PR 流程與自動化護欄" data-link-desc="infra 變更走 PR → plan → review diff → 合併 → apply，配 fmt / validate / tflint / checkov / tfsec 與 Atlantis 自動化，讓基礎設施可審查、可回溯、可交接">模組七：infra 走 PR 流程</a>：MR review 流程的原則與護欄</li>
</ul>
]]></content:encoded></item><item><title>模組七：infra 走 PR 流程與自動化護欄</title><link>https://tarrragon.github.io/blog/infra/07-infra-as-pr/</link><pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/infra/07-infra-as-pr/</guid><description>&lt;p>infra 變更要走跟 application code 一樣的流程：開分支、提 PR、跑檢查、review diff、合併、發布。這條原則把基礎設施變更從「某個人在自己終端機 apply」轉成「團隊可審查的紀錄」，是 IaC 真正兌現價值的地方，也是解開「只有我懂 infra」這個單點依賴的關鍵。基礎設施跟程式碼一樣會出錯、會需要回溯、會交接給別人，所以它需要同一套保護機制。&lt;/p>
&lt;h2 id="infra-變更走-code-流程">infra 變更走 code 流程&lt;/h2>
&lt;p>infra 變更的標準路徑是 PR → plan → review diff → 合併 → apply。這個順序的核心責任是把「執行前先看清楚要改什麼」變成強制步驟，而不是 apply 之後才從事故裡發現改錯了。每個環節各自承擔一段審查責任，少掉任一段，infra 就退回到不可審查的狀態。&lt;/p>
&lt;p>&lt;code>terraform plan&lt;/code> 是這條鏈裡最關鍵的一環。它把當前 state、雲端實際資源、與目標設定三方比對，產出一份「會新增 / 修改 / 刪除哪些資源」的 diff。這份 diff 是 review 的對象：reviewer 直接看 plan 算出來的實際變更，而非讀 HCL 自行想像結果。一個容易被低估的判讀訊號是 plan 裡的 &lt;code>destroy&lt;/code> 與 &lt;code>replace&lt;/code>（顯示為 &lt;code>-/+&lt;/code>）— 改一個看似無害的欄位（例如某些雲資源的 &lt;code>name&lt;/code>、或資料庫的 &lt;code>identifier&lt;/code>）可能觸發整個資源重建，對有狀態的服務代表資料遺失或停機。Review 階段抓到這個 &lt;code>-/+&lt;/code>，比 apply 到一半才發現便宜太多。&lt;/p>
&lt;p>把 plan 結果貼回 PR 是讓 review 真正生效的做法。流程上，PR 觸發 CI 跑 plan，plan 輸出回貼成 PR comment，reviewer 連同程式碼 diff 一起看；approve 後才允許合併，合併才觸發 apply。這裡有個取捨：plan 與 apply 之間若隔了很久，雲端實際狀態可能已經漂移（有人手動改了、或別的 PR 先 apply 了），導致 apply 時的 plan 跟 review 時看到的不一致。多數團隊在 apply 階段會重跑一次 plan 並要求它與 review 時一致，代價是流程多一道、但換到「review 看到的就是實際執行的」這個保證。&lt;/p>
&lt;p>風險邊界落在 apply 失敗的回退上。infra apply 不像程式碼部署可以直接 rollback 到上一版 image — 中途失敗時部分資源已經建立、state 可能處於半完成狀態。所以 PR 流程的價值不只在事前審查，也在事後可追溯：每次變更都對應一個 commit 與一個 PR，要回溯時知道是哪次改的、為什麼改、誰 review 的。&lt;/p>
&lt;h2 id="fmt-與-validate最便宜的第一道檢查">fmt 與 validate：最便宜的第一道檢查&lt;/h2>
&lt;p>&lt;code>fmt&lt;/code> 與 &lt;code>validate&lt;/code> 是進到任何安全掃描之前的基礎檢查，責任是擋掉格式不一致與語法 / 型別錯誤這類不需要動腦判斷的問題。它們跑得快、沒有誤判空間，適合放在 CI 最前面當作快速 fail 的關卡。&lt;/p>
&lt;p>&lt;code>terraform fmt -check&lt;/code> 驗證程式碼是否符合標準排版。它本身不影響基礎設施行為，價值在於消除 diff 噪音：當每個人的編輯器縮排習慣不同，PR diff 會混入大量純排版變動，把真正的邏輯變更淹沒，reviewer 更容易看漏。統一格式後，diff 裡剩下的就是語意變更。&lt;code>validate&lt;/code> 則檢查設定在語法與內部一致性上是否成立 — reference 到不存在的變數、型別不匹配、必填參數缺漏，這些在 validate 階段就會報錯，不必等到 plan 連線雲端才發現。&lt;/p>
&lt;p>判讀上，fmt 與 validate 失敗代表的是「這份 code 還沒準備好被認真 review」，屬於作者自己該先修掉的問題，不該佔用 reviewer 注意力。把它們設成 CI 必過的 gate，作者在本地就會先跑、先修，PR 送出時已經是乾淨的。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="c"># .github/workflows/terraform.yml — plan 前的基礎檢查&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">jobs&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">validate&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">runs-on&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">ubuntu-latest&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">steps&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">actions/checkout@v4&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">hashicorp/setup-terraform@v3&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">terraform fmt -check -recursive&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">terraform init -backend=false&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">terraform validate&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="tflint--checkov--tfsec抓壞寫法與安全漏洞">tflint / checkov / tfsec：抓壞寫法與安全漏洞&lt;/h2>
&lt;p>fmt 與 validate 確認 code「語法正確」，但語法正確的設定仍然可能是危險的設定。tflint、checkov、tfsec 這類靜態掃描工具承擔的是「語意正確」這層：在不實際建立資源的前提下，從 HCL 裡比對已知的壞寫法與安全反模式，把問題擋在 plan 之前。它們補的是 reviewer 肉眼容易漏掉的盲區 — 人會看漏一個 &lt;code>0.0.0.0/0&lt;/code>，規則不會。&lt;/p></description><content:encoded><![CDATA[<p>infra 變更要走跟 application code 一樣的流程：開分支、提 PR、跑檢查、review diff、合併、發布。這條原則把基礎設施變更從「某個人在自己終端機 apply」轉成「團隊可審查的紀錄」，是 IaC 真正兌現價值的地方，也是解開「只有我懂 infra」這個單點依賴的關鍵。基礎設施跟程式碼一樣會出錯、會需要回溯、會交接給別人，所以它需要同一套保護機制。</p>
<h2 id="infra-變更走-code-流程">infra 變更走 code 流程</h2>
<p>infra 變更的標準路徑是 PR → plan → review diff → 合併 → apply。這個順序的核心責任是把「執行前先看清楚要改什麼」變成強制步驟，而不是 apply 之後才從事故裡發現改錯了。每個環節各自承擔一段審查責任，少掉任一段，infra 就退回到不可審查的狀態。</p>
<p><code>terraform plan</code> 是這條鏈裡最關鍵的一環。它把當前 state、雲端實際資源、與目標設定三方比對，產出一份「會新增 / 修改 / 刪除哪些資源」的 diff。這份 diff 是 review 的對象：reviewer 直接看 plan 算出來的實際變更，而非讀 HCL 自行想像結果。一個容易被低估的判讀訊號是 plan 裡的 <code>destroy</code> 與 <code>replace</code>（顯示為 <code>-/+</code>）— 改一個看似無害的欄位（例如某些雲資源的 <code>name</code>、或資料庫的 <code>identifier</code>）可能觸發整個資源重建，對有狀態的服務代表資料遺失或停機。Review 階段抓到這個 <code>-/+</code>，比 apply 到一半才發現便宜太多。</p>
<p>把 plan 結果貼回 PR 是讓 review 真正生效的做法。流程上，PR 觸發 CI 跑 plan，plan 輸出回貼成 PR comment，reviewer 連同程式碼 diff 一起看；approve 後才允許合併，合併才觸發 apply。這裡有個取捨：plan 與 apply 之間若隔了很久，雲端實際狀態可能已經漂移（有人手動改了、或別的 PR 先 apply 了），導致 apply 時的 plan 跟 review 時看到的不一致。多數團隊在 apply 階段會重跑一次 plan 並要求它與 review 時一致，代價是流程多一道、但換到「review 看到的就是實際執行的」這個保證。</p>
<p>風險邊界落在 apply 失敗的回退上。infra apply 不像程式碼部署可以直接 rollback 到上一版 image — 中途失敗時部分資源已經建立、state 可能處於半完成狀態。所以 PR 流程的價值不只在事前審查，也在事後可追溯：每次變更都對應一個 commit 與一個 PR，要回溯時知道是哪次改的、為什麼改、誰 review 的。</p>
<h2 id="fmt-與-validate最便宜的第一道檢查">fmt 與 validate：最便宜的第一道檢查</h2>
<p><code>fmt</code> 與 <code>validate</code> 是進到任何安全掃描之前的基礎檢查，責任是擋掉格式不一致與語法 / 型別錯誤這類不需要動腦判斷的問題。它們跑得快、沒有誤判空間，適合放在 CI 最前面當作快速 fail 的關卡。</p>
<p><code>terraform fmt -check</code> 驗證程式碼是否符合標準排版。它本身不影響基礎設施行為，價值在於消除 diff 噪音：當每個人的編輯器縮排習慣不同，PR diff 會混入大量純排版變動，把真正的邏輯變更淹沒，reviewer 更容易看漏。統一格式後，diff 裡剩下的就是語意變更。<code>validate</code> 則檢查設定在語法與內部一致性上是否成立 — reference 到不存在的變數、型別不匹配、必填參數缺漏，這些在 validate 階段就會報錯，不必等到 plan 連線雲端才發現。</p>
<p>判讀上，fmt 與 validate 失敗代表的是「這份 code 還沒準備好被認真 review」，屬於作者自己該先修掉的問題，不該佔用 reviewer 注意力。把它們設成 CI 必過的 gate，作者在本地就會先跑、先修，PR 送出時已經是乾淨的。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c"># .github/workflows/terraform.yml — plan 前的基礎檢查</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">hashicorp/setup-terraform@v3</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform fmt -check -recursive</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform init -backend=false</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform validate</span></span></span></code></pre></div><h2 id="tflint--checkov--tfsec抓壞寫法與安全漏洞">tflint / checkov / tfsec：抓壞寫法與安全漏洞</h2>
<p>fmt 與 validate 確認 code「語法正確」，但語法正確的設定仍然可能是危險的設定。tflint、checkov、tfsec 這類靜態掃描工具承擔的是「語意正確」這層：在不實際建立資源的前提下，從 HCL 裡比對已知的壞寫法與安全反模式，把問題擋在 plan 之前。它們補的是 reviewer 肉眼容易漏掉的盲區 — 人會看漏一個 <code>0.0.0.0/0</code>，規則不會。</p>
<p>這三者的側重不同，組合起來覆蓋面才完整。tflint 偏向 provider 層的正確性與慣例規範：用了已棄用的參數、instance type 在該 region 不存在、命名不符規範。checkov 與 tfsec 偏向安全與合規：掃的是會造成資料外洩或權限過大的設定。兩個最常被它們攔下、也最常釀成真實事故的模式，值得單獨說明。</p>
<p>第一個是 S3 bucket 對外公開。一個漏設 <code>block_public_access</code> 或 ACL 寫成 <code>public-read</code> 的 bucket，會讓裡面的物件對整個網際網路可讀。這類設定在 HCL 裡只是一兩行，肉眼 review 時很容易因為「看起來像樣板」而放過，但後果是資料外洩。checkov 有專門規則比對 bucket 的 public access 設定，命中就讓 CI fail，逼作者在合併前說明或修正。</p>
<p>第二個是 security group 對全世界開放。一條 ingress 寫成 <code>cidr_blocks = [&quot;0.0.0.0/0&quot;]</code> 加上 port 22 或 3306，等於把 SSH 或資料庫埠暴露給全網掃描器。tfsec 與 checkov 都會標記這種「敏感埠 + 全開 CIDR」的組合。這條規則跟模組三：網路地基講的 security group 收斂原則是同一件事的兩端 — 模組三教怎麼把規則寫對，本章用靜態掃描確保寫錯時擋得下來。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 三道掃描串在一起，任一 fail 就中斷</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">tflint --recursive
</span></span><span class="line"><span class="ln">3</span><span class="cl">checkov -d . --quiet --compact
</span></span><span class="line"><span class="ln">4</span><span class="cl">tfsec . --soft-fail<span class="o">=</span>false</span></span></code></pre></div><p>判讀這些工具的命中時，要區分「真漏洞」與「情境合理的例外」。並非每個 <code>0.0.0.0/0</code> 都是錯 — 一個對外的 HTTPS load balancer 在 port 443 開全網是設計本意。所以這些掃描的命中是候選不是判決：多數工具支援用行內註解標記豁免（例如 checkov 的 <code>#checkov:skip</code>），代價是豁免要寫理由、要被 review，避免變成無聲略過。把例外顯式化、留下為什麼豁免的紀錄，比關掉整條規則安全。</p>
<h2 id="atlantis-與-github-actions自動化-plan-與-apply">Atlantis 與 GitHub Actions：自動化 plan 與 apply</h2>
<p>把上述流程自動化，需要一個能監聽 PR 事件、在對的時機跑 plan 與 apply 的執行層。兩種常見做法是直接用 CI 平台（如 GitHub Actions）寫 workflow，或用 Atlantis 這類專為 Terraform PR 流程設計的工具。Atlantis 是一個常駐服務，掛在 git 平台的 webhook 上：PR 開啟時它自動跑 <code>plan</code> 並把結果貼回 PR comment，reviewer approve 後在 PR 留言 <code>atlantis apply</code>，它才執行 apply 並回報結果。它的價值在於把「誰能 apply、apply 前要不要 approve、plan 結果在哪看」這些規則收斂成一致的、可設定的流程，而不是散落在各 repo 各自的 workflow 腳本裡。</p>
<p>選哪一種是機會成本的取捨。GitHub Actions workflow 的優點是不必額外維運一個服務、跟既有 CI 共用同一套權限與 runner；缺點是 apply 的 gating 邏輯（approve 後才能 apply、apply lock 避免兩個 PR 同時改同一份 state）要自己用 workflow 條件拼出來。Atlantis 的優點是這些 gating 與 state lock 是內建語意、跨多 repo 一致；缺點是它本身是一個要部署、要升級、要保護的常駐服務。團隊 repo 少、流程簡單時 Actions 划算；管理大量 Terraform repo、需要統一 apply 治理時 Atlantis 划算。</p>
<p>無論哪種執行層，自動化的 apply 都需要對雲端的寫入權限，而這個權限怎麼來是整條管線的安全根基。這裡正是模組二：身分與憑證地基鋪設的 OIDC 兌現的地方 — 管線不該存放長期的 access key，而是在 runner 執行時用 OIDC 向雲端換取短期 token。模組二講的是怎麼建立這個信任關係，本章是它的回報處：因為有了 OIDC，自動 apply 才能在不持有靜態憑證的前提下安全執行，憑證外洩的攻擊面從「一把長期金鑰」縮到「單次執行的短期 token」。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c"># 合併到主幹後，用 OIDC 換短期憑證再 apply（呼應模組二）</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">apply</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">if</span><span class="p">:</span><span class="w"> </span><span class="l">github.ref == &#39;refs/heads/main&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">    </span><span class="nt">permissions</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">      </span><span class="nt">id-token</span><span class="p">:</span><span class="w"> </span><span class="l">write  </span><span class="w"> </span><span class="c"># 允許 runner 取得 OIDC token</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">      </span><span class="nt">contents</span><span class="p">:</span><span class="w"> </span><span class="l">read</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">aws-actions/configure-aws-credentials@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">          </span><span class="nt">role-to-assume</span><span class="p">:</span><span class="w"> </span><span class="l">arn:aws:iam::123456789012:role/infra-apply</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">          </span><span class="nt">aws-region</span><span class="p">:</span><span class="w"> </span><span class="l">ap-northeast-1</span><span class="w">
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">hashicorp/setup-terraform@v3</span><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform init</span><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform apply -auto-approve</span></span></span></code></pre></div><p>判讀自動 apply 的邊界時，要留意它不適合所有變更。對會觸發資源重建或刪除的高風險 plan，多數團隊會保留人工 apply 的關卡（Atlantis 的手動 <code>atlantis apply</code>、或 workflow 加 environment protection rule 要人按確認），不讓這類變更在合併瞬間無人看管地執行。自動化的目的是消除重複勞動與人為遺漏，不是把判斷也一起省掉。</p>
<h2 id="知識留在-code而不是留在個人腦中">知識留在 code，而不是留在個人腦中</h2>
<p>走完整套 PR 流程後，infra 的真正收穫是知識從個人的記憶移到了 repo 裡。每一次「為什麼這個 security group 開這個埠」「為什麼這台機器選這個 instance type」的決策，都以 code + PR 描述 + review 討論的形式留下，新人讀 repo 就能還原當初的判斷，不必去問那個「只有他懂 infra」的人。這是這個模組從第一章開始累積的目的地：基礎設施可被閱讀，等於它可被交接。</p>
<p>可 revert 是這套機制最直接的兌現。當某次變更引發問題，回退手段是 <code>git revert</code> 那個 commit 再走一次 PR 流程，讓基礎設施回到變更前的設定 — 跟回退一段壞掉的程式碼是同一個動作。對照「只有我懂 infra」的舊狀態：那時候回退靠的是當事人記得自己改了什麼、手動在 console 改回去，記錯或人不在就無從回退。把變更歷史留在 git，回退就從「依賴某人的記憶」變成「依賴版本紀錄」。</p>
<p>這份 revert 能力的邊界要講清楚，跟本章前面講的 apply 半完成 state 是同一個誠實。revert code 救得回的是「設定」，救不回已經被銷毀的狀態與資料：revert 掉一個刪除 stateful 資源的 commit，只是讓設定回到「該資源存在」，但被刪掉的資料庫內容不會跟著回來；rename 或 replace 類的變更 revert 後，可能再觸發一次資源重建。所以 stateful 變更的真正回退仍然靠備份與快照，這正是模組五 stateful 處理與模組八 secret / state 保護要顧的事。把 git revert 當「設定層回退」就誠實，把它當「資料層回退」就會在事故裡踩空。</p>
<p>這條知識共享的路線會在模組九：怎麼把 infra 推動起來展開到組織層。本章解決的是技術機制 — code 留得住知識；模組九解決的是怎麼讓一個習慣手動操作的團隊真的願意走這套流程、把知識交出來。技術上能審查、能回溯、能交接是前提，但讓團隊實際採用它是另一層問題。</p>
<p>判讀一個團隊是否真的把知識留在 code 的訊號很具體：當主要負責 infra 的人請假，其他人能不能只靠讀 repo 就理解現狀並安全地改一個小設定。如果答案是「得等他回來」，那不論工具鏈多完整，知識還在個人腦中，PR 流程只是形式。這個訊號比任何工具設定都更能反映 infra 的成熟度。</p>
<h2 id="章節文章">章節文章</h2>
<table>
  <thead>
      <tr>
          <th>文章</th>
          <th>主題</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="/blog/infra/07-infra-as-pr/plan-review-apply-guardrails/" data-link-title="infra 走 PR 流程與自動化護欄" data-link-desc="infra 變更走 PR → plan → review diff → 合併 → apply，配 fmt / validate / tflint / checkov / tfsec 與 Atlantis 自動化，讓基礎設施可審查、可回溯、可交接">infra 走 PR 流程與自動化護欄</a></td>
          <td>PR → plan → review diff → 合併 → apply，配 fmt / validate / tflint / checkov / tfsec 與 Atlantis 自動化，讓基礎設施可審查、可回溯、可交接</td>
      </tr>
      <tr>
          <td><a href="/blog/infra/07-infra-as-pr/terraform-ci-pipeline-setup/" data-link-title="Terraform CI Pipeline 設定指南" data-link-desc="用 GitHub Actions 建立完整的 Terraform CI pipeline：fmt → validate → tflint → plan → PR comment → apply，含 OIDC credential 與環境保護規則">Terraform CI Pipeline 設定指南</a></td>
          <td>GitHub Actions 完整 workflow（fmt → validate → tflint → plan → PR comment → apply）、OIDC credential、環境保護規則</td>
      </tr>
      <tr>
          <td><a href="/blog/infra/07-infra-as-pr/checkov-tfsec-rule-customization/" data-link-title="checkov 與 tfsec 規則配置" data-link-desc="靜態掃描工具的規則選擇策略、自訂規則、豁免管理、false positive 處理與 CI 整合，讓掃描從噪音來源變成可信的品質關卡">checkov 與 tfsec 規則配置</a></td>
          <td>三階段漸進啟用、規則選擇策略、inline vs 集中式豁免管理、自訂規則、false positive 處理</td>
      </tr>
  </tbody>
</table>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>→ <a href="/blog/ci/" data-link-title="CI/CD 教學" data-link-desc="整理 CI/CD 的驗證、建置、發布 gate 與不同部署場域的流程差異，讓每次變更都能被穩定驗證與交付">CI/CD 教學</a>：infra 管線用的就是這套驗證 / 發布 gate，plan / apply 對應 build / deploy 階段</li>
<li>→ <a href="/blog/infra/02-identity-credentials/" data-link-title="模組二：身分與憑證地基 — IAM 與 OIDC" data-link-desc="IAM role / policy 設計、最小權限，以及用 OIDC 短期憑證取代長期 access key">模組二：身分與憑證地基</a>：管線用 OIDC 取得 apply 權限，本章是該章 OIDC 設計的回報兌現處</li>
<li>→ <a href="/blog/infra/03-network-foundation/" data-link-title="模組三：網路地基 — VPC 與分層" data-link-desc="VPC、public / private subnet 切分、route table、NAT、security group 設計">模組三：網路地基</a>：security group 收斂原則，本章用 tfsec / checkov 在 CI 攔下寫錯的全開規則</li>
<li>→ <a href="/blog/infra/09-driving-adoption/" data-link-title="模組九：怎麼把 infra 推動起來" data-link-desc="技術正確不等於推得動 — 信任赤字、期望值對齊、知識共享，infra 落地的組織課題">模組九：怎麼把 infra 推動起來</a>：本章把知識留在 code 的技術機制，在該章展開成組織層的採用與知識共享</li>
<li>→ <a href="/blog/backend/07-security-data-protection/" data-link-title="模組七：資安與資料保護" data-link-desc="以問題驅動方式擴充資安知識網：先定義服務環節問題，再以案例作為觸發式參考">backend 模組七：資安與資料保護</a>：S3 公開、敏感埠全開這類掃描攔截的反模式，對應的資料保護原則</li>
</ul>
]]></content:encoded></item><item><title>Conftest</title><link>https://tarrragon.github.io/blog/backend/07-security-data-protection/vendors/conftest/</link><pubDate>Mon, 18 May 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/backend/07-security-data-protection/vendors/conftest/</guid><description>&lt;p>Conftest 是 &lt;em>OPA CLI wrapper for static config policy check&lt;/em>、Open Policy Agent project 旗下的 CLI 工具、Apache 2.0 OSS、無商業版。它的核心定位是 &lt;em>CI-time policy gate&lt;/em>、有別於 admission runtime：在 git commit / PR / merge 階段、用 Rego policy 對 config file（Terraform HCL / K8s YAML / Dockerfile / JSON / TOML / INI / serverless.yml）做 static check、把 misconfiguration 攔在 deploy 之前。跟 &lt;a href="https://tarrragon.github.io/blog/backend/07-security-data-protection/vendors/opa/" data-link-title="Open Policy Agent (OPA)" data-link-desc="CNCF general-purpose policy engine、Rego Datalog-like 語言、decoupled decision &amp;#43; enforcement、跨 K8s / API / Terraform / SQL 統一 policy">OPA&lt;/a> / &lt;a href="https://tarrragon.github.io/blog/backend/07-security-data-protection/vendors/gatekeeper/" data-link-title="OPA Gatekeeper" data-link-desc="OPA 官方 K8s admission controller、ConstraintTemplate &amp;#43; Constraint 兩層、Rego policy &amp;#43; Audit &amp;#43; Mutation">Gatekeeper&lt;/a> / &lt;a href="https://tarrragon.github.io/blog/backend/07-security-data-protection/vendors/trivy/" data-link-title="Trivy" data-link-desc="Aqua Security 開源 all-in-one scanner：Container / Filesystem / K8s / IaC &amp;#43; Secret &amp;#43; License &amp;#43; SBOM、Apache 2.0、CI 友善">Trivy&lt;/a> Config 的差異在 &lt;em>執行時機 + 客製化方式&lt;/em>、規則表達力反而相近。&lt;/p>
&lt;h2 id="服務定位">服務定位&lt;/h2>
&lt;p>Conftest 是 OPA 生態中 &lt;em>最輕量的 CI-time tool&lt;/em> — 拿一份 Rego policy + 一份 config file、跑 &lt;code>conftest test&lt;/code> 就出 violation report。它不需要 cluster、不需要 daemon、不接 admission webhook、只是個 binary。跟 &lt;a href="https://tarrragon.github.io/blog/backend/07-security-data-protection/vendors/opa/" data-link-title="Open Policy Agent (OPA)" data-link-desc="CNCF general-purpose policy engine、Rego Datalog-like 語言、decoupled decision &amp;#43; enforcement、跨 K8s / API / Terraform / SQL 統一 policy">OPA&lt;/a> 比、OPA 是 &lt;em>runtime decision engine&lt;/em>（HTTP server / library / sidecar 提供 policy decision）、Conftest 只是 &lt;em>CLI 跑 once、結束即關&lt;/em>。跟 &lt;a href="https://tarrragon.github.io/blog/backend/07-security-data-protection/vendors/gatekeeper/" data-link-title="OPA Gatekeeper" data-link-desc="OPA 官方 K8s admission controller、ConstraintTemplate &amp;#43; Constraint 兩層、Rego policy &amp;#43; Audit &amp;#43; Mutation">Gatekeeper&lt;/a> 比、Gatekeeper 是 &lt;em>K8s admission controller runtime&lt;/em>、會在 kubectl apply 時攔下違規；Conftest 是在 PR 階段就攔下、deploy 前就 fail CI。跟 &lt;a href="https://tarrragon.github.io/blog/backend/07-security-data-protection/vendors/kyverno/" data-link-title="Kyverno" data-link-desc="K8s-native policy engine、YAML policy（非 Rego）、五類 rule（Validate / Mutate / Generate / Verify Images / Cleanup）、CNCF Incubating">Kyverno&lt;/a> 比、Kyverno 是 K8s-only 的 admission policy（YAML 語法）、Conftest 跨多 config format（不只 K8s）且用 Rego。跟 &lt;a href="https://tarrragon.github.io/blog/backend/07-security-data-protection/vendors/trivy/" data-link-title="Trivy" data-link-desc="Aqua Security 開源 all-in-one scanner：Container / Filesystem / K8s / IaC &amp;#43; Secret &amp;#43; License &amp;#43; SBOM、Apache 2.0、CI 友善">Trivy&lt;/a> Config 比、Trivy Config 是 &lt;em>built-in misconfig rule&lt;/em>（開箱即用、預定義常見 anti-pattern）、Conftest 是 &lt;em>自己寫 Rego policy&lt;/em>（客製化彈性大但要寫 rule）。&lt;/p></description><content:encoded><![CDATA[<p>Conftest 是 <em>OPA CLI wrapper for static config policy check</em>、Open Policy Agent project 旗下的 CLI 工具、Apache 2.0 OSS、無商業版。它的核心定位是 <em>CI-time policy gate</em>、有別於 admission runtime：在 git commit / PR / merge 階段、用 Rego policy 對 config file（Terraform HCL / K8s YAML / Dockerfile / JSON / TOML / INI / serverless.yml）做 static check、把 misconfiguration 攔在 deploy 之前。跟 <a href="/blog/backend/07-security-data-protection/vendors/opa/" data-link-title="Open Policy Agent (OPA)" data-link-desc="CNCF general-purpose policy engine、Rego Datalog-like 語言、decoupled decision &#43; enforcement、跨 K8s / API / Terraform / SQL 統一 policy">OPA</a> / <a href="/blog/backend/07-security-data-protection/vendors/gatekeeper/" data-link-title="OPA Gatekeeper" data-link-desc="OPA 官方 K8s admission controller、ConstraintTemplate &#43; Constraint 兩層、Rego policy &#43; Audit &#43; Mutation">Gatekeeper</a> / <a href="/blog/backend/07-security-data-protection/vendors/trivy/" data-link-title="Trivy" data-link-desc="Aqua Security 開源 all-in-one scanner：Container / Filesystem / K8s / IaC &#43; Secret &#43; License &#43; SBOM、Apache 2.0、CI 友善">Trivy</a> Config 的差異在 <em>執行時機 + 客製化方式</em>、規則表達力反而相近。</p>
<h2 id="服務定位">服務定位</h2>
<p>Conftest 是 OPA 生態中 <em>最輕量的 CI-time tool</em> — 拿一份 Rego policy + 一份 config file、跑 <code>conftest test</code> 就出 violation report。它不需要 cluster、不需要 daemon、不接 admission webhook、只是個 binary。跟 <a href="/blog/backend/07-security-data-protection/vendors/opa/" data-link-title="Open Policy Agent (OPA)" data-link-desc="CNCF general-purpose policy engine、Rego Datalog-like 語言、decoupled decision &#43; enforcement、跨 K8s / API / Terraform / SQL 統一 policy">OPA</a> 比、OPA 是 <em>runtime decision engine</em>（HTTP server / library / sidecar 提供 policy decision）、Conftest 只是 <em>CLI 跑 once、結束即關</em>。跟 <a href="/blog/backend/07-security-data-protection/vendors/gatekeeper/" data-link-title="OPA Gatekeeper" data-link-desc="OPA 官方 K8s admission controller、ConstraintTemplate &#43; Constraint 兩層、Rego policy &#43; Audit &#43; Mutation">Gatekeeper</a> 比、Gatekeeper 是 <em>K8s admission controller runtime</em>、會在 kubectl apply 時攔下違規；Conftest 是在 PR 階段就攔下、deploy 前就 fail CI。跟 <a href="/blog/backend/07-security-data-protection/vendors/kyverno/" data-link-title="Kyverno" data-link-desc="K8s-native policy engine、YAML policy（非 Rego）、五類 rule（Validate / Mutate / Generate / Verify Images / Cleanup）、CNCF Incubating">Kyverno</a> 比、Kyverno 是 K8s-only 的 admission policy（YAML 語法）、Conftest 跨多 config format（不只 K8s）且用 Rego。跟 <a href="/blog/backend/07-security-data-protection/vendors/trivy/" data-link-title="Trivy" data-link-desc="Aqua Security 開源 all-in-one scanner：Container / Filesystem / K8s / IaC &#43; Secret &#43; License &#43; SBOM、Apache 2.0、CI 友善">Trivy</a> Config 比、Trivy Config 是 <em>built-in misconfig rule</em>（開箱即用、預定義常見 anti-pattern）、Conftest 是 <em>自己寫 Rego policy</em>（客製化彈性大但要寫 rule）。</p>
<p>關鍵張力：<em>CI-time static check</em> ↔ <em>runtime admission enforcement</em> 是兩種互補機制、不是替代。CI 抓在 deploy 之前、但 manifest 不一定都走 PR（kubectl apply 直接打 cluster 就漏接）；admission 抓 runtime 寫入、但 deploy 後才 fail 已經慢。production 通常 CI（Conftest / Trivy Config）+ admission（Gatekeeper / Kyverno）雙層覆蓋。</p>
<h2 id="本章目標">本章目標</h2>
<p>讀完本頁、讀者能判斷：</p>
<ol>
<li>Conftest 在 policy-as-code stack 中承擔哪一段（CI gate）、跟 admission runtime 怎麼分工</li>
<li>Rego policy directory / <code>conftest test</code> / <code>conftest verify</code> / Bundle / Combine flag 的 ownership 跟工程化做法</li>
<li>Conftest vs Trivy Config vs Checkov vs OPA + custom CI wrapper 的取捨</li>
<li>何時用 Conftest、何時走 Trivy Config（不想學 Rego）或 Gatekeeper（runtime enforcement）</li>
</ol>
<h2 id="最短判讀路徑">最短判讀路徑</h2>
<p>判斷 Conftest 導入是否健康、最少看四件事：</p>
<ul>
<li><strong>Policy directory 走版控</strong>：Rego files（<code>policy/*.rego</code>）跟 application code 同 repo、或抽到中央 policy repo + Bundle 拉取、PR review 才能改 policy</li>
<li><strong><code>conftest verify</code> 是否跑</strong>：Rego policy 本身有單元測試（<code>*_test.rego</code>）、policy 改動有 test coverage、不是寫完就上 production CI</li>
<li><strong>CI integration 點</strong>：跑在 PR check / merge gate、fail 阻斷 merge、不是只跑 warning 沒人看</li>
<li><strong>跟 admission 是否雙層</strong>：CI fail 之外、cluster 也裝 <a href="/blog/backend/07-security-data-protection/vendors/gatekeeper/" data-link-title="OPA Gatekeeper" data-link-desc="OPA 官方 K8s admission controller、ConstraintTemplate &#43; Constraint 兩層、Rego policy &#43; Audit &#43; Mutation">Gatekeeper</a> / <a href="/blog/backend/07-security-data-protection/vendors/kyverno/" data-link-title="Kyverno" data-link-desc="K8s-native policy engine、YAML policy（非 Rego）、五類 rule（Validate / Mutate / Generate / Verify Images / Cleanup）、CNCF Incubating">Kyverno</a> 接 runtime；否則 kubectl apply 繞過 CI 就破口</li>
</ul>
<p>四件事任一缺失、就是 <a href="/blog/backend/07-security-data-protection/supply-chain-integrity-and-artifact-trust/" data-link-title="7.12 供應鏈完整性與 Artifact 信任" data-link-desc="定義 build provenance、artifact 信任與交付鏈風險問題">Supply Chain Integrity</a> 邊界的待補項目。</p>
<h2 id="日常操作與決策形狀">日常操作與決策形狀</h2>
<p><strong>Policy directory（Rego files）</strong>：Conftest 預設讀 <code>./policy/</code> 目錄下所有 <code>*.rego</code> 檔。Policy 用 <code>deny[msg]</code> / <code>warn[msg]</code> / <code>violation[msg]</code> rule 表達 — <code>deny</code> 失敗整個 test、<code>warn</code> 只 print 不 fail、<code>violation</code> 給結構化輸出（含 metadata 給後續 SOAR 處理）。慣例是一個 policy 檔對一個 anti-pattern（<code>policy/k8s_privileged.rego</code> / <code>policy/terraform_public_s3.rego</code>）、不混寫。</p>
<p><strong><code>conftest test</code> command</strong>：<code>conftest test deployment.yaml</code> / <code>conftest test --policy ./custom-policy terraform.plan.json</code> 是日常入口。支援 <code>--all-namespaces</code>（K8s 多 manifest）、<code>--input</code>（強制 parser 類型）、<code>--combine</code>（跨檔 check）、<code>--output json|tap|table</code>（CI 報表格式）。CI integration 通常 <code>conftest test infra/**/*.yaml --output github</code> 直接 emit GitHub Actions annotation。</p>
<p><strong>Parser（多 format 支援）</strong>：Conftest 原生支援 HCL（Terraform）/ YAML / JSON / TOML / INI / Dockerfile / CUE / Jsonnet / EDN / XML / VCL（Fastly）/ Cyclonedx SBOM 等。同一份 Rego 可跑多 format — parser 把不同 format normalize 成 Rego input JSON、policy 寫 <code>input.spec.containers[_].securityContext.privileged == true</code> 不管原本是 YAML 還是 JSON。這是 Conftest 比 Checkov / Trivy Config 客製化彈性更大的關鍵：同一個 policy 引擎處理跨 format misconfig。</p>
<p><strong>Combine flag（跨檔 check）</strong>：<code>conftest test --combine *.yaml</code> 把多檔合併成單一 input array、policy 可跨檔 reference。實務場景：K8s deployment 必須有對應 service + configmap + networkpolicy、缺一就 fail；Terraform module 跨檔 reference（VPC + subnet + security group）必須一致。沒有 Combine 就只能單檔檢查、跨檔 invariant 抓不到。</p>
<p><strong><code>conftest verify</code>（policy unit test）</strong>：Policy 本身要有測試 — <code>policy/k8s_privileged_test.rego</code> 寫 <code>test_privileged_denied</code> / <code>test_non_privileged_allowed</code>、<code>conftest verify</code> 跑這些測試。Policy 改動先跑 verify、再 merge policy 到 production CI。沒做 verify 的 policy 是「policy 自己 broken 沒人發現」的常見破口。</p>
<p><strong>Bundle（OPA bundle 拉 policy）</strong>：<code>conftest pull</code> 從 OCI registry / HTTP / git / S3 拉 policy bundle、policy 集中在 central repo、各 service repo 不重複維護。Bundle 包含 Rego files + data files + manifest、可簽章驗證（配 <a href="https://docs.sigstore.dev/">Sigstore cosign</a>）。大組織通常 platform team 維護 policy bundle、application team 在 CI 拉最新版本跑。</p>
<p><strong>CI integration</strong>：GitHub Actions（<code>open-policy-agent/conftest-action</code>）/ GitLab CI / Jenkins / CircleCI 都有現成 step。跑點通常在 PR check 階段（review 前 fail fast）+ merge gate（防止繞過）。失敗訊息含 file / line / policy name、SOC / SRE 看 annotation 就知道哪行違規。</p>
<h2 id="核心取捨表">核心取捨表</h2>
<table>
  <thead>
      <tr>
          <th>取捨維度</th>
          <th>Conftest</th>
          <th>Trivy Config</th>
          <th>Checkov</th>
          <th>OPA + custom CI wrapper</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>規則來源</td>
          <td>自己寫 Rego（彈性大、要學 Rego）</td>
          <td>內建 misconfig rule（開箱即用）</td>
          <td>內建 + 自訂 Python rule</td>
          <td>自己寫 Rego + 自己包 CI</td>
      </tr>
      <tr>
          <td>學習曲線</td>
          <td>中 — Rego 語法 + Conftest CLI</td>
          <td>緩 — <code>trivy config</code> 一個指令</td>
          <td>緩 — 內建 rule、自訂 Python 稍重</td>
          <td>陡 — Rego + 自己組 CI 跑點</td>
      </tr>
      <tr>
          <td>Format 支援</td>
          <td>廣 — Terraform / K8s / Dockerfile 等</td>
          <td>中 — Terraform / K8s / CloudFormation</td>
          <td>中 — Terraform / K8s / Serverless</td>
          <td>看自己包</td>
      </tr>
      <tr>
          <td>客製彈性</td>
          <td>高 — 任意 Rego policy</td>
          <td>低 — 內建 rule、客製要寫 plugin</td>
          <td>中 — Python custom rule</td>
          <td>高</td>
      </tr>
      <tr>
          <td>跨檔 check</td>
          <td>強 — <code>--combine</code> flag</td>
          <td>弱 — 主要單檔</td>
          <td>中</td>
          <td>看自己寫</td>
      </tr>
      <tr>
          <td>Policy 共享</td>
          <td>OPA Bundle（OCI / git / HTTP）</td>
          <td>Trivy DB（中央更新）</td>
          <td>Checkov rule pack</td>
          <td>自己管</td>
      </tr>
      <tr>
          <td>計費</td>
          <td>OSS Apache 2.0</td>
          <td>OSS（Aqua 商業版有加值）</td>
          <td>OSS（Bridgecrew 商業版）</td>
          <td>OSS（OPA）</td>
      </tr>
      <tr>
          <td>適合場景</td>
          <td>客製化 policy、Rego 已用、跨 format</td>
          <td>開箱即用、團隊不想學 Rego</td>
          <td>Terraform-heavy、Python team 熟</td>
          <td>OPA 已是 runtime、CI 想複用 policy</td>
      </tr>
  </tbody>
</table>
<p>選 Conftest 的核心訴求：<em>組織有 custom policy 需求</em> + <em>已用 OPA / Rego（admission 走 Gatekeeper、CI 走 Conftest 統一語言）</em> + <em>跨多 config format 需要同一個 policy 引擎</em>。如果只是要 K8s privileged container / Terraform public S3 這類常見 anti-pattern 攔截、直接 Trivy Config 開箱即用更划算。</p>
<h2 id="進階主題">進階主題</h2>
<p><strong><code>conftest verify</code>（policy unit test lifecycle）</strong>：Policy 是 code、code 要有測試。<code>policy/k8s_privileged_test.rego</code> 寫 <code>test_privileged_denied { count(deny) &gt; 0 with input as {...} }</code>、CI 跑 <code>conftest verify ./policy</code> 把 policy test 當 unit test。Policy change 走 PR → verify pass → 部署到 central bundle → application CI 拉新版本。沒 verify 的 policy 是「沒人知道 policy 自己壞掉、所有 application CI 都 silently pass」的 systemic 風險。</p>
<p><strong>Bundle 從 OCI registry pull + 簽章驗證</strong>：<code>conftest pull oci://registry.example.com/policy-bundle:v1.2.3</code> 從 OCI registry 拉 policy bundle、policy distribution 走 container image 同一套 supply chain。配 <a href="https://docs.sigstore.dev/">Sigstore cosign</a> 簽章驗證、policy bundle 也走 <a href="/blog/backend/07-security-data-protection/supply-chain-integrity-and-artifact-trust/" data-link-title="7.12 供應鏈完整性與 Artifact 信任" data-link-desc="定義 build provenance、artifact 信任與交付鏈風險問題">7.12 供應鏈完整性</a> 的 release gate 邏輯 — policy 本身就是 artifact、需要 signing + verification。</p>
<p><strong>跟 <a href="/blog/backend/07-security-data-protection/vendors/trivy/" data-link-title="Trivy" data-link-desc="Aqua Security 開源 all-in-one scanner：Container / Filesystem / K8s / IaC &#43; Secret &#43; License &#43; SBOM、Apache 2.0、CI 友善">Trivy</a> Config 混用</strong>：實務上 <em>Trivy Config 跑預設 rule</em>（CIS / NSA / OWASP baseline、開箱即用）+ <em>Conftest 跑客製 rule</em>（organization-specific：必須有特定 label、必須走特定 namespace、必須引用特定 ConfigMap）。兩者 CI 階段都跑、報表合併。不是二選一、是 baseline + custom 的分工。</p>
<p><strong>跟 admission 雙層</strong>：CI 階段 Conftest fail 之外、cluster 也裝 <a href="/blog/backend/07-security-data-protection/vendors/gatekeeper/" data-link-title="OPA Gatekeeper" data-link-desc="OPA 官方 K8s admission controller、ConstraintTemplate &#43; Constraint 兩層、Rego policy &#43; Audit &#43; Mutation">Gatekeeper</a> 接 admission。Gatekeeper 用 ConstraintTemplate（也是 Rego）、同一份 Rego policy 理論上 CI / runtime 共用 — 但實務上 admission 場景跟 static check 場景的 input shape 不同（admission 拿 AdmissionReview object、static 拿 raw manifest）、policy 通常分兩份維護或寫 abstraction layer 共用。</p>
<h2 id="排錯與失敗快速判讀">排錯與失敗快速判讀</h2>
<ul>
<li><strong>Policy pass 但 production 還是 misconfig</strong>：CI 沒卡在 merge gate、或有 <code>kubectl apply</code> 繞過 PR — 加 admission controller（<a href="/blog/backend/07-security-data-protection/vendors/gatekeeper/" data-link-title="OPA Gatekeeper" data-link-desc="OPA 官方 K8s admission controller、ConstraintTemplate &#43; Constraint 兩層、Rego policy &#43; Audit &#43; Mutation">Gatekeeper</a> / <a href="/blog/backend/07-security-data-protection/vendors/kyverno/" data-link-title="Kyverno" data-link-desc="K8s-native policy engine、YAML policy（非 Rego）、五類 rule（Validate / Mutate / Generate / Verify Images / Cleanup）、CNCF Incubating">Kyverno</a>）做 runtime 雙層</li>
<li><strong>Rego policy 自己 broken / silently pass</strong>：沒寫 <code>*_test.rego</code> + <code>conftest verify</code> — 補 policy unit test、CI 跑 verify 才 promote</li>
<li><strong><code>conftest test</code> 跑出 0 violations 但 manifest 有問題</strong>：policy directory 沒讀對、或 parser 自動偵測選錯 — 顯式 <code>--policy ./policy --input yaml</code></li>
<li><strong>跨檔 invariant 抓不到</strong>（deployment 沒對應 service）：忘記用 <code>--combine</code> flag — 改 <code>conftest test --combine *.yaml</code></li>
<li><strong>Bundle 拉到舊版本 / policy drift</strong>：沒固定 bundle tag、用 <code>latest</code> 漂移 — bundle reference 用 digest（<code>sha256:...</code>）或 immutable tag</li>
<li><strong>False positive 多 / team 開始 ignore CI</strong>：policy 寫得太寬、沒考慮合理例外 — Rego 加 exception list（<code>data.exceptions</code>）、走 <a href="/blog/backend/07-security-data-protection/blue-team/" data-link-title="7.B 防守者視角（藍隊）與控制面驗證" data-link-desc="從防守者角度整理控制面、偵測路由、驗證策略與演練回寫">Exception Workflow</a> lifecycle</li>
<li><strong>Policy 散落各 application repo / 維護不一致</strong>：沒走 central bundle — 抽 policy 到中央 repo、各 application 拉 bundle</li>
</ul>
<h2 id="何時改走其他服務">何時改走其他服務</h2>
<table>
  <thead>
      <tr>
          <th>需求形狀</th>
          <th>改走</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>開箱即用、不想學 Rego</td>
          <td><a href="/blog/backend/07-security-data-protection/vendors/trivy/" data-link-title="Trivy" data-link-desc="Aqua Security 開源 all-in-one scanner：Container / Filesystem / K8s / IaC &#43; Secret &#43; License &#43; SBOM、Apache 2.0、CI 友善">Trivy Config</a></td>
      </tr>
      <tr>
          <td>K8s admission runtime</td>
          <td><a href="/blog/backend/07-security-data-protection/vendors/gatekeeper/" data-link-title="OPA Gatekeeper" data-link-desc="OPA 官方 K8s admission controller、ConstraintTemplate &#43; Constraint 兩層、Rego policy &#43; Audit &#43; Mutation">Gatekeeper</a> / <a href="/blog/backend/07-security-data-protection/vendors/kyverno/" data-link-title="Kyverno" data-link-desc="K8s-native policy engine、YAML policy（非 Rego）、五類 rule（Validate / Mutate / Generate / Verify Images / Cleanup）、CNCF Incubating">Kyverno</a></td>
      </tr>
      <tr>
          <td>Runtime application policy</td>
          <td><a href="/blog/backend/07-security-data-protection/vendors/opa/" data-link-title="Open Policy Agent (OPA)" data-link-desc="CNCF general-purpose policy engine、Rego Datalog-like 語言、decoupled decision &#43; enforcement、跨 K8s / API / Terraform / SQL 統一 policy">OPA</a></td>
      </tr>
      <tr>
          <td>Terraform-heavy + Python team</td>
          <td>Checkov / tfsec</td>
      </tr>
      <tr>
          <td>Cloud-native CNAPP</td>
          <td>Wiz / Prisma Cloud</td>
      </tr>
  </tbody>
</table>
<h2 id="不在本頁內的主題">不在本頁內的主題</h2>
<ul>
<li>Rego 完整語法 reference、<code>every</code> / <code>walk</code> / built-in function 進階用法</li>
<li>OPA Bundle 的 server-side 實作（policy publish pipeline）</li>
<li>Conftest 跟 Open Policy Agent runtime 的內部架構差異</li>
<li>Sigstore cosign 的 keyless signing flow 細節</li>
</ul>
<h2 id="案例回寫">案例回寫</h2>
<p>Conftest 在 07 案例庫沒有直接 vendor-level 事件、但所有 supply chain case 都是 CI-time policy gate 的對照：</p>
<table>
  <thead>
      <tr>
          <th>案例</th>
          <th>跟 Conftest 的關係（對照啟示）</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="/blog/backend/07-security-data-protection/red-team/cases/supply-chain/solarwinds-2020-sunburst/" data-link-title="7.R7.2.1 SolarWinds 2020：更新鏈被濫用" data-link-desc="合法更新流程遭植入後，攻擊者如何長期潛伏與橫向擴散">SolarWinds 2020 Sunburst</a></td>
          <td>Conftest 在 CI 階段檢查 Terraform / K8s manifest 是否符合 image signing policy（image 必須來自 signed registry、必須有 cosign attestation）</td>
      </tr>
      <tr>
          <td><a href="/blog/backend/07-security-data-protection/red-team/cases/supply-chain/log4shell-cve-2021-44228-component-chain/" data-link-title="7.R7.2.7 Log4Shell 2021：共用元件風險與修補鏈" data-link-desc="共用元件漏洞如何同步影響多服務，並迫使團隊建立依賴治理 workflow">Log4Shell CVE-2021-44228</a></td>
          <td>Conftest 配 SBOM 檔案做 CI-time vulnerable component check、補 admission 之前的 gate（image 含 log4j-core &lt;2.17 直接 PR fail）</td>
      </tr>
      <tr>
          <td><a href="/blog/backend/07-security-data-protection/supply-chain-integrity-and-artifact-trust/" data-link-title="7.12 供應鏈完整性與 Artifact 信任" data-link-desc="定義 build provenance、artifact 信任與交付鏈風險問題">7.12 供應鏈完整性 (section)</a></td>
          <td>Conftest 是 release gate 在 CI 階段的 policy enforcement 工具、跟 admission 雙層覆蓋、policy bundle 本身也走 cosign 簽章 supply chain</td>
      </tr>
  </tbody>
</table>
<h2 id="下一步路由">下一步路由</h2>
<ul>
<li>上游：<a href="/blog/backend/07-security-data-protection/supply-chain-integrity-and-artifact-trust/" data-link-title="7.12 供應鏈完整性與 Artifact 信任" data-link-desc="定義 build provenance、artifact 信任與交付鏈風險問題">7.12 供應鏈完整性與 artifact 信任</a></li>
<li>平行：<a href="/blog/backend/07-security-data-protection/vendors/opa/" data-link-title="Open Policy Agent (OPA)" data-link-desc="CNCF general-purpose policy engine、Rego Datalog-like 語言、decoupled decision &#43; enforcement、跨 K8s / API / Terraform / SQL 統一 policy">OPA</a>、<a href="/blog/backend/07-security-data-protection/vendors/gatekeeper/" data-link-title="OPA Gatekeeper" data-link-desc="OPA 官方 K8s admission controller、ConstraintTemplate &#43; Constraint 兩層、Rego policy &#43; Audit &#43; Mutation">Gatekeeper</a>、<a href="/blog/backend/07-security-data-protection/vendors/kyverno/" data-link-title="Kyverno" data-link-desc="K8s-native policy engine、YAML policy（非 Rego）、五類 rule（Validate / Mutate / Generate / Verify Images / Cleanup）、CNCF Incubating">Kyverno</a>、<a href="/blog/backend/07-security-data-protection/vendors/trivy/" data-link-title="Trivy" data-link-desc="Aqua Security 開源 all-in-one scanner：Container / Filesystem / K8s / IaC &#43; Secret &#43; License &#43; SBOM、Apache 2.0、CI 友善">Trivy</a></li>
<li>跨類：<a href="/blog/backend/07-security-data-protection/vendors/github-advanced-security/" data-link-title="GitHub Advanced Security" data-link-desc="GitHub 內建 4 大模組：Code Scanning (CodeQL) &#43; Secret Scanning &#43; Dependency Review &#43; Dependabot、跟 PR / Security tab 深度整合">GitHub Advanced Security</a>（CI security check pipeline 共用）</li>
<li>官方：<a href="https://www.conftest.dev/">Conftest Documentation</a></li>
</ul>
]]></content:encoded></item></channel></rss>