<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>模組七：infra 走 PR 流程與自動化護欄 on Tarragon</title><link>https://tarrragon.github.io/blog/infra/07-infra-as-pr/</link><description>Recent content in 模組七：infra 走 PR 流程與自動化護欄 on Tarragon</description><generator>Hugo -- gohugo.io</generator><language>zh-TW</language><copyright>Tarragon (CC BY 4.0)</copyright><lastBuildDate>Fri, 26 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://tarrragon.github.io/blog/infra/07-infra-as-pr/index.xml" rel="self" type="application/rss+xml"/><item><title>infra 走 PR 流程與自動化護欄</title><link>https://tarrragon.github.io/blog/infra/07-infra-as-pr/plan-review-apply-guardrails/</link><pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/infra/07-infra-as-pr/plan-review-apply-guardrails/</guid><description>&lt;p>infra 變更要走跟 application code 一樣的流程：開分支、提 PR、跑檢查、review diff、合併、發布。這條原則把基礎設施變更從「某個人在自己終端機 apply」轉成「團隊可審查的紀錄」，是 IaC 真正兌現價值的地方，也是解開「只有我懂 infra」這個單點依賴的關鍵。基礎設施跟程式碼一樣會出錯、會需要回溯、會交接給別人，所以它需要同一套保護機制。&lt;/p>
&lt;h2 id="infra-變更走-code-流程">infra 變更走 code 流程&lt;/h2>
&lt;p>infra 變更的標準路徑是 PR → plan → review diff → 合併 → apply。這個順序的核心責任是把「執行前先看清楚要改什麼」變成強制步驟，而不是 apply 之後才從事故裡發現改錯了。每個環節各自承擔一段審查責任，少掉任一段，infra 就退回到不可審查的狀態。&lt;/p>
&lt;h3 id="plan-是整條鏈最關鍵的一環">plan 是整條鏈最關鍵的一環&lt;/h3>
&lt;p>&lt;code>terraform plan&lt;/code> 把當前 state、雲端實際資源、與目標設定三方比對，產出一份「會新增 / 修改 / 刪除哪些資源」的 diff。這份 diff 是 review 的對象：reviewer 直接看 plan 算出來的實際變更，而非讀 HCL 自行想像結果。&lt;/p>
&lt;p>plan 輸出裡最關鍵的判讀訊號是操作類型。&lt;code>+&lt;/code> 是新增，&lt;code>~&lt;/code> 是就地更新，&lt;code>-&lt;/code> 是銷毀，&lt;code>-/+&lt;/code> 是先銷毀再重建。前兩者多數情境是安全的，後兩者需要逐行細看。改一個看似無害的欄位可能觸發整個資源重建（&lt;code>-/+&lt;/code>），例如某些雲資源的 &lt;code>name&lt;/code> 或 &lt;code>identifier&lt;/code> 是 immutable 屬性，改它的唯一方式就是銷毀再建。對有狀態的服務（RDS、帶資料的 EBS volume），&lt;code>-/+&lt;/code> 代表資料遺失或停機。Review 階段抓到這個 &lt;code>-/+&lt;/code>，比 apply 到一半才發現便宜太多。&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl"># plan 輸出中要特別警惕的標記
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl"># forces replacement — 某個 immutable 屬性被修改，將觸發銷毀重建
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl"># must be replaced — 跟上面同義，Terraform 新版的表達方式
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl"># will be destroyed — 資源將被刪除
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl"> # aws_db_instance.primary must be replaced
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl"> -/+ resource &amp;#34;aws_db_instance&amp;#34; &amp;#34;primary&amp;#34; {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl"> ~ identifier = &amp;#34;app-prod&amp;#34; -&amp;gt; &amp;#34;app-production&amp;#34; # forces replacement
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl"> ...
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl"> }&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="把-plan-結果貼回-pr">把 plan 結果貼回 PR&lt;/h3>
&lt;p>把 plan 結果貼回 PR 是讓 review 真正生效的做法。流程上，PR 觸發 CI 跑 plan，plan 輸出回貼成 PR comment，reviewer 連同程式碼 diff 一起看；approve 後才允許合併，合併才觸發 apply。&lt;/p>
&lt;p>這裡有個取捨：plan 與 apply 之間若隔了很久，雲端實際狀態可能已經漂移（有人手動改了、或別的 PR 先 apply 了），導致 apply 時的 plan 跟 review 時看到的不一致。應對方式分保守與務實兩種。保守做法是 apply 前重跑一次 plan 並比對結果 — 一致才繼續，不一致就中斷。務實做法是在合併觸發 apply 時自動跑 plan 並只在無 destroy / replace 時自動執行，有 destroy / replace 就停下來要人確認。多數團隊從務實做法開始，到遇過一次 plan-apply 不一致的事故後才升級到保守做法。&lt;/p>
&lt;h3 id="apply-失敗的回退邊界">apply 失敗的回退邊界&lt;/h3>
&lt;p>infra apply 不像程式碼部署可以直接 rollback 到上一版 image — 中途失敗時部分資源已經建立、state 可能處於半完成狀態。例如 apply 建了一個新 subnet 但在建 route table 時 timeout，此時 subnet 存在於雲端和 state 裡，route table 只在雲端不在 state 裡（或反過來），下一次 plan 的計算基礎就不精準。&lt;/p></description><content:encoded><![CDATA[<p>infra 變更要走跟 application code 一樣的流程：開分支、提 PR、跑檢查、review diff、合併、發布。這條原則把基礎設施變更從「某個人在自己終端機 apply」轉成「團隊可審查的紀錄」，是 IaC 真正兌現價值的地方，也是解開「只有我懂 infra」這個單點依賴的關鍵。基礎設施跟程式碼一樣會出錯、會需要回溯、會交接給別人，所以它需要同一套保護機制。</p>
<h2 id="infra-變更走-code-流程">infra 變更走 code 流程</h2>
<p>infra 變更的標準路徑是 PR → plan → review diff → 合併 → apply。這個順序的核心責任是把「執行前先看清楚要改什麼」變成強制步驟，而不是 apply 之後才從事故裡發現改錯了。每個環節各自承擔一段審查責任，少掉任一段，infra 就退回到不可審查的狀態。</p>
<h3 id="plan-是整條鏈最關鍵的一環">plan 是整條鏈最關鍵的一環</h3>
<p><code>terraform plan</code> 把當前 state、雲端實際資源、與目標設定三方比對，產出一份「會新增 / 修改 / 刪除哪些資源」的 diff。這份 diff 是 review 的對象：reviewer 直接看 plan 算出來的實際變更，而非讀 HCL 自行想像結果。</p>
<p>plan 輸出裡最關鍵的判讀訊號是操作類型。<code>+</code> 是新增，<code>~</code> 是就地更新，<code>-</code> 是銷毀，<code>-/+</code> 是先銷毀再重建。前兩者多數情境是安全的，後兩者需要逐行細看。改一個看似無害的欄位可能觸發整個資源重建（<code>-/+</code>），例如某些雲資源的 <code>name</code> 或 <code>identifier</code> 是 immutable 屬性，改它的唯一方式就是銷毀再建。對有狀態的服務（RDS、帶資料的 EBS volume），<code>-/+</code> 代表資料遺失或停機。Review 階段抓到這個 <code>-/+</code>，比 apply 到一半才發現便宜太多。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln"> 1</span><span class="cl"># plan 輸出中要特別警惕的標記
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"># forces replacement  — 某個 immutable 屬性被修改，將觸發銷毀重建
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"># must be replaced    — 跟上面同義，Terraform 新版的表達方式
</span></span><span class="line"><span class="ln"> 4</span><span class="cl"># will be destroyed   — 資源將被刪除
</span></span><span class="line"><span class="ln"> 5</span><span class="cl">
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">  # aws_db_instance.primary must be replaced
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">  -/+ resource &#34;aws_db_instance&#34; &#34;primary&#34; {
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">      ~ identifier = &#34;app-prod&#34; -&gt; &#34;app-production&#34;  # forces replacement
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">        ...
</span></span><span class="line"><span class="ln">10</span><span class="cl">    }</span></span></code></pre></div><h3 id="把-plan-結果貼回-pr">把 plan 結果貼回 PR</h3>
<p>把 plan 結果貼回 PR 是讓 review 真正生效的做法。流程上，PR 觸發 CI 跑 plan，plan 輸出回貼成 PR comment，reviewer 連同程式碼 diff 一起看；approve 後才允許合併，合併才觸發 apply。</p>
<p>這裡有個取捨：plan 與 apply 之間若隔了很久，雲端實際狀態可能已經漂移（有人手動改了、或別的 PR 先 apply 了），導致 apply 時的 plan 跟 review 時看到的不一致。應對方式分保守與務實兩種。保守做法是 apply 前重跑一次 plan 並比對結果 — 一致才繼續，不一致就中斷。務實做法是在合併觸發 apply 時自動跑 plan 並只在無 destroy / replace 時自動執行，有 destroy / replace 就停下來要人確認。多數團隊從務實做法開始，到遇過一次 plan-apply 不一致的事故後才升級到保守做法。</p>
<h3 id="apply-失敗的回退邊界">apply 失敗的回退邊界</h3>
<p>infra apply 不像程式碼部署可以直接 rollback 到上一版 image — 中途失敗時部分資源已經建立、state 可能處於半完成狀態。例如 apply 建了一個新 subnet 但在建 route table 時 timeout，此時 subnet 存在於雲端和 state 裡，route table 只在雲端不在 state 裡（或反過來），下一次 plan 的計算基礎就不精準。</p>
<p>應對的紀律是：apply 失敗後，先跑一次 <code>terraform plan</code> 確認 state 與現實的差距，再決定是修正 code 重新 apply 還是手動清理殘留資源後 <code>terraform state rm</code>。在清理之前不要再改 code、不要連發第二次 apply — 第二次 apply 在不確定的 state 上跑，可能把問題擴大。</p>
<p>PR 流程的價值在這裡不只是事前審查，也是事後可追溯：每次變更都對應一個 commit 與一個 PR，要回溯時知道是哪次改的、為什麼改、誰 review 的。</p>
<h2 id="fmt-與-validate最便宜的第一道檢查">fmt 與 validate：最便宜的第一道檢查</h2>
<p><code>fmt</code> 與 <code>validate</code> 是進到任何安全掃描之前的基礎檢查，責任是擋掉格式不一致與語法 / 型別錯誤這類不需要動腦判斷的問題。它們跑得快（通常不到五秒）、沒有誤判空間，適合放在 CI 最前面當作快速 fail 的關卡。</p>
<p><code>terraform fmt -check</code> 驗證程式碼是否符合標準排版。它本身不影響基礎設施行為，價值在於消除 diff 噪音：當每個人的編輯器縮排習慣不同，PR diff 會混入大量純排版變動，把真正的邏輯變更淹沒，reviewer 更容易看漏。統一格式後，diff 裡剩下的就是語意變更。在本地開發階段配合 editor plugin 或 pre-commit hook 在存檔時自動 fmt，讓 CI 的 fmt check 幾乎不會再 fail — 它存在的意義是攔住那些沒裝 plugin 的人。</p>
<p><code>validate</code> 則檢查設定在語法與內部一致性上是否成立 — reference 到不存在的變數、型別不匹配、必填參數缺漏、module 呼叫的 source 解析不了，這些在 validate 階段就會報錯，不必等到 plan 連線雲端才發現。validate 需要先跑 <code>terraform init</code>，但可以用 <code>-backend=false</code> 跳過連線 state backend，這樣在 CI 裡不需要雲端憑證就能跑完。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c"># .github/workflows/terraform.yml — plan 前的基礎檢查</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">validate</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">hashicorp/setup-terraform@v3</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform fmt -check -recursive</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform init -backend=false</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform validate</span></span></span></code></pre></div><p>判讀上，fmt 與 validate 失敗代表的是「這份 code 還沒準備好被認真 review」，屬於作者自己該先修掉的問題，不該佔用 reviewer 注意力。把它們設成 CI 必過的 gate，作者在本地就會先跑、先修，PR 送出時已經是乾淨的。</p>
<h2 id="tflint--checkov--tfsec抓壞寫法與安全漏洞">tflint / checkov / tfsec：抓壞寫法與安全漏洞</h2>
<p>fmt 與 validate 確認 code「語法正確」，但語法正確的設定仍然可能是危險的設定。tflint、checkov、tfsec 這類靜態掃描工具承擔的是「語意正確」這層：在不實際建立資源的前提下，從 HCL 裡比對已知的壞寫法與安全反模式，把問題擋在 plan 之前。它們補的是 reviewer 肉眼容易漏掉的盲區 — 人會看漏一個 <code>0.0.0.0/0</code>，規則不會。</p>
<h3 id="三者的側重">三者的側重</h3>
<table>
  <thead>
      <tr>
          <th>工具</th>
          <th>側重領域</th>
          <th>典型命中</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>tflint</td>
          <td>provider 層正確性與慣例</td>
          <td>棄用參數、region 不存在的 instance type、命名違規</td>
      </tr>
      <tr>
          <td>checkov</td>
          <td>安全與合規（CIS benchmark 導向）</td>
          <td>S3 公開、未加密、缺少 log、IAM 過寬</td>
      </tr>
      <tr>
          <td>tfsec</td>
          <td>安全反模式（HCL 結構導向）</td>
          <td>敏感埠全開、未加密、hardcode secret</td>
      </tr>
  </tbody>
</table>
<p>checkov 與 tfsec 的覆蓋範圍有重疊（都會掃 S3 公開與 SG 全開），差別在規則來源與報告格式。checkov 的規則對標 CIS benchmark 和多雲合規框架（AWS、Azure、GCP、Kubernetes），tfsec 更專注在 Terraform HCL 結構。兩者跑在一起時，重複的命中可以用其中一個的 skip 標記豁免。</p>
<h3 id="兩個最常攔下的反模式">兩個最常攔下的反模式</h3>
<p><strong>S3 bucket 對外公開</strong>。一個漏設 <code>block_public_access</code> 或 ACL 寫成 <code>public-read</code> 的 bucket，會讓裡面的物件對整個網際網路可讀。這類設定在 HCL 裡只是一兩行，肉眼 review 時很容易因為「看起來像樣板」而放過，但後果是資料外洩。checkov 規則 <code>CKV_AWS_19</code>（S3 bucket 未啟用 server-side encryption）和 <code>CKV_AWS_53</code>（block public access 未全開）會標記這類漏洞：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># checkov 會攔下的寫法 — 缺少 block_public_access
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="c1"></span><span class="k">resource</span> <span class="s2">&#34;aws_s3_bucket&#34; &#34;data&#34;</span> {
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="n">  bucket</span> <span class="o">=</span> <span class="s2">&#34;acme-customer-data&#34;</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">}<span class="c1">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="c1">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="c1"># 正確寫法 — 顯式關閉公開存取
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="c1"></span><span class="k">resource</span> <span class="s2">&#34;aws_s3_bucket_public_access_block&#34; &#34;data&#34;</span> {
</span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="n">  bucket</span>                  <span class="o">=</span> <span class="k">aws_s3_bucket</span><span class="p">.</span><span class="k">data</span><span class="p">.</span><span class="k">id</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="n">  block_public_acls</span>       <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="n">  block_public_policy</span>     <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="n">  ignore_public_acls</span>      <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="n">  restrict_public_buckets</span> <span class="o">=</span> <span class="kt">true</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">}</span></span></code></pre></div><p><strong>Security group 對全世界開放</strong>。一條 ingress 寫成 <code>cidr_blocks = [&quot;0.0.0.0/0&quot;]</code> 加上 port 22 或 3306，等於把 SSH 或資料庫埠暴露給全網掃描器。tfsec 與 checkov 都會標記這種「敏感埠 + 全開 CIDR」的組合。這條規則跟<a href="/blog/infra/03-network-foundation/" data-link-title="模組三：網路地基 — VPC 與分層" data-link-desc="VPC、public / private subnet 切分、route table、NAT、security group 設計">模組三：網路地基</a>講的 security group 收斂原則是同一件事的兩端 — 模組三教怎麼把規則寫對，本章用靜態掃描確保寫錯時擋得下來。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 三道掃描串在一起，任一 fail 就中斷</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">tflint --recursive
</span></span><span class="line"><span class="ln">3</span><span class="cl">checkov -d . --quiet --compact
</span></span><span class="line"><span class="ln">4</span><span class="cl">tfsec . --soft-fail<span class="o">=</span>false</span></span></code></pre></div><h3 id="命中是候選不是判決">命中是候選不是判決</h3>
<p>判讀這些工具的命中時，要區分「真漏洞」與「情境合理的例外」。並非每個 <code>0.0.0.0/0</code> 都是錯 — 一個對外的 HTTPS load balancer 在 port 443 開全網是設計本意。所以掃描的命中是候選不是判決。</p>
<p>多數工具支援用行內註解標記豁免。checkov 用 <code>#checkov:skip=CKV_AWS_260:ALB 443 對外是設計本意</code>，tfsec 用 <code>#tfsec:ignore:aws-elb-alb-not-public</code>。豁免的紀律是：每個 skip 都要寫理由、要在 PR 裡可見。沒有理由的 skip 跟關掉整條規則沒有差別 — review 時看到無理由的 skip 應該當成跟看到裸 <code>0.0.0.0/0</code> 一樣的警報。</p>
<p>把例外顯式化、留下為什麼豁免的紀錄，比關掉整條規則安全。隨時間累積的 skip 也要定期盤點：某個當初合理的例外，在架構演進後可能已經不再合理。</p>
<h2 id="atlantis-與-github-actions自動化-plan-與-apply">Atlantis 與 GitHub Actions：自動化 plan 與 apply</h2>
<p>把上述流程自動化，需要一個能監聽 PR 事件、在對的時機跑 plan 與 apply 的執行層。兩種常見做法是直接用 CI 平台（如 GitHub Actions）寫 workflow，或用 Atlantis 這類專為 Terraform PR 流程設計的工具。</p>
<h3 id="atlantis">Atlantis</h3>
<p>Atlantis 是一個常駐服務，掛在 git 平台的 webhook 上。PR 開啟時它自動跑 <code>plan</code> 並把結果貼回 PR comment，reviewer approve 後在 PR 留言 <code>atlantis apply</code>，它才執行 apply 並回報結果。它的價值在於把「誰能 apply、apply 前要不要 approve、plan 結果在哪看」這些規則收斂成一致的、可設定的流程。</p>
<p>Atlantis 內建的 state lock 語意在多 PR 並行時特別有用：當兩個 PR 都改到同一個 Terraform project，第二個 PR 的 plan 會被 lock 擋住，直到第一個 apply 完成或 PR 關閉。這避免了兩個 PR 各自拿到的 plan 基於不同的 state 快照、apply 時互相覆蓋的問題。用 GitHub Actions 要自己實作這個 lock 邏輯（通常靠 Terraform 自己的 state lock + workflow concurrency group），複雜度高得多。</p>
<p>Atlantis 的代價是它本身是一個要部署、要升級、要保護的常駐服務 — 它持有對雲端的寫入權限，所以它的部署環境必須嚴格控制存取。</p>
<h3 id="github-actions">GitHub Actions</h3>
<p>GitHub Actions workflow 的優點是不必額外維運服務、跟既有 CI 共用同一套 runner。缺點是 apply 的 gating 邏輯要自己用 workflow 條件拼出來。一個完整的 workflow 通常分成兩個 job：PR 觸發 plan job（跑 fmt / validate / scan / plan、把結果貼回 PR），合併到 main 才觸發 apply job。</p>
<p>無論哪種執行層，自動化的 apply 都需要對雲端的寫入權限，而這個權限怎麼來是整條管線的安全根基。這裡正是<a href="/blog/infra/02-identity-credentials/" data-link-title="模組二：身分與憑證地基 — IAM 與 OIDC" data-link-desc="IAM role / policy 設計、最小權限，以及用 OIDC 短期憑證取代長期 access key">模組二：身分與憑證地基</a>鋪設的 OIDC 兌現的地方 — 管線不該存放長期的 access key，而是在 runner 執行時用 OIDC 向雲端換取短期 token。</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c"># 合併到主幹後，用 OIDC 換短期憑證再 apply（呼應模組二）</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">apply</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">if</span><span class="p">:</span><span class="w"> </span><span class="l">github.ref == &#39;refs/heads/main&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">    </span><span class="nt">permissions</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">      </span><span class="nt">id-token</span><span class="p">:</span><span class="w"> </span><span class="l">write  </span><span class="w"> </span><span class="c"># 允許 runner 取得 OIDC token</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">      </span><span class="nt">contents</span><span class="p">:</span><span class="w"> </span><span class="l">read</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">aws-actions/configure-aws-credentials@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">          </span><span class="nt">role-to-assume</span><span class="p">:</span><span class="w"> </span><span class="l">arn:aws:iam::123456789012:role/infra-apply</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">          </span><span class="nt">aws-region</span><span class="p">:</span><span class="w"> </span><span class="l">ap-northeast-1</span><span class="w">
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">hashicorp/setup-terraform@v3</span><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform init</span><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform apply -auto-approve</span></span></span></code></pre></div><h3 id="選型判準">選型判準</h3>
<table>
  <thead>
      <tr>
          <th>考量</th>
          <th>GitHub Actions</th>
          <th>Atlantis</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>維運成本</td>
          <td>無額外服務</td>
          <td>需部署 + 升級常駐服務</td>
      </tr>
      <tr>
          <td>state lock</td>
          <td>靠 Terraform 自身 + concurrency</td>
          <td>內建 project lock、跨 PR 互斥</td>
      </tr>
      <tr>
          <td>apply gating</td>
          <td>自己用 environment rule 拼</td>
          <td>內建 approve + <code>atlantis apply</code> 語意</td>
      </tr>
      <tr>
          <td>跨 repo 一致</td>
          <td>每 repo 各自寫 workflow</td>
          <td>一套 server config 管所有 repo</td>
      </tr>
      <tr>
          <td>適合規模</td>
          <td>少量 repo、簡單流程</td>
          <td>多 repo、需統一 apply 治理</td>
      </tr>
  </tbody>
</table>
<p>判讀自動 apply 的邊界：對會觸發資源重建或刪除的高風險 plan，多數團隊會保留人工 apply 的關卡（Atlantis 的手動 <code>atlantis apply</code>、或 workflow 加 environment protection rule 要人按確認），不讓這類變更在合併瞬間無人看管地執行。自動化的目的是消除重複勞動與人為遺漏，不是把判斷也一起省掉。</p>
<h2 id="知識留在-code而不是留在個人腦中">知識留在 code，而不是留在個人腦中</h2>
<p>走完整套 PR 流程後，infra 的真正收穫是知識從個人的記憶移到了 repo 裡。每一次「為什麼這個 security group 開這個埠」「為什麼這台機器選這個 instance type」的決策，都以 code + PR 描述 + review 討論的形式留下，新人讀 repo 就能還原當初的判斷，不必去問那個「只有他懂 infra」的人。基礎設施可被閱讀，等於它可被交接。PR 流程上線後，管理層可以從 repo 的 PR merge 歷史與 plan comment 確認所有 infra 變更都經過提案與審查——這本身就是稽核要求的變更紀錄證據，不需要額外產出。</p>
<h3 id="git-revert-的能力與邊界">git revert 的能力與邊界</h3>
<p>可 revert 是 PR 流程最直接的兌現。當某次變更引發問題，回退手段是 <code>git revert</code> 那個 commit 再走一次 PR 流程，讓基礎設施回到變更前的設定 — 跟回退一段壞掉的程式碼是同一個動作。對照手動操作的舊狀態：回退靠的是當事人記得自己改了什麼、手動在 Console 改回去，記錯或人不在就無從回退。把變更歷史留在 git，回退就從「依賴某人的記憶」變成「依賴版本紀錄」。</p>
<p>這份 revert 能力的邊界要講清楚。revert code 救得回的是「設定」，救不回已經被銷毀的狀態與資料：</p>
<ul>
<li>revert 掉一個刪除 RDS 的 commit，只是讓設定回到「該資源應該存在」。apply 時 Terraform 會試圖建一個新的空資料庫 — 但被刪掉的資料庫裡的資料不會跟著回來。</li>
<li>rename 或 replace 類的變更 revert 後，可能再觸發一次資源重建 — 因為 <code>identifier</code> 又改回去了，而 identifier 是 immutable 屬性。</li>
<li>apply 到一半失敗的 state 不能直接 revert code 修復，得先處理 state 與雲端現實的不一致。</li>
</ul>
<p>stateful 變更的真正回退仍然靠備份與快照，這正是<a href="/blog/infra/05-core-services/" data-link-title="模組五：核心服務上 IaC" data-link-desc="資料庫、運算、儲存、load balancer 怎麼寫進基礎設施程式碼，以及上線順序">模組五：核心服務上 IaC</a> stateful 處理與<a href="/blog/infra/08-governance-habits/" data-link-title="模組八：治理好習慣 — 規模長大後不失控的最小節奏" data-link-desc="tagging 規範、secrets 不進 code、成本可見性、最小可行節奏，規模長大後不失控">模組八：治理好習慣</a> secret / state 保護要顧的事。把 git revert 當「設定層回退」就誠實，把它當「資料層回退」就會在事故裡踩空。</p>
<h3 id="知識共享的判讀訊號">知識共享的判讀訊號</h3>
<p>判讀一個團隊是否確實把知識留在 code 的訊號：當主要負責 infra 的人請假，其他人能不能只靠讀 repo 就理解現狀並安全地改一個小設定。如果答案是「得等他回來」，那不論工具鏈多完整，知識還在個人腦中，PR 流程只是形式。這個訊號比任何工具設定都更能反映 infra 的成熟度。</p>
<p>讓知識真正從個人腦中搬進 repo 的方式，除了 PR 流程本身，還需要組織層的配合 — 刻意的 review 輪替、on-call 輪值、配對操作。這條路線在<a href="/blog/infra/09-driving-adoption/" data-link-title="模組九：怎麼把 infra 推動起來" data-link-desc="技術正確不等於推得動 — 信任赤字、期望值對齊、知識共享，infra 落地的組織課題">模組九：怎麼把 infra 推動起來</a>展開到組織層。本章解決的是技術機制 — code 留得住知識；模組九解決的是怎麼讓團隊實際願意走這套流程、把知識交出來。</p>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>→ <a href="/blog/ci/" data-link-title="CI/CD 教學" data-link-desc="整理 CI/CD 的驗證、建置、發布 gate 與不同部署場域的流程差異，讓每次變更都能被穩定驗證與交付">CI/CD 教學</a>：infra 管線用的就是這套驗證 / 發布 gate，plan / apply 對應 build / deploy 階段</li>
<li>→ <a href="/blog/infra/02-identity-credentials/" data-link-title="模組二：身分與憑證地基 — IAM 與 OIDC" data-link-desc="IAM role / policy 設計、最小權限，以及用 OIDC 短期憑證取代長期 access key">模組二：身分與憑證地基</a>：管線用 OIDC 取得 apply 權限，本章是該章 OIDC 設計的回報兌現處</li>
<li>→ <a href="/blog/infra/03-network-foundation/" data-link-title="模組三：網路地基 — VPC 與分層" data-link-desc="VPC、public / private subnet 切分、route table、NAT、security group 設計">模組三：網路地基</a>：security group 收斂原則，本章用 tfsec / checkov 在 CI 攔下寫錯的全開規則</li>
<li>→ <a href="/blog/infra/05-core-services/" data-link-title="模組五：核心服務上 IaC" data-link-desc="資料庫、運算、儲存、load balancer 怎麼寫進基礎設施程式碼，以及上線順序">模組五：核心服務上 IaC</a>：stateful 資源的保護策略，git revert 救不回資料層</li>
<li>→ <a href="/blog/infra/08-governance-habits/" data-link-title="模組八：治理好習慣 — 規模長大後不失控的最小節奏" data-link-desc="tagging 規範、secrets 不進 code、成本可見性、最小可行節奏，規模長大後不失控">模組八：治理好習慣</a>：secret / state 保護</li>
<li>→ <a href="/blog/infra/09-driving-adoption/" data-link-title="模組九：怎麼把 infra 推動起來" data-link-desc="技術正確不等於推得動 — 信任赤字、期望值對齊、知識共享，infra 落地的組織課題">模組九：怎麼把 infra 推動起來</a>：本章把知識留在 code 的技術機制，在該章展開成組織層的採用與知識共享</li>
<li>→ <a href="/blog/backend/07-security-data-protection/" data-link-title="模組七：資安與資料保護" data-link-desc="以問題驅動方式擴充資安知識網：先定義服務環節問題，再以案例作為觸發式參考">backend 模組七：資安與資料保護</a>：S3 公開、敏感埠全開這類掃描攔截的反模式，對應的資料保護原則</li>
<li>→ <a href="/blog/infra/02-identity-credentials/team-access-management/" data-link-title="團隊權限分級與存取管理" data-link-desc="用 admin / operator / viewer 三級劃分團隊成員的雲端操作權限，設計臨時提權流程、定期 access review 節奏，以及 contractor 與外部 vendor 的存取邊界">團隊權限分級</a>：權限變更走 PR 流程，讓 policy 調整有審查紀錄</li>
<li>→ <a href="/blog/infra/08-governance-habits/handover-design/" data-link-title="職務交接與存取撤銷設計" data-link-desc="人員異動時的存取撤銷順序、credential rotation、最小交接清單，以及讓交接成本結構性降低的 infra 設計原則">職務交接設計</a>：PR 歷史是交接時的知識載體</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/terraform-ci-pipeline-setup/" data-link-title="Terraform CI Pipeline 設定指南" data-link-desc="用 GitHub Actions 建立完整的 Terraform CI pipeline：fmt → validate → tflint → plan → PR comment → apply，含 OIDC credential 與環境保護規則">Terraform CI Pipeline 設定指南</a>：GitHub Actions 完整 workflow</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/checkov-tfsec-rule-customization/" data-link-title="checkov 與 tfsec 規則配置" data-link-desc="靜態掃描工具的規則選擇策略、自訂規則、豁免管理、false positive 處理與 CI 整合，讓掃描從噪音來源變成可信的品質關卡">checkov 與 tfsec 規則配置</a>：規則選擇、豁免管理、CI 整合</li>
</ul>
]]></content:encoded></item><item><title>Terraform CI Pipeline 設定指南</title><link>https://tarrragon.github.io/blog/infra/07-infra-as-pr/terraform-ci-pipeline-setup/</link><pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/infra/07-infra-as-pr/terraform-ci-pipeline-setup/</guid><description>&lt;p>Terraform 的 PR 流程要發揮價值，plan 和 apply 需要在 CI 裡自動執行，而非在工程師的本機跑。本篇用 GitHub Actions 建立一條完整的 pipeline：PR 開啟時跑檢查和 plan、plan 結果貼回 PR comment 讓 reviewer 看、合併到主幹後才 apply。整條管線的 credential 用 OIDC 取得短期 token（見 &lt;a href="https://tarrragon.github.io/blog/infra/02-identity-credentials/oidc-trust-policy-setup/" data-link-title="OIDC Trust Policy 設定指南" data-link-desc="GitHub Actions 與 AWS 之間的 OIDC 聯合設定：建立 provider、設計 trust policy 的 claim 收斂、plan 與 apply role 分離、常見錯誤排查">OIDC Trust Policy 設定&lt;/a>），不存任何長期 key。&lt;/p>
&lt;h2 id="pipeline-的兩個階段">Pipeline 的兩個階段&lt;/h2>
&lt;p>整條 pipeline 分成兩個觸發時機，各自承擔不同責任：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>階段&lt;/th>
 &lt;th>觸發條件&lt;/th>
 &lt;th>責任&lt;/th>
 &lt;th>失敗時&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Plan&lt;/td>
 &lt;td>PR 開啟或更新&lt;/td>
 &lt;td>檢查格式、驗證語法、靜態掃描、產出 plan diff&lt;/td>
 &lt;td>PR 無法合併&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Apply&lt;/td>
 &lt;td>合併到 main&lt;/td>
 &lt;td>把 plan 過的變更套用到雲端&lt;/td>
 &lt;td>需要人工介入&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>兩個階段用不同的 IAM role：plan role 只有唯讀權限（能跑 &lt;code>terraform plan&lt;/code> 但不能改任何資源），apply role 有寫入權限。這個分離確保 PR 階段的任何 code 都沒辦法偷偷改動雲端資源。&lt;/p>
&lt;h2 id="plan-階段的完整-workflow">Plan 階段的完整 workflow&lt;/h2>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="line">&lt;span class="ln"> 1&lt;/span>&lt;span class="cl">&lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Terraform Plan&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 2&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">on&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 3&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">pull_request&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 4&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">paths&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 5&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="s1">&amp;#39;infra/**&amp;#39;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 6&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 7&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">permissions&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 8&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">id-token&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">write&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln"> 9&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">contents&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">read&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">10&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">pull-requests&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">write&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">11&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">12&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">jobs&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">13&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">plan&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">14&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">runs-on&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">ubuntu-latest&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">15&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">defaults&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">16&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">17&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">working-directory&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">infra/environments/prod&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">18&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">19&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">steps&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">20&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">actions/checkout@v4&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">21&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">22&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">aws-actions/configure-aws-credentials@v4&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">23&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">with&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">24&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">role-to-assume&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">arn:aws:iam::123456789012:role/infra-plan&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">25&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">aws-region&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">ap-northeast-1&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">26&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">27&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">hashicorp/setup-terraform@v3&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">28&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">with&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">29&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">terraform_version&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="m">1.9.0&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">30&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">31&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Format check&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">32&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">terraform fmt -check -recursive -diff&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">33&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">34&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Init&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">35&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">terraform init -input=false&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">36&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">37&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Validate&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">38&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">terraform validate&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">39&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">40&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">TFLint&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">41&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">terraform-linters/setup-tflint@v4&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">42&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">with&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">43&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">tflint_version&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">latest&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">44&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">tflint --recursive --format compact&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">45&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">46&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Plan&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">47&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">id&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">plan&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">48&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">|&lt;/span>&lt;span class="sd">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">49&lt;/span>&lt;span class="cl">&lt;span class="sd"> terraform plan -no-color -input=false -out=tfplan \
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">50&lt;/span>&lt;span class="cl">&lt;span class="sd"> -detailed-exitcode 2&amp;gt;&amp;amp;1 | tee plan-output.txt&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">51&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">continue-on-error&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="kc">true&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">52&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">53&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Comment plan on PR&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">54&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">uses&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">actions/github-script@v7&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">55&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">with&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">56&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">script&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="p">|&lt;/span>&lt;span class="sd">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">57&lt;/span>&lt;span class="cl">&lt;span class="sd"> const fs = require(&amp;#39;fs&amp;#39;);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">58&lt;/span>&lt;span class="cl">&lt;span class="sd"> const plan = fs.readFileSync(&amp;#39;infra/environments/prod/plan-output.txt&amp;#39;, &amp;#39;utf8&amp;#39;);
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">59&lt;/span>&lt;span class="cl">&lt;span class="sd"> const truncated = plan.length &amp;gt; 60000
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">60&lt;/span>&lt;span class="cl">&lt;span class="sd"> ? plan.substring(0, 60000) + &amp;#39;\n\n... (truncated)&amp;#39;
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">61&lt;/span>&lt;span class="cl">&lt;span class="sd"> : plan;
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">62&lt;/span>&lt;span class="cl">&lt;span class="sd"> await github.rest.issues.createComment({
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">63&lt;/span>&lt;span class="cl">&lt;span class="sd"> owner: context.repo.owner,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">64&lt;/span>&lt;span class="cl">&lt;span class="sd"> repo: context.repo.repo,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">65&lt;/span>&lt;span class="cl">&lt;span class="sd"> issue_number: context.issue.number,
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">66&lt;/span>&lt;span class="cl">&lt;span class="sd"> body: `### Terraform Plan\n\`\`\`\n${truncated}\n\`\`\``
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">67&lt;/span>&lt;span class="cl">&lt;span class="sd"> });&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">68&lt;/span>&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">69&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">Fail if plan errored&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">70&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">if&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">steps.plan.outcome == &amp;#39;failure&amp;#39;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">71&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">run&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">exit 1&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="各步驟的職責">各步驟的職責&lt;/h3>
&lt;p>&lt;strong>Format check&lt;/strong> 驗證 HCL 是否符合標準排版。它不影響功能，但消除 diff 噪音——排版不一致時 PR diff 會混入純格式變更，reviewer 分不清哪些是邏輯改動。&lt;code>-diff&lt;/code> flag 讓 CI 輸出具體哪幾行不符合，作者在本地跑 &lt;code>terraform fmt&lt;/code> 就能修。&lt;/p></description><content:encoded><![CDATA[<p>Terraform 的 PR 流程要發揮價值，plan 和 apply 需要在 CI 裡自動執行，而非在工程師的本機跑。本篇用 GitHub Actions 建立一條完整的 pipeline：PR 開啟時跑檢查和 plan、plan 結果貼回 PR comment 讓 reviewer 看、合併到主幹後才 apply。整條管線的 credential 用 OIDC 取得短期 token（見 <a href="/blog/infra/02-identity-credentials/oidc-trust-policy-setup/" data-link-title="OIDC Trust Policy 設定指南" data-link-desc="GitHub Actions 與 AWS 之間的 OIDC 聯合設定：建立 provider、設計 trust policy 的 claim 收斂、plan 與 apply role 分離、常見錯誤排查">OIDC Trust Policy 設定</a>），不存任何長期 key。</p>
<h2 id="pipeline-的兩個階段">Pipeline 的兩個階段</h2>
<p>整條 pipeline 分成兩個觸發時機，各自承擔不同責任：</p>
<table>
  <thead>
      <tr>
          <th>階段</th>
          <th>觸發條件</th>
          <th>責任</th>
          <th>失敗時</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Plan</td>
          <td>PR 開啟或更新</td>
          <td>檢查格式、驗證語法、靜態掃描、產出 plan diff</td>
          <td>PR 無法合併</td>
      </tr>
      <tr>
          <td>Apply</td>
          <td>合併到 main</td>
          <td>把 plan 過的變更套用到雲端</td>
          <td>需要人工介入</td>
      </tr>
  </tbody>
</table>
<p>兩個階段用不同的 IAM role：plan role 只有唯讀權限（能跑 <code>terraform plan</code> 但不能改任何資源），apply role 有寫入權限。這個分離確保 PR 階段的任何 code 都沒辦法偷偷改動雲端資源。</p>
<h2 id="plan-階段的完整-workflow">Plan 階段的完整 workflow</h2>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Terraform Plan</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">on</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">pull_request</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">paths</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">      </span>- <span class="s1">&#39;infra/**&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w"></span><span class="nt">permissions</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">  </span><span class="nt">id-token</span><span class="p">:</span><span class="w"> </span><span class="l">write</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">  </span><span class="nt">contents</span><span class="p">:</span><span class="w"> </span><span class="l">read</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">  </span><span class="nt">pull-requests</span><span class="p">:</span><span class="w"> </span><span class="l">write</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">  </span><span class="nt">plan</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="w">    </span><span class="nt">defaults</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w">      </span><span class="nt">run</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w">        </span><span class="nt">working-directory</span><span class="p">:</span><span class="w"> </span><span class="l">infra/environments/prod</span><span class="w">
</span></span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">19</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">21</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">22</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">aws-actions/configure-aws-credentials@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">23</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">24</span><span class="cl"><span class="w">          </span><span class="nt">role-to-assume</span><span class="p">:</span><span class="w"> </span><span class="l">arn:aws:iam::123456789012:role/infra-plan</span><span class="w">
</span></span></span><span class="line"><span class="ln">25</span><span class="cl"><span class="w">          </span><span class="nt">aws-region</span><span class="p">:</span><span class="w"> </span><span class="l">ap-northeast-1</span><span class="w">
</span></span></span><span class="line"><span class="ln">26</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">27</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">hashicorp/setup-terraform@v3</span><span class="w">
</span></span></span><span class="line"><span class="ln">28</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">29</span><span class="cl"><span class="w">          </span><span class="nt">terraform_version</span><span class="p">:</span><span class="w"> </span><span class="m">1.9.0</span><span class="w">
</span></span></span><span class="line"><span class="ln">30</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">31</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Format check</span><span class="w">
</span></span></span><span class="line"><span class="ln">32</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform fmt -check -recursive -diff</span><span class="w">
</span></span></span><span class="line"><span class="ln">33</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">34</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Init</span><span class="w">
</span></span></span><span class="line"><span class="ln">35</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform init -input=false</span><span class="w">
</span></span></span><span class="line"><span class="ln">36</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">37</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Validate</span><span class="w">
</span></span></span><span class="line"><span class="ln">38</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform validate</span><span class="w">
</span></span></span><span class="line"><span class="ln">39</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">40</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">TFLint</span><span class="w">
</span></span></span><span class="line"><span class="ln">41</span><span class="cl"><span class="w">        </span><span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">terraform-linters/setup-tflint@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">42</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">43</span><span class="cl"><span class="w">          </span><span class="nt">tflint_version</span><span class="p">:</span><span class="w"> </span><span class="l">latest</span><span class="w">
</span></span></span><span class="line"><span class="ln">44</span><span class="cl"><span class="w">      </span>- <span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">tflint --recursive --format compact</span><span class="w">
</span></span></span><span class="line"><span class="ln">45</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">46</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Plan</span><span class="w">
</span></span></span><span class="line"><span class="ln">47</span><span class="cl"><span class="w">        </span><span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="l">plan</span><span class="w">
</span></span></span><span class="line"><span class="ln">48</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="ln">49</span><span class="cl"><span class="sd">          terraform plan -no-color -input=false -out=tfplan \
</span></span></span><span class="line"><span class="ln">50</span><span class="cl"><span class="sd">            -detailed-exitcode 2&gt;&amp;1 | tee plan-output.txt</span><span class="w">
</span></span></span><span class="line"><span class="ln">51</span><span class="cl"><span class="w">        </span><span class="nt">continue-on-error</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="ln">52</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">53</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Comment plan on PR</span><span class="w">
</span></span></span><span class="line"><span class="ln">54</span><span class="cl"><span class="w">        </span><span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/github-script@v7</span><span class="w">
</span></span></span><span class="line"><span class="ln">55</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">56</span><span class="cl"><span class="w">          </span><span class="nt">script</span><span class="p">:</span><span class="w"> </span><span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="ln">57</span><span class="cl"><span class="sd">            const fs = require(&#39;fs&#39;);
</span></span></span><span class="line"><span class="ln">58</span><span class="cl"><span class="sd">            const plan = fs.readFileSync(&#39;infra/environments/prod/plan-output.txt&#39;, &#39;utf8&#39;);
</span></span></span><span class="line"><span class="ln">59</span><span class="cl"><span class="sd">            const truncated = plan.length &gt; 60000
</span></span></span><span class="line"><span class="ln">60</span><span class="cl"><span class="sd">              ? plan.substring(0, 60000) + &#39;\n\n... (truncated)&#39;
</span></span></span><span class="line"><span class="ln">61</span><span class="cl"><span class="sd">              : plan;
</span></span></span><span class="line"><span class="ln">62</span><span class="cl"><span class="sd">            await github.rest.issues.createComment({
</span></span></span><span class="line"><span class="ln">63</span><span class="cl"><span class="sd">              owner: context.repo.owner,
</span></span></span><span class="line"><span class="ln">64</span><span class="cl"><span class="sd">              repo: context.repo.repo,
</span></span></span><span class="line"><span class="ln">65</span><span class="cl"><span class="sd">              issue_number: context.issue.number,
</span></span></span><span class="line"><span class="ln">66</span><span class="cl"><span class="sd">              body: `### Terraform Plan\n\`\`\`\n${truncated}\n\`\`\``
</span></span></span><span class="line"><span class="ln">67</span><span class="cl"><span class="sd">            });</span><span class="w">
</span></span></span><span class="line"><span class="ln">68</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">69</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Fail if plan errored</span><span class="w">
</span></span></span><span class="line"><span class="ln">70</span><span class="cl"><span class="w">        </span><span class="nt">if</span><span class="p">:</span><span class="w"> </span><span class="l">steps.plan.outcome == &#39;failure&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln">71</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">exit 1</span></span></span></code></pre></div><h3 id="各步驟的職責">各步驟的職責</h3>
<p><strong>Format check</strong> 驗證 HCL 是否符合標準排版。它不影響功能，但消除 diff 噪音——排版不一致時 PR diff 會混入純格式變更，reviewer 分不清哪些是邏輯改動。<code>-diff</code> flag 讓 CI 輸出具體哪幾行不符合，作者在本地跑 <code>terraform fmt</code> 就能修。</p>
<p><strong>Init</strong> 初始化 provider 和 backend。<code>-input=false</code> 避免 CI 卡在等待互動式輸入。如果 backend 設定錯了（bucket 不存在、權限不足），這一步就會失敗，不會跑到後面浪費時間。</p>
<p><strong>Validate</strong> 檢查 HCL 的語法和內部一致性——變數沒宣告、型別不匹配、必填參數缺漏。它不連線雲端，只讀 code，所以不需要 AWS credential 也能跑（但放在 init 之後是因為 validate 需要 provider schema）。</p>
<p><strong>TFLint</strong> 做 provider 層的正確性檢查：instance type 在該 region 不存在、已棄用的參數、命名不符規範。它補的是 validate 抓不到的「語法對但值不對」的問題。</p>
<p><strong>Plan</strong> 是整條 pipeline 的核心產出。<code>-detailed-exitcode</code> 讓 exit code 區分三種狀態：0 = 無差異、1 = 錯誤、2 = 有差異。<code>-out=tfplan</code> 把 plan 結果存成二進位檔，apply 階段可以直接用這份 plan 執行，避免 plan 和 apply 之間的時間差導致不一致。</p>
<p><strong>Comment</strong> 把 plan 輸出貼回 PR，reviewer 看 code diff 的同時看到 plan 的實際變更。plan 輸出可能很長（幾百行），超過 GitHub comment 上限時截斷，但保留開頭（通常包含 add/change/destroy 的摘要行）。</p>
<h2 id="apply-階段">Apply 階段</h2>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Terraform Apply</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span><span class="nt">on</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">push</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">branches</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="l">main]</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">    </span><span class="nt">paths</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">      </span>- <span class="s1">&#39;infra/**&#39;</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w"></span><span class="nt">permissions</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">  </span><span class="nt">id-token</span><span class="p">:</span><span class="w"> </span><span class="l">write</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">  </span><span class="nt">contents</span><span class="p">:</span><span class="w"> </span><span class="l">read</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">  </span><span class="nt">apply</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="w">    </span><span class="nt">environment</span><span class="p">:</span><span class="w"> </span><span class="l">production</span><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w">    </span><span class="nt">defaults</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w">      </span><span class="nt">run</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="w">        </span><span class="nt">working-directory</span><span class="p">:</span><span class="w"> </span><span class="l">infra/environments/prod</span><span class="w">
</span></span></span><span class="line"><span class="ln">19</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">21</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">22</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">23</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">aws-actions/configure-aws-credentials@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln">24</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">25</span><span class="cl"><span class="w">          </span><span class="nt">role-to-assume</span><span class="p">:</span><span class="w"> </span><span class="l">arn:aws:iam::123456789012:role/infra-apply</span><span class="w">
</span></span></span><span class="line"><span class="ln">26</span><span class="cl"><span class="w">          </span><span class="nt">aws-region</span><span class="p">:</span><span class="w"> </span><span class="l">ap-northeast-1</span><span class="w">
</span></span></span><span class="line"><span class="ln">27</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">28</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">hashicorp/setup-terraform@v3</span><span class="w">
</span></span></span><span class="line"><span class="ln">29</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">30</span><span class="cl"><span class="w">          </span><span class="nt">terraform_version</span><span class="p">:</span><span class="w"> </span><span class="m">1.9.0</span><span class="w">
</span></span></span><span class="line"><span class="ln">31</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">32</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Init</span><span class="w">
</span></span></span><span class="line"><span class="ln">33</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform init -input=false</span><span class="w">
</span></span></span><span class="line"><span class="ln">34</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">35</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Plan (verify)</span><span class="w">
</span></span></span><span class="line"><span class="ln">36</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform plan -no-color -input=false -detailed-exitcode</span><span class="w">
</span></span></span><span class="line"><span class="ln">37</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">38</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Apply</span><span class="w">
</span></span></span><span class="line"><span class="ln">39</span><span class="cl"><span class="w">        </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l">terraform apply -auto-approve -input=false</span></span></span></code></pre></div><h3 id="environment-protection-rule">environment protection rule</h3>
<p><code>environment: production</code> 這一行啟用 GitHub 的環境保護功能。在 repo 的 Settings → Environments → production 設定：</p>
<ul>
<li><strong>Required reviewers</strong>：指定至少一個人 approve 才能執行 apply job</li>
<li><strong>Wait timer</strong>：合併後等 N 分鐘才開始 apply（給人反應時間）</li>
<li><strong>Deployment branches</strong>：限定只有 main branch 能觸發</li>
</ul>
<p>這層保護讓高風險的變更（plan 顯示 destroy 或 replace）在 apply 前多一道人工確認。日常低風險變更（加一個 tag、調一個參數）可以直接通過。取捨點是：每次 apply 都要人按確認會拖慢頻繁的小變更，可以用 deployment rule 的條件只攔 production 環境。</p>
<h3 id="apply-階段重跑-plan-的理由">Apply 階段重跑 plan 的理由</h3>
<p>apply 之前重跑一次 plan，是為了驗證合併後的現實跟 PR review 時看到的一致。PR 從開啟到合併可能隔了幾小時或幾天，期間有人可能手動改了雲端資源（drift）或別的 PR 先 apply 了。重跑 plan 確認差異跟預期一致，不一致就停下來而非盲目 apply。</p>
<p>如果使用了 plan 階段的 <code>-out=tfplan</code> 保存 plan 檔，apply 可以改為 <code>terraform apply tfplan</code> 直接執行已 review 過的 plan。代價是 plan 檔需要跨 job 傳遞（GitHub Actions 的 artifact），且 plan 檔有時效——state 在 plan 之後被修改，apply 會拒絕執行。</p>
<h2 id="多環境的-pipeline-設計">多環境的 pipeline 設計</h2>
<p>管理 dev / staging / prod 三個環境時，pipeline 有兩種常見結構：</p>
<p><strong>單 workflow 加 matrix</strong>：一份 YAML 用 <code>strategy.matrix</code> 跑三個環境，每個環境有自己的 working directory 和 IAM role。好處是維護一份 YAML；代價是三個環境的 plan 都在同一次 PR run 裡，reviewer 要看三份 plan 輸出。</p>
<p><strong>每環境獨立 workflow</strong>：三份 YAML 各自觸發在對應環境目錄的變更上（<code>paths: ['infra/environments/dev/**']</code>）。好處是只有改到的環境才跑、PR comment 乾淨；代價是三份 YAML 有重複。</p>
<p>多數團隊起步時用單 workflow + matrix，環境數量超過三個或各環境的 apply 策略不同（dev 自動、prod 要 approval）時切到獨立 workflow。</p>
<h2 id="安全邊界">安全邊界</h2>
<p>CI pipeline 是 infra 變更的自動化執行者，它的安全性等同於 apply role 的權限。幾個邊界要守住：</p>
<p><strong>OIDC claim 收斂</strong>：apply role 的 trust policy 只允許特定 repo 的 main branch 假扮（見 <a href="/blog/infra/02-identity-credentials/oidc-trust-policy-setup/" data-link-title="OIDC Trust Policy 設定指南" data-link-desc="GitHub Actions 與 AWS 之間的 OIDC 聯合設定：建立 provider、設計 trust policy 的 claim 收斂、plan 與 apply role 分離、常見錯誤排查">OIDC Trust Policy 設定</a>）。如果 claim 只驗 repo 不驗 branch，任何人在 feature branch 推一個修改過的 workflow 就能觸發 apply。</p>
<p><strong>Workflow 修改的 review</strong>：<code>.github/workflows/</code> 底下的 YAML 變更應該跟 infra code 一樣走 PR review。修改 workflow 等於修改 pipeline 的行為——加一個 <code>terraform destroy</code> step 就能在合併時清掉整個環境。GitHub 的 CODEOWNERS 功能可以強制特定人 review workflow 變更。</p>
<p><strong>Secret 與 environment variable</strong>：OIDC 取代了存在 repo secrets 裡的 access key，但 workflow 可能還用到其他 secret（Terraform Cloud token、Slack webhook URL）。這些 secret 要限定在特定 environment 才能存取，不開放給所有 branch。</p>
<p>本篇聚焦 GitHub Actions。如果團隊選擇 Atlantis（常駐服務、內建 state lock 與 apply 語意），見<a href="/blog/infra/07-infra-as-pr/plan-review-apply-guardrails/" data-link-title="infra 走 PR 流程與自動化護欄" data-link-desc="infra 變更走 PR → plan → review diff → 合併 → apply，配 fmt / validate / tflint / checkov / tfsec 與 Atlantis 自動化，讓基礎設施可審查、可回溯、可交接">主文章的 Atlantis 段</a>的選型討論。</p>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>→ <a href="/blog/infra/02-identity-credentials/oidc-trust-policy-setup/" data-link-title="OIDC Trust Policy 設定指南" data-link-desc="GitHub Actions 與 AWS 之間的 OIDC 聯合設定：建立 provider、設計 trust policy 的 claim 收斂、plan 與 apply role 分離、常見錯誤排查">OIDC Trust Policy 設定</a>：pipeline 的 credential 來源</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/checkov-tfsec-rule-customization/" data-link-title="checkov 與 tfsec 規則配置" data-link-desc="靜態掃描工具的規則選擇策略、自訂規則、豁免管理、false positive 處理與 CI 整合，讓掃描從噪音來源變成可信的品質關卡">checkov / tfsec 規則配置</a>：pipeline 裡的靜態安全掃描怎麼配</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/plan-review-apply-guardrails/" data-link-title="infra 走 PR 流程與自動化護欄" data-link-desc="infra 變更走 PR → plan → review diff → 合併 → apply，配 fmt / validate / tflint / checkov / tfsec 與 Atlantis 自動化，讓基礎設施可審查、可回溯、可交接">infra 走 PR 流程與自動化護欄</a>：pipeline 背後的審查原則</li>
<li>→ <a href="/blog/infra/04-environment-separation/" data-link-title="模組四：環境分離與模組化" data-link-desc="dev / staging / prod 切分、目錄結構 vs workspace、用可重用 module 避免環境漂移">模組四：環境分離與模組化</a>：多環境的目錄結構決定 pipeline 的 working directory</li>
</ul>
]]></content:encoded></item><item><title>checkov 與 tfsec 規則配置</title><link>https://tarrragon.github.io/blog/infra/07-infra-as-pr/checkov-tfsec-rule-customization/</link><pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate><guid>https://tarrragon.github.io/blog/infra/07-infra-as-pr/checkov-tfsec-rule-customization/</guid><description>&lt;p>checkov 和 tfsec 安裝後直接跑，通常會產出幾十到幾百條命中。全部修完不切實際、全部忽略又失去價值。這篇處理的是怎麼從「裝了工具」走到「工具的產出可信且可操作」——規則選擇、嚴重度過濾、豁免管理、自訂規則、CI 整合，以及 false positive 的處理流程。&lt;/p>
&lt;h2 id="規則選擇策略">規則選擇策略&lt;/h2>
&lt;p>兩個工具的內建規則集都超過數百條，涵蓋從加密設定到命名慣例。全開跑會讓命中清單長到沒人看。規則選擇的判準是「這條規則命中後，團隊會不會真的去修」——答案是不會的規則，開著只是製造噪音。&lt;/p>
&lt;h3 id="分層啟用">分層啟用&lt;/h3>
&lt;p>把規則分成三層逐步啟用，而非一次全開：&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>層次&lt;/th>
 &lt;th>規則類型&lt;/th>
 &lt;th>範例&lt;/th>
 &lt;th>啟用時機&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>地基層&lt;/td>
 &lt;td>資料外洩與權限失控&lt;/td>
 &lt;td>S3 public access、SG 0.0.0.0/0、IAM wildcard&lt;/td>
 &lt;td>day 1&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>營運層&lt;/td>
 &lt;td>加密與備份&lt;/td>
 &lt;td>RDS encryption、EBS encryption、backup retention&lt;/td>
 &lt;td>IaC 覆蓋率 &amp;gt;50%&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>規範層&lt;/td>
 &lt;td>命名、tagging、logging&lt;/td>
 &lt;td>缺 tag、缺 log group、resource naming&lt;/td>
 &lt;td>治理成熟後&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>地基層是即使其他規則都關掉也要開的——S3 bucket 對外公開（&lt;code>CKV_AWS_19&lt;/code>、&lt;code>CKV_AWS_53&lt;/code>）和 security group 全開（&lt;code>CKV_AWS_24&lt;/code>、&lt;code>CKV_AWS_25&lt;/code>）這類規則命中就是真問題。營運層在 IaC 覆蓋率夠高時啟用，否則會掃到大量不在 IaC 管理內的資源。規範層等團隊有能力消化命中量再開。&lt;/p>
&lt;h3 id="checkov-的規則過濾">checkov 的規則過濾&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># 只跑地基層規則&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">checkov -d . --check CKV_AWS_19,CKV_AWS_53,CKV_AWS_24,CKV_AWS_25,CKV_AWS_40,CKV_AWS_145
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="c1"># 或者用 framework 過濾（只掃 Terraform）&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">checkov -d . --framework terraform --compact --quiet&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>checkov 支援 &lt;code>--check&lt;/code>（白名單，只跑這些）和 &lt;code>--skip-check&lt;/code>（黑名單，跳過這些）。初期用 &lt;code>--check&lt;/code> 白名單比較可控——明確列出要跑的規則，而非從全集去扣。隨著團隊消化能力提升再擴大白名單。&lt;/p>
&lt;h3 id="tfsec-的嚴重度過濾">tfsec 的嚴重度過濾&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># 只報 CRITICAL 和 HIGH&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">tfsec . --minimum-severity HIGH
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="c1"># 排除特定規則&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">tfsec . --exclude aws-s3-specify-public-access-block&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>tfsec 的嚴重度分 CRITICAL / HIGH / MEDIUM / LOW。初期設 &lt;code>--minimum-severity HIGH&lt;/code> 把低嚴重度的過濾掉，減少噪音量。降低閾值的時機是 HIGH 以上的命中清零後。&lt;/p>
&lt;h2 id="豁免管理">豁免管理&lt;/h2>
&lt;p>不是每個命中都是錯——對外的 ALB 在 port 443 開 &lt;code>0.0.0.0/0&lt;/code> 是設計意圖、不是漏洞。豁免的重點是讓例外顯式化、有理由、可被 review。&lt;/p>
&lt;h3 id="行內豁免">行內豁免&lt;/h3>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-hcl" data-lang="hcl">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="k">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_security_group_rule&amp;#34; &amp;#34;alb_https&amp;#34;&lt;/span> {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="n"> type&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;ingress&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="n"> from_port&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="m">443&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="n"> to_port&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="m">443&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">&lt;span class="n"> protocol&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;tcp&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">6&lt;/span>&lt;span class="cl">&lt;span class="n"> cidr_blocks&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;0.0.0.0/0&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="c1">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">7&lt;/span>&lt;span class="cl">&lt;span class="c1"> #checkov:skip=CKV_AWS_24:ALB 的 HTTPS 入站需要對外開放
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">8&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>}&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>tfsec 的行內豁免：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-hcl" data-lang="hcl">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="k">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_security_group_rule&amp;#34; &amp;#34;alb_https&amp;#34;&lt;/span> {&lt;span class="c1">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="c1"> #tfsec:ignore:aws-ec2-no-public-ingress-sgr -- ALB HTTPS listener requires public access
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="c1">&lt;/span>&lt;span class="n"> cidr_blocks&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;0.0.0.0/0&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">}&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>行內豁免的好處是理由跟程式碼在一起，review 時一眼可見。壞處是散落在各檔案裡，盤點所有豁免要 grep。&lt;/p>
&lt;h3 id="集中式豁免">集中式豁免&lt;/h3>
&lt;p>checkov 支援 &lt;code>.checkov.yaml&lt;/code> 集中管理豁免：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c"># .checkov.yaml&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">skip-check&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="l">CKV_AWS_24 &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># ALB public-facing SG rules&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="l">CKV_AWS_19 &lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c"># Legacy S3 buckets pending migration&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>集中式的好處是一個地方看到所有豁免，適合全域性的例外（如「這批 legacy S3 bucket 還沒遷完、暫時跳過 public access 檢查」）。壞處是理由離程式碼太遠，三個月後沒人記得為什麼跳過。&lt;/p>
&lt;h3 id="豁免紀律">豁免紀律&lt;/h3>
&lt;p>每個豁免都要寫理由（&lt;code>--&lt;/code> 之後的文字）。沒有理由的豁免等於靜默跳過——review 時看不出是故意的還是為了讓 CI 過而隨手加的。定期（每季度）跑一次豁免盤點：&lt;/p>





&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="ln">1&lt;/span>&lt;span class="cl">&lt;span class="c1"># 盤點所有 checkov 豁免&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">2&lt;/span>&lt;span class="cl">grep -rn &lt;span class="s2">&amp;#34;checkov:skip&amp;#34;&lt;/span> --include&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;*.tf&amp;#34;&lt;/span> .
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">3&lt;/span>&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">4&lt;/span>&lt;span class="cl">&lt;span class="c1"># 盤點所有 tfsec 豁免&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="ln">5&lt;/span>&lt;span class="cl">grep -rn &lt;span class="s2">&amp;#34;tfsec:ignore&amp;#34;&lt;/span> --include&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;*.tf&amp;#34;&lt;/span> .&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>每個命中問一句：當初跳過的原因還成立嗎？legacy 遷移完了嗎？臨時的例外變成永久的了嗎？&lt;/p></description><content:encoded><![CDATA[<p>checkov 和 tfsec 安裝後直接跑，通常會產出幾十到幾百條命中。全部修完不切實際、全部忽略又失去價值。這篇處理的是怎麼從「裝了工具」走到「工具的產出可信且可操作」——規則選擇、嚴重度過濾、豁免管理、自訂規則、CI 整合，以及 false positive 的處理流程。</p>
<h2 id="規則選擇策略">規則選擇策略</h2>
<p>兩個工具的內建規則集都超過數百條，涵蓋從加密設定到命名慣例。全開跑會讓命中清單長到沒人看。規則選擇的判準是「這條規則命中後，團隊會不會真的去修」——答案是不會的規則，開著只是製造噪音。</p>
<h3 id="分層啟用">分層啟用</h3>
<p>把規則分成三層逐步啟用，而非一次全開：</p>
<table>
  <thead>
      <tr>
          <th>層次</th>
          <th>規則類型</th>
          <th>範例</th>
          <th>啟用時機</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>地基層</td>
          <td>資料外洩與權限失控</td>
          <td>S3 public access、SG 0.0.0.0/0、IAM wildcard</td>
          <td>day 1</td>
      </tr>
      <tr>
          <td>營運層</td>
          <td>加密與備份</td>
          <td>RDS encryption、EBS encryption、backup retention</td>
          <td>IaC 覆蓋率 &gt;50%</td>
      </tr>
      <tr>
          <td>規範層</td>
          <td>命名、tagging、logging</td>
          <td>缺 tag、缺 log group、resource naming</td>
          <td>治理成熟後</td>
      </tr>
  </tbody>
</table>
<p>地基層是即使其他規則都關掉也要開的——S3 bucket 對外公開（<code>CKV_AWS_19</code>、<code>CKV_AWS_53</code>）和 security group 全開（<code>CKV_AWS_24</code>、<code>CKV_AWS_25</code>）這類規則命中就是真問題。營運層在 IaC 覆蓋率夠高時啟用，否則會掃到大量不在 IaC 管理內的資源。規範層等團隊有能力消化命中量再開。</p>
<h3 id="checkov-的規則過濾">checkov 的規則過濾</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 只跑地基層規則</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">checkov -d . --check CKV_AWS_19,CKV_AWS_53,CKV_AWS_24,CKV_AWS_25,CKV_AWS_40,CKV_AWS_145
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 或者用 framework 過濾（只掃 Terraform）</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">checkov -d . --framework terraform --compact --quiet</span></span></code></pre></div><p>checkov 支援 <code>--check</code>（白名單，只跑這些）和 <code>--skip-check</code>（黑名單，跳過這些）。初期用 <code>--check</code> 白名單比較可控——明確列出要跑的規則，而非從全集去扣。隨著團隊消化能力提升再擴大白名單。</p>
<h3 id="tfsec-的嚴重度過濾">tfsec 的嚴重度過濾</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 只報 CRITICAL 和 HIGH</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">tfsec . --minimum-severity HIGH
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 排除特定規則</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">tfsec . --exclude aws-s3-specify-public-access-block</span></span></code></pre></div><p>tfsec 的嚴重度分 CRITICAL / HIGH / MEDIUM / LOW。初期設 <code>--minimum-severity HIGH</code> 把低嚴重度的過濾掉，減少噪音量。降低閾值的時機是 HIGH 以上的命中清零後。</p>
<h2 id="豁免管理">豁免管理</h2>
<p>不是每個命中都是錯——對外的 ALB 在 port 443 開 <code>0.0.0.0/0</code> 是設計意圖、不是漏洞。豁免的重點是讓例外顯式化、有理由、可被 review。</p>
<h3 id="行內豁免">行內豁免</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_security_group_rule&#34; &#34;alb_https&#34;</span> {
</span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="n">  type</span>        <span class="o">=</span> <span class="s2">&#34;ingress&#34;</span>
</span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="n">  from_port</span>   <span class="o">=</span> <span class="m">443</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="n">  to_port</span>     <span class="o">=</span> <span class="m">443</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl"><span class="n">  protocol</span>    <span class="o">=</span> <span class="s2">&#34;tcp&#34;</span>
</span></span><span class="line"><span class="ln">6</span><span class="cl"><span class="n">  cidr_blocks</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;0.0.0.0/0&#34;</span><span class="p">]</span><span class="c1">
</span></span></span><span class="line"><span class="ln">7</span><span class="cl"><span class="c1">  #checkov:skip=CKV_AWS_24:ALB 的 HTTPS 入站需要對外開放
</span></span></span><span class="line"><span class="ln">8</span><span class="cl"><span class="c1"></span>}</span></span></code></pre></div><p>tfsec 的行內豁免：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-hcl" data-lang="hcl"><span class="line"><span class="ln">1</span><span class="cl"><span class="k">resource</span> <span class="s2">&#34;aws_security_group_rule&#34; &#34;alb_https&#34;</span> {<span class="c1">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="c1">  #tfsec:ignore:aws-ec2-no-public-ingress-sgr -- ALB HTTPS listener requires public access
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="c1"></span><span class="n">  cidr_blocks</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;0.0.0.0/0&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">4</span><span class="cl">}</span></span></code></pre></div><p>行內豁免的好處是理由跟程式碼在一起，review 時一眼可見。壞處是散落在各檔案裡，盤點所有豁免要 grep。</p>
<h3 id="集中式豁免">集中式豁免</h3>
<p>checkov 支援 <code>.checkov.yaml</code> 集中管理豁免：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln">1</span><span class="cl"><span class="c"># .checkov.yaml</span><span class="w">
</span></span></span><span class="line"><span class="ln">2</span><span class="cl"><span class="w"></span><span class="nt">skip-check</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">3</span><span class="cl"><span class="w">  </span>- <span class="l">CKV_AWS_24 </span><span class="w"> </span><span class="c"># ALB public-facing SG rules</span><span class="w">
</span></span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="w">  </span>- <span class="l">CKV_AWS_19 </span><span class="w"> </span><span class="c"># Legacy S3 buckets pending migration</span></span></span></code></pre></div><p>集中式的好處是一個地方看到所有豁免，適合全域性的例外（如「這批 legacy S3 bucket 還沒遷完、暫時跳過 public access 檢查」）。壞處是理由離程式碼太遠，三個月後沒人記得為什麼跳過。</p>
<h3 id="豁免紀律">豁免紀律</h3>
<p>每個豁免都要寫理由（<code>--</code> 之後的文字）。沒有理由的豁免等於靜默跳過——review 時看不出是故意的還是為了讓 CI 過而隨手加的。定期（每季度）跑一次豁免盤點：</p>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 盤點所有 checkov 豁免</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">grep -rn <span class="s2">&#34;checkov:skip&#34;</span> --include<span class="o">=</span><span class="s2">&#34;*.tf&#34;</span> .
</span></span><span class="line"><span class="ln">3</span><span class="cl">
</span></span><span class="line"><span class="ln">4</span><span class="cl"><span class="c1"># 盤點所有 tfsec 豁免</span>
</span></span><span class="line"><span class="ln">5</span><span class="cl">grep -rn <span class="s2">&#34;tfsec:ignore&#34;</span> --include<span class="o">=</span><span class="s2">&#34;*.tf&#34;</span> .</span></span></code></pre></div><p>每個命中問一句：當初跳過的原因還成立嗎？legacy 遷移完了嗎？臨時的例外變成永久的了嗎？</p>
<h2 id="自訂規則">自訂規則</h2>
<p>內建規則覆蓋通用安全實踐，但專案特有的規範（如「所有 RDS 必須有 <code>cost-center</code> tag」「所有 S3 bucket 名稱必須以公司前綴開頭」）需要自訂。</p>
<h3 id="checkov-自訂規則python">checkov 自訂規則（Python）</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c1"># custom_checks/require_cost_center_tag.py</span>
</span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="kn">from</span> <span class="nn">checkov.terraform.checks.resource.base_resource_check</span> <span class="kn">import</span> <span class="n">BaseResourceCheck</span>
</span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="kn">from</span> <span class="nn">checkov.common.models.enums</span> <span class="kn">import</span> <span class="n">CheckResult</span><span class="p">,</span> <span class="n">CheckCategories</span>
</span></span><span class="line"><span class="ln"> 4</span><span class="cl">
</span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="k">class</span> <span class="nc">CostCenterTagRequired</span><span class="p">(</span><span class="n">BaseResourceCheck</span><span class="p">):</span>
</span></span><span class="line"><span class="ln"> 6</span><span class="cl">    <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
</span></span><span class="line"><span class="ln"> 7</span><span class="cl">        <span class="n">name</span> <span class="o">=</span> <span class="s2">&#34;Ensure cost-center tag is present&#34;</span>
</span></span><span class="line"><span class="ln"> 8</span><span class="cl">        <span class="nb">id</span> <span class="o">=</span> <span class="s2">&#34;CUSTOM_001&#34;</span>
</span></span><span class="line"><span class="ln"> 9</span><span class="cl">        <span class="n">supported_resources</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;aws_instance&#34;</span><span class="p">,</span> <span class="s2">&#34;aws_db_instance&#34;</span><span class="p">,</span> <span class="s2">&#34;aws_s3_bucket&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">10</span><span class="cl">        <span class="n">categories</span> <span class="o">=</span> <span class="p">[</span><span class="n">CheckCategories</span><span class="o">.</span><span class="n">GENERAL_SECURITY</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">11</span><span class="cl">        <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="n">name</span><span class="p">,</span> <span class="nb">id</span><span class="o">=</span><span class="nb">id</span><span class="p">,</span> <span class="n">categories</span><span class="o">=</span><span class="n">categories</span><span class="p">,</span>
</span></span><span class="line"><span class="ln">12</span><span class="cl">                         <span class="n">supported_resources</span><span class="o">=</span><span class="n">supported_resources</span><span class="p">)</span>
</span></span><span class="line"><span class="ln">13</span><span class="cl">
</span></span><span class="line"><span class="ln">14</span><span class="cl">    <span class="k">def</span> <span class="nf">scan_resource_conf</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">conf</span><span class="p">):</span>
</span></span><span class="line"><span class="ln">15</span><span class="cl">        <span class="n">tags</span> <span class="o">=</span> <span class="n">conf</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&#34;tags&#34;</span><span class="p">,</span> <span class="p">[{}])[</span><span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="ln">16</span><span class="cl">        <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">tags</span><span class="p">,</span> <span class="nb">dict</span><span class="p">)</span> <span class="ow">and</span> <span class="s2">&#34;cost-center&#34;</span> <span class="ow">in</span> <span class="n">tags</span><span class="p">:</span>
</span></span><span class="line"><span class="ln">17</span><span class="cl">            <span class="k">return</span> <span class="n">CheckResult</span><span class="o">.</span><span class="n">PASSED</span>
</span></span><span class="line"><span class="ln">18</span><span class="cl">        <span class="k">return</span> <span class="n">CheckResult</span><span class="o">.</span><span class="n">FAILED</span>
</span></span><span class="line"><span class="ln">19</span><span class="cl">
</span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="n">check</span> <span class="o">=</span> <span class="n">CostCenterTagRequired</span><span class="p">()</span></span></span></code></pre></div>




<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="ln">1</span><span class="cl"><span class="c1"># 跑自訂規則</span>
</span></span><span class="line"><span class="ln">2</span><span class="cl">checkov -d . --external-checks-dir ./custom_checks</span></span></code></pre></div><h3 id="tfsec-自訂規則yaml">tfsec 自訂規則（YAML）</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="c"># .tfsec/custom_rules.yaml</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w"></span>- <span class="nt">id</span><span class="p">:</span><span class="w"> </span><span class="l">CUSTOM_001</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">  </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l">S3 bucket name must start with company prefix</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">  </span><span class="nt">impact</span><span class="p">:</span><span class="w"> </span><span class="l">Non-standard naming breaks cross-account policies</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">  </span><span class="nt">resolution</span><span class="p">:</span><span class="w"> </span><span class="l">Add company prefix to bucket name</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">  </span><span class="nt">requiredTypes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">    </span>- <span class="l">resource</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">  </span><span class="nt">requiredLabels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">    </span>- <span class="l">aws_s3_bucket</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">  </span><span class="nt">severity</span><span class="p">:</span><span class="w"> </span><span class="l">MEDIUM</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">  </span><span class="nt">matchSpec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">    </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">bucket</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">    </span><span class="nt">action</span><span class="p">:</span><span class="w"> </span><span class="l">startsWith</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">    </span><span class="nt">value</span><span class="p">:</span><span class="w"> </span><span class="l">acme-</span></span></span></code></pre></div><p>自訂規則的數量保持精簡——每條規則都是維護成本。只有「違反後會在後續流程造成問題」的規範值得寫成自動化規則，純粹的風格偏好留給 review 時口頭提醒。</p>
<h2 id="ci-整合">CI 整合</h2>
<p>把掃描接進 CI 的目標是「PR 合併前就攔下問題」，而非 apply 之後才發現。</p>
<h3 id="github-actions-範例">GitHub Actions 範例</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="ln"> 1</span><span class="cl"><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 2</span><span class="cl"><span class="w">  </span><span class="nt">security-scan</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 3</span><span class="cl"><span class="w">    </span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l">ubuntu-latest</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 4</span><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 5</span><span class="cl"><span class="w">      </span>- <span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">actions/checkout@v4</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 6</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln"> 7</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Run checkov</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 8</span><span class="cl"><span class="w">        </span><span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">bridgecrewio/checkov-action@v12</span><span class="w">
</span></span></span><span class="line"><span class="ln"> 9</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">10</span><span class="cl"><span class="w">          </span><span class="nt">directory</span><span class="p">:</span><span class="w"> </span><span class="l">.</span><span class="w">
</span></span></span><span class="line"><span class="ln">11</span><span class="cl"><span class="w">          </span><span class="nt">check</span><span class="p">:</span><span class="w"> </span><span class="l">CKV_AWS_19,CKV_AWS_53,CKV_AWS_24,CKV_AWS_25</span><span class="w">
</span></span></span><span class="line"><span class="ln">12</span><span class="cl"><span class="w">          </span><span class="nt">quiet</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="ln">13</span><span class="cl"><span class="w">          </span><span class="nt">compact</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="ln">14</span><span class="cl"><span class="w">          </span><span class="nt">soft_fail</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span></span></span><span class="line"><span class="ln">15</span><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="ln">16</span><span class="cl"><span class="w">      </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">Run tfsec</span><span class="w">
</span></span></span><span class="line"><span class="ln">17</span><span class="cl"><span class="w">        </span><span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l">aquasecurity/tfsec-action@v1</span><span class="w">
</span></span></span><span class="line"><span class="ln">18</span><span class="cl"><span class="w">        </span><span class="nt">with</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="ln">19</span><span class="cl"><span class="w">          </span><span class="nt">minimum_severity</span><span class="p">:</span><span class="w"> </span><span class="l">HIGH</span><span class="w">
</span></span></span><span class="line"><span class="ln">20</span><span class="cl"><span class="w">          </span><span class="nt">soft_fail</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span></span></span></code></pre></div><p><code>soft_fail: false</code> 讓掃描命中時 CI 失敗、阻擋合併。初期可以先設 <code>soft_fail: true</code>（掃描報告但不阻擋），讓團隊觀察命中量，確認規則集合理後再切成強制。</p>
<h3 id="掃描結果回貼-pr">掃描結果回貼 PR</h3>
<p>checkov 和 tfsec 的 GitHub Actions 都支援把結果以 PR comment 回貼。讓 reviewer 在 PR 頁面直接看到掃描結果，不用去翻 CI log。checkov-action 預設會回貼；tfsec-action 需要額外的 <code>github_token</code> 設定。</p>
<h3 id="漸進式導入">漸進式導入</h3>





<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="ln">1</span><span class="cl">Week 1-2：soft_fail=true，觀察命中量和 false positive 率
</span></span><span class="line"><span class="ln">2</span><span class="cl">Week 3：修完所有真問題，豁免所有合理的 false positive
</span></span><span class="line"><span class="ln">3</span><span class="cl">Week 4：切 soft_fail=false，掃描變成強制 gate</span></span></code></pre></div><p>這個節奏讓團隊在掃描變成強制之前就清理完存量，避免「一開 hard fail 所有 PR 都過不了」的窘境。</p>
<h2 id="false-positive-處理">False positive 處理</h2>
<p>false positive 的處理有三條路，依復發頻率選：</p>
<table>
  <thead>
      <tr>
          <th>路徑</th>
          <th>適用情境</th>
          <th>做法</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>行內豁免</td>
          <td>單一資源的合理例外</td>
          <td>在該資源加 <code>checkov:skip</code> + 理由</td>
      </tr>
      <tr>
          <td>全域跳過</td>
          <td>整個規則不適用於此專案</td>
          <td>加進 <code>.checkov.yaml</code> skip-check</td>
      </tr>
      <tr>
          <td>自訂規則覆蓋</td>
          <td>內建規則的判準不適合</td>
          <td>寫自訂規則取代內建規則</td>
      </tr>
  </tbody>
</table>
<p>最常見的 false positive 是 ALB 的 public-facing security group（設計就是要開 443）和開發環境的寬鬆設定（dev 允許、prod 不允許）。後者可以用 checkov 的 <code>--var-file</code> 搭配環境變數區分——dev 跑寬鬆規則集、prod 跑嚴格規則集。</p>
<p>處理 false positive 時要抵抗「加 skip 讓 CI 過」的捷徑衝動。每個 skip 都要問：這是設計意圖（ALB 要開放）還是技術債（dev 環境暫時放寬）？前者寫永久豁免加理由，後者寫臨時豁免加 TODO 和預計修復時間。</p>
<h2 id="跨分類引用">跨分類引用</h2>
<ul>
<li>→ <a href="/blog/infra/07-infra-as-pr/plan-review-apply-guardrails/" data-link-title="infra 走 PR 流程與自動化護欄" data-link-desc="infra 變更走 PR → plan → review diff → 合併 → apply，配 fmt / validate / tflint / checkov / tfsec 與 Atlantis 自動化，讓基礎設施可審查、可回溯、可交接">infra 走 PR 流程與自動化護欄</a>：掃描在 PR 流程裡的定位與 plan/apply 的關係</li>
<li>→ <a href="/blog/infra/07-infra-as-pr/terraform-ci-pipeline-setup/" data-link-title="Terraform CI Pipeline 設定指南" data-link-desc="用 GitHub Actions 建立完整的 Terraform CI pipeline：fmt → validate → tflint → plan → PR comment → apply，含 OIDC credential 與環境保護規則">Terraform CI Pipeline 設定</a>：掃描步驟怎麼嵌入完整的 CI workflow</li>
<li>→ <a href="/blog/infra/03-network-foundation/security-group-audit-cleanup/" data-link-title="Security Group 稽核與清理" data-link-desc="盤點所有 security group 規則、找出 0.0.0.0/0 全開與未使用的 SG、依賴檢查後安全刪除、自動化治理">模組三：Security Group 稽核與清理</a>：掃描命中 0.0.0.0/0 後的處理流程</li>
</ul>
]]></content:encoded></item></channel></rss>