Applications

"Applications"

2026-05-14 Case Study：customer support agent 從 task decomposition 到 eval 把模組四原理串成端到端案例：observe → decompose → design workflow → instrument trace → design eval → iterate。每段標出引用哪章。
2026-05-11 4.1 RAG 原理：retrieval + augmentation 模式為什麼模型需要外掛知識、語意相似 vs 字面相似、chunking 的本質取捨、retrieval 失敗的根本原因
2026-05-14 4.2 RAG 檢索增強：query rewriting / HyDE / multi-step / context packing Query 端增強（rewriting / expansion / HyDE）、multi-step iterative retrieval、retrieve 後的 context packing（dedup / ordering / summarization）、adaptive retrieval：vanilla RAG 不夠時的下一層工具箱
2026-05-11 4.3 Tool use 原理：LLM 跟外部世界互動 Structured output 是 LLM 跨入工程系統的橋、function calling 取捨、為什麼本地小模型 tool use 表現崩潰
2026-05-11 4.4 Agent 架構原理 Agent loop 結構、失敗模式、什麼任務適合 vs 不適合、跟人類審查的協作模型
2026-05-14 4.5 人機協作拓樸：何時人介入、怎麼介入 Centaur vs Cyborg 工作模式、jagged frontier、HITL 三種觸發時機（pre-act / mid-stream / post-hoc）、確認流程的設計避免橡皮圖章化
2026-05-11 4.6 應用層協議：function calling / structured output / MCP 三個常被混為一談的概念：模型能力、sampling 約束、server 協議，三者的層級差異與組合方式
2026-05-11 4.7 Workflow 編排模式 Pipeline / router / parallel / reflection：多 LLM call 組合的四種基本模式與退化條件
2026-05-14 4.8 Multi-Agent 拓樸：flat / hierarchical / agent-as-tool 從 multi-call workflow 走到 multi-agent system 的判讀、flat vs hierarchical 拓樸、agent-as-tool 的 MCP 視角、specialization 跟 orchestration overhead 的取捨
2026-05-12 4.9 Production 部署的資源評估原理從本地單 user 到 production multi-tenant：concurrent users、cost model、observability、SLA、capacity planning 的設計取捨
2026-05-12 4.10 衍生產物管理原理：什麼進 git、什麼不該 LLM 應用的 source / derived / external 三類產物對應 git / build cache / registry、與 production 部署的 reproducibility / cost / share 取捨
2026-05-12 4.11 Long context engineering 128K / 1M context 模型怎麼用：claimed vs effective context、lost-in-the-middle、context 設計策略、Long context vs RAG 取捨
2026-05-12 4.12 Embedding model 內部：訓練、選型、in-domain fine-tune Embedding model 怎麼訓練（contrastive learning + hard negative mining）、怎麼挑（MTEB / 大小 / domain）、何時該自己 fine-tune
2026-05-14 4.13 Eval 設計座標系：三軸、八象限、何時測什麼 Eval 設計三軸（objective↔subjective / component↔end-to-end / quantitative↔qualitative）、八象限的對應 eval 工具、軸選錯的訊號、跟 benchmarking / LLM-as-judge / tracing 的關係
2026-05-12 4.14 Benchmarking 與評估方法論判讀 model card benchmark 數字、做自己工作流的 in-house benchmark、量測本地推論速度的完整方法論
2026-05-12 4.15 Vision in coding workflow：本地 VLM 怎麼接寫 code VLM 在 coding 工作流的 use cases、本地 VLM 選型、跟雲端 VLM 的分工、Continue.dev / Ollama 整合現狀
2026-05-12 4.16 靜態 / serverless RAG deployment：架構選擇與資安取捨沒 backend 的場景怎麼做 RAG：四種 deployment 方案、API key 暴露問題、CORS / abuse / 第三方信任、跟模組六的 routing
2026-05-12 4.17 Coding agent harness：scaffold / context engineering / subagent Coding agent 的內部設計：scaffold vs harness 分層、context budget 25% 規則、subagent 拓樸、跟 Claude Code / Cursor / Aider 的 mapping
2026-05-12 4.18 Prompt caching 工程實務：cost / latency 最大槓桿 Prompt cache 怎麼運作、cache_control 設計、coding agent 跟 long-context 的 cache pattern、anti-pattern 跟 cache miss 訊號
2026-05-12 4.19 Agent memory 分層架構 Agent 在 context window 之外管理長期狀態的設計：working / short-term / long-term episodic / semantic / procedural 五個層次、寫入時機、retrieval 設計、失敗模式
2026-05-12 4.20 LLM tracing 與 observability OpenTelemetry GenAI semantic conventions、結構化 span 設計、cost / latency 監控、failure debug 流程、跟 LLM-as-judge eval 的串接
2026-05-12 4.21 LLM-as-Judge 評估方法 LLM 評估 LLM 的 production eval 方法：rubric design、pairwise / direct scoring、三大 bias 緩解、跟 trace 串接的閉環、calibration
2026-07-01 4.22 RAG storage 工程：從 pickle 到 vector database 的選型判讀 RAG storage backend 選型：規模到哪個階段該從 in-memory 升級到 vector DB、dependency chain 如何收窄選項
2026-05-14 4.0 Prompt 技術光譜：手法分類、取捨、組合模式 Zero-shot / few-shot、chain-of-thought、role / template、reflection 等 prompt 技術的分類與取捨、何時 stack 何時不要 stack、跟 fine-tune / RAG / chaining 的邊界