CtxFST CH25 - Graph Layer 可以換：為什麼 Phase 1-4 是 CtxFST 還是 LightRAG 不重要，重點在 Phase 5-7

CtxFST CH25：Graph Layer 可以換，重點在 Runtime 和 Planner

如果你看完 CH24，可能會有一個很自然的問題：

這些 entity graph、graph expansion、relation-aware retrieval……LightRAG 不是也能做嗎？

答案是：對，可以。

但這個問題背後真正重要的，不是「哪個 graph layer 比較好」，而是：

Graph construction 和 retrieval 只是前半段。讓 agent 真正能推理、能規劃、能回寫狀態的，是後半段。

這篇要把這件事講清楚。

先把架構切成兩半

整個 CtxFST + OpenClaw 的升級計畫，Phase 1 到 Phase 7，可以從中間一刀切成兩半：

前半段：Phase 1-4（Graph Construction + Retrieval）

Phase 1：Parser — 讀懂 .ctxfst.md 的結構
Phase 2：Index — 建立 entity / chunk / edge 的 storage schema
Phase 3：Entity-aware retrieval — 從 query 匹配 entity，拉出相關 chunks
Phase 4：Graph expansion — 沿著 relation edges 展開一跳鄰居

後半段：Phase 5-7（Prompt Assembly + Runtime + Planning）

Phase 5：Prompt adapter — 按 priority 組裝結構化 prompt sections
Phase 6：Runtime state — world state、precondition check、postcondition writeback
Phase 7：Planner / routing — goal-aware routing、skill chain search、explainable next action

兩半之間的接口，就是一個叫 ContextPack 的資料結構：

ContextPack {
  query: string
  matched_entities: EntityMatch[]
  entity_chunks: ChunkHit[]
  vector_chunks: ChunkHit[]
  keyword_chunks: ChunkHit[]
  expanded_entities: ExpandedEntity[]
  graph_chunks: ChunkHit[]
  fused_chunks: FusedChunkHit[]
}

只要前半段能產出 ContextPack，後半段不在意底層是什麼。

前半段：CtxFST vs LightRAG

現在的前半段（Phase 1-4）是用 CtxFST 手寫 .ctxfst.md 來建 graph。但 LightRAG 可以替代這整段。差異在 trade-off：

CtxFST（現在的做法）

Entity / relation 是作者手寫在 .ctxfst.md 裡的
Graph 存在 in-memory SQLite，query 時現建現拋
查詢是 token exact match，不需要 embeddings
不需要 LLM ingestion，不需要 vector DB，不需要外部服務
完全離線、完全確定性

LightRAG（替代方案）

Entity / relation 由 LLM 自動抽取（ingestion 時呼叫 LLM）
Graph 存在外部 DB（Neo4j / nano-vectordb）
查詢走 vector similarity + graph traversal
零人工標註，丟文件進去就能用
需要 embedding model + LLM + vector DB

比較表

	CtxFST	LightRAG
需要 LLM ingestion	否	是（抽 entity 要花 token）
需要 embedding	否	是
需要外部服務	否（純 SQLite）	是（vector DB）
Entity 品質	人寫的，確定性高	看 LLM 抽取品質
可離線	可以	不行
結果確定性	確定（token exact match）	不確定（vector similarity）
維護成本	高（要手寫）	低（自動）
可 scale	受限（手寫不 scale）	可以

結論：如果 memory 量少、結構明確、需要離線確定性 → CtxFST。如果 memory 量大、希望自動化 → LightRAG。

兩者都能產出 ContextPack，所以後半段不用改。

後半段才是真正的分水嶺

不管你用 CtxFST 還是 LightRAG 做 graph layer，Phase 5-7 都要自己建。而且 Phase 5-7 才是讓 agent 從「能查資料」變成「能理解世界」的關鍵。

現在有的（Phase 5：Prompt Adapter）

Phase 5 已經完成。它做的是把 ContextPack 轉成結構化的 prompt sections：

## Missing Preconditions        (priority 100)
## Active States                (priority 90)
## Relevant Entities            (priority 80)
## Supporting Chunks            (priority 70/50/30)
## Related Entities (Graph)     (priority 60)
## Suggested Next Actions       (priority 40)

帶 token budget 裁切，確保不超出 context window。

但這還是唯讀的靜態查詢。系統能組裝很好的 prompt，但不能：

追蹤哪些條件已經被滿足
記錄哪些 skill 已經執行過
根據當前狀態決定下一步
把執行結果回寫成新的 graph edges

還沒做的（Phase 6：Runtime State）

Phase 6 要加的是 session-scoped 狀態機：

WorldState {
  goal: string
  active_states: string[]      // 目前滿足的條件
  completed_skills: string[]   // 已執行的 skill
  blocked_by: string[]         // 卡住的原因
}

以及三個核心操作：

checkPreconditions(skill) — 執行前檢查 entity 的 preconditions 是否都在 active_states 中
applyPostconditions(skill) — 執行成功後把 postconditions 寫進 active_states
Runtime event writeback — 把 COMPLETED / BLOCKED_BY 寫回成 graph edges

這一步完成後，.ctxfst.md 裡的 preconditions / postconditions 才真正從「裝飾性 metadata」變成「runtime 可操作的狀態欄位」。

還沒做的（Phase 7：Planner / Routing）

Phase 7 要加的是 goal-aware decision engine：

Goal-aware routing — 根據 goal 從 graph 裡找最短可行路徑（skill chain）
Relation-aware weighting — REQUIRES / LEADS_TO 影響排序，不被 SIMILAR 帶偏
Completed skills 降權 — 不重複推薦已做過的
Blocked-aware routing — 先推薦能解除 blockage 的步驟
Explainable next action — 解釋為什麼推薦這一步

這一步完成後，agent 才真正具備「multi-step planning」的能力。

五個能力的對應關係

把整件事攤開來看：

能力	對應 Phase	狀態
用 entity graph 理解世界	Phase 2-4	已完成（CtxFST 或可換 LightRAG）
用 world state 追蹤當前條件	Phase 6	未開始
用 graph-aware routing 決定下一步	Phase 7	未開始
用 multi-step planning 搜尋 skill chain	Phase 7	未開始
執行結果回寫成 graph edges 與 runtime state	Phase 6	未開始

前兩個是 read path（能讀、能查）。後三個是 write path + decision path（能追蹤、能規劃、能回寫）。

LightRAG 也沒有做 Phase 6-7

這是一個很重要的認知：

LightRAG 的 graph 同樣是 read-only 的。

LightRAG 能自動建 graph、能做 graph-aware retrieval，但它不會：

在 agent 執行完一個 skill 之後自動更新 graph state
檢查 preconditions 是否滿足
規劃下一步要做什麼
追蹤 session 內的 world state 變化

所以不管選 CtxFST 還是 LightRAG 做 graph layer，Phase 6-7 的 runtime state + planner 都是要另外建的。

差別只在 graph 的 construction 是手寫還是 LLM 抽取。

用 LightRAG 時要注意的三件事

如果決定用 LightRAG 替代 Phase 1-4，有三個地方需要特別處理：

1. Operational Relation Semantics

LightRAG 自動抽取的 relation type 通常是語意性的（related_to、part_of、mentioned_in），不是操作性的（REQUIRES、LEADS_TO、BLOCKED_BY）。

Phase 7 的 planner 需要的是後者。所以你可能要在 LightRAG extraction 之上再加一層 operational annotation：手動或用 LLM 補上 operational semantics。

2. Entity 品質的不確定性

LightRAG 靠 LLM 抽取 entity，品質不是 closed enum。同一個概念可能被抽成不同名稱，或者重要的 entity 被漏掉。

Phase 6-7 的 runtime state 和 planner 需要可靠的 entity identity。如果 entity 名稱不一致，checkPreconditions() 和 applyPostconditions() 就會失靈。

3. 外部依賴

LightRAG 需要 embedding model + LLM for ingestion + vector DB。現在的 CtxFST 是純本地 SQLite，完全離線。如果你的使用場景需要離線運作，這是一個硬限制。

如果用 LightRAG，World State 怎麼加？

LightRAG 本身不提供 world state，你要自己加一層。

LightRAG 的 graph 長這樣

[Entity: Analyze Resume] --related_to--> [Entity: Resume Parsing]
[Entity: Resume Parsing] --related_to--> [Entity: PDF Extract]

純語意關係，沒有狀態。

加上 world state 後要變這樣

[Entity: Analyze Resume]
  preconditions: [state:resume-uploaded]
  postconditions: [state:resume-parsed]
  status: blocked (missing state:resume-uploaded)

[State: resume-uploaded] = inactive
[State: resume-parsed]   = inactive

Session: {
  goal: state:analysis-complete
  active_states: []
  completed_skills: []
  blocked_by: [entity:analyze-resume]
}

三種做法

做法	怎麼做	代價
A. 在 LightRAG graph 上加 properties	entity node 加 preconditions / postconditions 屬性，另開 state nodes	要改 LightRAG 的 schema，或 fork
B. LightRAG + 外掛 state layer	LightRAG 管 graph retrieval，另一個 SQLite/Redis 管 world state	兩套 store，但各自獨立不互相侵入
C. 不用 LightRAG 的 graph store，只用它的 extraction	LLM extraction → 寫入自己的 graph + state store	最大自由度，但等於只用 LightRAG 的 ingestion pipeline

B 最務實

原因：

LightRAG 的 graph store 不是設計來做 mutable runtime state 的
World state 的讀寫頻率遠高於 graph construction（每次 tool 執行都要 check + writeback）
Session isolation 在 LightRAG 裡沒有概念

所以 world state 不是加「在」LightRAG 上面，而是加「旁邊」。LightRAG 負責「這個世界有什麼 entity 和 relation」，world state store 負責「現在這個 session 走到哪了」。

架構圖

┌──────────────────────┐  ┌──────────────────────┐
│  LightRAG            │  │  World State Store    │
│  (graph + retrieval) │  │  (SQLite / Redis)     │
│                      │  │                       │
│  entities            │  │  active_states[]      │
│  relations           │  │  completed_skills[]   │
│  chunks              │  │  blocked_by[]         │
│  vector index        │  │  runtime_events[]     │
│                      │  │  session_id scoped    │
└──────┬───────────────┘  └──────┬────────────────┘
       │                         │
       └────────┬────────────────┘
                ▼
        ContextPack + WorldState
                │
                ▼
        Phase 5: Prompt Adapter
                │
                ▼
        Phase 7: Planner / Routing

雙向轉換器：讓 Graph Layer 真正可替換

前面說「graph layer 可以換」，但如果資料格式不互通，換的成本其實很高。真正讓 CtxFST 和 LightRAG 可替換的關鍵，是做兩個雙向轉換工具：

`lightrag-to-ctxfst`：LightRAG → `.ctxfst.md`

LightRAG 自動抽取的 entity / relation，轉成 .ctxfst.md 格式。

輸入： LightRAG graph store（entities + relations + chunks）

輸出： .ctxfst.md 檔案

轉換邏輯：

LightRAG entity        → CtxFST entities[].id / type / aliases
LightRAG relation      → CtxFST relations[]（需要 type mapping，見下方）
LightRAG chunk         → CtxFST chunks[]（保留 chunk-entity mapping）
（缺少的欄位）          → preconditions / postconditions 留空，等人工補

Relation type mapping：

LightRAG 抽出的	映射成 CtxFST 的
`related_to`	`SIMILAR`（預設，最弱）
`part_of` / `contains`	`REQUIRES`（子概念依賴父概念）
`leads_to` / `causes` / `enables`	`LEADS_TO`
`depends_on` / `requires`	`REQUIRES`
其他	`SIMILAR`（fallback）

用途：

用 LightRAG 自動抽取大量文件，產出 .ctxfst.md 初稿
人工 review 和補上 preconditions / postconditions / operational semantics
之後走 CtxFST 的離線確定性 retrieval pipeline

這等於是 LLM 幫你打草稿，你來審稿。

`ctxfst-to-lightrag`：`.ctxfst.md` → LightRAG

手寫的 .ctxfst.md，匯入 LightRAG 的 graph store。

輸入： .ctxfst.md 檔案（一個或多個）

輸出： LightRAG graph store 的 entities / relations / chunks

轉換邏輯：

CtxFST entities[].id   → LightRAG entity node
CtxFST entities[].aliases → LightRAG entity aliases / embeddings
CtxFST relations[]     → LightRAG relation edges
CtxFST chunks[]        → LightRAG chunks + vector embeddings
CtxFST preconditions   → LightRAG entity property（custom field）
CtxFST postconditions  → LightRAG entity property（custom field）

用途：

把手寫的高品質 entity 定義匯入 LightRAG，享受 vector retrieval 的 fuzzy matching
讓 LightRAG 的 graph 裡混入人工策展的 operational semantics
做 A/B 比較：同一批 entity，CtxFST exact match vs LightRAG vector retrieval 哪個效果好

為什麼雙向轉換這麼重要

                    lightrag-to-ctxfst
    LightRAG  ──────────────────────────►  .ctxfst.md
    (auto)    ◄──────────────────────────  (curated)
                    ctxfst-to-lightrag

有了雙向轉換器，你可以：

先自動後人工 — LightRAG 抽取 → 轉成 .ctxfst.md → 人工補 operational semantics → 進 CtxFST pipeline
先人工後自動 — 手寫 .ctxfst.md → 匯入 LightRAG → 用 vector retrieval 做 fuzzy matching
混合使用 — 核心 entity 手寫（高確定性），周邊 entity 自動抽取（低維護成本）
隨時切換 — 如果某天 LightRAG 不維護了，資料全在 .ctxfst.md；如果手寫不 scale 了，匯入 LightRAG

資料不被鎖死在任何一邊。這才是真正的「graph layer 可替換」。

轉換時會遺失什麼

方向	會遺失的	為什麼
LightRAG → CtxFST	vector embeddings	CtxFST 不用 embeddings，走 exact match
LightRAG → CtxFST	LLM 抽取的 confidence score	`.ctxfst.md` 沒有 confidence 欄位
CtxFST → LightRAG	preconditions / postconditions 的語意	LightRAG 只能存成 custom property，不會用來做 runtime check
CtxFST → LightRAG	operational relation semantics（`REQUIRES` vs `SIMILAR` 的差異）	LightRAG 的 retrieval 不區分 relation type 權重

遺失的部分不影響核心功能 — vector embeddings 在 LightRAG 側會重算，operational semantics 在 CtxFST 側才有用。每邊遺失的剛好是對方不需要的。

最務實的混合路線

綜合以上，如果目標是同時拿到 LightRAG 的自動化和 CtxFST 的確定性，最務實的做法是：

┌─────────────────────────────────────┐
│  Phase 7: Planner / Routing         │  ← 自己建
│  goal-aware, relation-aware,        │
│  explainable next action            │
├─────────────────────────────────────┤
│  Phase 6: Runtime State             │  ← 自己建
│  world state, precheck,             │
│  writeback, session isolation       │
├─────────────────────────────────────┤
│  Graph Layer (可替換)                │  ← CtxFST 或 LightRAG
│  entity extraction,                 │     （LightRAG 時加 B 方案外掛 state）
│  graph store, graph retrieval       │
├─────────────────────────────────────┤
│  Phase 5: Prompt Adapter            │  ← 自己建（或改寫）
│  priority sections, token budget    │
└─────────────────────────────────────┘

具體來說：

Ingestion 階段用 LightRAG 自動從 .md 抽出 entity / relation
轉換階段用 lightrag-to-ctxfst 產出 .ctxfst.md 初稿，人工補 operational semantics
Retrieval 階段維持 CtxFST pipeline（離線、確定性、priority-aware prompt assembly）
Runtime 階段在 ContextPack 之上建 world state + planner（B 方案：旁邊掛 SQLite/Redis state store）
回流階段如果需要 fuzzy retrieval，用 ctxfst-to-lightrag 把策展過的 entity 匯回 LightRAG

這樣就是：自動化 ingestion + 人工策展 + 確定性 retrieval + 可推理的 runtime + 資料不鎖死。

`.ctxfst.md` 的真正角色

回到最根本的問題：

.ctxfst.md 只是可讀的上下文摘要，不是可推理的世界模型資料結構。

這句話在 Phase 5 之前是完全正確的。

.ctxfst.md 裡的 preconditions、postconditions、state_refs 這些欄位，目前只被用來做 prompt rendering（Phase 5 的 "Missing Preconditions" section）。沒有人真的在 runtime 去 check 或 update 它們。

Phase 6 要做的，就是讓這些欄位從裝飾性 metadata 變成 runtime 可操作的狀態機。

Phase 7 再讓狀態機驅動 routing 和 planning。

到那一步，.ctxfst.md 才真正從「結構化筆記」變成「semantic world model 的 source of truth」。

收尾

如果要用一句話總結：

Graph layer 是可以換的。CtxFST 手寫或 LightRAG 自動抽取都行，只要能產出 ContextPack。但真正讓 agent 從「能查」變成「能推理」的，是 Phase 6 的 runtime state 和 Phase 7 的 planner — 這兩層不管用什麼 graph layer 都要自己建。

所以與其糾結 CtxFST vs LightRAG，更值得關注的問題是：

lightrag-to-ctxfst 和 ctxfst-to-lightrag 的轉換器怎麼設計？Relation type mapping 的 heuristics 夠不夠用？
Phase 6 的 WorldState 和 precondition / postcondition 機制怎麼設計？
Phase 7 的 planner 要用什麼演算法做 skill chain search？
Runtime writeback 怎麼和 graph store 整合？

這些才是接下來真正要解的問題。而雙向轉換器是第一步 — 因為有了它，後面不管怎麼選技術路線，資料都不會被鎖死。

← Previous
CtxFST CH24 - 讓 OpenClaw 原生理解 CtxFST：從 chunk-only memory 升級成 semantic world model runtime