arXiv:2605.23951v1 Announce Type: new
Abstract: The companion paper introduced a four-level verification lattice on agent-skill manifests (unverified, declared, tested, formal) and left the top level aspirational. This
paper closes that gap. We give a precise semantics for skill behaviour faithful to how a skill is consumed by an LLM-driven runtime (a deterministic script-side reachable
through a non-deterministic LLM-side), state the verification problem as a capability-containment property over that semantics, and present three composable methods that
together raise a skill from declared or tested to formal: (1) sound static capability-containment analysis of the script-side via abstract interpretation over a small effect
lattice; (2) a refinement type system for tool-call envelopes that mechanically rejects any call whose statically-inferred capability is not in the manifest’s declared set;
(3) SMT-bounded model checking against the parent paper’s biconditional correctness criterion, with the bound chosen so any counter-example fitting the runtime’s
transaction-buffer horizon is exhibited as a concrete trace. We prove the three layers composed soundly cover the parent paper’s threat model modulo a single residual (the
LLM’s freedom to refuse to act) that the parent paper’s runtime biconditional catches at session boundary. The methods reuse existing well-engineered tools (Z3, Semgrep,
CodeQL, refinement-type checkers, mechanised proof assistants) rather than asking operators to build new ones, and the proof-carrying artifact extends the existing SKILL.md
convention. All three methods plus the bundle producer and re-checker ship as zero-dependency JavaScript modules in the open-source enclawed framework
(https://github.com/metereconsulting/enclawed; project page https://www.enclawed.com/), with 53 unit tests and an end-to-end CLI demo on a sample skill.
Inside Interoception: The hidden sense of how you feel inside
MIT Technology Review Explains: Let our writers untangle the complex, messy world of science and technology to help you understand what’s coming next. You can read more



