Context-First Thinking

Context.Engineering

Practical writing on AI agents, context engineering, and toolchains. Full articles live on the site. Short updates and discussion happen in Telegram.

100+ daily readers 5k+ monthly readers

Latest Posts

May 22, 2026
Tasks Are Not Goals

A task tells an agent what to change. A goal should also carry intent, boundaries, stop conditions, and evidence.

#context-engineering #agents #workflow #verification
May 8, 2026
Signum Can Now Be Installed in Codex App as a Plugin

What changes when a contract-first agent workflow becomes an installable Codex App plugin, not just a Claude Code command.

#context-engineering #codex #agents #verification #signum
Apr 27, 2026
Stop Writing CLAUDE.md From Scratch

Every agentic codebase starts with an empty CLAUDE.md and fills it with hallucinations. Here's the six-file harness signum scaffolds for you.

#context-engineering #claude-code #agents #specs
Apr 8, 2026
AI Agents Need Permission Boundaries, Not Personalities

Most agent runtimes add more roles. punk starts from a harder premise: trust comes from boundaries, durable state, and proof.

#contextengineering #agents #architecture #verification
Mar 23, 2026
My AI Agent Said 'Done.' It Skipped an Entire Acceptance Criterion.

The hardest bug was not in the code. It was in the trust model between the engineer agent and the orchestrator.

#context-engineering #claude-code #verification #trust-boundary
Mar 20, 2026
Your AI Agent Can't Tell Which Solution Is Current

AI agents rewrite code but leave the old version behind. jj's predecessor chains make ghost solutions detectable -- git can't.

#context-engineering #jj #code-entropy #agents
Mar 20, 2026
Your AI Spec Is Already Stale

When agents read project.intent.md as ground truth, stale specs become execution bugs. Here's how I caught real drift in two projects.

#context-engineering #claude-code #agents #specs
Mar 17, 2026
Switching AI CLIs Without Losing 32 Skills: Why I Built nex

A Rust CLI that makes your AI agent skills portable across Claude Code, Codex, and Gemini. One command to install, one command to switch.

#aiagents #claudecode #rust #cli #opensource #devtools
Mar 17, 2026
What a Formal Verification Agent Taught Me About Code Audit

Studying Mistral's Leanstral -- an agent for Lean 4 theorem proving -- led to concrete improvements in Signum, a multi-model code audit pipeline.

#context-engineering #claude-code #signum #verification #agents
Mar 15, 2026
One Pass Isn't Enough: How Signum Learned to Fix Its Own Code

AI code verification as a loop, not a gate. Iterative audit, contract self-critique, and shared context across tasks in Signum v4.6.

#context-engineering #claude-code #verification #iterative-audit
Mar 11, 2026
Environment is context: security auditing for AI agent workstations

We carefully design prompts and tools but rarely audit the environment where the agent actually runs. Sentinel makes that measurable.

#contextengineering #claudecode #security #agents #devtools
Mar 11, 2026
Skillpulse: Your AI Skills Are Flying Blind Without Telemetry

A PostToolUse hook that logs every skill activation to local JSONL. No existing tool tracks whether the model actually follows a skill's instructions.

#context-engineering #claude-code #agents #telemetry #local-first
Mar 10, 2026
Research Agents Lie. The Fix Is Adversarial Verification.

Most AI research tools optimize for coherent synthesis, not factual accuracy. Delve adds a claim-level adversarial verification stage that changes the trust model entirely.

#context-engineering #claude-code #agents #deep-research #verification
Mar 10, 2026
From Plugin to Product: How Herald Became Sift and Why the Data Model Changed Everything

A local news plugin worked until it didn't. The fix was a different data model, language, and delivery surface.

#context-engineering #agents #golang #architecture #saas
Mar 6, 2026
Spec-Gated Delivery: Why PR Review Is the Wrong Trust Checkpoint for AI Code

AI made code cheap. It didn't make trust cheap. The fix isn't better reviewers - it's moving the gate from PR diff to approved intent.

#context-engineering #verification #agents #software-delivery
Mar 5, 2026
AI Writes Code. Where Is the Proof?

Proofpack chains contract, implementation, and audit into a single verifiable record. Why proof artifacts are the missing primitive of AI code generation.

#context-engineering #claude-code #verification #proofpack #supply-chain
Mar 4, 2026
Herald v2: Local-First News Intelligence for AI Agents

How I built a 4-stage news pipeline that clusters articles into stories using title similarity, all in stdlib Python with SQLite.

#context-engineering #claude-code #agents #local-first #python
Mar 3, 2026
The Contract Is the Context: How Signum Makes AI Code Verification Principled

Why running AI-generated code through more AI reviewers doesn't solve the reliability problem — and what a contract-first pipeline changes about it.

#context-engineering #claude-code #agents #verification #contract-first
Mar 2, 2026
11 Plugins, One Marketplace: Building an AI Agent Toolkit from Scratch

How I built a plugin ecosystem for Claude Code — from scattered scripts to a full lifecycle with scaffolding, quality gates, multi-AI review, and one-command install.

#context-engineering #claude-code #plugins #agents #open-source
Feb 23, 2026
First Agent Skills Benchmark: What Works, What Doesn't, and Why Context Matters

Analyzing SkillsBench — the first systematic benchmark for Agent Skills. 7,308 trajectories, critical review, and why skills are context engineering for agents.

#context-engineering #agents #skills #benchmark
Feb 19, 2026
Git is not for agents

Why git breaks AI agents and how jj solves every single one of these problems

#context-engineering #ai-agents #tools
Feb 18, 2026
Agent-friendly web: context engineering at internet scale

How content format determines whether an AI agent can see your site. Research data, real standards, and what to do right now.

#context-engineering #agents #web #standards
Feb 18, 2026
Which model for which agent: metrics over intuition

Research on Claude model selection for multi-agent teams. Why Opus can be cheaper than Sonnet, and Haiku is dangerous for agentic tasks.

#context-engineering #agents #models #claude-code
Jan 24, 2026
Gas Town: Multi-Agent Orchestrator Cheatsheet

Reference guide for Gas Town — a system for parallel management of 20-30 Claude Code agents. Commands, concepts, workflows.

#context-engineering #agents #tooling #claude-code
Jan 23, 2026
Evoidea: A Memetic Algorithm for Ideas

Applying evolutionary algorithms to startup idea generation with AI agents

#context-engineering #ai-agents #ideation
Jan 22, 2026
Loadout: Dependency Management for AI Skills

How to solve skill drift in AI agents. Manifest + lock + symlinks — a pattern from package managers applied to context management.

#context-engineering #agents #skills #tooling
Dec 27, 2025
Claude Code Architecture: Why a Simple Loop Beat Complex Graphs

Breaking down Claude Code architecture based on PromptLayer founder's talk. Why while-loop, Bash, and context management matter more than complex workflows.

#context-engineering #claude-code #agents #architecture
Dec 25, 2025
Semantic Catalog: How Enterprise Teams Engineer Context

Why Text-to-SQL and direct REST API mapping fail, and how a semantic graph of business entities solves the context delivery problem in enterprise.

#context-engineering #mcp #enterprise #semantic-catalog
Dec 20, 2025
Context Engineering: First Steps

Introduction to context engineering — an engineering approach to working with LLMs. Why prompts stop working and what to do about it.

#context-engineering #llm #intro

5k+ monthly readers

Join @ctxtdev for short updates, in-between ideas, and discussion around new posts.

Join @ctxtdev Support the blog Work with me

Support keeps the writing going. If you are building AI agents and need a second set of eyes, the consulting path is open too.

100+ daily readers

Context.Engineering

Latest Posts

Tasks Are Not Goals

Signum Can Now Be Installed in Codex App as a Plugin

Stop Writing CLAUDE.md From Scratch

AI Agents Need Permission Boundaries, Not Personalities

My AI Agent Said 'Done.' It Skipped an Entire Acceptance Criterion.

Your AI Agent Can't Tell Which Solution Is Current

Your AI Spec Is Already Stale

Switching AI CLIs Without Losing 32 Skills: Why I Built nex

What a Formal Verification Agent Taught Me About Code Audit

One Pass Isn't Enough: How Signum Learned to Fix Its Own Code

Environment is context: security auditing for AI agent workstations

Skillpulse: Your AI Skills Are Flying Blind Without Telemetry

Research Agents Lie. The Fix Is Adversarial Verification.

From Plugin to Product: How Herald Became Sift and Why the Data Model Changed Everything

Spec-Gated Delivery: Why PR Review Is the Wrong Trust Checkpoint for AI Code

AI Writes Code. Where Is the Proof?

Herald v2: Local-First News Intelligence for AI Agents

The Contract Is the Context: How Signum Makes AI Code Verification Principled

11 Plugins, One Marketplace: Building an AI Agent Toolkit from Scratch

First Agent Skills Benchmark: What Works, What Doesn't, and Why Context Matters

Git is not for agents

Agent-friendly web: context engineering at internet scale

Which model for which agent: metrics over intuition

Gas Town: Multi-Agent Orchestrator Cheatsheet

Evoidea: A Memetic Algorithm for Ideas

Loadout: Dependency Management for AI Skills

Claude Code Architecture: Why a Simple Loop Beat Complex Graphs

Semantic Catalog: How Enterprise Teams Engineer Context

Context Engineering: First Steps