ai-assisted.dev

ai-assisted.dev References for AI-native software development https://ai-assisted.dev/ 2026-04-28T10:00:00.000Z A field guide to writing your own eval harness https://ai-assisted.dev/p/a-001 2026-04-28T10:00:00.000Z

Why "vibes-based" testing collapses past 50 prompts, and the smallest harness that scales without becoming a second product.

Building effective agents https://www.anthropic.com/engineering/building-effective-agents 2026-04-26T10:00:00.000Z

The taxonomy I keep coming back to. Workflows vs. agents, with worked patterns.

How we're shipping with autonomous agents at scale https://medium.com/ 2026-04-25T10:00:00.000Z

A pragmatic field report. The section on guardrail budgets is worth the read alone.

Benchmarking 7 coding agents on a real refactor https://ai-assisted.dev/p/a-002 2026-04-22T10:00:00.000Z

Same 12k-line TypeScript codebase, same task: extract a domain layer. I ran every agent twice and graded the diffs.

SWE-bench Verified — leaderboard notes https://www.swebench.com/ 2026-04-20T10:00:00.000Z

Worth reading the verified subset methodology before quoting any number from the headline board.

The new contract between developers and AI https://thenewstack.io/ 2026-04-18T10:00:00.000Z

Skim the intro, read the middle. Their framing of "negotiated autonomy" is sticky.

Things I have learned about LLMs in 2025 https://simonwillison.net/ 2026-04-16T10:00:00.000Z

Annual roundup. Section on tool-calling reliability is gold.

The shape of a good MCP server https://ai-assisted.dev/p/a-003 2026-04-12T10:00:00.000Z

Most MCP servers I see are CRUD wrappers. Here is what changes when you design tools as if a model were a junior engineer with no memory.

Subagents and the verifier pattern https://docs.anthropic.com/ 2026-04-05T10:00:00.000Z

Forking a verifier you never await is the cheapest CI you will ever ship.

Latency budgets for agentic UIs https://ai-assisted.dev/p/a-004 2026-03-30T10:00:00.000Z

A 90th-percentile breakdown of where the seconds actually go in a tool-using chat. Spoiler: it is not the model.

Aider — repo-map heuristics, annotated https://aider.chat/ 2026-03-24T10:00:00.000Z

The PageRank-on-symbols trick is more interesting than the headline feature.

Designing for the model in the loop https://ai-assisted.dev/p/a-005 2026-03-19T10:00:00.000Z

When the user is not the only one reading your UI. A short manifesto on machine-legible interfaces.