There is so much going on in the realm of Ai Driven Development, that I have a need to write down these observations in the moment. Other people can observe what I find interesting. Human curated.
Maybe some of these insights will contribute towards a post Maybe not.
Every day will be published as an rss entry.
Structure of a day in /moments:
{date}
moment X
~ ~ ~
moment Y
---
I can imagine this stream will grow quickly.
cmd/ctrl + F is your friend.
2026-03-15
When treating the LLM as a generic runtime, you need to battle thinking ambiguity.
~ ~ ~
This is quite the story: how a non biologist analyzed a cancer medicine for his dying dog. Article.
~ ~ ~
Gonna try this prompt (via X)
"What's the single smartest and most radically innovative and accretive and useful and compelling addition you could make to the plan at this point?"
2026-03-14
For 80% of the automation needs, the pattern can be: LLM + spec + datasource. Done.
~ ~ ~
chrome://inspect/#remote-debugging
2026-03-13
Anthropic Sonnet & Opus: 1M.
2026-03-12
Dawn of new type of UI. New type of apps. X.
~ ~ ~
There is value and fun in pair spec-ing.
~ ~ ~
Steve Yegge having a conversation on The Pragmatic Engineer podcast. Need to listen to it in full.
2026-03-11
This article: How System Prompts Define Agent Behavior is worth the full read. Thank to JG for the tip.
In the article they reference the cchistory tracker made by Mario.
2026-03-10
AMI by Yann LeCun. X
2026-03-09
Ghostty 1.3 is out. X
~ ~ ~
Agentic QA: Claude Code Review. X.
2026-03-08
LLM = Generic runtime
~ ~ ~
AOS = Agent Operating System
~ ~ ~
Copilot going the Pi route? Agent adaptable extensions incoming. X
2026-03-06
The radar chart by Anthropic as bar chart. The radar chart has a lot of traction. On X.
~ ~ ~
3 new entries in aishepherd.nl/tldr. The Anthropic report is of extra interest.
2026-03-05
I used OpenSpec to build a skill (guidance + logic). This was overkill.
~ ~ ~
Yes yes already, Google has a cli. Yay.
~ ~ ~
I don't think people are aware of how much time and effort is needed for personal optimal harness engineering. It's a lot.
2026-03-04
BullshitBench v2 by Peter Gostev. Anthropic models detect nonsense quite ok.
~ ~ ~
cmux interesting!
~ ~ ~
Together with JG we enjoy touring our harnesses, and while observing things, immediately experiment and perform harness hygiene. It's a very dynamic way of pairing.
~ ~ ~
Modern Go guidelines skill by Jetbrains GitHub
2026-03-03
Kepano (Obsidian CEO) has created a new project: defuddle.md.
curl defuddle.md/aishepherd.nl/moments/
It has CLI ❤️
~ ~ ~
tldr of Anthropic's Improving skill-creator: Test, measure, and refine Agent Skills. So me creating my own skills evals framework late last december was ok.
~ ~ ~
Anthropic on X:
We've seen unprecedented growth in Claude and Claude Code traffic this week that was genuinely hard to forecast.
I wonder if there is some kind of correlation with OpenAI getting the DoW contract...
~ ~ ~
GPT 5.4 incoming. X.
~ ~ ~
A thorough writeup by Robert Hafner: Beyond the Vibes: A Rigorous Guide to AI Coding Assistants and Agents. Link. Found through Linkedin
~ ~ ~
Roundup overview by Cat, on the impact Claude Code is making at different orgs. Via X.
~ ~ ~
Yesterday I needed to update certain aspects of my agentic harness. I’ll call this Harness Hygiene.
~ ~ ~
Claude Code is getting voice mode. Via X
2026-03-02
tldr of Theo's Software engineering is dead now. I've found the smaller teams pattern especially interesting.
2026-02-28
Anthropic says no to the DoW. A day later later OpenAI says yes? What’s going on?
2026-02-27
Non Ai agent related directly, but Obsidian sync headless is in beta. Docs. Can become big. From Kepano, why you could want to use it:
- Automate remote backups
- Automate publishing a website
- Give agentic tools access to a vault without access to your full computer
- Sync a shared team vault to a server that feeds other tools
- Run scheduled automations e.g. aggregate daily notes into weekly summaries, auto-tag, etc
...all while having the speed, privacy, customizability, end-to-end encryption of Obsidian Sync
~ ~ ~
Debbie O’Brien first got let go by Microsoft late last year, and now by Block. She does an incredible amount of great visible developer relations work. She doesn’t deserve this.
~ ~ ~
Jack Dorsey made the decision to downsize the people part of Block by 40%. These are approx. 4000 people. Reason being:
we're already seeing that the intelligence tools we’re creating and using, paired with smaller and flatter teams, are enabling a new way of working which fundamentally changes what it means to build and run a company. and that's accelerating rapidly.
Wondering what the role of their Goose Agent is playing here. Message on X
2026-02-26
Laura Tacho, in her talk Data vs hype: how orgs actually win with AI
There is no typical experience with AI
~ ~ ~
Codex 5.3 is the first model to make Mitchell move away from Opus 4.6. Via X.
~ ~ ~
Tailscale is jumping on the Ai bandwagon. Second initiative in short time. Co-op with LM Studio. It’s called LM Link:
Connect to remote instances of LM Studio, load your models, and use them as if they were local.
Could be very interesting. Wondering what the difference is between this and running Ollama on your own server with Tailscale?
~ ~ ~
Martin Gratzer: From Ore to Iron: Build Your Own Coding Agent. The agent inception part is humorous. Again no mysteries here. Link via Gramps
2026-02-25
“Pi X Hugging Face.” Via X
2026-02-24
Yes yes already, Claude Code /remote-control.
~ ~ ~
OpenSpec has official Pi support in version 1.2.0
~ ~ ~
Readable article giving praise to the Pi. This also stood out:
The more I built, the more I wanted to build. That feedback loop is its own subject.
~ ~ ~
Important research: Agents of Chaos
More easily consumable results: link
2026-02-23
Exciting work by Simon Willison. He’s guiding us through Agentic Engineering Patterns. Post.
~ ~ ~
Distillation attacks, USA vs China. Anthropic announcement.
~ ~ ~
Need to read this research paper: Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?
…Across multiple coding agents and LLMs, we find that context files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%
~ ~ ~
The robot is always invited by gramps on X
…If you need assistance, first, ask your clanker…
~ ~ ~
Gramps working with his clanker:
2026-02-21
OpenAI: Harness engineering: leveraging Codex in an agent-first world. Source. Via JG And Thorsten Ball.
~ ~ ~
Benedict Evans: How wil OpenAI compete. Source.
~ ~ ~
markitdown by Microsoft:
Python tool for converting files and office documents to Markdown.
~ ~ ~
Ryan Singer’s shaping skills might be worth a look. Source.
~ ~ ~
ollama launch pi X. Getting closer to independence.
~ ~ ~
Claude Code prompt caching insights:
- Prompt auto-caching with Claude, X
- Lessons from Building Claude Code: Prompt Caching Is Everything, X
~ ~ ~
The speed at which Anthropic is releasing stuff is quite insane.
~ ~ ~
So is “research preview” the new beta?
~ ~ ~
Well done Anthropic on this security initiative: Making frontier cybersecurity capabilities available to defenders.
2026-02-19
Voicebox, Open Source alternative to ElevenLabs.
~ ~ ~
Research by Anthropic: Measuring AI agent autonomy in practice.
~ ~ ~
Chris Lattner wrote an article on Anthropic’s Claude C Compiler. Tldr:
“He argues it's a genuine milestone showing AI can now participate in large-scale systems engineering, not just write snippets. However, CCC reproduced established compiler design rather than inventing anything novel, revealing that AI excels at automating implementation of known patterns while true innovation remains a human skill.”
~ ~ ~
Yet again the Amp guys. Post in their chronicle: The Coding Agent is Dead. Actually found this time on LinkedIn.
~ ~ ~
Interesting take by JG: MCP to restrict agents
~ ~ ~
Atlassian has different OpenAPI's available. For instance:
- Jira
- Confluence
Really you don't need an MCP in order to agentically interact with Atlassian things. You didn't get this from me.
~ ~ ~
Rodney by Simon Willison is a cli tool to interact with a chrome browser instance over cli. The announcement article. It again contains Simon’s wisdom: the need for testing
2026-02-18
Very important health talk between Steve Yegge and Scott Hanselman on YouTube.:
AI is making developers dramatically more productive...so why is everyone so exhausted?
~ ~ ~
Nice interview with Boris Cherny by YC on YouTube.
~ ~ ~
Fireship: How AI is breaking the SaaS business model... Of course take with a grain of salt.
~ ~ ~
Tailscale Aperture:
…It uses tsnet so that it appears as a node on your Tailscale network (tailnet). Aperture implicitly knows the identity of whatever is connecting to it, so you can stuff all your API keys inside and tell everyone to point their coding agents at Aperture to get access to LLMs.
~ ~ ~
Jarred is on fire:
In the next version of Bun
bun build --compile --target=browser ./index.htmlBundles & inlines all CSS and JS into a single standalone .html file - no external
<script>or<link>tags. One file, zero dependencies.
~ ~ ~
This Skills .md file is really good.. This is a great base indeed.
2026-02-17
electrobun. I don’t make this stuff up.
~ ~ ~
It sucks, but it’s true. Why Peter chose the US over EU.
~ ~ ~
“Agents optimize for outcomes, not attention.” Via X
~ ~ ~
Need to verify the Uncle Bob stories from this X article. Specs are the way to go.
~ ~ ~
Anthropic PTC: Programmatic Tool Calling. Via X:
*Before:
User prompt -> Claude -> uses tool -> ClaudeAfter:
User prompt -> Claude -> writes code and logic -> that code uses a tool -> code logic can parse or format results, add conditional logic and use tool multiple times -> Claude*
Also from Anthropic: Increase web search accuracy and efficiency with dynamic filtering:
Claude can now natively write and execute code during web searches to filter results before they reach the context window, improving its accuracy and token efficiency.
Yes, but what about using the markdown headers first. Thank you Pi for extensibility.
~ ~ ~
Need to read: Harness Engineering, via Martin Fowler.
~ ~ ~
Sonnet 4.6 is here. link
~ ~ ~
WebMCP might be very big.
2026-02-16
Via Mario Zechner, Nader Dabit on X: The Self-Healing PR
~ ~ ~
Large PR's are not beneficial to the agent's context window. Not even when they are delegated in parts to sub-agents. Just as repo's might need restructuring for better agent understanding, the same goes for PR's.
I've seen multiple sub-agents needing to use their memory garbage collection. This is sub-optimal for the end result.
~ ~ ~
Via Martin Fowler fragments, Margaret-Anne Storey: How Generative and Agentic AI Shift Concern from Technical Debt to Cognitive Debt, summary by Claude:
As AI and agentic tools accelerate software development, the bigger risk shifts from technical debt to cognitive debt — the erosion of developers' shared understanding of what their software does, how, and why. Velocity without understanding is unsustainable, and protecting the "shared theory" of a system matters more than any speed metric.
~ ~ ~
Context engineering pattern, compliancy gate pattern:
- ask the model to do something
GATE CHECK - Before ..., verify:
- [ ] Verify you've done something
I know, but it works. Sometimes.
~ ~ ~
Cloudflare has an llms.txt for their docs.
2026-02-15
Jarred Sumner, asking for bun GH issues on X. He will pass them on to Claude.
~ ~ ~
David Crawshaw on X:
…The age of malleable software is close.
~ ~ ~
Google WebMCP
~ ~ ~
Peter Steinberger is joining OpenAI. @steipete post and Sam Altman on X.
~ ~ ~
Clanker: is a slur for robots and artificial intelligence (AI) software.
~ ~ ~
Pi optimised itself for Cloudflare markdown-for-agents
This blog post you’re reading takes 16,180 tokens in HTML and 3,150 tokens when converted to markdown. That’s a 80% reduction in token usage.
~ ~ ~
Again so subtle:
pi install git:github.com/jeroendee/{some-extension-collection}=> install my own pi extensionspi updateupdate them
Pi ❤️
~ ~ ~
Via Martin Fowler fragments, he attributes Laura Tacho:
The Venn Diagram of Developer Experience and Agent Experience is a circle
Yes this. Exactly this. This past Friday I had a nice meeting discussing an upcoming workshop. I shared my view that in this day and age of agentic dev, the devex characteristics of a technology are very important. Think LLM corpus, build times, test times (caching), cli based toolchain, optimised lsp, etc.
~ ~ ~
From Martin Fowler’s fragments:
…senior developers were very resistant to using LLMs, when those senior developers were involved in an exercise that forced them to do some hands-on work with LLMs, a third of them were instantly converted to being very pro-LLM. That suggests that practical experience is important to give senior folks credible information to judge the value, particularly since there’s been striking improvements to models in just the last couple of months. As was quipped, some negative opinions of LLM capabilities “are so January”.
~ ~ ~
Interesting X article: SaaS Isn't Dead. It's Worse Than That.
…Total cost of ownership is real.
~ ~ ~
PICNIM: Problem In Chair, Not In Model
2026-02-14
There is going to be a tsunami of software.
~ ~ ~
They removed the running of claude from claude. Why?
~ ~ ~
Dax on X:
v1.2.0 of opencode includes our migration to sqlite...
...
this paves the way to a lot of cool features!
~ ~ ~
Steve Yegge wrote another article: The AI Vampire. This is a must read. There is so much interesting stuff being dropped.
I’m convinced that 3 to 4 hours is going to be the sweet spot for the new workday.
~ ~ ~
Indeed, capturing and storing model output that lead to a commit.
~ ~ ~
What a time. Bot vs OS maintainer on GitHub: link
~ ~ ~
Dax (OpenCode) on X:
a lot of our recent hires care deeply about code
~ ~ ~
Jarred Sumner on X: Bun.cron() in the next version.
~ ~ ~
The vibe coding trap by Max Musing on X:
…It confuses the cost of building software with the cost of owning software. These are very different things.
2026-02-13
2025 DORA AI Capabilities Model report link. Note, you do need to leave your info, before you can download it.
~ ~ ~
Scott Hanselman: Tiny Tool Town:
A place for stupid-delightful tools made with love.
Free, fun & open source. Made for an audience of one.
~ ~ ~
Tesla Full Self-Driving (Supervised) Ride-Along Experience in The Netherlands. Link.
~ ~ ~
Jarred Sumner on X:
LLM session transcripts provide far richer context about changes than a commit message. Put them with the code.
~ ~ ~
I want the “hashing lines” in Pi as well. So I requested Mario Zechner for this. His answer:
This can be easily done via extension. Ask your pi clanker, point it at oh-my-pi and say "do this hash thing via custom tools overriding built-in tools as an extension".
Duh, me and my old thinking.
~ ~ ~
Anthropic raises $30 billion in Series G funding at $380 billion post-money valuation. Source
~ ~ ~
diffs has quite some traction.
… open source diff and code rendering library.
2026-02-12
IBM Technology on their YouTube channel has a lot of useful videos. Like this one: What is Prompt Caching? Optimize LLM Latency with AI Transformers tldw:
Prompt caching stores the precomputed key-value (KV) pairs from a transformer's prefill phase so that repeated static prompt prefixes don't need to be reprocessed. Unlike output caching (which skips the LLM entirely for identical queries), prompt caching only caches the input processing work, letting the model jump straight to generating a response for the new, dynamic portion of the prompt.
~ ~ ~
Sam Altman on X:
*GPT-5.3-Codex-Spark is launching today as a research preview for Pro.
More than 1000 tokens per second!*
~ ~ ~
I've added an /events section to the site. There is quite some interest in Ai Driven Development.
~ ~ ~
Can Bölük, from his article 👇:
The model is the moat. The harness is the bridge. Burning bridges just means fewer people bother to cross. Treating harnesses as solved, or even inconsequential, is very short-sighted.
~ ~ ~
So this is very cool: I Improved 15 LLMs at Coding in One Afternoon. Only the Harness Changed. Via Mario Zechner.
...What if, when the model reads a file, or greps for something, every line comes back tagged with a 2-3 character content hash:
1:a3|function hello() {
2:f1| return "world";
3:0e|}
When the model edits, it references those tags — “replace line
2:f1, replace range1:a3through3:0e, insert after3:0e.” If the file changed since the last read, the hashes (optimistically) won’t match and the edit is rejected before anything gets corrupted.
~ ~ ~
JG built something really nice. An agent dependency manager. For your convenience, I'll post the parsed repo in deepwiki. You can find it on GitHub: agentdeps
~ ~ ~
Agent review on pre-push hook. Add git delta to reduce agent's field of view. Might be costly.
~ ~ ~
Thank you Ethan Mollick on X:
This viral essay is worth reading. I agree that AI is a very big deal & that most people don't know how good it has gotten, fast.
~ ~ ~
Lex Fridman had a conversation with Peter Steinberger. A conversation spanning 4 hours. Thankfully there are Lex clips on the YouTubes.
2026-02-11
exe.dev has an llms.txt.
~ ~ ~
Need to read A Visual Guide to LLM Agents by Maarten Grootendorst.
~ ~ ~
Agentic primitive: text.
~ ~ ~
What's going on with all the xAI people leaving...?
~ ~ ~
SSH client Echo for iOS and iPadOS, is a client powered by libghostty. Endorsed by Mitchell Hashimoto himself.
~ ~ ~
This MCP APPS might be an important architectural part of how a whole new generation of services will be built. Official repo of the spec.
~ ~ ~
Oh this might be interesting: Docker Sandboxes via Scott Hanselman.
~ ~ ~
During a hackaton last week, in which we had an advisory role, JG mentioned the profession of lamplighters:
Public street lighting was developed in the 16th century. During this time, lamplighters toured public streets at dusk, lighting outdoor fixtures by means of a wick on a long pole. At dawn, the lamplighter would return to put them out using a small hook on the same pole.
...
...gas lights steadily overtook candles and oil lamps as the dominant form of street lighting. Early gaslights required lamplighters, but by the late 19th century, systems were developed which allowed the lights to operate automatically. The advent of incandescent lighting diminished the necessity of hiring lamplighters.
Just an example of human profession => technical innovation => poof
~ ~ ~
Not directly Ai related, but I have ❤️ for Go. Go 1.26 got released. It contains the new Green Tea garbage collector.
~ ~ ~
LLM = knowledge work CPU
~ ~ ~
Signal: A friend of mine, who gets coding contract jobs through a supply portal, is seeing the cycle time diminishing. This directly affects competition, and thus revenue.
~ ~ ~
This post by Matt Shumer might be one of the most important ones yet on my /moments. I urge you to read his post. Some excerpts:
…Making AI great at coding was the strategy that unlocks everything else.
…
The people who are ahead in their industries (the ones actually experimenting seriously) are not dismissing this. They're blown away by what it can already do. And they're positioning themselves accordingly.
…
Start using AI seriously, not just as a search engine
~ ~ ~
The former CEO of GitHub, Thomas Dohmke, made a thing/new company: Entire. Their first new consumer product, a cli. It has a feature called Checkpoints:
…When you commit code generated by an agent, Checkpoints capture the full session alongside the commit: the transcript, prompts, files touched, token usage, tool calls and more.
~ ~ ~
So this might be useful: Obsidian CLI. Preview in version 1.12. Post by kepano on X:
...the CLI commands are complementary to existing terminal commands.
2026-02-10
Funny because in some sense, it's true.
~ ~ ~
Nice conversation between Scott Hanselman and Gergely Orosz. Gergely was also describing how he replaced a SaaS offering with his own.
~ ~ ~
The value of SaaS is shifting to hosting and data instead of the software. When anyone can replicate software, what used to be difficult, where does the value lie?
~ ~ ~
Using a man in the middle proxy, is essential to understanding how an agent communicates with the model. JG and myself are using mitmproxy.
2026-02-09
~ ~ ~
The bun.dev docs have an llms.txt.
~ ~ ~
Normally I would prefer Go for programming. But I find Bun/TS very elegant for the needed behaviour in agent skills. It fit's nicely in my skills distribution model.
~ ~ ~
In meta-engineering, being able to quickly experiment and iterate on the agents memory, is crucial for the outcome of the needed agent process.
~ ~ ~
Armin Ronacher has a new article: A Language For Agents
What a nice clear article by Matthias Kainer (via Martin Fowler's Fragments): So whats the next word, then?
Via Simon Willison: AI Doesn’t Reduce Work—It Intensifies It:
...I've had conversations with people recently who are losing sleep because they're finding building yet another feature with "just one more prompt" irresistible.
I can very much relate to this. I've contacted Alexander Klöpping, because I think this is an interesting subject for the Ai Report podcast.
JG being the good engineer that he is, pointed me towards this article by Simon Willison: The lethal trifecta for AI agents: private data, untrusted content, and external communication
Beautiful written resignation letter.
Dave Maasland on WNL: Doodzonde dat er geen minister van Digitale Zaken komt. If we want to cut the umbilical cord with the US big tech and prepare society for the impact of Ai, I reckon The Netherlands needs something like: Bundesministerium für Digitales und Staatsmodernisierung
Nice interview by YC with Peter Steinberger:
"You can just do things..."
2026-02-08
Debbie O’Brian is also leaving QA and embracing the Ai adventure.
I think Bun as runtime for skill behaviors in Pi, is going to be great.
I’ve just extended Pi from my phone while doing groceries.
Peter Steinberger describing his "AHA" moment.
Wolters Kluwer is down in the market. This is a direct response to Anthropic releasing Opus 4.6 and Cowork. Via Beursnerd
I liked this post very much by David Crawshaw (exe.dev founder): Eight more months of agents. He is describing all kinds of interesting experiences I can relate to based on my experiences:
...it is all about the model
...
In 2026, I don't use an IDE any more. ... I am back on Vi ... Vi is turning 50 this year.
...
...if you try some penny-saving cheap model like Sonnet, or a second rate local model, you do worse than waste your time, you learn the wrong lessons.I want local models to succeed more than anyone. I found LLMs entirely uninteresting until the day mixtral came out and I was able to get it kinda-sorta working locally on a very expensive machine. The moment I held one of these I finally appreciated it. And I know local models will win. At some point frontier models will face diminishing returns, local models will catch up, and we will be done being beholden to frontier models. That will be a wonderful day, but until then, you will not know what models will be capable of unless you use the best. Pay through the nose for Opus or GPT-7.9-xhigh-with-cheese. Don't worry, it's only for a few years.
...
...you have to provide your own sandbox.
...
I am extremely out of touch with anti-LLM arguments
...
By far the worst product I had to use every day in this new world were clouds, so that's what I'm building...
Anthropic 2026 Agentic Coding Trends Report.
Vouch by Mitchell Hashimoto, amongst others inspired by how Mario Zechner triages contributors:
A contributor trust management system based on explicit vouches to participate
2026-02-07
Working in the agent and simply calling the agent.
Claude Opus 4.6 /fast very 💸💸💸
Pi is really amazing. Great example of bespoke software. It feels like a thin wrapper for the agent sdk’s. No fluff, pure. Pi enables you to create your own way of working, instead of adhering to imposed ways. Very liberating. Pi acknowledges that the models are great.
Programming the magical “box”. That’s all it is.
Meta-engineering & harness engineering.
T800 incoming.
Nice interview with Armin Ronacher and Mario Zechner.
2026-02-06
Via JG, Alexander Klöpping: Oh Shit.
...In Den Haag formeren ze op dit moment een kabinet zonder Minister van Digitale Zaken, laat staan een Minister van AI.
Ze hebben het Oh Shit-moment niet gehad.
Als er de afgelopen weken iemand was geweest die Jetten, Yesilgöz en Bontenbal verplicht dertig minuten achter een laptop had gezet om met AI een applicatie te bouwen, was misschien het kwartje gevallen. Dan hadden ze het gezien.
We hebben geen plan.
Via Johannes Elmarasy, he pointed me towards this X post by Greg Brockman: Software development is undergoing a renaissance in front of our eyes.
...This post shares how OpenAI is currently approaching retooling our teams towards agentic software development.
...
Structure codebases to be agent-first
Need to read this post by Mario Zechner creator of Pi.
We will be killing our editor extension, the Amp VS Code extension. We're going to be killing it. And we're going to be killing it because we think it's no longer the future. We think the sidebar is dead.
JG and myself have been exploring Pi today. Pi is great. Less is more. ❤️
Hot take: Ai Driven Development is more deterministic than pure human based development.
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 claude
JG made me aware of the terminal reset command. This will fix the weird character situation you might encounter while using Ghostty. I asked Pi to explain reset to me:
The reset command is a Unix/Linux terminal utility that reinitializes your terminal to a sane state.
Ethan Mollick on X has pointed out some interesting information in the system card. My eye caught this:
...model welfare...
2026-02-05
X article by Lance Martin from Anthropic on getting the most out of Opus 4.6.
This MartinFowler.com article by Birgitta Böckeler on context engineering for coding agents, presents a good overview of the variables at play.
Mitchell wrote a lovely post on his journey from Ai sceptic to his adoption of Ai. I liked the section in which he described: harness engineering
…It is the idea that anytime you find an agent makes a mistake, you take the time to engineer a solution such that the agent never makes that mistake again.
Nice visualized comparison of sub-agents vs agent teams. Posted by Lydia from Anthropic
CC Agent Teams. Orchestration has arrived. Docs:
Teammates message each other directly
Opus 4.6 is here. Announcement post.
Really Perplexity? Via X. Feeding the rumours:
"claude46opusthinking": {
"label": "Claude Opus 4.6 Thinking",
"description": "Anthropic's Opus reasoning model with thinking",
"mode": "search"
},
Claude Code /debug
...Great for chatting through issues like "/debug why didn't my hook trigger?" or "/debug why did my tool call fail?"
Just this Tuesday we built our very first RAG pipeline at work. But there are signals from the industry that there might be better strategies emerging. PageIndex seems very interesting. Entry point is a ToC.
PageIndex is a vectorless, reasoning-based RAG engine that mirrors how humans read, delivering traceable, explainable, and context-aware retrieval, without vector databases or chunking.
People from Amp wrote this: Liberating Code Review:
...we've fully decoupled the review agent completely from any UI, making it a composable and extensible subroutine that can be invoked from many different places where it is useful
...
This composability also means you can more easily close the loop...
They have introduced the concept of checks. This is a nice idea, but again those extra needed directories. Don't know how I feel about them.
Gergely Orosz on the pattern of sleep disruption, in this new era of Ai. Directly after seeing this message, I got an newsletter email from Geoff Huntley addressing exactly this.
exe.dev first impressions: super low barrier of entry. Minimalistic way of quickly creating vm's. Builtin web based shell. Enables mobile interaction on the go. Shift to mobile is important. Last night I closed my laptop while working in OpenCode, first thought was, it's a good thing I have an SSH client on my phone...
Deno (founded by Ryan Dahl) is joining the sandboxes race as well. Stumbled upon this, 5 mins, after trying exe.dev...
Pi has a new home pi.dev. Thanks to the fine people at exe.dev. Need to try Pi and exe.dev.
Sam Altman’s response to yesterday’s Anthropic ads. From what I can remember Amp was the front runner in using ads in their product Amp Free.
Claude Code /insights
…give suggestions on how to improve your workflow.
2026-02-04
sub-agents are for context management.
The open format Agent Skills, is a very nice initiative. If only the context aware activation of those skills would be more dependable.
.agents/skills is becoming a thing. Cursor and opencode
Nice tip from Ado @ Anthropic: tab towards your next prompt.
GPT-5.2 and GPT-5.2-Codex are now 40% faster But how did they do this?
Model season has begun? Alibaba's QwenTeam has released: qwen3-coder-next just days after Kimi K2.5 has dropped.
Boris on agentic search vs. RAG in CC.
People are raving about this: Apple’s Xcode now supports the Claude Agent SDK. It appears that OpenAI's Codex is also available in Xcode 26.3 RC. Competition anyone.
2026-02-03
Today we had a hackaton at work. RAG was the subject. I was really impressed by the results presented only after a few hours. A year ago, this would have been very different.
xAI has joined SpaceX
Ethan Mollick on X thread that touches upon engineering an agentic engineering process:
*A lesson of the past three years is that figuring out the best way to use AI is an exploratory process that is open to many!.
...
This is especially true about the best way to use Claude Code (or any other similar tool) in your particular industry or organization or context. Literally nobody knows the answer to that, but you can discover it yourself if you experiment.
2026-02-02
The new OpenAI Codex app. A lot of UI. The new Codex? What happened to the old Codex?
OpenAI product link
I'm just going to wait for George Hotz, to finalize his Gas Town exploration.
Rumour has it, CC swarms is only one feature flag away... 💸 Maybe this coincides with the Sonnet 5 rumours.
Nice article by Thorsten Ball (who is building Amp). He describes a good place to start:
...try to take a simple file, have the agent write tests for it, have the agent run them, don’t look at the code, have the agent modify the code & run the tests, increase the scope, see where that leads you.
There is noise on the internets about an imminent arrival of Sonnet 5... Could be just that, noise. Or not.
Interesting, Jarred Sumner is gauging people's stance if LSP support were to be removed from CC. Quinn Slack has some interesting responses to this. Quinn is Sourcegraph founder, and knows quite a lot about LSP's. He's currently working on Amp. They have not included for LSP support for a reason.
2026-02-01
From the Results from the 2025 Go Developer Survey, it seems cloud environments outside of the US are becoming increasingly popular. Hetzner:
...We found that the “Other” category increased to 11% this year, and this was primarily driven by Hetzner (20% of Other responses); we plan to include Hetzner as a response choice in next year’s survey.
Armin’s agent stuff
Need to have a look at the Pi coding agent.
Wut? OpenClaw is built on top of Pi. Via Armin Ronacher. Direct link to the article.
…both OpenClaw and Pi follow the same idea: LLMs are really good at writing and running code, so embrace this.
…
Pi’s entire idea is that if you want the agent to do something that it doesn’t do yet, you don’t go and download an extension or a skill or something like this. You ask the agent to extend itself.
Claude Cowork plugins for {your business domain here} has the potential to replace a lot of current business “solutions”.
This article is nice for people who feel intimidated by the terminal. Might be useful for Product Owners.
CC —from-pr
Important Insights by Boris (original CC creator), in how CC is used by the CC team at Anthropic.
2026-01-31
OpenClaw has extensive docs Need to read this first. Especially all things related to security
😳 OpenClaw star history
Within 10 mins I'm made aware of Temporal twice. Github
Gas Town Hall Includes GT docs.
Via Steve's post: hallucination squatting which lead me to: slopsquatting. An actor acts on (potential) LLM hallucinations for future functional benefit.
Steve Yegge has a new post. Reading is WIP. But wow.
How AI Impacts Skill Formation Interesting research paper (by Anthropic). Found via the alignment post by Anthropic. From the research abstract:
...Our findings suggest that AI-enhanced productivity is not a shortcut to competence and AI assistance should be carefully adopted into workflows to preserve skill formation...
2026-01-30
New CC plugin Playground
We've published a new Claude Code plugin called playground that helps Claude generate HTML playgrounds. These are standalone HTML files that let you visualize a problem with Claude, interact with it and give you an output prompt to paste back into Claude Code.
The OpenAI article referenced in the Vercel article is also very interesting. It describes setting up an eval framework for SKILL usage.
Mighty article by Vercel. They were also not trusting the SKILL activation, so did quite an investigation. The results are very interesting. It's about trust in system memory and the skill system. Just like Anthropic is using index files (for the robot) for their documentation, Vercel has found that an optimised index in system memory yield a much greater trust in capability activation.
2026-01-29
Nice interview with Peter Steinberger on The Pragmatic Engineer. It's all about "closing the loop".
This beautiful-mermaid by Craft might be interesting. Via Gergely Orosz on X (who is apparently the brother of the creator). For the lazy ones, a direct link to github.
Important library: whenwords. It's A Software Library with No Code. It's also a nice playground to experiment with your fav. agent. Drew also wrote about when you want to have a library with code:
- When Performance Matters
- When Testing is Complicated
- When You Need to Provide Support & Bug Fixes
- When Updates Matter
- When Community & Interoperability Matter
Via Quinn Slack on X, apparently Amp is also moving away from custom (slash) commands (remember that Anthropic is also moving in this direction).
Via Anthropic apparently you can get active PR info in CC now. You need gh Github CLI for this to work.
Again an open source project that needs to create a strict policy on Ai usage. This time it’s Jellyfin. Via HN. Earlier this week it was Mitchell Hashimoto for Ghostty.
2026-01-28
I've created a receipts skill. I will point the agent to a directory with scanned receipt pdf's, and will let it process the filenames based on the contents of the pdf's. Works great, will save me a lot of time. Next step is to import & match these automatically against my accounting system.
Post by the zed.dev team announcing the Agent Client Protocol Registry. Zed and JetBrains have been collaborating on ACP.
❤️
⏺ Let me verify diagnostics are clean.
⏺ gopls - go_diagnostics (MCP)(files: [])
⎿ No diagnostics.
Nice article by Peter Steinberger, creator of moltbot (previously clawdbot)
...Building software is like walking up a mountain. You don’t go straight up, you circle around it and take turns, sometimes you get off path and have to walk a bit back, and it’s imperfect, but eventually you get to where you need to be.
2026-01-27
Dynamic AGENTS.md resolution in OpenCode also learn.md. The latter looks a lot like my optimization instruction.
Mistral releases Mistral Vibe 2.0 Let's go EU.
Anthropic Boris: next version of CC will let you customize the spinner verbs :-)
I've observed the claude-code-guide subagent web fetching a documentation index This combined with the llms.txt is just elegant. Need to try this pattern out for efficient code repo searching.
Working with the new opsx workflow in OpenSpec. Very seamless. The opsx:verify surprised me. It's an OS provided gap analysis.
I'm going to see if an OS change spec can be combined with the new CC Tasks primitive.
ollama launchis a new command which sets up and runs your favorite coding tools like Claude Code, OpenCode, and Codex with local or cloud models. No environment variables or config files needed
Andrej Karpathy posted a few random notes on X
...LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related...
Thorsten Ball explains how we should strive to work towards not needing to review the code anymore.
2026-01-26
MCP-apps has business potential.
*…Consider a tool that queries your database. It returns rows of data, maybe hundreds of them. The model can summarize this data, but users often want to explore: sort by a column, filter to a date range, or click into a specific record. With text responses, every interaction requires another prompt. “Show me just the ones from last week.” “Sort by revenue.” “What’s the detail on row 47?” It works, but it’s slow.
MCP Apps closes this gap…*
OpenSpec has reached version 1.0.0. today. And with it a new workflow called OPSX.
In order to not get confused in the process by an agent native plan mode and OpenSpec. After talking to JG this morning. This sequence of steps will work.
- in [plan] mode brainstorm on the plan (I like
/superpowers:brainstorm); - use the plan as the basis for an OpenSpec proposal (
/openspec:proposal); - OpenSpec apply the proposal (
/openspec:apply) This way you don't end up with a plan for writing the OpenSpec proposal ;-)
playwright-cli & v1.58.0 Playwright, nice example of a project advertising for CLI usage over MCP, because of token-efficiency:
Modern coding agents increasingly favor CLI–based workflows exposed as SKILLs over MCP because CLI invocations are more token-efficient: they avoid loading large tool schemas and verbose accessibility trees into the model context, allowing agents to act through concise, purpose-built commands. This makes CLI + SKILLs better suited for high-throughput coding agents that must balance browser automation with large codebases, tests, and reasoning within limited context windows.
2026-01-25
Angie Jones on AI-Assisted Development at Block
…we deliberately structure repos so AI agents can understand, navigate, and contribute at scale
Anthropic Boris mentioned they shipped a new rendering engine in CC last week. Maybe that explains the new look and feel.
Jeffrey Emanuel uses a standard prompt to let CC "clean up it's own messes". Prompt:
Great, now I want you to carefully read over all of the new code you just wrote and other existing code you just modified with "fresh eyes" looking super carefully for any obvious bugs, errors, problems, issues, confusion, etc. Carefully fix anything you uncover.
Anthropic async hooks in CC
The way Anthropic setup their own documentation for the CC documentation agent is nice. They have an index file llms.txt the agent can use this index to retrieve applicable documentation. I reckon this pattern can be used for other use cases as well. Perhaps repo's?
Angie Jones commented on LinkedIn:
…we have our agents doing 15-21 days of engineering work per sprint! we assign them tickets, they implement the work, run tests, fix the failures, and put up a PR!
2026-01-24
Need to watch Boris giving a walkthrough of Claude Cowork & Code
Unified way to install skills by Vercel
Need to discuss with JG if clawd is interesting for our Slack setup.
BIll Kennedy on 2026
I need everyone to start focusing on their engineering skills…
Anthropic: Building multi-agent systems: when and how to use them
This whole article is goud
Anthropic: Building agents with Skills: Equipping agents for specialized work
…the emerging agent architecture looks like a combination of:
- Agent loop: The core reasoning system that decides what to do next
- Agent runtime: Execution environment (code, filesystem)
- MCP servers: Connections to external tools and data sources
- Skills library: Domain expertise and procedural knowledge
Anthropic SKILLS support sub-folders?
Anthropic SKILL updates.
Going forward, when thinking of making a slash command we suggest making a skill instead.
…you can choose whether you want it to be invocable, model-invocable, or both (the default).
Skills naturally pair with subagents. Subagents allow you to execute the skill while protecting your context window, you can also choose which subagent is activated and if you want to fork the context.
2026-01-23
I feel comfortable in the Go ecosystem. Just as years ago I felt comfortable in the .NET ecosystem. Might be interesting to also dabble a bit in bun. It has matured quite a bit. Anthropic buying bun hopefully supports sustainability.
Build in Opus, review in GPT
Ralph loop in Goose
SKILLS are what Neo (The Matrix) gets loaded into his brain. He now knows Kung Fu.
What would I do without Tailscale.
All those uncomfortable “enterprise” VPN solutions.
Mitchell guarding the Ghostty project against bad AI drivers
commit with more background info.
I reckon the true Ai awareness & adoption in the “upper” levels of an organization, will follow the same pattern as with agile, from way back.
Agentation is cool
Anthropic Task in CC
…Tasks are a new primitive that help Claude Code track and complete more complicated projects and collaborate on them across multiple sessions or subagents.
They took inspiration from Steve Yegge’s beads. It’s an upgrade of Todo. This could be big.
From sprites
"Stateful sandbox environments with checkpoint & restore"
This pretty much describes what I have setup in Proxmox.
Delta Based Development. It's about getting the SPECS and code in sync. It's not, "...and now implement feature x", it's "what's the current delta between the specs and code, and sync". Need to experiment with SPEC removal.
JG mentioned "gap analysis" a couple of times this week.
2026-01-22
David Crawshaw (co-founder & ex-CTO Tailscale) is now CEO of exe.dev. They provide VM sandboxes. Seems somewhat similar to fly.io Sprites
Need to find a way to combine openspec with beads. I like my beads setup.
People who are using the robots as Stack Overflow 2.0 have not fully experienced the power of the loop. It's all about the loop(s).
CC or OC as the interface to all things computer. Me: "Hey, change this svg to a png", robot: "sure I'll just use ImageMagick and run this otherworldly command". Also, managing a linux server with one of these agents is bliss.
Prompting the robot to perform cli based commands for workflows is an interesting pattern. Observed this in beads and openspec.
Ado CC lets you stash your prompt.
How should we position the role of QA in this new era of agents?
How cool would it be to just dump your idea in the agent orchestrator, let it do its thing and observe the output.
Appears that beads is migrating away from SQLite to dolt
People are liking OpenSpec I need to gain experience with this way of working
Seeing linear being mentioned quite a lot. Interested to work with this instead of the Atlassian and Microsoft offerings.
Would be awesome to work in a highly enthusiastic small team and going from reverse Ralphing to forward Ralphing essential business solutions.
Found out about the Jeffrey Emanuel's bespoke version of beads: beads_rust. Example of You don't need all of it
Anthropic Claude’s constitution
Anthropic CC best practices
2026-01-21
Agentic first impression scale: Oh fuck, oh wow, ahhhhh
Nice article by Angie Jones on skills vs MCP
Use the AskUserQuestion tool for the interview with the user. It's really nice.
JG has (again) mentioned LangChain Github
Via SW became aware of the article Electricity use of AI coding agents
Via JG Thoughtworks reverse engineering
AI/works™ uses AI-enabled reverse engineering to interpret legacy applications and convert them into structured specifications enriched with regulatory, security and industry context. These specifications guide agentic workflows that generate production-grade code, automated tests and deployment pipelines.
2026-01-20
…the era of humans writing code is over.
2026 and forward is engineering². Engineering solutions in a looping engineering process.
2026-01-19
JG mentioned reverse RW gap analysis SPEC <=> Code. Very interesting, eventual product consent through looping.
LSP tool bootstrapping analysis. Specific LSP tools will be loaded based on repo conditionals. In the case of Go prob. looking for a go.mod file.
Keep the original plan as a verification source after the beads have been implemented.
Every week we discuss some surprising new phenomenon or principle we’ve discovered while coding with 20+ agents.
2026-01-18
- The Website Obesity crisis from way back.
- This is a motherfucking website via: Motherfucking Blog
Jeffrey Emanuel has launched his port of Steve Yegge's beads: beads_rust
I still want to create my own with just plain old SQlite.
I like how Simon Willison quotes people.