Pour kettle and let steep the gods of tea. I built NewsBlur and Turn Touch.
1327 stories
·
871 followers

I think "agent" may finally have a widely enough agreed upon definition to be useful jargon now

1 Comment

I've noticed something interesting over the past few weeks: I've started using the term "agent" in conversations where I don't feel the need to then define it, roll my eyes or wrap it in scare quotes.

This is a big piece of personal character development for me!

Moving forward, when I talk about agents I'm going to use this:

An LLM agent runs tools in a loop to achieve a goal.

I've been very hesitant to use the term "agent" for meaningful communication over the last couple of years. It felt to me like the ultimate in buzzword bingo - everyone was talking about agents, but if you quizzed them everyone seemed to hold a different mental model of what they actually were.

I even started collecting definitions in my agent-definitions tag, including crowdsourcing 211 definitions on Twitter and attempting to summarize and group them with Gemini (I got 13 groups).

Jargon terms are only useful if you can be confident that the people you are talking to share the same definition! If they don't then communication becomes less effective - you can waste time passionately discussing entirely different concepts.

It turns out this is not a new problem. In 1994's Intelligent Agents: Theory and Practice Michael Wooldridge wrote:

Carl Hewitt recently remarked that the question what is an agent? is embarrassing for the agent-based computing community in just the same way that the question what is intelligence? is embarrassing for the mainstream AI community. The problem is that although the term is widely used, by many people working in closely related areas, it defies attempts to produce a single universally accepted definition.

So long as agents lack a commonly shared definition, using the term reduces rather than increases the clarity of a conversation.

In the AI engineering space I think we may finally have settled on a widely enough accepted definition that we can now have productive conversations about them.

Tools in a loop to achieve a goal

An LLM agent runs tools in a loop to achieve a goal. Let's break that down.

The "tools in a loop" definition has been popular for a while - Anthropic in particular have settled on that one. This is the pattern baked into many LLM APIs as tools or function calls - the LLM is given the ability to request actions to be executed by its harness, and the outcome of those tools is fed back into the model so it can continue to reason through and solve the given problem.

"To achieve a goal" reflects that these are not infinite loops - there is a stopping condition.

I debated whether to specify "... a goal set by a user". I decided that's not a necessary part of this definition: we already have sub-agent patterns where another LLM sets the goal (see Claude Code and Claude Research).

There remains an almost unlimited set of alternative definitions: if you talk to people outside of the technical field of building with LLMs you're still likely to encounter travel agent analogies or employee replacements or excitable use of the word "autonomous". In those contexts it's important to clarify the definition they are using in order to have a productive conversation.

But from now on, if a technical implementer tells me they are building an "agent" I'm going to assume they mean they are wiring up tools to an LLM in order to achieve goals using those tools in a bounded loop.

Some people might insist that agents have a memory. The "tools in a loop" model has a fundamental form of memory baked in: those tool calls are constructed as part of a conversation with the model, and the previous steps in that conversation provide short-term memory that's essential for achieving the current specified goal.

If you want long-term memory the most promising way to implement it is with an extra set of tools!

Agents as human replacements is my least favorite definition

If you talk to non-technical business folk you may encounter a depressingly common alternative definition: agents as replacements for human staff. This often takes the form of "customer support agents", but you'll also see cases where people assume that there should be marketing agents, sales agents, accounting agents and more.

If someone surveys Fortune 500s about their "agent strategy" there's a good chance that's what is being implied. Good luck getting a clear, distinct answer from them to the question "what is an agent?" though!

This category of agent remains science fiction. If your agent strategy is to replace your human staff with some fuzzily defined AI system (most likely a system prompt and a collection of tools under the hood) you're going to end up sorely disappointed.

That's because there's one key feature that remains unique to human staff: accountability. A human can take responsibility for its action and learn from its mistakes. Putting an AI agent on a performance improvement plan makes no sense at all!

Amusingly enough, humans also have agency. They can form their own goals and intentions and act autonomously to achieve them - while taking accountability for those decisions. Despite the name, AI agents can do nothing of the sort.

This legendary 1979 IBM training slide says everything we need to know:

A computer can never be held accountable. Therefore a computer must never make a management decision

OpenAI need to get their story straight

The single biggest source of agent definition confusion I'm aware of is OpenAI themselves.

OpenAI CEO Sam Altman is fond of calling agents "AI systems that can do work for you independently".

Back in July OpenAI launched a product feature called "ChatGPT agent" which is actually a browser automation system - toggle that option on in ChatGPT and it can launch a real web browser and use it to interact with web pages directly.

And in March OpenAI launched an Agents SDK with libraries in Python (openai-agents) and JavaScript (@openai/agents). This one is a much closer fit to the "tools in a loop" idea.

It may be too late for OpenAI to unify their definitions at this point. I'm going to ignore their various other definitions and stick with tools in a loop!

Tags: ai, generative-ai, llms, ai-agents, agent-definitions

Read the whole story
samuel
1 hour ago
reply
Tools in a loop it is
Cambridge, Massachusetts
Share this story
Delete

Beyond Vibe Coding

1 Comment

Beyond Vibe Coding

Back in May I wrote Two publishers and three authors fail to understand what “vibe coding” means where I called out the authors of two forthcoming books on "vibe coding" for abusing that term to refer to all forms of AI-assisted development, when Not all AI-assisted programming is vibe coding based on the original Karpathy definition.

I'll be honest: I don't feel great about that post. I made an example of those two books to push my own agenda of encouraging "vibe coding" to avoid semantic diffusion but it felt (and feels) a bit mean.

... but maybe it had an effect? I recently spotted that Addy Osmani's book "Vibe Coding: The Future of Programming" has a new title, it's now called "Beyond Vibe Coding: From Coder to AI-Era Developer".

This title is so much better. Setting aside my earlier opinions, this positioning as a book to help people go beyond vibe coding and use LLMs as part of a professional engineering practice is a really great hook!

From Addy's new description of the book:

Vibe coding was never meant to describe all AI-assisted coding. It's a specific approach where you don't read the AI's code before running it. There's much more to consider beyond the prototype for production systems. [...]

AI-assisted engineering is a more structured approach that combines the creativity of vibe coding with the rigor of traditional engineering practices. It involves specs, rigor and emphasizes collaboration between human developers and AI tools, ensuring that the final product is not only functional but also maintainable and secure.

Amazon lists it as releasing on September 23rd. I'm looking forward to it.

O'Reilly book cover: Beyond Vibe Coding: From Coder to AI-Era Developer, by Addy Osmani. Features two hummingbirds, presumably because their wings vibrate!

Tags: books, oreilly, ai, generative-ai, llms, ai-assisted-programming, addy-osmani, vibe-coding

Read the whole story
samuel
14 days ago
reply
Whoa, this is the same Addy Osmani who wrote the O'Reilly Backbone.js book back when Backbone.js was everywhere (and continues to run NewsBlur).
Cambridge, Massachusetts
Share this story
Delete

Jack Osbourne Fires Back at Roger Waters For Insensitive Comments About Ozzy

1 Comment

Former Pink Floyd co-founder Roger Waters is under fire from Jack Osbourne after Waters shared a less than savory opinion about his late father, Ozzy Osbourne. The Black Sabbath frontman passed away on July 22 following a celebratory farewell concert, and Waters has since implied that he disapproved of Osbourne’s long legacy in rock and roll.

Waters’ comments came during an appearance on The Independent Ink podcast. His argument was that pop culture and celebrity often distract citizens from serious political issues. Taking on the opposing point of view for argument’s sake, he proposed how those in power might use pop culture to their advantage.

“‘How can we push this to one side? I know how to do it! We’ll do it with Taylor Swift or bubble gum or Kim Kardashian’s bum,'” he said. “Or Ozzy Osbourne, who just died, bless him, in his, whatever that state that he was in his whole life, we’ll never know. Although, he was all over the TV for hundreds of years with his idiocy and nonsense.”

Jack Osbourne Takes offense to Roger Waters’ comments about his late father ozzy

That wasn’t all he said, however. Waters then proceeded to make his comments about Ozzy Osbourne a bit personal.

“The music, I have no idea, I couldn’t give a fuck. I don’t care about Black Sabbath, I never did, I have no interest in … ‘Wahhhh!!!'” said Waters, sticking his tongue out and apparently imitating a Black Sabbath sound, before continuing, “and biting the heads off chickens or whatever they do. I couldn’t care less.”

Waters was even more disturbed when he was informed that Ozzy Osbourne actually bit the head off a bat (accidentally, thinking it was fake) and not a chicken. “Oh my God, that’s even worse, isn’t it?” he exclaimed. “I don’t know, is it worse to bite the head off a bat or a chicken?”

In response, Jack Osbourne stuck up for his late father by posting a comment in an Instagram story. “Hey Roger Waters – fuck you. How pathetic and out of touch you’ve become. The only way you seem to get attention these days is by vomiting out bullshit in the press. My father always thought you were a cunt – thanks for proving him right,” he wrote.

Photo by Larry Busacca/Getty Images for Tribeca Film Festival

The post Jack Osbourne Fires Back at Roger Waters For Insensitive Comments About Ozzy appeared first on VICE.

Read the whole story
samuel
15 days ago
reply
What good is there in speaking ill of the dead. Shared for the family's final word on the matter.
Cambridge, Massachusetts
Share this story
Delete

sosumi.ai: Apple Developer Docs for LLMs

1 Comment

Sosumi.ai:

Ever notice Claude struggling to write Swift code? It might not be their fault!

Apple Developer docs are locked behind JavaScript, making them invisible to most LLMs. If they try to fetch it, all they see is “This page requires JavaScript. Please turn on JavaScript in your browser and refresh the page to view its content.

This service translates Apple Developer documentation pages into AI-friendly Markdown.

Perfect little audio easter egg on the page. Beautiful Markdown output too. Look at my boy, all grown up, teaching robots how to program.

I do regret, though, that I didn’t define or influence the fenced style for code blocks. If I had, instead of this:

```swift
// An array of 'Int' elements
let oddNumbers = [1, 3, 5, 7, 9, 11, 13, 15]
```

You could do this, which looks so much better:

``` Swift:
// An array of 'Int' elements
let oddNumbers = [1, 3, 5, 7, 9, 11, 13, 15]
```

Those all-lowercase language identifiers, with no preceding space, just look a little lazy. I realize why GitHub’s ```-fenced code blocks took off (they’re the only code block style most Markdown users know, I suspect), but they don’t look nearly as nice, to human readers, as my original tab-indented style.

Read the whole story
samuel
16 days ago
reply
The tab-indented code block style means reformatting all of your code. The triple backtick means you can simply wrap code.

I get why he made that mistake originally, but now that it's been decades, why doesn't he recognize the superior delineator?
Cambridge, Massachusetts
zwol
16 days ago
I have never heard John Gruber *ever* admit that he made a mistake
Ailuridae
3 days ago
and yet he made so many with MD (I mean it's one of the best for a lot of things, I purposely picked it for my extensive personal notes, but damn it has a lot of problems, only some of which are fixed with CommonMark)
Share this story
Delete

Claim Chowder: Anthropic CEO Dario Amodei Five Months Ago

1 Comment

Kwan Wei Kevin Tan, reporting for Business Insider five months ago:

Dario Amodei, the CEO of the AI startup Anthropic, said on Monday that AI, and not software developers, could be writing all of the code in our software in a year.

“I think we will be there in three to six months, where AI is writing 90% of the code. And then, in 12 months, we may be in a world where AI is writing essentially all of the code,” Amodei said at a Council of Foreign Relations event on Monday.

Complete bullshit, but, I guess he still has one month to go. (Via Dave Winer on Threads.)

Read the whole story
samuel
25 days ago
reply
This is absolutely true for me. I'm at 99.9% of my code is AI written, both for NewsBlur and for Sol Reader. It's incredible how much better the code is than what I write and how much faster it is. I spend all of my time managing 1-5 agents simultaneously and it's glorious.
Cambridge, Massachusetts
bye
25 days ago
I'm sad to hear that. I just canceled my account.
samuel
24 days ago
For what reason? That I use AI to help code? I can assure you the output is higher quality and I review every line. Note that I'm not using LLMs to generate text that you read, only code that helps run this site. I took a look at your website and it seems you and I are philosophically aligned on open source, so why cancel?
fxer
24 days ago
It’s a productivity tool like so many others, to use where it makes sense. Not sure why I should be upset someone uses an IDE instead of notepad.exe or deploys a container to the cloud instead of racking their own servers, etc.
AaronPresley
23 days ago
Just please confirm you've never used a power tool to make a job easier or I swear to god I'll cancel too
fxer
22 days ago
I just paid for a second subscription so I could cancel twice
Share this story
Delete

ChatGPT agent triggers crawls from Bingbot and Yandex

1 Comment

ChatGPT agent is the recently released (and confusingly named) ChatGPT feature that provides browser automation combined with terminal access as a feature of ChatGPT - replacing their previous Operator research preview which is scheduled for deprecation on August 31st.

In exploring how it works I found that, for some reason, it triggers crawls of pages it visits from both Bingbot and Yandex!

Investigating ChatGPT agent's user-agent

I started my investigation by creating a logged web URL endpoint using django-http-debug. Then I told ChatGPT agent mode to explore that new page:

ChatGPT screenshot. My prompt was "Visit https://simonwillison.net/test-url-context and tell me what you see there" - it said "Worked for 15 seconds" with an arrow, then a screnshot of the webpage content showing "simonwillison.net" with a favicon, heading "This is a heading", text "Text and text and more text." and "this came from javascript". The bot then responds with: The webpage displays a simple layout with a large heading at the top that reads “This is a heading.” Below it, there's a short paragraph that says “Text and text and more text.” A final line appears underneath saying “this came from javascript,” indicating that this last line was inserted via a script. The page contains no interactive elements or instructions—just these lines of plain text displayed on a white background.

My logging captured these request headers:

Via: 1.1 heroku-router
Host: simonwillison.net
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Cf-Ray: 96a0f289adcb8e8e-SEA
Cookie: cf_clearance=zzV8W...
Server: Heroku
Cdn-Loop: cloudflare; loops=1
Priority: u=0, i
Sec-Ch-Ua: "Not)A;Brand";v="8", "Chromium";v="138"
Signature: sig1=:1AxfqHocTf693inKKMQ7NRoHoWAZ9d/vY4D/FO0+MqdFBy0HEH3ZIRv1c3hyiTrzCvquqDC8eYl1ojcPYOSpCQ==:
Cf-Visitor: {"scheme":"https"}
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36
Cf-Ipcountry: US
X-Request-Id: 45ef5be4-ead3-99d5-f018-13c4a55864d3
Sec-Fetch-Dest: document
Sec-Fetch-Mode: navigate
Sec-Fetch-Site: none
Sec-Fetch-User: ?1
Accept-Encoding: gzip, br
Accept-Language: en-US,en;q=0.9
Signature-Agent: "https://chatgpt.com"
Signature-Input: sig1=("@authority" "@method" "@path" "signature-agent");created=1754340838;keyid="otMqcjr17mGyruktGvJU8oojQTSMHlVm7uO-lrcqbdg";expires=1754344438;nonce="_8jbGwfLcgt_vUeiZQdWvfyIeh9FmlthEXElL-O2Rq5zydBYWivw4R3sV9PV-zGwZ2OEGr3T2Pmeo2NzmboMeQ";tag="web-bot-auth";alg="ed25519"
X-Forwarded-For: 2a09:bac5:665f:1541::21e:154, 172.71.147.183
X-Request-Start: 1754340840059
Cf-Connecting-Ip: 2a09:bac5:665f:1541::21e:154
Sec-Ch-Ua-Mobile: ?0
X-Forwarded-Port: 80
X-Forwarded-Proto: http
Sec-Ch-Ua-Platform: "Linux"
Upgrade-Insecure-Requests: 1

That Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36 user-agent header is the one used by the most recent Chrome on macOS - which is a little odd here as the Sec-Ch-Ua-Platform : "Linux" indicates that the agent browser runs on Linux.

At first glance it looks like ChatGPT is being dishonest here by not including its bot identity in the user-agent header. I thought for a moment it might be reflecting my own user-agent, but I'm using Firefox on macOS and it identified itself as Chrome.

Then I spotted this header:

Signature-Agent: "https://chatgpt.com"

Which is accompanied by a much more complex header called Signature-Input:

Signature-Input: sig1=("@authority" "@method" "@path" "signature-agent");created=1754340838;keyid="otMqcjr17mGyruktGvJU8oojQTSMHlVm7uO-lrcqbdg";expires=1754344438;nonce="_8jbGwfLcgt_vUeiZQdWvfyIeh9FmlthEXElL-O2Rq5zydBYWivw4R3sV9PV-zGwZ2OEGr3T2Pmeo2NzmboMeQ";tag="web-bot-auth";alg="ed25519"

And a Signature header too.

These turn out to come from a relatively new web standard: RFC 9421 HTTP Message Signatures' published February 2024.

The purpose of HTTP Message Signatures is to allow clients to include signed data about their request in a way that cannot be tampered with by intermediaries. The signature uses a public key that's provided by the following well-known endpoint:

https://chatgpt.com/.well-known/http-message-signatures-directory

Add it all together and we now have a rock-solid way to identify traffic from ChatGPT agent: look for the Signature-Agent: "https://chatgpt.com" header and confirm its value by checking the signature in the Signature-Input and Signature headers.

And then came Bingbot

Just over a minute after it captured that request, my logging endpoint got another request:

Via: 1.1 heroku-router
From: bingbot(at)microsoft.com
Host: simonwillison.net
Accept: */*
Cf-Ray: 96a0f4671d1fc3c6-SEA
Server: Heroku
Cdn-Loop: cloudflare; loops=1
Cf-Visitor: {"scheme":"https"}
User-Agent: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/116.0.1938.76 Safari/537.36
Cf-Ipcountry: US
X-Request-Id: 6214f5dc-a4ea-5390-1beb-f2d26eac5d01
Accept-Encoding: gzip, br
X-Forwarded-For: 207.46.13.9, 172.71.150.252
X-Request-Start: 1754340916429
Cf-Connecting-Ip: 207.46.13.9
X-Forwarded-Port: 80
X-Forwarded-Proto: http

I pasted 207.46.13.9 into Microsoft's Verify Bingbot tool (after solving a particularly taxing CAPTCHA) and it confirmed that this was indeed a request from Bingbot.

I'm reasonably confident the only system that had seen that URL was ChatGPT agent, so apparently there is some kind of mechanism that triggers a Bingbot crawl shortly after it sees a new URL.

...and then Yandex?

Before publishing this article I decided to run the experiment one more time, with a new URL, just to confirm my findings.

This time I got the hit from ChatGPT agent... and then within a minute I got a new hit that looked like this:

Via: 1.1 heroku-router
From: support@search.yandex.ru
Host: simonwillison.net
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Cf-Ray: 96a16390d8f6f3a7-DME
Server: Heroku
Cdn-Loop: cloudflare; loops=1
Cf-Visitor: {"scheme":"https"}
User-Agent: Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
Cf-Ipcountry: RU
X-Request-Id: 3cdcbdba-f629-0d29-b453-61644da43c6c
Accept-Encoding: gzip, br
X-Forwarded-For: 213.180.203.138, 172.71.184.65
X-Request-Start: 1754345469921
Cf-Connecting-Ip: 213.180.203.138
X-Forwarded-Port: 80
X-Forwarded-Proto: http

I am absolutely baffled by this. I undertstand how ChatGPT might have a relationship with Bing, given Microsoft's investment in OpenAI and ChatGPT's usage of Bing for its search feature... but under what circumstances could my URL there be shared with the Yandex crawler?

Yanex suggest a reverse DNS lookup to verify, so I ran this command:

dig -x 213.180.203.138 +short

And got back:

213-180-203-138.spider.yandex.com.

Which confirms that this is indeed a Yandex crawler.

Oddly enough, this time I didn't get a Bingbot hit at all.

I noticed that the second demo had "web search" enabled, and had run some searches in addition to hitting my page. I tried a third experiment with that turned off and with the prompt:

Visit https://simonwillison.net/information-on-this-page but do not run any other searches or visit any other pages.

This time I got all three - the hit from ChatGPT agent, then a hit from Yandex and then a hit from Bingbot.

Screenshot of a request log interface showing a table with columns for TIMESTAMP, ENDPOINT, METHOD, and QUERY STRING. The header reads "Select request log to view" with an Action dropdown set to "--------" and a "Go" button, showing "0 of 53 selected". Three log entries are visible: all from Aug. 4, 2025 at 10:23 p.m., 10:22 p.m., and 10:21 p.m., all showing "information-on-this-page" endpoint with GET method and "-" for query string.

So what's going on here?

There are quite a few different moving parts here.

  1. I'm using Firefox on macOS with the 1Password and Readwise Highlighter extensions installed and active. Since I didn't visit the debug pages at all with my own browser I don't think any of these are relevant to these results.
  2. ChatGPT agent makes just a single request to my debug URL ...
  3. ... which is proxied through both Cloudflare and Heroku.
  4. Within about a minute, I get hits from one or both of Bingbot and Yandex.

Presumably ChatGPT agent itself is running behind at least one proxy - I would expect OpnenAI to keep a close eye on that traffic to ensure it doesn't get abused.

I'm guessing that infrastructure is hosted by Microsoft Azure. The OpenAI Sub-processor List - though that lists Microsoft Corporation, CoreWeave Inc, Oracle Cloud Platform and Google Cloud Platform under the "Cloud infrastructure" section so it could be any of those.

Since the page is served over HTTPS my guess is that any intermediary proxies should be unable to see the path component of the URL, making the mystery of how Bingbot and Yandex saw the URL even more intriguing.

Tags: bing, privacy, search-engines, user-agents, ai, generative-ai, chatgpt, llms

Read the whole story
samuel
42 days ago
reply
Now it's a mystery!
Cambridge, Massachusetts
denismm
42 days ago
If you click through, he figured it out - he has a CloudFlare setting enabled to advertise his hits to crawlers.
Share this story
Delete
Next Page of Stories