Pour kettle and let steep the gods of tea. I built NewsBlur and Turn Touch.
1341 stories
·
868 followers

I ported JustHTML from Python to JavaScript with Codex CLI and GPT-5.2 in 4.5 hours

2 Comments

I wrote about JustHTML yesterday - Emil Stenström's project to build a new standards compliant HTML5 parser in pure Python code using coding agents running against the comprehensive html5lib-tests testing library. Last night, purely out of curiosity, I decided to try porting JustHTML from Python to JavaScript with the least amount of effort possible, using Codex CLI and GPT-5.2. It worked beyond my expectations.

TL;DR

I built simonw/justjshtml, a dependency-free HTML5 parsing library in JavaScript which passes 9,200 tests from the html5lib-tests suite and imitates the API design of Emil's JustHTML library.

It took two initial prompts and a few tiny follow-ups. GPT-5.2 running in Codex CLI ran uninterrupted for several hours, burned through 1,464,295 input tokens, 97,122,176 cached input tokens and 625,563 output tokens and ended up producing 9,000 lines of fully tested JavaScript across 43 commits.

Time elapsed from project idea to finished library: about 4 hours, during which I also bought and decorated a Christmas tree with family and watched the latest Knives Out movie.

Some background

One of the most important contributions of the HTML5 specification ten years ago was the way it precisely specified how invalid HTML should be parsed. The world is full of invalid documents and having a specification that covers those means browsers can treat them in the same way - there's no more "undefined behavior" to worry about when building parsing software.

Unsurprisingly, those invalid parsing rules are pretty complex! The free online book Idiosyncrasies of the HTML parser by Simon Pieters is an excellent deep dive into this topic, in particular Chapter 3. The HTML parser.

The Python html5lib project started the html5lib-tests repository with a set of implementation-independent tests. These have since become the gold standard for interoperability testing of HTML5 parsers, and are used by projects such as Servo which used them to help build html5ever, a "high-performance browser-grade HTML5 parser" written in Rust.

Emil Stenström's JustHTML project is a pure-Python implementation of an HTML5 parser that passes the full html5lib-tests suite. Emil spent a couple of months working on this as a side project, deliberately picking a problem with a comprehensive existing test suite to see how far he could get with coding agents.

At one point he had the agents rewrite it based on a close inspection of the Rust html5ever library. I don't know how much of this was direct translation versus inspiration - his project has 1,215 commits total so it appears to have included a huge amount of iteration, not just a straight port.

My project is a straight port. I instructed Codex CLI to build a JavaScript version of Emil's Python code.

The process in detail

I started with a bit of mise en place. I checked out two repos and creating an empty third directory:

cd ~/dev
git clone https://github.com/EmilStenstrom/justhtml
git clone https://github.com/html5lib/html5lib-tests
mkdir justjshtml
cd justjshtml

Then I started Codex CLI for GPT-5.2 like this:

codex --yolo -m gpt-5.2

That --yolo flag is a shortcut for --dangerously-bypass-approvals-and-sandbox, which is every bit as dangerous as it sounds.

My first prompt told Codex to inspect the existing code and use it to build a specification for the new JavaScript library:

We are going to create a JavaScript port of ~/dev/justhtml - an HTML parsing library that passes the full ~/dev/html5lib-tests test suite. It is going to have a similar API to the Python library but in JavaScript. It will have no dependencies other than raw JavaScript, hence it will work great in the browser and node.js and other environments. Start by reading ~/dev/justhtml and designing the user-facing API for the new library - create a spec.md containing your plan.

I reviewed the spec, which included a set of proposed milestones, and told it to add another:

Add an early step to the roadmap that involves an initial version that parses a simple example document that is valid and returns the right results. Then add and commit the spec.md file.

Here's the resulting spec.md file. My request for that initial version became "Milestone 0.5" which looked like this:

Milestone 0.5 — End-to-end smoke parse (single valid document)

  • Implement the smallest end-to-end slice so the public API is real early:
    • new JustHTML("<html><head></head><body><p>Hello</p></body></html>") returns a tree with the expected tag structure and text nodes.
    • doc.toText() returns "Hello" and doc.errors is empty for this valid input.
  • Add scripts/smoke.js (no deps) that runs the example and asserts the expected structure/output.
  • Gate: node scripts/smoke.js passes.

Then I told it:

Implement Milestone 0.5

And off it went. The resulting code appeared to work so I said:

Configure GitHub Actions test.yml to run that on every commit, then commit and push

I ran git init and created a private GitHub repository for this project at this point, and set up the local directory to push to that remote.

Here's that initial push. Then I told it:

OK do the rest, commit and push often

And that was almost it!

I set my laptop to not fall asleep and left it to its devices while we went off to buy a Christmas tree. The "commit and push often" meant I could monitor its progress on my phone by refreshing the commit log on GitHub.

I was running this against my $20/month ChatGPT Plus account, which has a five hour token allowance window for Codex CLI. That ran out at 6:35pm and Codex paused, so I waited until the reset point at 7:14pm and typed:

Keep going

At 9:30pm it declared itself done with the following summary message:

• What’s Done

  - html5lib-tests now run in CI and pass: tokenizer 6810/6810, tree 1770/1782
    (12 skipped: #script-on), serializer 230/230, encoding 82/83 (1 skipped:
    scripted) (.github/workflows/test.yml:1).
  - Public API expanded: stream() (src/stream.js:1), CSS selectors query()/
    matches() (src/selector.js:1), and toMarkdown() (src/markdown.js:1), all
    exported via src/index.js:1.
  - Added no-deps unit scripts for selectors/stream/markdown (scripts/run-
    selector-tests.js:1, scripts/run-stream-tests.js:1, scripts/run-markdown-
    tests.js:1) and wired them into CI.
  - Fixed TreeBuilder so <br> no longer closes <p> (src/treebuilder.js:647).

As a finishing touch, I had it add a playground interface so I could try out the new library in my browser. I prompted:

Add a playground.html in the top level folder which loads the necessary ES modules from ./src/... and implements the exact same functionality as seen on https://tools.simonwillison.net/justhtml but using the JavaScript library instead of Pyodide

It fetched my existing playground page using curl and built a new playground.html file that loaded the new JavaScript code instead. This worked perfectly.

I enabled GitHub Pages for my still-private repo which meant I could access the new playground at this URL:

https://simonw.github.io/justjshtml/playground.html

Screenshot of JustJSHTML Playground web application. Header reads "JustJSHTML Playground" with subtitle "A dependency-free JavaScript HTML5 parser - GitHub". Below is a status bar showing "JavaScript Environment" with a green "Ready" badge. The main input area has "Paste HTML" and "Fetch from URL" buttons, with a text area containing HTML code: "<!DOCTYPE html> <html> <head> <title>Example Page</title> </head> <body> <header> <nav> <ul>". A "Playground Mode" section shows buttons for "CSS Selector Query", "Pretty Print HTML", "Tree Structure", "Stream Events", "Extract Text", and "To Markdown" (highlighted in purple). Below is a text field labeled "CSS Selector (optional - leave empty for whole document):" with placeholder "e.g., article, main, .content (or leave empty)" and a green "Convert to Markdown" button. The Output section has a teal header with "Whole document" badge and displays converted markdown: "Example Page" followed by "- [Home](/)" "- [About](/about)" "- [Contact](/contact)".

All it needed now was some documentation:

Add a comprehensive README with full usage instructions including attribution plus how this was built plus how to use in in HTML plus how to use it in Node.js

You can read the result here.

We are now at eight prompts total, running for just over four hours and I've decorated for Christmas and watched Wake Up Dead Man on Netflix.

According to Codex CLI:

Token usage: total=2,089,858 input=1,464,295 (+ 97,122,176 cached) output=625,563 (reasoning 437,010)

My llm-prices.com calculator estimates that at $29.41 if I was paying for those tokens at API prices, but they were included in my $20/month ChatGPT Plus subscription so the actual extra cost to me was zero.

What can we learn from this?

I'm sharing this project because I think it demonstrates a bunch of interesting things about the state of LLMs in December 2025.

  • Frontier LLMs really can perform complex, multi-hour tasks with hundreds of tool calls and minimal supervision. I used GPT-5.2 for this but I have no reason to believe that Claude Opus 4.5 or Gemini 3 Pro would not be able to achieve the same thing - the only reason I haven't tried is that I don't want to burn another 4 hours of time and several million tokens on more runs.
  • If you can reduce a problem to a robust test suite you can set a coding agent loop loose on it with a high degree of confidence that it will eventually succeed. I called this designing the agentic loop a few months ago. I think it's the key skill to unlocking the potential of LLMs for complex tasks.
  • Porting entire open source libraries from one language to another via a coding agent works extremely well.
  • Code is so cheap it's practically free. Code that works continues to carry a cost, but that cost has plummeted now that coding agents can check their work as they go.
  • We haven't even begun to unpick the etiquette and ethics around this style of development. Is it responsible and appropriate to churn out a direct port of a library like this in a few hours while watching a movie? What would it take for code built like this to be trusted in production?

I'll end with some open questions:

  • Does this library represent a legal violation of copyright of either the Rust library or the Python one?
  • Even if this is legal, is it ethical to build a library in this way?
  • Does this format of development hurt the open source ecosystem?
  • Is it responsible to publish software libraries built in this way?
  • How much better would this library be if an expert team hand crafted it over the course of several months?

Tags: html, javascript, python, ai, generative-ai, llms, ai-assisted-programming, gpt-5, codex-cli

Read the whole story
samuel
9 hours ago
reply
Well this is blowing my mind. Love the waypoint of the minimal test case.
Cambridge, Massachusetts
Share this story
Delete
1 public comment
denismm
7 hours ago
reply
Interesting that he runs the entire experiment and tjen says “is this legal? Is this ethical?”

'Tis the season for bright lights and big energy bills. Here are some ways to keep it budget-friendly.

1 Comment
A photo of Shawn Keeler's over-the-top holiday display at his Putnam Valley home.

Shawn Keeler has been going all out on Christmas decorations for decades.

Holiday decorations can zap a lot of energy, but there are ways to decorate and save on power bills. [ more › ]

Read the whole story
samuel
2 days ago
reply
Not sure I buy the 10% parasitic leakage from plugged in but turned off lights/inflatables. It’s way lower than that and that paragraph encourages paranoid behavior.
Cambridge, Massachusetts
Share this story
Delete

SCOTUS' denial of Texas case fuels First Amendment fears

1 Comment

The Supreme Court has declined to review a Llano County, Texas, fight over local officials' removal of a group of titles from public libraries that could reverberate far beyond local bookshelves.

The big picture: The clash that pits First Amendment rights against government and parental controls is playing out across the country, with access to books concerning race, gender and identity often caught in the crosshairs.


  • "Our government inserted itself into the personal reading choices of the citizens," lead plaintiff Leila Green Little told Axios in an interview on Tuesday. "If that's not something important to speak up for, I don't know what is."

Driving the news: The Supreme Court on Monday declined to take up the challenge against the removal of more than a dozen books from a central Texas county's libraries, letting stand an appeals court ruling that rejected the plaintiff's argument that the right to receive information extends to public libraries.

  • Writing for the 5th Circuit earlier this year, Judge Stuart Kyle Duncan, a Trump appointee, stated "[n]o one is banning" books, adding that if a "disappointed patron can't find a book in the library, he can order it online, buy it from a bookstore, or borrow it from a friend."

Threat level: Retired librarian Carolyn Foote, a co-founder of the Texas FReadom Fighters, told Axios the Supreme Court's decision not to hear the case will "embolden people who are trying to restrict books from the public, from students. "

  • She continued, "I do believe library users have First Amendment rights, and I do believe that local communities should be able to fight for the kinds of stories they want to see on their local shelves."
  • Little said she's already seen the appeals court decision in the case she and her fellow plaintiffs brought cited elsewhere in what she said are attempts at "further censorship at public libraries and in public school libraries."
  • She told Axios, "This is only the beginning."

Flashback: Little and six fellow plaintiffs sued a group of local officials in April 2022 for the removal of books such as "Caste: The Origins of Our Discontent" and "They Called Themselves the K.K.K.: The Birth of an American Terrorist Group."

  • In 2021, a Texas lawmaker turned heads with his list of hundreds of books he deemed objectionable.
  • After that, some Llano County residents alerted local commissioners of what they described as "Pornographic Filth" in libraries, Axios' Asher Price reported based on emails, prompting the county judge (the chief elected official in the county) to tell the library director to pull books.
  • Suzette Baker, who served as the head librarian of one of the county's libraries, alleged she was fired for insubordination after she refused to remove books from the shelves that dealt with gender and race.

Catch up quick: A federal judge in 2023 ordered Llano County officials to return the books to the shelves. But earlier this year, a majority of the full 5th U.S. Circuit Court of Appeals reversed that order.

  • The appeals court decision sets precedent in Texas, Louisiana and Mississippi.
  • In their request to the Supreme Court, the Llano County residents described the removals as "censorship." But the officials argued the library system was routinely weeding books and that the decisions were not related to the content within them.
  • County Judge Ron Cunningham did not immediately respond to Axios' request for comment.

Yes, but: "When you remove books as a librarian, it's through a weeding process ... you review sections of the library, like a whole section of the library at a time," Foote said.

  • She added, "weeding is not when someone sends you a list of books they're concerned about, and you go pluck them off the shelf. "

What they're saying: Sam Helmick, the president of the American Library Association, said in a statement emailed to Axios that the Supreme Court's denial left "millions of library users" with "a diminished right to read and explore information free from government interference."

  • Helmick added, "The ruling threatens to transform government libraries into centers for indoctrination instead of protecting them as hubs of open inquiry."

Go deeper: Book wars: Texas county poised to close its libraries



Read the whole story
samuel
6 days ago
reply
This court ruling is a loss of access for people who can't afford a book. I'm exposed to a more diverse set of books at the library than I am on amazon because I'll check out books that I would never buy, but I still get exposed. That's why libraries are targeted.
Cambridge, Massachusetts
acdha
6 days ago
This Is also for people who don't have perfect financial privacy: if you're, say, a gay kid you're not buying anything on Amazon where it's permanently in your order history whereas the local librarians have VERY STRONG OPINIONS about patron privacy.
Share this story
Delete

Apple Says Original iPhone SE is Now 'Obsolete'

1 Comment
Apple today added the first-generation iPhone SE to its obsolete products list, meaning the device is no longer eligible for repairs, battery replacements, or any other service at Apple Stores and Apple Authorized Service Providers worldwide.


Apple considers a product to be obsolete once seven years have passed since the company stopped distributing it for sale. The original iPhone SE was discontinued in September 2018, so the device recently crossed that seven-year mark.

The original iPhone SE was released in March 2016. The device's design is largely based on the iPhone 5s, with key specs including a 4-inch display, a Touch ID home button, and an aluminum and glass frame with chamfered edges. However, the original iPhone SE is powered by a newer A9 chip from the iPhone 6s and iPhone 6s Plus.

"Everyone who wants a smaller phone is going to love iPhone SE," said Apple's former marketing chief Phil Schiller, in a press release announcing the device.

Apple went on to release second-generation and third-generation iPhone SE models in April 2020 and March 2022, respectively, with both of those devices having a similar design as the iPhone 8. In February 2025, the iPhone SE was entirely discontinued for the foreseeable future, after it was effectively replaced by the iPhone 16e.
Related Forum: iPhone

This article, "Apple Says Original iPhone SE is Now 'Obsolete'" first appeared on MacRumors.com

Discuss this article in our forums

Read the whole story
samuel
14 days ago
reply
Still sad my iPhone SE 3 is the end of the line.
Cambridge, Massachusetts
Share this story
Delete

8 Foods You Should Never Cook In Cast Iron And 3 You Should

1 Comment
Cast iron skillets are revered by many chefs and home cooks, however, there are certain foods that just don't vibe well with cast iron and should be avoided.



Read the whole story
samuel
23 days ago
reply
Cast iron is both inexpensive and the most versatile for cooking. Great article here with dos and don’ts.
Cambridge, Massachusetts
Share this story
Delete

Code like a surgeon

1 Share

A lot of people say AI will make us all “managers” or “editors”…but I think this is a dangerously incomplete view!

Personally, I’m trying to code like a surgeon.

A surgeon isn’t a manager, they do the actual work! But their skills and time are highly leveraged with a support team that handles prep, secondary tasks, admin. The surgeon focuses on the important stuff they are uniquely good at.

My current goal with AI coding tools is to spend 100% of my time doing stuff that matters. (As a UI prototyper, that mostly means tinkering with design concepts.)

It turns out there are a LOT of secondary tasks which AI agents are now good enough to help out with. Some things I’m finding useful to hand off these days:

  • Before attempting a big task, write a guide to relevant areas of the codebase
  • Spike out an attempt at a big change. Often I won’t use the result but I’ll review it as a sketch of where to go
  • Fix typescript errors or bugs which have a clear specification
  • Write documentation about what I’m building

I often find it useful to run these secondary tasks async in the background – while I’m eating lunch, or even literally overnight!

When I sit down for a work session, I want to feel like a surgeon walking into a prepped operating room. Everything is ready for me to do what I’m good at.

Mind the autonomy slider

Notably, there is a huge difference between how I use AI for primary vs secondary tasks.

For the core design prototyping work, I still do a lot of coding by hand, and when I do use AI, I’m more careful and in the details. I need fast feedback loops and good visibility. (eg, I like Cursor tab-complete here)

Whereas for secondary tasks, I’m much much looser with it, happy to let an agent churn for hours in the background. The ability to get the job done eventually is the most important thing; speed and visibility matter less. Claude Code has been my go-to for long unsupervised sessions but Codex CLI is becoming a strong contender there too, possibly my new favorite.

These are very different work patterns! Reminds me of Andrej Karpathy’s “autonomy slider” concept. It’s dangerous to conflate different parts of the autonomy spectrum – the tools and mindset that are needed vary quite a lot.

Your agent doesn’t need a career trajectory

The “software surgeon” concept is a very old idea – Fred Brooks attributes it to Harlan Mills in his 1975 classic “The Mythical Man-Month”. He talks about a “chief programmer” who is supported by various staff including a “copilot” and various administrators. Of course, at the time, the idea was to have humans be in these support roles.

OK, so there is a super obvious angle here, that “AI has now made this approach economically viable where it wasn’t before”, yes yes… but I am also noticing a more subtle thing at play, something to do with status hierarchies.

A lot of the “secondary” tasks are “grunt work”, not the most intellectually fulfilling or creative part of the work. I have a strong preference for teams where everyone shares the grunt work; I hate the idea of giving all the grunt work to some lower-status members of the team. Yes, junior members will often have more grunt work, but they should also be given many interesting tasks to help them grow.

With AI this concern completely disappears! Now I can happily delegate pure grunt work. And the 24/7 availability is a big deal. I would never call a human intern at 11pm and tell them to have a research report on some code ready by 7am… but here I am, commanding my agent to do just that!

Notion is for surgeons?

Finally I’ll mention a couple thoughts on how this approach to work intersects with my employer, Notion.

First, as an employee, I find it incredibly valuable right now to work at a place that is bullish on AI coding tools. Having support for heavy use of AI coding tools, and a codebase that’s well setup for it, is enabling serious productivity gains for me – especially as a newcomer to a big codebase.

Secondly, as a product – in a sense I would say we are trying to bring this way of working to a broader group of knowledge workers beyond programmers. When I think about how that will play out, I like the mental model of enabling everyone to “work like a surgeon”.

The goal isn’t to delegate your core work, it’s to identify and delegate the secondary grunt work tasks, so you can focus on the main thing that matters.


If you liked this perspective, you might enjoy reading these other posts I’ve written about the nature of human-AI collaboration:

Read the whole story
samuel
31 days ago
reply
Cambridge, Massachusetts
Share this story
Delete
Next Page of Stories