Every week in the past, Cursor CEO Michael Truell celebrated what gave the impression of a outstanding occasion.
“We constructed a browser with GPT-5.2 in Cursor,” he mentioned in a social media post. “It ran uninterrupted for one week.”
This browser, he mentioned, consisted of three million strains of code throughout 1000’s of information. “The rendering engine is from-scratch in Rust with HTML parsing, CSS cascade, structure, textual content shaping, paint, and a customized JS VM.”
An 88 p.c job failure fee is indicative of a code base that does not work
“It *form of* works! It nonetheless has points and is after all very removed from WebKit/Chromium parity, however we had been astonished that easy web sites render shortly and largely accurately,” he added.
Some builders managed to compile the code after some bug fixes. Others reported success after revisions to the construct directions.
However by and enormous, builders aren’t satisfied Cursor has made a breakthrough.
Jason Gorman, managing director of Codemanship, a UK-based software program growth consultancy, argues it is proof that agentic AI scales to supply damaged software program.
Oliver Medhurst, a software program engineer and former Mozillan who participates in Ecma Worldwide’s TC39 requirements group, concurs. Requested whether or not there’s something extra right here than an indication that AI brokers can produce massive initiatives of not very top quality code, he mentioned that is an excellent abstract of the undertaking.
“It’s spectacular that it does considerably work and that it could actually (poorly) handle a codebase of that measurement however I might say that’s what is spectacular,” Medhurst informed The Register in an e mail. “Cursor mentioned it was only a demonstration and I believe it’s truthful to name it as such, however it’s positively not an excellent browser engine, objectively. One other level is that it’s extremely bloated. Ladybird and Servo do far more in a lot much less strains of code (Ladybird and Servo repos are each ~1M).”
Not a simple job
Writing an online browser is without doubt one of the most difficult general-purpose functions a programmer can tackle. Chromium, the open supply basis of Google Chrome, has more than 37 million lines of code.
Cursor did not fairly go that far: Its browser, dubbed FastRender, consists of about three million strains of code, in keeping with Truell.
Software program developer Joshua Marinacci again in 2022 wrote about how difficult the net had change into, to the purpose the place “just a few corporations can implement a browser from scratch.”
The truth that Microsoft stopped growing its personal browser engine and moved Edge onto Chromium attests to the big engineering sources required to develop and preserve a browser and the underlying rendering know-how.
Cursor software program engineer Wilson Lin, who labored on the browser code, printed a blog post elaborating on the objectives for the undertaking: “to grasp how far we will push the frontier of agentic coding for initiatives that sometimes take human groups months to finish.”
The Register requested Cursor and Lin to remark however we have not heard again.
Marinacci’s submit, regardless of warnings concerning the complexity of browsers, nonetheless concluded by urging builders to attempt their hand at browser growth. A method to try this is to utilize present elements.
Critics accused Cursor of leaning closely on Servo, the open-source Rust-based rendering engine spun out of Mozilla.
However Lin, in an internet dialogue, rejected the declare that FastRender was cobbled collectively from libraries and frameworks. “I would push again on the concept all of the brokers did was wire up dependencies — the JS VM, DOM, paint methods, chrome, textual content pipeline, are all being developed as a part of this undertaking, and there are actual complicated methods being engineered in direction of the purpose of a browser engine, even when not there but,” he wrote.
Codemanship’s Gorman stays unimpressed. In his weblog submit, he factors out that the Action performance metrics on FastRender repo’s Insights web page present the instability of the underlying code.
“An 88 p.c job failure fee may be very excessive,” he wrote. “It is form of indicative of a code base that does not work.”
Once we requested Gorman about stories of profitable builds, he expressed skepticism, noting that the CI construct continues to be failing.
AI growth: ‘Identical sport, totally different cube.’
Whereas we have famous cases the place skilled builders have reported using AI coding tools to good effect, Gorman is unmoved by such tales.
“Once we take a look at the out there non-partisan knowledge (e.g., the most recent DORA State of AI-Assisted Software program Improvement report or the METR examine that discovered that devs reported vital productiveness beneficial properties however had been discovered to [be] 19 p.c slower on common), the pattern is evident that builders tremendously misjudge the influence on their very own productiveness, and that almost all of groups are negatively impacted on outcomes like lead instances and launch reliability,” he defined.
“The minority who see modest beneficial properties had already addressed the bottlenecks of their software program growth processes like testing, code evaluate and integration. Most groups won’t ever tackle these bottlenecks, principally as a result of they are not really trying on the outcomes.”
Gorman mentioned that lots of the extra sensational claims about AI coding success come from builders engaged on small issues on their very own, with out a buyer, customers, or dependencies tied to different groups.
“They obtained the automobile as much as 200 mph on a straight street with no different vehicles round and concluded that sooner vehicles equals sooner site visitors,” he mentioned. “Then they return to the workplace and demand these sorts of speed-ups from their groups who’re primarily driving in rush-hour site visitors.”
When the measurement is output – strains of code, commits, Pull Requests – Gorman says there’s positively a rise.
“However simply since you connect a code-generating firehose to your plumbing, that does not imply you will get an influence bathe,” he mentioned. “Numerous groups are measuring the water stress popping out of the firehose, not out of the bathe. That is what the proof is exhibiting.”
He went on to level to the absence of proof that AI instruments are resulting in the creation of extra software program, as measured by the amount of merchandise out there in app shops, and to the dearth of income attributable to those instruments.
“The place is all this AI-generated software program?” he mentioned. “I’ve spent three years trying into this. I really feel like James Randi at a spoon-bending conference typically.”
Gorman mentioned AI know-how may be very spectacular, however typically incorrect. “Do I believe it is of no worth?” he mentioned. “Completely not. I exploit it daily. As a coach and mentor, I really feel I would like to verify I’ve obtained an excellent deal with on use it, and on what works higher.
“Do I believe it is a game-changer? No. The rules and practices that enabled high-performing dev groups earlier than ‘AI’ are the very same rules and practices that make them efficient with it – small steps, tight suggestions loops, steady testing, code evaluate and integration, and extremely modular designs. Identical sport, totally different cube.”
He added, “If AI brokers actually may construct a working 3 million LOC product in every week, when does the person/buyer suggestions occur in that design course of? That is the place the actual worth will get found.” ®
Source link


