A few hours after the marketing site went live, we pointed a standard-tier NSPEC run at nspec.dev. This is the part of the post where a founder is supposed to say the tool passed with flying colours. It didn’t. It found four real bugs, and every one of them was a thing I thought I had already checked.
Here’s what came back.
The run
Parameters: tier=standard, six viewports (desktop 1440, laptop 1280, tablet portrait + landscape, mobile portrait + landscape), no login, accessibility category skipped. Wall clock time: roughly twelve minutes, most of it spent in the verifier.
The bugs
- P1 · Mobile menu close button unclickable. The hamburger opened the menu fine. The X to close it sat inside the header (
z-50), but the backdrop overlay was alsoz-40and covered the full viewport including the header area. The click went to the backdrop, not the button. I had tested opening the menu ten times. I had never once closed it without reloading. - P1 · “Reserve my seat” on the Team tier was dead. The popular pricing tier had a button with an
onClickthat pointed to a stale handler. It rendered. It hovered. It did nothing. This is the highest-intent CTA on the entire page. - P2 · “Talk to us” on Enterprise was also dead. Same pattern, different tier.
- P2 · Eight footer links were
href="#"stubs.Privacy, Terms, Blog, Changelog, and the social icons pointed nowhere. The individual verdict on each was mild; the collective signal was “unfinished project.”
What the verifier did
The thing that matters more than the bugs is what didn’t make it into the report. The first pass through the agents generated roughly forty candidates. After the verifier and the server-side quality gates, four survived. The rest were:
- Third-party console noise (font 404s from a CDN that had a bad minute, a rate-limited telemetry envelope, a browser-extension injection).
- Layout warnings that reproduced in one browser-context-state but disappeared in the fresh verifier context.
- Duplicate screenshots from two agents covering the same surface.
- One bug candidate from
ui-explorerthatbug-verifiercould not reproduce across three attempts and marked as “low” confidence. Kicked.
Without the verifier, the report would have been 40 items and the four real bugs would have been buried in the noise. With it, the ticket count equals the bug count.
What filing looked like
Each survivor went into a local tracker fixture over the MCP connector path (we don’t route real tickets to our internal Linear during self-test runs). The ticket body contained the exact shape an external tracker receives at launch: title, numbered repro, acceptance criteria, severity with rationale, and the evidence bundle (highlighted.png, fullpage.png, dom.html, console.log, network.json, verifier.json).
The fix turnaround
All four shipped inside four hours of the run. The mobile menu fix was three lines (z-[60] on the button container, top-16 on the overlay so it starts below the header). The pricing CTAs became a discriminated union on tier kind so Enterprise routes to a prefilled mailto and waitlist tiers open the inline form. The footer stubs became real routes.
The point
The point of dogfooding is not that the tool didn’t find bugs. The point is that a pre-launch marketing site, written by the same person who built the tool that tests it, still had two P1s and two P2s the operator hadn’t caught. The verifier pattern is the reason those four ended up in the ticket queue instead of the trash folder with the other thirty-six.
If you want to join the first cohort, the form is on the home page. If you want to argue about the architecture, the next post is for you.