How does NSPEC verify bugs?

An independent bug-verifier agent re-runs the repro in a fresh browser context, up to three times. Only bugs that reproduce, with a manual-grade confidence score, make it into the report.

Do you need access to my source code?

No. NSPEC tests the running UI. You give it a URL and optional login. It never reads your repo unless you opt in to git-diff based risk prioritization.

Which viewports are covered?

Six viewports at launch: desktop 1440, laptop 1280, tablet portrait and landscape, mobile portrait and landscape.

Yes, on Enterprise. Docker and Helm, with BYO LLM (OpenAI, Anthropic, or a local model). Artifacts never leave your network.

The five categories of bugs that manual QA keeps missing

After several months of beta runs across a couple dozen codebases, the same five categories of bug show up again and again. These are the ones that make it past code review, past CI, past manual QA, and land in production. Not because anyone is bad at their job · because these defects are invisible to the processes that are supposed to find them.

1. Viewport drift

A component renders correctly at desktop widths, wraps oddly at the tablet break, and truncates at mobile. Manual QA catches this only if someone thinks to resize the browser, which they don’t, because the Figma mock showed a desktop view. CI Playwright runs also usually target one viewport, not six.

Why NSPEC catches it: every route is audited at all six viewports by default. We diff the rendered output against layout invariants (text fits, controls are reachable, touch targets meet size rules) and flag drift as a P2 or P3.

2. State bugs hidden behind the happy path

The login form works when you enter a valid email and password. It also needs to handle: empty submit, invalid email shape, wrong-password, server error, network error, throttle, password manager auto-fill timing, paste-in-password-reveal, and “enter” vs button click. Manual QA tests the happy path because it’s the fastest. CI tests the happy path because that’s what Playwright scripts usually cover.

Why NSPEC catches it:the ui-explorer agent doesn’t have the bias that says “of course you submit with valid input first.” It walks the state space around the form and captures every observable the form produces. Server-side gates reject repeats, so you get one ticket per actual bug, not one per invalid input.

3. Console poison

The page looks fine. Every network request is 200. The console has 60 errors per page load, most of them from third parties (GA, Sentry, ad-tech, fonts), but a few of them are genuinely yours: a broken effect dep, a missing asset, a hydration mismatch you suppressed. Because the page looks fine, nobody looks at the console. The actual bug compounds over weeks until one of its adjacent failure modes finally breaks the UI.

Why NSPEC catches it:every run captures console output per step, the third-party noise filter drops the GA/Sentry noise, and what’s left gets triaged by the bug-verifier. If it’s reproducible and non-noisy, it’s a ticket.

4. Data-shape drift

An API response field gets renamed, or a new field gets introduced, or an array becomes sometimes-empty. The frontend renders with the new shape without crashing · but shows blank values, falls through to an “unknown” state, or renders “undefined%” on a stat card. Code review missed it because the frontend changes looked harmless. CI missed it because the fixture data still matched the old shape.

Why NSPEC catches it:the component-auditor asserts against the rendered node, not a synthetic fixture. It knows that “RTP: undefined%” is wrong regardless of what the backend said it should be.

5. Regression by accretion

The most expensive category. A feature quietly stops working because a seemingly-unrelated refactor landed four sprints ago. Nothing fails loudly. The feature just gradually drops engagement until someone notices a metric dashboard and has to reconstruct the git history to find the culprit.

Why NSPEC catches it: project memory. Every run remembers the output of the last run on the same surface. When the rendered state of a feature diverges from its prior baseline · widget now returns empty, CTA is no longer clickable, form no longer submits · it becomes a ticket with a first-seen timestamp pointing at the deploy that caused it.

The common thread

Every one of these bugs is invisible to the process that’s supposed to catch it. Code review sees a diff, not a rendered UI. CI sees a script, not the state space. Manual QA sees the happy path, not the console. The gap isn’t skill; it’s that the expected failure modes of a modern frontend outnumber the expected coverage of the people and processes guarding it.

Autonomous QA works not because the agents are clever, but because they’re indifferent. They don’t tire of resizing windows, they don’t skip the console on a page that “looks fine,” and they don’t remember the happy path more vividly than the edges. That indifference is the feature.

Want us to run a pass on your app? The waitlist is on the home page.