There's a new type of software agency that has emerged in the last two years. They market themselves on speed. "Ship in days, not months." "10× faster with AI." They're not lying about the speed — they can move fast. What they're not telling you is what corners they cut to get there.
We call this vibe coding. The name comes from a common workflow: prompt an LLM with a rough description of what you want, accept the output if it looks plausible, move on. No deep reading of what was generated. No understanding of the abstractions the model chose. No consideration of what happens when the thing needs to change in six months.
Vibe coding can produce working software. The problem is that "working" and "correct" are not the same thing — and the gap between them tends to be invisible until it isn't.
The Invisible Tax
When you vibe code a payment flow, it might work correctly for 99% of transactions. The 1% that don't — edge cases the LLM didn't anticipate, race conditions in concurrent requests, decimal precision errors in currency calculations — won't show up in your test suite if you let the LLM write the tests too. They'll show up in production, at the worst possible time, in ways that are difficult to trace because nobody on your team actually understands the code that's running.
"The most dangerous line of code is the one you didn't write — the one you accepted."
This is the invisible tax of vibe coding. You pay nothing upfront and a great deal later. The payment comes in the form of bugs that are hard to reproduce, architecture that's expensive to change, and onboarding costs that are proportional to how little anyone understands the system they're maintaining.
What We Do Instead
We use AI heavily. Dismissing AI as a development tool in 2026 would be like dismissing version control in 2005 — a choice that says more about the chooser than the tool. The question is never whether to use AI. The question is how to use it without ceding the understanding that makes you a reliable engineer.
Our workflow has three stages:
1. Design first, generate second
Before we prompt an LLM for anything non-trivial, we design the solution. Not in exhaustive detail — we're not writing specs for bureaucratic reasons — but enough to know what we're asking for and why. A data model sketch. A sequence diagram for a critical flow. A list of edge cases we know we need to handle. The design work takes an hour. It makes the generation work honest.
An LLM given a well-scoped problem produces dramatically better output than one given an open-ended one. More importantly, a well-scoped prompt produces output you can evaluate — because you know what correct looks like before you see what the model produced.
2. Read everything you accept
Every function we accept from an AI tool is read by a senior engineer. Not skimmed — read. We ask: does this do what I think it does? Does it handle the edge cases I listed? Is the abstraction the model chose the right one for this context, or did it make the convenient choice rather than the correct one?
This sounds slow. It isn't, relative to the alternative. Reading and evaluating AI-generated code is significantly faster than writing the same code from scratch. The key word is "evaluating" — you need to be a capable engineer to do it well. You can't delegate understanding.
3. Own the tests, own the code
We write tests. We let AI help us write tests. But we never accept a test suite without understanding what it's testing and what it's not. The test suite is documentation. It describes what the system is supposed to do. If you don't understand your tests, you don't understand your system.
We pay particular attention to what the tests don't cover: happy paths the model found easy but edge cases it didn't think to include. For anything touching money, security, or compliance, we write adversarial tests ourselves — not because we don't trust AI, but because adversarial thinking is a human skill that AI is not yet reliably good at.
Where AI Genuinely Excels
We're not being precious about this. There are entire categories of work where AI assistance is genuinely transformative:
- Boilerplate — CRUD endpoints, migration scripts, serializers, test fixtures
- Documentation — docstrings, README files, OpenAPI specs from existing code
- Refactoring well-understood code — extracting functions, renaming for clarity, converting between formats
- Research — "what are the tradeoffs between X and Y for this use case" produces better first-pass analysis than most Stack Overflow threads
- Code review assistance — "what could go wrong with this function" surfaces issues a tired engineer might miss
The common thread: these are tasks where the definition of "correct" is clear, and where errors are easy to detect and cheap to fix. The places we stay vigilant are places where correctness is subtle, errors are expensive, and the system will be maintained by people who weren't in the room when it was built.
The Question to Ask Any Agency
If you're evaluating a software agency in 2026, ask them one question: "Walk me through how a piece of code goes from a prompt to production at your firm."
A vibe-coding shop will describe a workflow that ends at the generated output. An engineering firm will describe everything that comes after — the review, the tests, the edge cases considered, the architecture decisions made and documented. The answer to that question tells you everything you need to know about what you're buying.
We believe the engineering discipline that made software reliable before AI exists for the same reasons now. AI made us faster. It didn't make the discipline optional.