Y
ou're paying someone to build something you can't fully evaluate. You're trusting that the code is sound, the architecture makes sense, and the shortcuts being taken are the acceptable kind. That's an uncomfortable position. It's also exactly the position that gets founders burned.The good news is that you don't need to read code to have a useful view of whether your developer is doing good work. You need to know what questions to ask, what answers should satisfy you, and what answers should concern you.
Here's how to evaluate software developer quality when you can't assess the work directly and what it means when the answers don't add up.
- >Code volume and response speed measure activity, not quality. Velocity and legibility matter more.
- >A good developer makes failure modes legible, pushes back on aggressive timelines, and admits when they don't know.
- >Declining velocity, recurring bug patterns, and vague answers about system health are primary red flags.
- >If you lack visibility into the system's ceiling, error rates, or backup integrity, you need an independent technical audit.
The Signals Most Founders Rely On (That Don't Actually Tell You Much)
The default signals founders use are almost all wrong.
Code volume, they shipped a lot this sprint — says nothing about quality. A developer can ship a lot of code that accrues technical debt faster than it delivers value. Features that look finished on the surface can be held together in ways that make the next feature twice as hard to build.
Confidence in technical discussions is similarly unreliable. A developer who sounds certain about every decision isn't necessarily a developer who's making good decisions. Uncertainty is honest. False confidence is a warning sign. If someone never expresses doubt about a complex technical choice, they're either working on a very simple problem or they're not being straight with you.
Response speed, meeting attendance, long working hours: none of these measure what you need to measure. They measure activity. What you need to measure is whether the system is getting better or worse over time, and whether your developer is being honest with you about why.
What Good Actually Looks Like
Good developers make their work legible. Not just to other developers but also to you.
When something goes wrong, a good developer tells you what happened, why it happened, and what's changing to prevent it from happening again. Not a wall of jargon designed to end the conversation. A clear explanation: "This database query was running without an index, so it got slow as data grew. We added the index and we're adding a check to catch this class of problem earlier in review." That's a complete answer. It covers the cause, the fix, and the process change.
Good developers tell you when they don't know something. Every senior engineer I've worked with has said "I'm not sure yet" on hard problems. It's not weakness—it's how you avoid confidently building the wrong thing.
Good developers push back. When you ask for a timeline that's too aggressive, or a feature that will create problems downstream, they tell you. They explain the trade-off and give you enough information to make the decision yourself. A developer who says yes to everything isn't serving you well. They're just deferring the problem.
Good developers get more productive over time, not less. If your velocity was higher six months ago than it is now, something has gone wrong. Either the architecture is getting harder to change safely as the codebase grows—which is technical debt accumulating—or there's a coordination problem in the team, or scope has expanded without commensurate resourcing. Any of these is a real conversation to have. None of them improve by being ignored.
The Red Flags That Actually Mean Something
Your system breaks in ways that were foreseeable. Traffic spikes during a promo, a payment flow fails in an edge case, a new feature introduces a bug in an existing one. The question isn't whether things break, they always do. It's whether the failure modes were anticipated and built for. An experienced developer thinks through the likely failure cases before the code ships. A less experienced one finds out about them from users.
The same types of bugs keep appearing. One-off bugs are normal in any production system. Recurring patterns of the same type authentication edge cases, data consistency problems, third-party integration failures; suggest the root cause was never addressed. Patches that treat symptoms without fixing the underlying issue compound. The codebase accumulates fragility.
"It's complicated" gets used to deflect, not explain. Complexity is real. Some problems are genuinely hard. But when "it's complicated" becomes the response to specific, reasonable questions why did this take three weeks, what caused this outage, why is this feature harder than expected—and no actual explanation follows, that's usually covering for architecture that nobody fully understands anymore, including the people who built it.
Estimates are consistently wrong in the same direction. Missing a sprint estimate occasionally is expected. Missing by a consistent margin, sprint after sprint, in the same direction, suggests either a planning dysfunction or a codebase that's harder to work in than the team admits. Both are worth surfacing.
You can't get a straight answer about the state of the system. How many concurrent users can it handle before it degrades? What's the current error rate? When did someone last test restoring from a database backup? These are basic questions. If the answers are vague, your developer may not know—and "we don't know the state of the thing we built" is itself a meaningful signal about what kind of care the system has been receiving.
The Mistake That Costs the Most
The most expensive mistake is waiting until something has gone badly wrong before asking any of these questions. By then, you're not evaluating—you're doing a post-mortem while the business is under pressure and your options are narrower than they were six months ago.
The second mistake is treating a bad signal as a personnel problem before checking whether it's an architecture problem. A developer who was effective a year ago and seems slower now may not be performing worse. They may be working in a codebase that's become significantly harder to change safely. That's a different diagnosis with a different solution.
When something feels off, the right move is to get a second opinion on the technical side before drawing any conclusions. Not to build a case against someone—to understand what you're actually looking at. This is what a technical audit is for: an independent assessment of the codebase, architecture, and engineering practices, translated into terms that help you make decisions about what to do next.
It's not an adversarial process. The founders who get the most value from a technical audit are the ones who bring their developer into it, because a good developer has nothing to fear from it and often wants the external validation.
What This Looks Like When It Goes Wrong
A Series A founder in Jakarta hired a development agency to build the core product. Eighteen months in, they had a working system and a growing user base. They also had an engineering team spending 60% of its time on bug fixes rather than features, a system that degraded under traffic spikes that were entirely predictable from their growth trajectory, and no clear picture of what it would cost to fix any of it.
The agency had done what was asked. The product was built, features were shipped. But there was no caching layer, no error monitoring, no structured logging, and a database schema that hadn't been designed for the queries the application was actually running at scale. The code worked. The architecture didn't.
An independent technical audit took two days. It produced a prioritised list of what needed fixing immediately versus what could wait, and a realistic cost estimate for each. The founder had information they'd been making decisions without for over a year. That's all the audit was for: not blame, just information.
If you want to understand what a technical debt audit actually covers and what it costs to run one, [→ read: How to Run a Technical Debt Audit]. If you're also evaluating whether your current development partner is the right long-term fit, [→ How to Choose a Software Development Company in Indonesia] covers what to look for before you sign.
FAQ
Q: How can a non-technical founder evaluate developer quality?
A: Focus on outcomes, predictability, and communication rather than activity. Is the system getting more reliable over time? Are estimates roughly accurate? Does the developer explain decisions clearly and flag trade-offs proactively? These signals don't require reading code to assess.
Q: What questions should I ask my developer about system health right now?
A: Ask three things: What's our ceiling—at what load does the system start to degrade? What's our current error rate? When did we last test restoring from a backup, and what happened? If none of these get a confident, specific answer, you have less visibility into your own product than you should.
Q: What does technical debt look like from a non-technical founder's perspective?
A: Slowing velocity, where features that used to take a week now take three. Recurring bugs of the same type. Estimates that are consistently optimistic. Developers who describe everything as complicated without explaining why. These are the surface symptoms of a codebase that's becoming more expensive to work in every month.
Q: When should I get an independent technical audit?
A: When something feels off and you don't have the technical background to diagnose it yourself. Also as a standard practice before a significant new investment in headcount, infrastructure, or a new development partner. A technical audit gives you the information to make those decisions based on reality rather than reassurance.
Q: What's the difference between a developer who's doing good work and one who just looks productive?
A: A developer doing good work makes the system more reliable and easier to change over time. A developer who looks productive ships code that doesn't address root causes, builds complexity that only they understand, and produces a codebase that gets harder to work in as it grows. The test is trajectory: is the system in better shape than it was six months ago? Is the team faster or slower than they were then?
The honest question isn't "is this specific developer good?" It's "is the system getting better, and do I have the information I need to make decisions about it?" If the answer to either is no, that's worth understanding before the next contract renewal, the next** hire, or the next funding round. If you want an independent read on where your system actually stands, that's the kind of architecture review we run at SpectreDev—with the founder in the room, not just the engineering team.