I used a billion tokens to build the wrong product

February 26, 20267 min read

The more you plan, the better AI builds.

This is the most common narrative in the AI world right now.

Sure. Spend 80% of your time planning. You write detailed feature breakdowns, design language, dos and don'ts, your taste injected into every spec, and AI will execute it all beautifully. No argument there.

But what if you don't know what you want? What if the problem isn't clear in your head? I can see the goalpost. I can feel that something needs to exist. But I cannot, for the life of me, articulate what exactly that thing is. How do you plan for something you can't even put into words? You write a spec for what? A vague feeling?

So I build the wrong thing. On purpose. I take a half-formed idea, rant to my AI in a voice note, and have a working version before my coffee gets cold. And every single time, it builds something that works but also feels wrong.

That's the point. I'm not trying to get it right. I'm trying to look at something concrete and say "Wow, aren't you a useless piece of crap."

I was hiring engineers. I had back-to-back calls. 15-minute slots, three to four hours straight. That's roughly 90 calls a week.

I take all these calls with a physical notebook. Non-negotiable. I don't want some app as my note-taker. When I'm talking to someone, I want to be looking at them.

These are 15-minute intro calls. I don't have time for deep technical dives. So I'd come up with a five-question format. It started with a simple gotcha question that immediately filtered out people who didn't think through what was being asked and just started answering like a record. Based on what happened in that answer, I'd branch out in a couple of different directions, trying to get more signals on their agency, their technical capability. All in fifteen minutes, scribbling in my notebook without looking at it, eyes on the person.

Problem is, two or three hours later, after back-to-back conversations, I'd go back to my notes for the first person and have no idea what the hell I was thinking when I wrote those things. The signal was gone. The context that made that sentence meaningful had evaporated somewhere between person four and person five. I was losing the why behind my observations.

I wanted something that could hold onto those signals with their context intact. I've been thinking about products for a long time, so naturally my brain went: okay, I need a talent pipeline manager. Track people, track signals, track the context around those signals. And obviously I wanted AI somewhere in this. I just wanted to dump my messy handwritten notes in, get a coherent summary out, something I could open days later and instantly know what I was looking at.

I also knew what I'd find in those 90 weekly calls. Most of them would be what I call non-technical engineers. If it wasn't code, if it was plumbing or standing at a petrol pump, they'd do it with the same attitude. I wanted to stay the hell away from them. Then there'd be a smaller group with actual potential. Technically decent, maybe can't articulate their thinking well yet, but I'd pick up on something like hunger, high agency, a desire to prove themselves if someone just gave them a shot. And then the rare ones who think in systems and explain things clearly.

I wanted a simple three-colour [ 🔴 🟡 🟢 ] system to bucket them. I knew the shape of the problem. Had no clue how to solve it. So I told the AI what I roughly wanted and let it build something.

And it did. I looked at it with pride. Three colour-coded buckets. Sorting logic, tagging, the whole thing. I could add a person, assign a colour, see the list.

I sat there staring at it. Okay. Now what? I had three pretty buckets full of people and absolutely no idea what to do with any of them. The app did exactly what I asked for. It was technically correct but completely useless.

You might say, "Congrats, you invented Prototyping!" And I will wholeheartedly agree with you on that.

I still didn't know what the right thing was. But sure as hell the sorting wasn't the problem. Putting them in buckets was a piece of the solution, not the solution itself. The actual problem was somewhere else. I kept using it after every single call. Using the wrong thing, struggling to achieve my simple goals.

The first day using the app was brutal. I'd finish a marathon of calls, click pictures of my notebook pages with my phone, send them to ChatGPT to parse my terrible handwriting, then manually create entries in the app and paste the parsed notes in. The signals it generated were absolute garbage. They gave zero context on why I'd written something. The summaries read like a robot filling out a form: "Discussed career goals and current role." Wow, thanks! That tells me nothing.

I could look at that two days later and have less information than if I'd just stared at my original notes.

I'd done exactly the thing I'd watched a hundred enterprise companies do and made fun of them for. I slapped an LLM API on a normal piece of software and called it an AI app. I knew this doesn't work. I'd seen it fail everywhere. But I did it anyway. You can't skip the wrong versions. You have to use them, let the friction guide you in the right direction.

Now came the painful steps of iteration.

The prompt was bad, so I kept rewriting them, again and again and again. I told the model exactly how to parse my notes, what to extract, what to ignore, how to structure it.
Told it what I actually cared about. It was articulation. It was whether someone paused and thought about the question before answering, instead of immediately reciting whatever they had memorized. Whether they'd shown agency in some way.
My notes alone weren't enough context for any useful signal, so I added resume uploads and let AI parse the resume and my notes together into one summary. It was getting better.
Finding people in the call list was a nightmare, so I redid the navigation.

Each fix came from the same place. I used the wrong thing and then took it to the extremities of usability until it utterly failed to do its job.

One day a hiring manager asked me about a candidate and I just forwarded the resume. Their resume was the usual bullshit: "Spearheaded cross-functional initiatives to drive operational excellence." Come on!

Immediately after I realized, I'd spent time with this person getting to know them. What I actually wanted to share was the signal I'd collected: the bucket they belonged in, my notes, the patterns I'd spotted, the stuff between the lines. So I built a profile snapshot. Big green (or yellow) header at the top. The signals. The details. Something that basically said "I already did the work of understanding this person."

Not a single one of these features was planned.

I think you're solving the wrong thing

Yeah, your AI will never say that. It'll say "You're a genius." "You are absolutely right." "You've uncovered something no one truly understands." "You have quietly come up with something amazing."

It's a sycophant. You say "build this" and it goes "Great idea! Here's your app."

That's why the deliberate wrong-building matters. Because if AI won't challenge your thinking, you need something that will. And the fastest thing I've found is a concrete wrong version staring back at you. Your own reaction to the tasteless slop you have built.

Will I do it again? Hell yes!

I'm building it again now. From scratch. With the clarity that a billion wrong tokens paid for.