This post is part of our ongoing series of essays, postmortems, and working notes from the Brainscend team. The full article content is being migrated to the new platform. In the meantime, the excerpt below captures the core argument.

For most production tasks, a 7B fine-tuned model beats a frontier model on cost, latency, and reliability. Here's when to choose which.

The context

Every project we take on surfaces something unexpected: a constraint we didn't see coming, a model behaviour we had to rethink, a process failure that turned into a better process. Writing about it is how we make sure the learning sticks, both for us and for teams facing similar problems.

This piece came out of real delivery work. The details have been reviewed with the client where relevant. Names and identifying information have been changed or omitted where confidentiality applies.

What we found

The short version is in the excerpt above. The longer version involves more nuance, more failure, and more iteration than any summary can capture. That's the point of writing it down.

If you're working through something similar, the “Takeaways” section below is the most actionable part. Everything else is context that makes the takeaway believable.

Takeaways

The principles that held up across this engagement weren't novel. They were things we already believed, tested under conditions that made it easy to forget them. Writing them down here is as much a reminder for ourselves as anything else.

If you're dealing with something similar and want to talk through the specifics, get in touch. We'd rather have a useful conversation than publish a complete essay.