Apr 13, 2025

Critical Company Size

“Noisy gradients are ok, actually.”

I’ve noticed something interesting about the nature of product development - it’s a lot like machine learning.

In machine learning, training progress comes down to two things:

Finding the right direction (gradient):
You take a batch of examples, calculate the gradient to improve, and then move in that direction.
Taking frequent steps:
You don’t calculate gradients once; you repeat this many thousands of times, constantly improving the model.

The size of your batch matters. Small batches give noisy directions but calculate quickly, letting you step forward often. Large batches give precise directions, but calculating them takes time, meaning fewer, slower steps.

Researchers call this the Critical Batch Size, the point where the trade-off is optimal. Below it, increasing your batch size (more precision) genuinely improves training speed. But above it, each increase just wastes compute on precision you don’t really need, making each step slower and your overall progress worse.

Teams and Companies have a “Critical Size”, Too

This intuitive also fits companies and teams working on innovative products.

In these situations, you’re doing one of two things:

Deciding what to do (gradient):
Gathering feedback, user research, experiments, or intuition.
Actually doing (execution):
Building, shipping, releasing, iterating - just seeing what happens.

The team size directly affects this trade-off:

Too small:
Your gradient is too noisy, shaped by limited feedback. Decisions come quickly, but they swing wildly based on a single conversation or user request.
Too large:
You spend far too long setting direction (endless planning). The impact - you take fewer actual steps forward.

Even if you have the perfect direction, there is a curse not found in model training. With product development the environment can change around you, and your stable (expensive) gradient can become directionally wrong.

A smaller, faster-moving competitor with noisier but more frequent steps will win.

We’re seeing this time and time again. Taking a lot of steps with a “good enough” gradient - wins. See Cursor AI and WhatsApp as insane ~20 person success stories.

John Carmack summed this up perfectly:

Feedback beats planning

How to Innovate

I don’t want to prescribe some “ideal number” or pretend there’s one size fits all. But I’ve found this intuition useful when thinking about scaling project teams or companies:

Progress comes from balancing direction (gradients) and execution speed (step frequency).

In practice, the most effective innovation-focused teams I’ve seen, this balance tends to favor uncomfortably noisy direction and uncomfortably rapid steps.

And this is the lesson.

It’s ok to have a noisy gradient, just act on it quickly and repeatedly.

Stay small enough to stay fast. Optimize for steps, not perfection.