Charles F. Stevens and I Didn’t Know What we were doing with 19yrs Ago

By Brianne Lee (Presented in Sunhwa Lee in PNAS 2007)

In the summer of 2007, my advisor Charles Stevens and I published a paper in PNAS with a quiet title: “General design principle for scalable neural circuits in a vertebrate retina.” We had been working with goldfish and zebrafish, measuring the dendritic arbors of retinal ganglion cells across eyes of dramatically different sizes. The question Chuck had been chewing on for years was simple to state and hard to answer: when a fish’s eye grows, how does the retina scale its circuitry?
The answer turned out to be a specific power law. Arbor area scaled with retinal area raised to roughly the 1/2 power, not linearly, not constantly, but at a precise sub-linear exponent. And that exponent wasn’t arbitrary. It emerged from the optimal balance between two competing objectives: spatial resolution and signal-to-noise accuracy. Biology, it seemed, had selected for neither maximum resolution nor maximum accuracy, but for the specific mathematical ratio that kept both in optimal tension as the system grew.
At the time, it felt like a specialized contribution, elegant within its niche, important to people who cared about retinal architecture, but unlikely to ripple outward. I was a young researcher. Chuck was Chuck, which meant the work met his standards, which meant it was rigorous. But it lived inside the neuroscience literature, cited modestly, and the world mostly moved on.
Then I moved on too. I left active research for reasons that had nothing to do with the work, life reasons, the ordinary kind. A decade passed. Then more.
Why I’m Writing Again
Chuck died in October 2022. He was 88 and still working. In his last years at the Salk Institute, he kept returning to what he called the scalable architecture of the brain, the idea that biological neural systems had to be built in a way that let them grow without being redesigned. In a 2012 PNAS interview he said it plainly:
“In order for evolution to work, neural circuits have to have what the computer scientists call a scalable architecture. That means that you have to be able to make the computer more powerful just by making it bigger and you don’t have the luxury of redesigning it.”
Chuck was using computer science language a decade before the current AI scaling debate became the defining question of the field. He wasn’t speculating about LLMs. He was describing the evolutionary problem biology has been solving for hundreds of millions of years. But his framing that scaling is a design constraint before it is an engineering achievement, now reads like a prophecy.
I’ve been watching the AI scaling conversation from the sidelines for a few years. The empirical laws that Kaplan, Chinchilla, and others have mapped out, loss as a power law in compute, parameters, and data, are beautiful, but they’re empirical. The field has the curves. It doesn’t yet have a first-principles theory for why those specific exponents appear.
Biology has an answer to that question, or at least a framework for one. And that framework is what Chuck spent his career building.
So I’m coming back to the work.
What the 2007 Paper Actually Showed
The fish retina presents an unusual opportunity. Unlike mammals, fish continue to grow throughout their lives. Their eyes enlarge. New retinal ganglion cells (RGCs) are added. Existing RGCs expand their dendritic arbors. The same visual world gets imaged onto a larger retina with more cells and the fish’s visual acuity improves with age.
This gives you something rare in biology: a natural experiment in scaling. The same circuit, same cell types, same organism, across an order of magnitude in size.
Chuck and I asked a specific question. When RGC arbors grow, how does their size relate to the size of the retina they sit in? There were three candidate answers, each corresponding to a different design principle:
Maximum resolution. Keep arbor size constant as the eye grows. More pixels, finer spatial detail. Arbor area independent of retina area (exponent 0).
Maximum accuracy. Grow arbor size in proportion to retina area. Each RGC averages more light, better signal-to-noise. Arbor area proportional to retina area (exponent 1).
Balanced. Grow arbor size as the square root of retina area. Both resolution and accuracy scale together, preserving their ratio (exponent 1/2).
We measured 70 DiI-stained arbors from 26 fish, across retinas ranging from 4 to 44 square millimeters. When we plotted arbor area against retina area on a log-log scale, we got a slope of 0.62 ± 0.11. Statistically indistinguishable from 1/2 (or 5/8, depending on which noise source dominates). Not flat. Not linear. Balanced.
Evolution had selected the compromise.
The deeper finding was this: the power law wasn’t a coincidence of retinal biology. It was the mathematical signature of an optimization between competing constraints. Once you accept that biological neural circuits must balance multiple objectives simultaneously, not maximize any single one, the specific exponent falls out of the math.
Why It Matters Now
Current AI scaling research has discovered something structurally similar. Model performance improves as a power law in compute, in parameters, in data. The exponents are small, often in the range of 0.05 to 0.10, but they’re consistent across model scales and training regimes. Nobody fully understands why those specific exponents appear.
The 2007 framework suggests a possibility worth taking seriously: power-law scaling emerges when a system optimizes a ratio between competing objectives, not when it maximizes any single one. If that’s true for biological neural circuits, it might be true for artificial ones too. And if it is, then the exponents we observe in AI scaling aren’t arbitrary, they reflect implicit trade-offs between quantities we haven’t fully identified yet.
Candidates for those competing quantities in AI include:
Accuracy vs. sample efficiency: how well does the model perform, relative to how much data it needs?
Capacity vs. generalization: how much can it memorize, vs. how well does it extrapolate?
Depth of reasoning vs. breadth of knowledge: especially relevant as test-time compute becomes a dominant lever
Performance vs. energy: the brain runs on 20 watts, and nobody has a mathematical theory for why that efficiency frontier looks the way it does
I’m not claiming our 2007 paper solves any of these. I’m claiming that the framework it represents, the idea that scaling laws emerge from trade-off optimization rather than from pure growth, is a lens worth bringing back into the conversation.
What I’m Going to Try to Do
I stepped away from research. I’m stepping back in, but I’m not pretending the intervening years didn’t happen. I’m not a computational ML researcher. I can’t contribute to this conversation through architectural innovations or training runs. What I can contribute is a specific biological framework that happened to be published just before this conversation existed, authored with one of the most respected neuroscientists of his generation, and now largely invisible to the AI research community.
Over the coming months, I’m going to write a series of posts exploring this territory. A side-by-side comparison of biological and AI scaling curves. A closer look at what test-time compute and dynamic resource allocation look like in both systems. A piece on energy efficiency and what biology’s solution to that problem might suggest. And eventually, a longer reflection on Chuck Stevens’ larger body of work on scalable architecture, which deserves to be read by people who are currently trying to build scalable architectures themselves.
I don’t know if any of this will move the field. That’s not really the point. The point is that there’s a body of thinking about neural scaling that sits adjacent to the current AI conversation, and one of its primary architects is no longer alive to carry it forward. I happened to write a small piece of it with him. That’s a responsibility, and it’s a privilege, and it’s what I’m going to do with the time I have now.
Chuck used to say, in the way he said many things, offhandedly, with a chuckle, that the interesting work was usually hiding in plain sight. I think some of his own work is hiding in plain sight right now. I’d like to help find it.

Related Posts

Leave a Comment Cancel Reply