Unreasonable Audio Innovation

Keynote at the AES160th Convention Reception, Cityhall of Copenhagen,May 29th, 2026

1. OPENING

Good afternoon. I am told this room has some acoustic challenges. Consider that a live demonstration of one of my central arguments.

A few years ago we received a review of one of our products. The reviewer wrote that it sounded — and I quote — “surprisingly good for something that measures this well.” I still don’t know whether to laugh or cry. But it told me everything about the state of our field.

Tonight I want to talk about how a field gets stuck — and what it takes to get unstuck. Not with a revolution. Just with the willingness to ask questions we have stopped asking.

Look around you. This building was completed in 1905. The architect Martin Nyrop built something that looks entirely handcrafted and organic — and it is.

But hidden behind these walls he installed every modern technology available: central heating, electric light, elevators, a steel and glass roof structure. Engineering completely embraced, completely invisible. What you experience is the beauty it makes possible.

That was the skønvirke philosophy Nyrop championed: science and technology provide the foundations, human craft provides the soul. As one writer put it — the purpose was to ensure that the engineer did not suffocate the artist.

That is also our mission at PURIFI Audio. A straight wire to the soul of music. Engineering that serves beauty rather than replacing it.

The Irish playwright writer George Bernard Shaw wrote: “The reasonable man adapts himself to the world. The unreasonable one persists in trying to adapt the world to himself. Therefore, all progress depends on the unreasonable man.”

In practice it always starts with three questions: Why is this limitation here? What assumption created it? What happens if we remove the assumption?

I have had to ask those questions more than once. In the 1990s I built an eighth-order sigma-delta modulator when the textbooks said anything above second order was unstable. I was not certain I was right — I had a hypothesis and I tested it.

Bruno Putzeys, my co-founder, and I both worked on class D amplification when the consensus said it could never sound as good as a real amplifier.

In both cases a field had settled on a belief, the belief was wrong, and the cost was years of slower progress.

2. TWO TRIBES

Our HiFi community is divided into two tribes — and I use tribal deliberately, because tribes define themselves by disagreement. Leaving the tribe is a social act, not just an intellectual one. I have said things I later regretted from both sides of that divide. I suspect most of you have too.

The subjectivists tribe: audio is art, measurements are irrelevant, no explanation is too absurd provided it avoids actual data. You already know the reviewer quote.

The objectivists tribe: standard metrics are sufficient, double-blind testing is the only admissible evidence, and almost nothing is audible. And yet — paradoxically — they spend enormous effort optimising the same short list of metrics: THD, IMD, noise floor. Decade after decade. Regardless of whether those metrics capture what the ear is actually sensitive to.

So we have a field that simultaneously argues nothing is audible and yet spends enormous resources trying to make things sound better. At some point we should ask whether those two activities are consistent. Does this sound reasonable? Shaw would know what to call it.

What bothers me about both tribes is not that they disagree. It’s that neither is curious. Both have found a way to stop the inquiry and call it a conclusion.

3. THE EAR, TIME, AND THE MISSING DIMENSION

Our tools for characterising the auditory system are much more limited than the way they get used.

Take masking curves — a cornerstone of the objectivist arsenal. A masking study establishes at what level one tone obscures another. The objectivist cites the result, notes that a distortion product falls below the threshold, and concludes: inaudible. Case closed.

But those curves were derived from simple stimuli — two tones, controlled laboratory conditions. How does that extrapolate to a complex music signal, with dozens of simultaneous frequency components, temporal variation, and a listener engaged in active musical perception? The honest answer: we don’t fully know. The extrapolation is assumed, not demonstrated.

Now consider what the ear can actually do. The auditory system can localise sound direction at frequencies as low as 30 Hz — a wavelength of eleven metres — using inter-aural time differences measured in microseconds. Ten to twenty microseconds. Our hearing has been shaped by evolution to detect things our signal-processing models don’t account for. The ear is not a spectrum analyser. It is a pattern-recognition system with extraordinary temporal resolution, and we understand it poorly.

Here is the specific failure. The ear’s temporal resolution is not a footnote — it is the foundation of spatial hearing. Psychoacoustics did not forget this. It built entire theories of binaural perception around it. But when it came to distortion audibility, the field defaulted almost entirely to frequency-domain tools. Masking curves ask: at what amplitude does one tone hide another?

They do not ask: at what temporal displacement does a distortion product become audible? That question was never built into the framework — not because it was answered, but because the tractable experiments did not require it. A field chose its subspace based on what it could measure. The dropped coordinate was time.

The proof is sitting in our own toolbox — and it is almost funny. The exponential sine sweep, the standard method we use to generate THD plots, works by spreading harmonics out across time. The harmonics of the sweep arrive at the listener’s ear at different times.

Temporally separated like that, people routinely detect distortion products even at minus 80 decibels. By ear and without effort. And before the analytical mind has time to form an opinion. Then the objectivist reads the graph — and the graph says: inaudible. The instrument that generates that verdict is powered by exactly the temporal sensitivity it ignores. That is not a footnote. That is the point.

Group delay from crossovers is another candidate. The classic experiments used static stimuli and concluded inaudible below certain thresholds. But group delay from a crossover acts on transients — it temporally smears the attack of any event that straddles the crossover frequency. That is not a static phenomenon. It would not surprise me if the threshold derived from steady-state experiments simply does not apply to dynamic musical material. The pattern would be familiar.

When you properly control bass room modes by eliminating the temporal ringing, the crossover time smearing should become more audible, not less. The room was providing its own temporal contamination. If that is right, some of the ‘inaudible’ verdicts on crossover group delay were earned in rooms that were doing part of the masking work. To my knowledge that experiment has not been cleanly run. It would be worth doing

And underneath all of this: when a study reports a result is not statistically significant, that does not mean the effect is absent. It means the study failed to reject the null hypothesis — which could mean the effect isn’t real, or it could simply mean the study was not powerful enough to detect it.

“Not proven audible in this study” is routinely translated into “not audible.” That translation is wrong. Absence of evidence is not evidence of absence. We say this but do not act like it.

4. ENGINEERING, AND THE HYSTERESIS STORY

Audio is not art and it is not science. It is engineering. That distinction matters because it changes how you decide what to do.

In engineering, you design your measurements for the specific failure modes of the specific device. You look hard for bad news. You take results seriously even when — especially when — they don’t fit the consensus.

At PURIFI, we started from a simple assumption: reduce distortions everywhere, including mechanisms the conventional wisdom classifies as benign. Not because we had proof they mattered. Because we could not see any good reason to leave them there — and because we were not confident the existing tools were looking in the right place.

One mechanism we focused on was magnetic hysteresis distortion in loudspeaker motors. You may have heard this in a system without knowing what it was. A granular texture in the sound.

A blanket of fuzz that stays just this side of audible — taunting and infuriating, like an itch you cannot scratch. That is hysteresis.

Measure it with a standard single-sine test and it looks unremarkable — modest harmonic distortion figures, nothing alarming.

But hysteresis has memory. The distortion it produces depends on the history of the signal, not just its current state. Every time the signal reverses direction, a trace is left. Every time the signal passes a previous turning point, that trace is erased — and a voltage step is produced.

The step happens not because of what the signal is doing now, but because of what it did a while ago. Cause and effect are decoupled in time.

This is precisely why standard masking curves miss it. Masking assumes temporal coincidence — that a distortion product is present at the same moment as the masking signal. Hysteresis violates that assumption entirely. The ear already told us it is sensitive to exactly this. The measurement was just not designed to look.

Many people report hearing a difference when hysteresis is substantially reduced, even at levels where the conventional measurement would suggest nothing of concern. The gap between what the measurement captures and what listeners report seems worth taking seriously.

That is the engineering approach: build a better test for a specific problem, look hard, report honestly what you find, and resist the temptation to dismiss what you cannot yet explain.

5. FLOYD TOOLE — A GIANT, AND WHAT HAPPENED TO HIS CAVEATS

I want to discuss a specific body of research carefully — because the researcher is someone I know and whose work I genuinely respect. I should also say clearly: I am not a psychoacoustician. I am an engineer and, in this area, essentially a curious observer. But I think observers are allowed to ask questions. That is, in fact, the call to action of this talk.

Floyd Toole’s contribution to loudspeaker research is enormous. Decades of rigorous empirical work at Harman: controlled listening experiments, preference ratings, statistical models. He was careful about limitations.

He coined the term circle of confusion to describe the interlocking dependencies between recording practice, room acoustics, and loudspeaker that make it hard to isolate what is a property of the speaker from what is an artefact of context. He knew what his data could and could not support.

The problem is not what Toole claimed. The problem is what accumulated around his work over time — independently of him. I recently read a post on a prominent audio forum that said, paraphrasing only slightly: “the science is now settled, so why aren’t all speakers designed to perfection?”

That sentence contains multiple serious errors of reasoning. The most revealing: the science is not settled, and Toole never claimed it was. The canonical preference targets were derived from experiments conducted with the loudspeakers that existed at the time.

Several questions follow that have not been cleanly answered. Were the test speakers representative of what is now achievable? Is listener preference even unimodal — or are we averaging over genuinely different clusters whose mean satisfies nobody? Does preference shift with culture and time? The methodology was designed to find a single stable target. Whether that target actually exists is a different question.

Every preference study was conducted in a room. A room with uncontrolled modes below two or three hundred hertz. The listeners were hearing speaker-plus-room, not speaker alone. Their preferences were for the least-bad version of what an uncontrolled room could deliver.

That circle of confusion has a temporal dimension that was never fully unpacked. Room modes are not merely a frequency-domain problem — they are fundamentally a timing problem. A low-frequency mode does not just colour the spectrum. It makes energy arrive late, smears transient attack, corrupts the temporal signal envelope.

The listeners were preferring the speaker that most gracefully managed temporal contamination. That is a different experiment than we think we ran — and it connects directly back to everything we discussed about the ear and time.

The research is valid — but it is answering a different question than we think. The room was the elephant in the room. Everyone knew it was there. Nobody could move it. So everyone worked around it.

The Archimedes project at DTU in the 1990s — a collaboration between B&O, KEF, and DTU, just down the road from where I was doing my PhD — asked exactly the right questions about how room acoustics affect timbre.

The infrastructure was extraordinary: a spherical loudspeaker array in an anechoic chamber, simulating controlled room conditions with great precision.

Serious work, seriously done. But the specific question of listener preference when room modes are controlled has still, to my knowledge, not been cleanly answered. Correct me if I am wrong.

6. SURROGATE MARKERS, ANCEL KEYS, AND THE REAL ENDPOINT

From other fields we know the concepts surrogate markers and end points. The real endpoint is what actually matters — but can be hard to measure, or you only find out too late. So, you measure something else that you hope is related. A job interview is a surrogate marker for job performance. Useful — but not the same thing. It is very easy, and very human, to forget that distinction and start treating the marker as if it was the endpoint itself.

In medicine, surrogate markers have a long and humbling history. Ancel Keys built fifty years of dietary guidance on LDL cholesterol as a proxy for cardiovascular risk. LDL just turned out to be a mediocre surrogate marker.

In audio, THD is our LDL. A useful proxy for sound quality — measurable, optimisable — but capturing only a fraction of what the ear actually responds to.

And here is the sharpest difference between our field and medicine: in medicine the hardest endpoint is death — final, binary, impossible to fake. Our endpoint is the experience of beauty. If anything, that should make us humbler about our surrogates, not less. As doctors say: are we treating the numbers or the patient?

7. WHAT IS AT STAKE

Here is what is actually at stake. Canon we need — it is the best foundation we have, hard-won and worth building on. Dogma is what a canon becomes when the caveats get forgotten, the questions stop, and the findings get treated as eternal truth rather than the best available answer.

When that happens, the tribes form to defend it. And when tribes form, the people finding results that don’t fit have no legitimate channel to report them. The anomaly doesn’t update the model. It disappears.

As the Danish designer and critic Poul Henningsen — another unreasonable man, a generation after Nyrop — once put it: the future comes by itself. Progress does not.

8. CALL TO ACTION — KLIPPEL, MILLIKAN, AND WHAT GOOD LOOKS LIKE

I want to end with a concrete picture of what intellectual honesty in engineering actually looks like.

In 2016, colleagues at PURIFI and DTU published an AES paper on force factor modulation in loudspeaker motors — a distortion mechanism we believed was being substantially underestimated. I presented the work. Wolfgang Klippel, one of the most respected names in loudspeaker measurement, disagreed. Strongly.

We exchanged a long series of emails. He challenged our methods rigorously; we defended our evidence. At the end of it, he accepted the new data, thanked us, and immediately updated his training slide deck. The next time he taught the subject, the corrected understanding was in his material.

Now — I mentioned 1905, and this building. That same year, in a patent office in Bern, a 26-year-old clerk published four papers that nobody took seriously but quietly overturned the establishment he was working outside of. His name was Albert Einstein.

Robert Millikan, a great experimental physicist, found Einstein’s photoelectric equation not just wrong but, in his words, “reckless” and “wholly untenable.” He spent ten years designing meticulous experiments specifically intended to prove Einstein wrong.

His data confirmed Einstein’s predictions with a precision he had not expected and could not deny. He published the results honestly. He told Einstein he was right. He won the Nobel Prize in 1923, partly for that work — the work he had begun in order to bury it.

That is one part of the model — the willingness to follow the evidence even when it leads somewhere you did not want to go.

But there is a second part. Einstein and Bohr disagreed about the meaning of quantum mechanics for thirty years — a debate conducted right here in Copenhagen. They did not dismiss each other.

They pushed each other to greater precision, through thought experiments and real experiments, through papers and arguments that sharpened both men’s thinking.

That sustained, respectful, evidence-driven disagreement is how the physics got done.

That is the full model. Not just updating when the evidence demands it — but engaging seriously, for as long as it takes, with those who see it differently.

NOT TRIBES. COLLEAGUES. CURIOUS.

We are engineers. We have a job.

The future comes by itself. Progress does not.

Thank you.

in Tech Notes

Sign in to leave a comment

What is the Baffle Step?

Why do we round the corners?