What enhancing can do is make audio sound “clearer”, in the sense of “less noisy”. Making it “clearer” in the sense of “more intelligible” requires a transcript.
A segment from the 2016 film The Case of: JonBenét Ramsey shows how a transcript and enhancing work together. The film revisits the unsolved 1996 murder of a six-year-old beauty queen in the USA. The audio you listened to is one of several pieces of evidence purporting to show that the child’s family was implicated in her murder.
The video below begins after 12 minutes, and the enhancing segment ends at 14 minutes and 37 seconds.
Judging from public reaction, many viewers accept the four phrases were “revealed” by the “enhancing” – but is that really what happened?
At Step 1 of the experiment, the audio was played “cold” – with no contextual information – to 78 participants. Half listened to the film’s original and half to its enhanced version. No one in either group heard anything remotely like any of the phrases. Most didn’t even hear human speech (did you?).
This effect is demonstrated by Step 2 of the experiment, where participants were given a transcript. After failing to hear any of the four phrases while listening cold, nearly half now agreed they could hear at least one of them.
Here’s what’s important
Participants who were primed by the transcript while listening to the enhanced audio were more likely (63% vs 24%) to accept more of the phrases more confidently than those listening to the original.
That would show a good effect of enhancing if the transcript were a reliable account of what was actually said. But is it? To answer that, consider where the phrases came from.
The movie portrays the investigators spontaneously hearing the phrases as the audio is enhanced. But that is disingenuous.
There is good evidence the phrases originate from police in the 1990s listening to noises at the end of a cassette copy of the 911 call in which the child’s disappearance was reported.
So what are those noises?
Listening to the whole call (start at 6 minutes 34 seconds in the movie above), it seems likely they are the sound of the agent typing up information provided by the caller. Interestingly, some commentators provide evidence (not tested in court) suggesting, when the audio was transferred to the cassette during the investigation, it was processed in ways that make the typing sound more like speech.
Be that as it may, Step 1 of the experiment makes clear that the movie’s “enhancing” has no effect whatsoever in revealing the phrases. That effect is entirely the work of priming by their (misleading) transcript.
The same thing happens in real trials
The movie’s flashy visuals and sensational tone seem far removed from a courtroom. Yet, the way the movie presents the audio is very similar to how audio is presented in a trial.
In trials, as in the movie, listeners hear an enhanced version of indistinct audio with the “assistance” of a police transcript.
The problem with this can be explained via an analogy from forensic image enhancement. Consider the very indistinct number plate below, and an enhancement that looks “clearer”. Does it help you see DUN 150J?
Knowing “ground truth” – the absolute, undisputed truth – about the real number plate makes it easy to see that, while the enhancing may have made the indistinct image look “clearer”, it has not thereby made it closer to reality.
The problem, of course, is that in a trial, ground truth is not known. The court has only an indistinct original and a “clearer” enhancement.
With no access to ground truth, it is impossible for the jury to discover that the apparently clearer enhancement is no closer to reality than the blurry original.
And all this is exactly true of audio
Does that mean enhancing is never effective?
Audio enhancing can sometimes be useful. It can also be ineffective – or even misleading. In the present case, for example, it misleadingly made typing sound like speech, at least to some listeners.
The point is that, in the absence of ground truth, the effectiveness of enhancing cannot be reliably determined simply by asking listeners whether the audio sounds clearer.
Yet that is the sole criterion used in our courts.
According to our legal system, evaluating the effectiveness of enhancing is a matter for the jury, who are invited to listen to the enhancement and use it if it sounds clearer to them.
But the experiment shows that making audio “clearer” can have the opposite effect to the one intended. That’s because less noisy audio makes an unreliable transcript seem more believable than it does in the original.