Next-Generation Audio Processors Will Mimic the Human Brain
Have you tried using Siri, Google Voice Search or Dragon NaturallySpeaking in a noisy environment? Even with the latest noise cancelling technology, it's challenging for a smartphone to hear you because of the computing power required to analyze a sound spectrum including signal and noise. Yet the human brain does it with ease, as we all know from listening to a friend in a crowded restaurant or hearing a spouse call out above the din in a stadium.
To a computer, audio input is a forest of frequencies, constantly changing over time. Identifying the target signal is like looking for a needle in a haystack. Typically, the computer checks every needle in the haystack -- a wasteful way to search. A smartphone has to send audio signals to a powerful central server for processing.
Curious how the brain succeeds at this challenge, engineers at A*STAR Institute for Infocomm Research in Singapore set out to understand the brain's audio processing algorithm in order to imitate it. A news release titled "Following the Brain's Lead" describes how a team led by Jonathan Dennis figured out a brain-mimicking approach:
"The method proposed in our study may not only contribute to a better understanding of the mechanisms by which the biological acoustic systems operate, but also enhance both the effectiveness and efficiency of audio processing," comments Huajin Tang, an electrical engineer from the research team.
The brain, he says, doesn't analyze what it doesn't have to. If it decides to follow a baritone voice, for example, it doesn't process the high frequencies. The first pass filter is a set of neurons that are "feature-sensitive" to frequency over time. Downstream "decision-making" neurons learn what the target sounds like, allowing the brain to selectively filter out the rest, saving on computation.
Tang's team tried their hand at this method. They analyzed a noisy sound spectrum for pre-determined targets, like the characteristic frequencies of a speaker's voice, or a periodic bell tone. Then, they analyzed only the frequencies surrounding the signal, instead of the whole audio spectrum. In this way, they could extract the target signals even when noise was present.
To improve the detection over time, the researchers fed matching frequency patterns into a neurological algorithm that mimics the way the brain learns through the repetition of known patterns.
The algorithm was successful. It could robustly identify the target signal, even in the presence of noise, with less computation. The team published its results in the proceedings of the 2013 International Conference on Acoustics, Speech, and Signal Processing.
Widespread use of this biomimetic algorithm will help your future smartphone become a better listener, even in that noisy restaurant or stadium. "...beyond that, it could also include touch, vision and other senses," Tang predicts.
The Half Has Not Been Told
The brain is even better than that, however. Those "decision-making" neurons, for instance, must have a feedback mechanism to tell the upstream "feature-sensitive" neurons to quench the input that is not of interest.
The brain can also switch its targets almost instantaneously. A good listener can pick out and follow the individual notes of an oboe, violin or tuba from the wall of sound of a full orchestra playing fortissimo. And every mother knows her child's voice, switching attention to it in a crowd at Disneyland even when in the midst of conversation with a friend. This implies the mind's ability to tell the brain's decision-making neurons what to focus on.
The better the incoming audio spectrum, the better the brain's algorithm works. An article on Medical Xpress describes the difficulty those with cochlear implants have in picking out speech from background noise. Remarkably, the visual part of the brain helps compensate. Patients who did lip reading while hearing through a cochlear implant improved their skill:
Vision thus supplies additional information that is crucial to understanding language, particularly in noisy environments where patients with cochlear implants sometimes find it difficult to distinguish words. Sight and hearing act together and in total synergy, thus helping to gradually improve patients' ability, as they recover, to decipher the words coded by the implant.
Fortunately, for those with undamaged ears, human hearing approaches auditory perfection. The human ear can detect intensities over 12 orders of magnitude (a trillion to one), even the faintest impressions of air vibrations half the diameter of a hydrogen atom. And that's just the receiver. The amps, mixers and processors will keep biomimetics engineers busy for a long time.
Here we see a remarkable capability in the brain involving many different component parts. The outer ear, middle ear, and cochlea, with all their complex parts, provide audio input to the auditory nerve. That nerve bundle feeds into "feature-sensitive" neurons able to dissect the spectrum of sound into its components. Several layers of processing neurons further analyze the features. Then, the "decision-making" neurons decide the target signal to focus on. All this works in "total synergy" with the visual system.
Could such a complex, interrelated computational mechanism arise by mutations and unguided selection? If so, the engineers might as well toss components onto a vibrating table and wait for something interesting to emerge. No; they studied "biological acoustic systems" because they know a well-designed system when they hear one. Imitation is the sincerest form of flattery.
Through biomimetics, these researchers are advancing science, both by contributing to the understanding of the brain's operation, and by improving audio processing technology. It's one more example of how science progresses through design-based approaches.
Image: Cochlea of the inner ear; Wellcome Images/Flickr.