Opinion: the role of a DSP in conferencing23 November 2016
Tim Root, CTO and EVP business development at Revolabs, discusses how DSP capabilities can enhance audio for a successful collaboration experience.
Audio is the bedrock of all meetings: without quality audio, the entire collaboration event can quickly begin to unravel and become frustrating and unproductive. The digital signal processor (DSP) plays a crucial role in the conference room and can transform the audio experience.
A DSP manages the audio inputs and outputs in a conferencing system, processing all the audio signals to deliver clear, intelligible speech to meeting participants. The backbone of conference audio, a DSP takes many forms. Conference phones typically have integrated DSP chips that provide the necessary processing power and other technologies to ensure a quality experience. Larger collaboration environments employ a greater number of standalone microphones and loudspeakers, and require a dedicated DSP appliance. This discrete unit has the processing horsepower needed for the numerous audio inputs and outputs in the room and across the UC network.
When meeting participants connect via audio endpoints, typically a DSP’s first job is to perform acoustic echo cancellation (AEC). When you hear your own voice echo in the local speaker, someone else’s communication device has failed to echo-cancel your signal. As your voice is amplified on the far end (the meeting space you’re connected to), it’s getting picked up by the far-end microphone(s) and played over your speaker. AEC hardware and software compare the microphone’s signal to your own incoming voice being played out on the far end, and subtract it from that microphone’s signal before passing it on.
Once echo is removed, a DSP provides automatic gain control (AGC). Here, the DSP should amplify each incoming microphone signal to approximately the same level. Without AGC, users on the far end either get blasted by a loud person or strain to hear a softly spoken person. AGC adjusts the volume of voices throughout the conversation to a comfortable level.
Mixing and gating
A DSP also provides mixing and gating technology. If all the microphones have to feed to one output (such as a telephone line or a videoconference appliance), each microphone’s input has to be mixed into the output signal. The outgoing signal(s) may be routed through multiple outputs simultaneously or independently, such as the internet, a phone line or a recording system. ‘Gating’ means the system can selectively deactivate certain microphones. For example, some advanced DSP algorithms can distinguish between human speech and noises such as typing, paper crinkling, and finger tapping, and ignore signals from a noisy mic until speech is detected on it again.
Output signals require DSP definition and encoding. As a result, the DSP sends analogue signals out to speakers or an amplifier, a digital stream to a VoIP phone system, or a digital stream to USB or Ethernet ports (AVB or Dante formats, for example). Because the microphone signals might be mixed or unmixed, the DSP must provide the correct digital signal coding/decoding (using a codec) of industry-standard specific formats required by the channel and protocol. In conference audio speech for digital systems, the DSP and codec convert analogue speech signals to digital and back again. Digital transmission of data increases the potential volume of traffic over a network. The DSP and codec can provide security by encrypting the data stream while also potentially reducing required transmission bandwidth.
Lastly, a DSP has other functions to improve the experience. Noise reduction capability reduces ambient noise in the microphone pickup signal, preventing it from being heard on the far end. On the other hand, noise fill injects noise so a listener doesn’t think he’s been disconnected when it’s too quiet because no one’s speaking. Master mute and volume control enable further intelligence for larger, complex installations.
Whether the conferencing application is small or large, a DSP plays a vital role by fine-tuning and optimising the audio signals flowing in and out of the room. The result is clear, intelligible speech that everyone can hear for successful, productive collaboration.