Slides for the presentation I made at ClueCon 21 on the experimental RED support in WebRTC, and how we've started tinkering with it in Janus. The presentation also addresses a more generic overview on audio features in WebRTC.
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
WebRTC, RED and Janus @ ClueCon21
1. Audio redundancy in WebRTC and Janus via RED
Lorenzo Miniero
ClueCon – Chicago, IL, USA (kinda!)
October 27th 2021
2. Who am I?
Lorenzo Miniero
• Ph.D @ UniNA
• Chairman @ Meetecho
• Main author of Janus®
Contacts and info
• lorenzo@meetecho.com
• https://twitter.com/elminiero
• https://www.slideshare.net/LorenzoMiniero
• https://soundcloud.com/lminiero
• https://lminiero.bandcamp.com
3. Just a few words on Meetecho
• Co-founded in 2009 as an academic spin-off
• University research efforts brought to the market
• Completely independent from the University
• Focus on real-time multimedia applications
• Strong perspective on standardization and open source
• Several activities
• Consulting services
• Commercial support and Janus licenses
• Streaming of live events (IETF, ACM, etc.)
• Proudly brewed in sunny Napoli, Italy
5. Remember Janus?
Janus
General purpose, open source WebRTC server
• https://github.com/meetecho/janus-gateway
• Demos and documentation: https://janus.conf.meetecho.com
• Community: https://groups.google.com/forum/#!forum/meetecho-janus
8. A ton of scenarios done today with Janus!
• SIP and RTSP gatewaying
• WebRTC-based call/contact centers
• Conferencing & collaboration
• E-learning & webinars
• Cloud platforms
• Media production
• Broadcasting & Gaming
• Identity verification
• Internet of Things
• Augmented/Virtual Reality
• ...and more!
9. It’s not just about video!
• Video obviously takes the lion’s share, most of the times
• Pretty much ubiquitous
• Most use cases assume video, one way or another
• It’s not the only thing that matters, though
• We still need to communicate, somehow
• Audio (and data) can be just as important, if not more
• Some applications even focus JUST on audio!
• ... and not only call/contact centers, PBX, or legacy infrastructures
10. It’s not just about video!
• Video obviously takes the lion’s share, most of the times
• Pretty much ubiquitous
• Most use cases assume video, one way or another
• It’s not the only thing that matters, though
• We still need to communicate, somehow
• Audio (and data) can be just as important, if not more
• Some applications even focus JUST on audio!
• ... and not only call/contact centers, PBX, or legacy infrastructures
11. It’s not just about video!
• Video obviously takes the lion’s share, most of the times
• Pretty much ubiquitous
• Most use cases assume video, one way or another
• It’s not the only thing that matters, though
• We still need to communicate, somehow
• Audio (and data) can be just as important, if not more
• Some applications even focus JUST on audio!
• ... and not only call/contact centers, PBX, or legacy infrastructures
15. “Can WebRTC help musicians?”
https://fosdem.org/2021/schedule/event/webrtc_musicians/
16. WebRTC and audio
• A couple of mandatory-to-implement codecs
• Opus + G.711
• G.711 just there as a fallback (and legacy interopability)
• Opus FTW!
• High quality audio codec designed for the Internet
• Very flexible in sampling rates, bitrates, etc.
• Support for stereo, and different “profiles” for voice/music
• Surround availble too, on an experimental basis (multiopus)
• A few interesting “tools”
• Audio levels RTP extension (VAD)
• Opus inband Forward Error Correction (FEC)
• Opus Discontinuous transmission (DTX)
17. WebRTC and audio
• A couple of mandatory-to-implement codecs
• Opus + G.711
• G.711 just there as a fallback (and legacy interopability)
• Opus FTW!
• High quality audio codec designed for the Internet
• Very flexible in sampling rates, bitrates, etc.
• Support for stereo, and different “profiles” for voice/music
• Surround availble too, on an experimental basis (multiopus)
• A few interesting “tools”
• Audio levels RTP extension (VAD)
• Opus inband Forward Error Correction (FEC)
• Opus Discontinuous transmission (DTX)
18. WebRTC and audio
• A couple of mandatory-to-implement codecs
• Opus + G.711
• G.711 just there as a fallback (and legacy interopability)
• Opus FTW!
• High quality audio codec designed for the Internet
• Very flexible in sampling rates, bitrates, etc.
• Support for stereo, and different “profiles” for voice/music
• Surround availble too, on an experimental basis (multiopus)
• A few interesting “tools”
• Audio levels RTP extension (VAD)
• Opus inband Forward Error Correction (FEC)
• Opus Discontinuous transmission (DTX)
19. Audio-only: SFU or MCU?
• SFUs ideal to just relay media
• No mixing/transcoding to worry about −→ less CPU on server, less delay
• More streams to distribute −→ more bandwidth needed
• Different streams −→ more control on UI
• MCUs ideal to just mix media
• Mixing/transcoding taking place −→ more CPU on server, more delay
• Just one stream to distribute −→ bandwidth constant
• Single output stream −→ UI rendering constrained
• Sometimes it makes sense to use them both!
• Use SFU where applicable (e.g., video, plenty of bandwidth)
• Use MCU to complement (e.g., audio, lower power devices)
• Besides, an MCU can mix SFU streams to broadcast to a CDN!
20. Audio-only: SFU or MCU?
• SFUs ideal to just relay media
• No mixing/transcoding to worry about −→ less CPU on server, less delay
• More streams to distribute −→ more bandwidth needed
• Different streams −→ more control on UI
• MCUs ideal to just mix media
• Mixing/transcoding taking place −→ more CPU on server, more delay
• Just one stream to distribute −→ bandwidth constant
• Single output stream −→ UI rendering constrained
• Sometimes it makes sense to use them both!
• Use SFU where applicable (e.g., video, plenty of bandwidth)
• Use MCU to complement (e.g., audio, lower power devices)
• Besides, an MCU can mix SFU streams to broadcast to a CDN!
21. Audio-only: SFU or MCU?
• SFUs ideal to just relay media
• No mixing/transcoding to worry about −→ less CPU on server, less delay
• More streams to distribute −→ more bandwidth needed
• Different streams −→ more control on UI
• MCUs ideal to just mix media
• Mixing/transcoding taking place −→ more CPU on server, more delay
• Just one stream to distribute −→ bandwidth constant
• Single output stream −→ UI rendering constrained
• Sometimes it makes sense to use them both!
• Use SFU where applicable (e.g., video, plenty of bandwidth)
• Use MCU to complement (e.g., audio, lower power devices)
• Besides, an MCU can mix SFU streams to broadcast to a CDN!
22. We do use both in our Virtual Event Platform!
https://commcon.xyz/session/turning-live-events-to-virtual-with-janus
23. Many efforts focused on audio in Janus, recently
• Modular nature of Janus encourages new functionality
• Not necessarily in new plugins
• VideoRoom, AudioBridge, Streaming plugins can all benefit
• Several activities done, started or planned to enhance audio experience
• Mostly in AudioBridge... (due to the nature of the plugin)
• ... but some features actually available to all plugins!
• Many coming from requirements for our Virtual Event Platform
• But we like to experiment as well!
• Main topic of a recent talk @ Open Source World
• https://www.slideshare.net/LorenzoMiniero/janus-audio-open-source-world
24. Many efforts focused on audio in Janus, recently
• Modular nature of Janus encourages new functionality
• Not necessarily in new plugins
• VideoRoom, AudioBridge, Streaming plugins can all benefit
• Several activities done, started or planned to enhance audio experience
• Mostly in AudioBridge... (due to the nature of the plugin)
• ... but some features actually available to all plugins!
• Many coming from requirements for our Virtual Event Platform
• But we like to experiment as well!
• Main topic of a recent talk @ Open Source World
• https://www.slideshare.net/LorenzoMiniero/janus-audio-open-source-world
25. Many efforts focused on audio in Janus, recently
• Modular nature of Janus encourages new functionality
• Not necessarily in new plugins
• VideoRoom, AudioBridge, Streaming plugins can all benefit
• Several activities done, started or planned to enhance audio experience
• Mostly in AudioBridge... (due to the nature of the plugin)
• ... but some features actually available to all plugins!
• Many coming from requirements for our Virtual Event Platform
• But we like to experiment as well!
• Main topic of a recent talk @ Open Source World
• https://www.slideshare.net/LorenzoMiniero/janus-audio-open-source-world
26. Many efforts focused on audio in Janus, recently
• Modular nature of Janus encourages new functionality
• Not necessarily in new plugins
• VideoRoom, AudioBridge, Streaming plugins can all benefit
• Several activities done, started or planned to enhance audio experience
• Mostly in AudioBridge... (due to the nature of the plugin)
• ... but some features actually available to all plugins!
• Many coming from requirements for our Virtual Event Platform
• But we like to experiment as well!
• Main topic of a recent talk @ Open Source World
• https://www.slideshare.net/LorenzoMiniero/janus-audio-open-source-world
27. Audio redundancy via RED
• Old RTP payload format for Redundant Audio Data (RED)
• https://datatracker.ietf.org/doc/html/rfc2198
• Recently added to Chrome on an experimental basis
• https://bugs.chromium.org/p/webrtc/issues/detail?id=11640
• https://webrtchacks.com/red-improving-audio-quality-with-redundancy/
• https://webrtchacks.com/implementing-redundant-audio-on-an-sfu/
• Basically a simple way to group multiple audio frames in a single RTP packet
• Current audio frame + one or more previously sent frames
• Allows recipient to easily recover lost packets at the cost of more bandwidth
28. Audio redundancy via RED
• Old RTP payload format for Redundant Audio Data (RED)
• https://datatracker.ietf.org/doc/html/rfc2198
• Recently added to Chrome on an experimental basis
• https://bugs.chromium.org/p/webrtc/issues/detail?id=11640
• https://webrtchacks.com/red-improving-audio-quality-with-redundancy/
• https://webrtchacks.com/implementing-redundant-audio-on-an-sfu/
• Basically a simple way to group multiple audio frames in a single RTP packet
• Current audio frame + one or more previously sent frames
• Allows recipient to easily recover lost packets at the cost of more bandwidth
29. Audio redundancy via RED
• Old RTP payload format for Redundant Audio Data (RED)
• https://datatracker.ietf.org/doc/html/rfc2198
• Recently added to Chrome on an experimental basis
• https://bugs.chromium.org/p/webrtc/issues/detail?id=11640
• https://webrtchacks.com/red-improving-audio-quality-with-redundancy/
• https://webrtchacks.com/implementing-redundant-audio-on-an-sfu/
• Basically a simple way to group multiple audio frames in a single RTP packet
• Current audio frame + one or more previously sent frames
• Allows recipient to easily recover lost packets at the cost of more bandwidth
36. Support for audio redundancy via RED in Janus
• Support in Janus needed work in both core and plugins
• Core needed to negotiate RED, and be able to unpack/pack RED
• Plugins needed to be able to do something with the data
• Important to support both endpoints that can do RED, and those who can’t
• RED-to-RED and nonRED-to-nonRED are easy
• In other cases, Janus may have to pack/unpack RED accordingly
• First integration basically done in most plugins
• Still missing in AudioBridge, though
If you want to learn more... (PR in testing phase)
https://www.meetecho.com/blog/opus-red/
37. Support for audio redundancy via RED in Janus
• Support in Janus needed work in both core and plugins
• Core needed to negotiate RED, and be able to unpack/pack RED
• Plugins needed to be able to do something with the data
• Important to support both endpoints that can do RED, and those who can’t
• RED-to-RED and nonRED-to-nonRED are easy
• In other cases, Janus may have to pack/unpack RED accordingly
• First integration basically done in most plugins
• Still missing in AudioBridge, though
If you want to learn more... (PR in testing phase)
https://www.meetecho.com/blog/opus-red/
38. Support for audio redundancy via RED in Janus
• Support in Janus needed work in both core and plugins
• Core needed to negotiate RED, and be able to unpack/pack RED
• Plugins needed to be able to do something with the data
• Important to support both endpoints that can do RED, and those who can’t
• RED-to-RED and nonRED-to-nonRED are easy
• In other cases, Janus may have to pack/unpack RED accordingly
• First integration basically done in most plugins
• Still missing in AudioBridge, though
If you want to learn more... (PR in testing phase)
https://www.meetecho.com/blog/opus-red/
39. Support for audio redundancy via RED in Janus
• Support in Janus needed work in both core and plugins
• Core needed to negotiate RED, and be able to unpack/pack RED
• Plugins needed to be able to do something with the data
• Important to support both endpoints that can do RED, and those who can’t
• RED-to-RED and nonRED-to-nonRED are easy
• In other cases, Janus may have to pack/unpack RED accordingly
• First integration basically done in most plugins
• Still missing in AudioBridge, though
If you want to learn more... (PR in testing phase)
https://www.meetecho.com/blog/opus-red/
45. A few challenges to address
• RED to non-RED doesn’t currently take into account redundant info
• If RED packet N-1 is lost, we don’t use packet N to get lost one
• RED packetization code in Janus assumes in-order packets
• May not always be the case (e.g., Streaming plugin and external RTP source)
• RED packetization is shared when doing one-to-many
• New subscribers get redundant info on pre-join audio packets
• Switching subscription in-session briefly mixes redundant info of different sessions
• (N-1 of new stream) != (N-1 of previous stream)
• Discontinuous Transmission (DTX) can cause issues
• Big timestamp jumps can overflow the smaller “timestamp” diff in RED
46. A few challenges to address
• RED to non-RED doesn’t currently take into account redundant info
• If RED packet N-1 is lost, we don’t use packet N to get lost one
• RED packetization code in Janus assumes in-order packets
• May not always be the case (e.g., Streaming plugin and external RTP source)
• RED packetization is shared when doing one-to-many
• New subscribers get redundant info on pre-join audio packets
• Switching subscription in-session briefly mixes redundant info of different sessions
• (N-1 of new stream) != (N-1 of previous stream)
• Discontinuous Transmission (DTX) can cause issues
• Big timestamp jumps can overflow the smaller “timestamp” diff in RED
47. A few challenges to address
• RED to non-RED doesn’t currently take into account redundant info
• If RED packet N-1 is lost, we don’t use packet N to get lost one
• RED packetization code in Janus assumes in-order packets
• May not always be the case (e.g., Streaming plugin and external RTP source)
• RED packetization is shared when doing one-to-many
• New subscribers get redundant info on pre-join audio packets
• Switching subscription in-session briefly mixes redundant info of different sessions
• (N-1 of new stream) != (N-1 of previous stream)
• Discontinuous Transmission (DTX) can cause issues
• Big timestamp jumps can overflow the smaller “timestamp” diff in RED
48. A few challenges to address
• RED to non-RED doesn’t currently take into account redundant info
• If RED packet N-1 is lost, we don’t use packet N to get lost one
• RED packetization code in Janus assumes in-order packets
• May not always be the case (e.g., Streaming plugin and external RTP source)
• RED packetization is shared when doing one-to-many
• New subscribers get redundant info on pre-join audio packets
• Switching subscription in-session briefly mixes redundant info of different sessions
• (N-1 of new stream) != (N-1 of previous stream)
• Discontinuous Transmission (DTX) can cause issues
• Big timestamp jumps can overflow the smaller “timestamp” diff in RED
49. A few challenges to address
• RED to non-RED doesn’t currently take into account redundant info
• If RED packet N-1 is lost, we don’t use packet N to get lost one
• RED packetization code in Janus assumes in-order packets
• May not always be the case (e.g., Streaming plugin and external RTP source)
• RED packetization is shared when doing one-to-many
• New subscribers get redundant info on pre-join audio packets
• Switching subscription in-session briefly mixes redundant info of different sessions
• (N-1 of new stream) != (N-1 of previous stream)
• Discontinuous Transmission (DTX) can cause issues
• Big timestamp jumps can overflow the smaller “timestamp” diff in RED
50. What about the impact on bandwidth?
• RED does help with redundancy...
• ... but uses much more bandwidth than usual for audio!
• Initial integration in Chrome used distance of two “generations”
• Each audio packet contains payload of previous two packets
• Bitrate of audio stream increases considerably!
• Less bandwidth available for other streams, e.g., video
• Latest version defaults to a single redundant packet instead
• Overridable with WebRTC-Audio-Red-For-Opus/Enabled-[1-9]/
51. What about the impact on bandwidth?
• RED does help with redundancy...
• ... but uses much more bandwidth than usual for audio!
• Initial integration in Chrome used distance of two “generations”
• Each audio packet contains payload of previous two packets
• Bitrate of audio stream increases considerably!
• Less bandwidth available for other streams, e.g., video
• Latest version defaults to a single redundant packet instead
• Overridable with WebRTC-Audio-Red-For-Opus/Enabled-[1-9]/
52. What about the impact on bandwidth?
• RED does help with redundancy...
• ... but uses much more bandwidth than usual for audio!
• Initial integration in Chrome used distance of two “generations”
• Each audio packet contains payload of previous two packets
• Bitrate of audio stream increases considerably!
• Less bandwidth available for other streams, e.g., video
• Latest version defaults to a single redundant packet instead
• Overridable with WebRTC-Audio-Red-For-Opus/Enabled-[1-9]/
53. What about the impact on bandwidth?
• RED does help with redundancy...
• ... but uses much more bandwidth than usual for audio!
• Initial integration in Chrome used distance of two “generations”
• Each audio packet contains payload of previous two packets
• Bitrate of audio stream increases considerably!
• Less bandwidth available for other streams, e.g., video
• Latest version defaults to a single redundant packet instead
• Overridable with WebRTC-Audio-Red-For-Opus/Enabled-[1-9]/
54. Thanks! Questions? Comments?
Get in touch!
• https://twitter.com/elminiero
• https://twitter.com/meetecho
• https://www.meetecho.com