Video Encoding Basics: Live Video Streaming Protocols for Broadcast Contribution and Distribution
From RTMP to SRT, HLS, MPEG-DASH and CMAF, the list of video transport protocols currently available is extensive and can be confusing. As part of our Video Encoding Basics series, we’re attempting to demystify some of the fundamentals of video encoding and in this post, we’ll take you through what a video transport protocol does and where it’s used in the video delivery chain. We’ll also explore the different protocols currently available as well as their relative merits and shortcomings.
What Is a Transport Protocol?
Let’s start off with the basics. Essentially, a transport protocol is a communication protocol responsible for establishing a connection and delivering data across network connections. More specifically, transport protocols occupy layer 4 of the Open Systems Interconnection (OSI) protocol model. The International Standards Organization’s OSI model serves as a standard template for describing a network protocol stack. The protocols at this level provide connection-oriented sessions and reliable data delivery. Transport protocols provide data delivery guarantees that are essential for file transfers and mission-critical applications. Importantly, different transport protocols may support a range of optional capabilities including error recovery, flow control, and support for re-transmission.
When Are Video Transport Protocols Used?
In video workflows, transport protocols are used for contribution, production, distribution and delivery. The diagram below illustrates these four distinct phases in the live video delivery chain, contribution is sometimes referred to as “the first mile” and delivery as “the last mile.” This last mile phase is where video streams are delivered directly to end user viewers on their chosen device or TV screen. For delivery protocols, you may have heard of HLS, MPEG-DASH and CMAF which are HTTP based and not applicable for first mile video streaming as they introduce too much latency for live broadcast production facilities.
In addition, there are also proprietary protocols available which are owned by one company and usually require a license to be used by customers or third-party vendors. The challenges of using a closed proprietary protocol are that it can be costly (requiring ongoing monthly payments), it promotes vendor lock-in (preventing interoperability between other manufacturers’ devices) and, there’s always a risk of orphan products when vendors discontinue a product line or go out of business.
Transport Layer Protocols
TCP – Transmission Control Protocol
TCP is the most commonly used transport protocol on the internet and guarantees that the recipient will receive the packets in order by numbering them. The recipient sends messages back to the sender saying it received the messages. If the sender does not get a correct response, it will resend the packets to ensure the recipient received them – because of these acknowledgments, TCP is considered to be very reliable. Packets are also checked for errors. TCP is optimized for data integrity rather than timely delivery which means that packets sent with TCP are tracked in such a way that no data is lost or corrupted in transit. This can result in relatively long delays while waiting for out-of-order messages or re-transmissions of lost messages. Though TCP works great for sending email and files, it is unsuitable for streaming live video.
UDP – User Datagram Protocol
The UDP protocol works similarly to TCP, but without the error checking that slows things down. When using UDP, “datagrams” which are essentially packets, are sent to the recipient, the sender does not wait to make sure the recipient received the packet – it will just continue sending the next packet. Sometimes referred to as a “best effort” service, if you are the recipient and you miss some UDP packets, too bad. There is no guarantee that you are receiving all the packets and there is no way to ask for a packet again if you miss it. The upside of losing all this processing overhead, however, means that UDP is fast and is frequently used for time-sensitive applications such as online gaming or live broadcasts where perceived latency is more critical than packet loss.
Application Layer Protocols for Video Streaming
RTMP – Real-Time Messaging Protocol
Initially, a proprietary protocol, RTMP was originally developed by Macromedia (now Adobe), for real-time streaming of video, audio, and data between a server and Flash player. Although Adobe announced that it will no longer support Flash, RTMP remains a commonly used protocol for live streaming within production workflows. Based on TCP, RTMP is a continuous streaming technology and relies on acknowledgments reported by the receiver. However, these acknowledgments (ACKs) are not reported immediately to the sender in order to keep the return traffic low. Only after a sequence of packets has been received, will a report of ACKs or NACKs (negative acknowledgments) be sent back. If there are lost packets within that sequence, the complete sequence of packets (going back to the last ACK) will be retransmitted. This packaging process can dramatically increase end-to-end latency. In addition, RTMP does not support HEVC encoded streams or advanced resolutions as it cannot be used at high bitrates due to bandwidth limitations. It’s worth noting that there are several variations of RTMP including RTMPS which works over a TLS/SSL connection.
RTP – Real-Time Transport Protocol
RTSP – Real-Time Streaming Protocol
RTSP is an application layer control protocol that communicates directly with a video streaming server. RTSP allows viewers to remotely pause, play, and stop video streams via the Internet without the need for local downloads. RTSP was most notably used by RealNetworks RealPlayer and is still being applied for various uses including for remote camera streams, online education and internet radio. RTSP requires a dedicated server for streaming and does not support content encryption or the retransmission of lost packets as it relies on the RTP protocol in conjunction with RTCP for media stream delivery.
SRT – Secure Reliable Video Transport
Pioneered by Haivision, SRT was developed to deliver the low latency performance of UDP over lossy networks with a high performance sender/receiver module which maximizes the available bandwidth. Codec agnostic, open source and royalty-free, SRT guarantees that the packet cadence (compressed video signal) that enters the network is identical to the one that is received at the decoder, dramatically simplifying the decoding process. SRT offers additional features including native AES encryption so stream security is managed on a link level. It also allows users to easily traverse firewalls throughout the workflow by supporting both sender and receive modes (in contrast to both RTMP and HTTP that only support a single mode).
In addition, SRT can bring together multiple video, audio, and data streams within a single SRT stream to support highly complex workflows without burdening the network administrators. Within the sender/receiver module, we incorporated the ability to detect the network performance with regard to latency, packet loss, jitter, and the available bandwidth. Advanced SRT integrations can use this information to guide stream initiation, or even to adapt the endpoints to changing network conditions.
SST – Safe Streams Transport
SST is an award-winning network protocol originally developed by Aviwest, now Haivision, for transporting video over cellular networks. With its IP bonding technology, SST ensures reliable and broadcast-grade video transmission over 3G, 4G, and 5G cellular networks, LAN, Wi-Fi, satellite, and the public internet. SST aggregates multiple network connections in real-time for an extra layer of reliability and increased bandwidth for high quality video. SST video bitrates can also be adjusted in real-time to handle any bandwidth fluctuations.
SST includes content encryption and the retransmission of lost video data and as well as audio, metadata, and remote control of devices such as video transmitters and PTZ cameras through DataBridge technology.
WebRTC – Web Real-Time Communication
WebRTC is an open source technology created by Google and standardized in January 2021 often used for video conferencing. WebRTC is made up of by three HTML5 APIs (“getUserMedia,” “RTCPeerConnection,” and “RTCDataChannel”) to deliver voice and video low latency streaming between browsers quick enough to mimic in-person communication. It also allows both video and audio capture and playback without the need to download and install any additional plugins. However, one of the drawbacks of using WebRTC is the lack of scalability as large-scale projects with a robust number of users connected will require an additional server, solution, or a cloud-based service to lighten the stress on the browser.
WebRTC is supported by every major browser including Microsoft Edge, Google Chrome, and Mozilla Firefox and is used to power popular video chat applications such as Microsoft Teams, Facebook Messenger, and Google Hangouts.
HLS – HTTP Live Streaming
HLS is an adaptive bitrate streaming protocol from Apple released in 2009 that is primarily used for the last mile of a video delivery. HLS content is delivered from a web server (or origin server) and often through a CDN before it reaches a video player. HLS video content is broken down into separate chunks, usually 10 seconds long, that are duplicated and encoded at varying bitrates and resolutions (or profiles) in parallel. As an adaptive bitrate protocol, the video player looks for changes in the bandwidth conditions and if there are fluctuations, it can seamlessly switch to the ABR profile best suited at that given moment. HLS supports video that is encoded with the H.264 or H.265 (HEVC) codecs.
With Adobe’s Flash technology now End-Of-Life, online video delivered by HLS and played via HTML5 players has become the main method of video delivery via the internet with support in major web browsers, mobile devices, media players, servers and even some consumer set-top-boxes. As an Apple technology, HLS is the main delivery protocol for iOS devices.
MPEG-DASH – Dynamic Adaptive Streaming over HTTP
MPEG-DASH is an open source protocol developed by the Moving Pictures Expert Group (MPEG) and became a Draft International Standard in November 2011. MPEG-DASH is mostly used to deliver live and on-demand video content over the internet to viewers’ set-top boxes, smartphones, tablets, and other devices. MPEG-DASH has the same approach to player delivery as HLS as it also breaks down content into parts or chunks with a cascade of different encodes making it adaptable to last mile over the top (OTT) video delivery.
MPEG-DASH is codec agnostic. This means that MPEG-DASH is not limited to using H.264 or HEVC codecs but can also support others such as VP8 or VP9 which could be advantageous for higher quality broadcasts with lower bitrates. As an alternative ABR protocol to HLS, MPEG-DASH is widely used on Android devices.
Need a little help?
Whatever challenges you’re facing with your remote working strategy, we can work with you to solve them.