SDES

Last updated

SDES (Session Description Protocol Security Descriptions) for Media Streams is a way to negotiate the key for Secure Real-time Transport Protocol. It has been proposed for standardization to the IETF in July 2006 (see RFC   4568.)

Contents

How it works

The keys are transported in the SDP attachment of a SIP message. That means, the SIP transport layer must make sure that no one else can see the attachment. This can be done by using TLS transport layer, or other methods like S/MIME. Using TLS assumes that the next hop in the SIP proxy chain can be trusted and it will take care about the security requirements of the request.

The main advantage of this method is that it is extremely simple. The key exchange method has been picked up by several vendors already, even though some vendors do not use a secure mechanism to transport the key. This helps to get the critical mass of implementation to make this method the de facto standard.

To illustrate this principle with an example, the phone sends a call to the proxy. By using the sips scheme, it indicates that the call must be made secure. The key is base-64 encoded in the SDP attachment.

INVITE sips:*97@ietf.org;user=phone SIP/2.0 Via: SIP/2.0/TLS 172.20.25.100:2049;branch=z9hG4bK-s5kcqq8jqjv3;rport From: "123" <sips:123@ietf.org>;tag=mogkxsrhm4 To: <sips:*97@ietf.org;user=phone> Call-ID: 3c269247a122-f0ee6wcrvkcq@snom360-000413230A07 CSeq: 1 INVITE Max-Forwards: 70 Contact: <sip:123@172.20.25.100:2049;transport=tls;line=gyhiepdm>;reg-id=1 User-Agent: snom360/6.2.2 Accept: application/sdp Allow: INVITE, ACK, CANCEL, BYE, REFER, OPTIONS, NOTIFY, SUBSCRIBE, PRACK, MESSAGE, INFO Allow-Events: talk, hold, refer Supported: timer, 100rel, replaces, callerid Session-Expires: 3600;refresher=uas Min-SE: 90 Content-Type: application/sdp Content-Length: 477  v=0 o=root 2071608643 2071608643 IN IP4 172.20.25.100 s=call c=IN IP4 172.20.25.100 t=0 0 m=audio 57676 RTP/SAVP 0 8 9 2 3 18 4 101 a=crypto:1 AES_CM_128_HMAC_SHA1_32 inline:WbTBosdVUZqEb6Htqhn+m3z7wUh4RJVR8nE15GbN a=rtpmap:0 pcmu/8000 a=rtpmap:8 pcma/8000 a=rtpmap:9 g722/8000 a=rtpmap:2 g726-32/8000 a=rtpmap:3 gsm/8000 a=rtpmap:18 g729/8000 a=rtpmap:4 g723/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-16 a=ptime:20 a=encryption:optional a=sendrecv

The phone receives the answer from the proxy and now there can be a two-way secure call:

SIP/2.0 200 Ok Via: SIP/2.0/TLS 172.20.25.100:2049;branch=z9hG4bK-s5kcqq8jqjv3;rport=62401;received=66.31.106.96 From: "123" <sips:123@ietf.org>;tag=mogkxsrhm4 To: <sips:*97@ietf.org;user=phone>;tag=237592673 Call-ID: 3c269247a122-f0ee6wcrvkcq@snom360-000413230A07 CSeq: 1 INVITE Contact: <sip:*97@203.43.12.32:5061;transport=tls> Supported: 100rel, replaces Allow-Events: refer Allow: INVITE, ACK, CANCEL, BYE, REFER, OPTIONS, PRACK, INFO Accept: application/sdp User-Agent: pbxnsip-PBX/1.5.1 Content-Type: application/sdp Content-Length: 298  v=0 o=- 1996782469 1996782469 IN IP4 203.43.12.32 s=- c=IN IP4 203.43.12.32 t=0 0 m=audio 57076 RTP/SAVP 0 101 a=rtpmap:0 pcmu/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-11 a=crypto:1 AES_CM_128_HMAC_SHA1_32 inline:bmt4MzIzMmYxdnFyaWM3d282dGR5Z3g0c2k5M3Yx a=ptime:20 a=sendrecv

Discussion: Call Initiation and missing End-to-End Encryption

A common problem with secure media is that the key exchange might not be finished when the first media packet arrives. In order to avoid initial clicks, those packets must be dropped. Usually this is only a short period of time (below 100 ms), so that this is no major problem.

The SDES method does not address the "end-to-end" media encryption. For example, if user A is talking to user B via a proxy P, SDES allows negotiation of keys between A and P or between B and P, but not between A and B. For end-to-end media security you must first establish a trust relationship with the other side. If you use a trusted intermediate for this, the call setup delay will significantly increase, which makes applications like push-to-talk difficult. If you do this peer-to-peer, it might be difficult for you to identify the other side. For example, your operator might implement a B2BUA architecture and play the role of the other side, so that you still don't have end-to-end security. Newer, modern protocols, like ZRTP, offer end-to-end encryption for SIP/RTP calls.

See also

Related Research Articles

The Real-Time Streaming Protocol (RTSP) is an application-level network protocol designed for multiplexing and packetizing multimedia transport streams over a suitable transport protocol. RTSP is used in entertainment and communications systems to control streaming media servers. The protocol is used for establishing and controlling media sessions between endpoints. Clients of media servers issue commands such as play, record and pause, to facilitate real-time control of the media streaming from the server to a client or from a client to the server.

The Real-time Transport Protocol (RTP) is a network protocol for delivering audio and video over IP networks. RTP is used in communication and entertainment systems that involve streaming media, such as telephony, video teleconference applications including WebRTC, television services and web-based push-to-talk features.

The Session Initiation Protocol (SIP) is a signaling protocol used for initiating, maintaining, and terminating communication sessions that include voice, video and messaging applications. SIP is used in Internet telephony, in private IP telephone systems, as well as mobile phone calling over LTE (VoLTE).

The Session Description Protocol (SDP) is a format for describing multimedia communication sessions for the purposes of announcement and invitation. Its predominant use is in support of streaming media applications, such as voice over IP (VoIP) and video conferencing. SDP does not deliver any media streams itself but is used between endpoints for negotiation of network metrics, media types, and other associated properties. The set of properties and parameters is called a session profile.

Transport Layer Security (TLS) is a cryptographic protocol designed to provide communications security over a computer network. The protocol is widely used in applications such as email, instant messaging, and voice over IP, but its use in securing HTTPS remains the most publicly visible.

A session border controller (SBC) is a network element deployed to protect SIP based voice over Internet Protocol (VoIP) networks.

The Secure Real-time Transport Protocol (SRTP) is a profile for Real-time Transport Protocol (RTP) intended to provide encryption, message authentication and integrity, and replay attack protection to the RTP data in both unicast and multicast applications. It was developed by a small team of Internet Protocol and cryptographic experts from Cisco and Ericsson. It was first published by the IETF in March 2004 as RFC 3711.

Opportunistic encryption (OE) refers to any system that, when connecting to another system, attempts to encrypt communications channels, otherwise falling back to unencrypted communications. This method requires no pre-arrangement between the two systems.

Multimedia Internet KEYing (MIKEY) is a key management protocol that is intended for use with real-time applications. It can specifically be used to set up encryption keys for multimedia sessions that are secured using SRTP, the security protocol commonly used for securing real-time communications such as VoIP.

Zfone is software for secure voice communication over the Internet (VoIP), using the ZRTP protocol. It is created by Phil Zimmermann, the creator of the PGP encryption software. Zfone works on top of existing SIP- and RTP-programs, but should work with any SIP- and RTP-compliant VoIP-program.

ZRTP is a cryptographic key-agreement protocol to negotiate the keys for encryption between two end points in a Voice over IP (VoIP) phone telephony call based on the Real-time Transport Protocol. It uses Diffie–Hellman key exchange and the Secure Real-time Transport Protocol (SRTP) for encryption. ZRTP was developed by Phil Zimmermann, with help from Bryce Wilcox-O'Hearn, Colin Plumb, Jon Callas and Alan Johnston and was submitted to the Internet Engineering Task Force (IETF) by Zimmermann, Callas and Johnston on March 5, 2006 and published on April 11, 2011 as RFC 6189.

The SIP URI scheme is a Uniform Resource Identifier (URI) scheme for the Session Initiation Protocol (SIP) multimedia communications protocol. A SIP address is a URI that addresses a specific telephone extension on a voice over IP system. Such a number could be a private branch exchange or an E.164 telephone number dialled through a specific gateway. The scheme was defined in RFC 3261.

Text over IP is a means of providing a real-time text (RTT) service that operates over IP-based networks. It complements Voice over IP (VoIP) and Video over IP.

<span class="mw-page-title-main">Linphone</span> Voice over IP software

Linphone is a free voice over IP softphone, SIP client and service. It may be used for audio and video direct calls and calls through any VoIP softswitch or IP-PBX. Linphone also provides the possibility to exchange instant messages. It has a simple multilanguage interface based on Qt for GUI and can also be run as a console-mode application on Linux.

In computer networking, the Message Session Relay Protocol (MSRP) is a protocol for transmitting a series of related instant messages in the context of a communications session. An application instantiates the session with the Session Description Protocol (SDP) over Session Initiation Protocol (SIP) or other rendezvous methods.

A cipher suite is a set of algorithms that help secure a network connection. Suites typically use Transport Layer Security (TLS) or its deprecated predecessor Secure Socket Layer (SSL). The set of algorithms that cipher suites usually contain include: a key exchange algorithm, a bulk encryption algorithm, and a message authentication code (MAC) algorithm.

Acrobits is a privately owned software development company creating VoIP Clients for mobile platforms, based in Prague, Czech Republic.

The Session Initiation Protocol (SIP) is the signaling protocol selected by the 3rd Generation Partnership Project (3GPP) to create and control multimedia sessions with multiple participants in the IP Multimedia Subsystem (IMS). It is therefore a key element in the IMS framework.

References