Securing VoIP in the Presence of Pervasive Monitoring

Many have been wondering about government spying activities on Internet communication and of course everyone is puzzled what to do about it. More specifically, who should do what for certain applications (since the application behavior is quite different).

I wrote down my thoughts in a presentation given to data protection authorities and I wanted to provide a bit more context to understand the slide deck. It also want to illustrate that there are various challenges (even when thinking about a single application, namely VoIP communication).

The Background

The International Working Group on Data Protection in Telecommunications (IWGDPT), which consists of mostly data protection authorities (DPAs) and similar organizations (e.g., the FTC) from all over the world, has published a Working Paper on Privacy and Security in Internet Telephony already in September 2006 to reach a common understanding of challenges that will arise with the introduction of VoIP technology from a data protection point of view. DPAs enforce privacy laws but they are not the only agencies able to do that. The recommendations are largely addressed to VoIP service providers and also to manufacturers of software and hardware. The recommendations are sound but were written at a time when most of the VoIP deployment was still at an early stage.

In the meanwhile VoIP deployment has increased substantially and even to an extend that some predict the end of the Plain Old Telephony System (POTS) in various regions, as discussed at the technical plenary of the Internet Architecture Board in March 2013. The plenary material can be found here. In practice it turns out that many over-the-top VoIP providers (and providers that provide real-time communication services like instant messaging, real-time text, and video) fall under a different regulatory regime than classical telecommunication companies (such as AT&T, Verizon, BT, Vodaphone). Consequently, they operate under different rules than the classical telcos with respect to telecommunication secrecy, emergency services, accessibility requirements, etc. Many over-the-top VoIP providers, such as Skype, don’t want to become electronic communications operator. Of course, just changing laws to apply the telecommunication laws also to over-the-top provider does not work since the environment is very different.

A few years have passed since the IWGDPT published their recommendations and a few things have changed in the Internet ecosystem. The community certainly has a better understanding of the degree of government surveillance. Also, the underlying technology has evolved.

To advance the understanding of what could be done to improve VoIP security and privacy I compiled a slide deck to start the discussion about how the existing working paper could be updated, i.e., what recommendations data protection enforcement agencies could provide primarily to VoIP service providers, and equipment manufacturers.

The slide deck is a first draft and I hope to receive some feedback from the Internet community as well as from data protection authorities.

Here is a summary of the story (since the slides might not be self-explanatory).

Problem Description

First, there are problems beside government surveillance but those are currently forgotten in light of all the NSA discussions. Here are a few examples:

  1. Earlier this year security researchers have found out that links exchanged via programs like Skype had been retrieved from by Microsoft, as explained in this press article. While one can easily argue that this serves a security purpose the problem here in fact was with transparency. Most users were kept under the impression that Skype is a peer-to-peer system and uses end-to-end security so that no other party than the two end points is able to read the traffic. Obviously, that does not seem to be true and the proprietary nature of Skype makes is very difficult to find out what is happening.
  2. It is quite common to re-invent the wheel when it comes to Internet services. Of course the existing standards (like SIP, XMPP, and WebRTC) look very complicated for a newbie. So, why not develop something much simpler and better? The guys behind WhatsApp and also Cryptocat thought that this is the way to go. Well. Of course designing security into a system requires a lot of expertise and you can get it wrong very easily. Here are some blog posts that report problems with WhatsApp and Cryptocat.
  3. Of course you may also have VoIP providers who have rather relaxed security practices in general or may operate with a business model that does not provide a lot of incentives for privacy protection in general.

Note that I mix providers offering voice services with those offering other real-time communication services. I do this since there is often very little difference in terms of the actually involved protocols (although deployment-wise they may differ).


Today’s VoIP Landscape

Once the problems are understood there is the question about how various VoIP services today look like and there have been changes as well.

First, more and more application providers on the Internet establish silos in their communication architecture. The problem is that you cannot take random client software (like you can do today with an email client) and get it to work with a random VoIP provider. The RTCWeb model is a classical example but even smart phone applications follow a similar scheme. In the RTCWeb example the server provides the code (in form of JavaScript) for the browser to execute. Of course, from an innovation point of view this gives service providers much better possibilities to adapt the software to new needs. The IAB covered this trend at one technical plenary under the title “Post Standardization” (see plenary content here and an accompanying document here).

From a security and privacy point of view the challenge, however, is that a user never knows what software it is actually running. Everything may change at any time. Unless there is a lot of trust from the user in the service provider everything is lost.

A secondary effect is that many of the previously standardized protocols are not necessarily used in the way expected and hence custom designs dominate but security features are hard to get correct, as the last 20 years clearly demonstrate. For privacy protection the situation is even worse since many companies have incentives that are in conflict to users since the entire industry is looking for cloud computing, and big data analytics to discover “something new”. These industry trends do not line up nicely with basic privacy principles like data minimization and purpose limitation.

Data Routing and Interconnection

Two other aspects need to be considered, namely

  1. The involvement of interconnection providers, and
  2. The route VoIP data takes

In the early days of VoIP standardization in the IETF the idea was that interconnection would work similar to email (which lets one email provider talk directly to another email provider without any other entities in the middle). That was a good idea but the business models of operators were different. Consequently, intermediaries were introduced and, for end users, these intermediaries are invisible to a large extend. Another way to understand the role of these intermediaries is to glance at the credit card industry: Credit card companies use, in certain circumstances, the Society for Worldwide Interbank Financial Telecommunication (SWIFT) to process financial transactions. End users have not been aware of the presence of SWIFT in the transactions and even less aware of the fact that SWIFT make personal data accessible to US authorities, as this write-up of the Article 29 working party describes.

Is it too farfetched to wonder whether governments work together with interconnection providers to obtain transaction data of VoIP calls or, worse, the actual content of the communication?

It has to be pointed out that XMPP today is deployed without intermediaries, very much in the same way as email systems are deployed. It also provides a standardized way to build federated communication systems. XMPP is, however, mostly used for instant messaging and Google, who is using XMPP for Google Hangout, dropped support for server-to-server federation in May 2013 turning their services into a silo similar to the Facebook silo (who are also using XMPP). Consequently, a user who is on Facebook using XMPP cannot communicate with a user who uses Google’s XMPP services.

The route the data takes may, of course, be different than the route the signaling messages take when communication between two or more endpoints. There is even a standardized protocol to allow the best possible route to be determined, namely the Interactive Connectivity Establishment (ICE). One could also provide VPN tunnels, independent TURN servers, or even Tor as input to ICE so that the routing of the VoIP data packets remains separate.

However, in most cases the service provider has ultimately the full control of the routing of the packets and will therefore be able to see the content. As such, playing with the packet routing to obtain some level of security by itself is insufficient. Only cryptographic security mechanisms can help.

Securing Communication

There are at least two important areas in securing real-time communication protocols, namely

  1. Securing the signaling messages, and
  2. Securing the actual communication payload (voice packets, for example)

The security mechanisms look somewhat differently and a discussion about the design decisions with impact to privacy in case of SIP-based presence services can be found in RFC 6973.

For signaling traffic it is not possible to protect the entire communication end-to-end since various parts of the messages need to be understood by the different servers to ensure routing of the messages. Typically, one differentiates between client-to-server and server-to-server communication. For server-to-server communication it is assumed (on paper) that those exchanges are protected using TLS (or IPsec) but of course many deployments rather fall back to an illusion of physical security. For the client-to-server communication the story is more complicated. TLS with server-side authentication is assumed (although often not provided) but the client/user authentication typically happens via username & password layered on top of TLS.

If this type of security is indeed provided then it provides protection against eavesdroppers, who are not part of the legitimate protocol exchange. The protection is of course not provided against collecting information about the user’s communication behavior by VoIP providers and intermediaries (or even by the VoIP client software running on the end device).

An important task in end-to-end communication is to either present the calling party identity to the callee or to hide it. In the context of VoIP this identity information typically comes in form of the calling party identifier (phone number or URI) + contact information. On one hand conveying these identifiers is a privacy violation but on the other hand it provides privacy protection for the callee since he can decide based on the calling party whether he wants to get interrupted or not. Typically, many systems today use some form of whitelist (in the form of a buddy list) to filter incoming communication attempts. New users have to go through an “introduction” phase first.

Two types of approaches have been developed to communicate identity information within the signaling messages, namely:

  1. A chain of trust/non-cryptographic approach (for example, P-Asserted-Identity)
  2. A cryptographic approach (for example, SIP Identity)

Various problems surfaced with the chain of trust approach (which is also used in today’s telephone system) with caller-id spoofing and telephony denial of service attacks. The cryptographic approach is, however, also not without problems from a deployment point of view due to the nature of the intermediaries, who invalidate the cryptographic protection.

A more detailed discussion of the topic can be found here and with IETF#87 a new working group, called “Secure Telephone Identity Revised” (STIR), has been started to tackle some of the problems.

To protect the content of the communication is story is even more complicated since the goal is to securely authenticate the parties on both ends of the communication. It turns out that the solutions very much depend on who to trust and what the anticipated capabilities of the adversary are. On top of it, there are also challenges in the interworking with existing technologies (e.g., PSTN interworking).

As an example of what is meant by the assumptions consider ZRTP and DTLS-SRTP:

  • ZRTP assumes that callee/caller recognize each other’s voice (due to the voice fingerprint authentication procedure). Of course, without voice there is no voice fingerprint and there are also communication interactions where the voice of the other communication partner is not known.
  • DTLS-SRTP on the other hand assumes that SIP identity protects the fingerprints of public keys exchanged between the two parties. While SIP identity can be added by the endpoints themselves it is more realistic that they are added by the voice service provider. Consequently, the user has to select voice server provide they trust.

Of course, this is not a new area of work and a document with requirements and use cases can be found in RFC 5479.

My Recommendations

After all this background, what are the possible recommendations that can be given to VoIP providers and their equipment manufacturers.

  1. Requirement for Transparency
    1. In particular, it is important to know what information is collected by whom and for what purpose? What is the retention period?
    2. What security technology is available? How is information protected?
  2. User Participation
    1. What are the controls for users to control sharing with other users and with intermediaries? e.g., identity information
    2. Are whitelists (buddylists) available? How can the delivery of identity information to the callee be suppressed?
    3. In the Web context with many possible communication partners Is there an ability to view granted permissions and to revoke them (e.g., access to camera, microphone).
  3. Security
    1. Mandatory security for signaling traffic (client-to-server & server-to-server)
    2. Mandatory E2E security for data traffic: is it possible to mandate SRTP for all voice communication? Can something be recommended about the actual key management technique, particularly since there are various techniques available and some of them with questionable security benefits (like SDES). There will obviously be an inherit conflict between lawful intercept and the desire of end users to keep the communication content confidential.
    3. For the case where the exchanged data is not voice but rather instant messaging, or pure data the security solutions are obviously very different. What can be recommended in that case? For XMPP the e2e security mechanism (using the IETF JOSE work) is still work in progress and the S/MIME-based solution in SIP might not be deployable unless it is paired with another technology, such as SIP Cert, which provides a way to distribute the certificates. RFC 3923 provides the e2e security version using S/MIME for XMPP, which of course suffers from the same limitations as the SIP-based counterpart.
    4. Allow key exchange mechanisms to be replaced by third parties (pluggable models). This is a common concept used in other authentication protocols, like EAP, SASL or the GSS-API. It gives the communications partners the ability to adjust their capabilities to their needs.
    5. Should perfect forward secrecy be mandated to avoid future compromise of the communication content?
    6. Protection against unauthorized access of stored data.
    7. Software has to come with an update mechanism to react quickly to discovered vulnerabilities.
  4. Privacy-friendly defaults
  5. Re-use open standards that have enjoyed wide community review to ensure the quality of the used technology. (-> to increase transparency)
  6. Use open source code. This should also increase the quality and transparency.
  7. Allow users to choose their identity provider (-> to increase competition and choice)
  8. Offer federated use (-> to increase competition and choice)
  9. Enable data portability to make it easier for users to switch their provider (for example by extracting the buddy list) (-> to increase competition and choice)
  10. Preference for cryptographic identity conveyance (-> to lower misattribution & intrusion)
  11. Offer capabilities for direct exchange of data (-> to provide data minimization)

For users there is only one recommendation, namely to select the application providers operating in jurisdictions they feel comfortable with. This selection will also impact the applicable data protection laws.

Final remarks

To solicit feedback I am curious whether these recommendations are detailed enough? What else could be recommended?  What can be said about the real-time web case where the security and privacy challenges are even bigger?

Please drop me a message to hannes.tschofenig AT, respond on this mailing list, or comment on the blog post.

2 thoughts on “Securing VoIP in the Presence of Pervasive Monitoring

Leave a Reply

Your email address will not be published. Required fields are marked *