Frequently Asked Questions About ZRTP
ZRTP's principal designer Phil Zimmermann answers a few questions on ZRTP.
A: ZRTP is new secure VoIP phone software which lets you make secure encrypted phone calls over the Internet. The ZRTP protocol will soon be integrated into many standalone secure VoIP clients, but today we have a software product that lets you turn your existing VoIP client into a secure phone. The current ZRTP software runs in the Internet protocol stack on any Windows XP, Mac OS X, or Linux PC, and intercepts and filters all the VoIP packets as they go in and out of the machine, and secures the call on the fly. You can use a variety of different software VoIP clients to make a VoIP call. The ZRTP software detects when the call starts, and initiates a cryptographic key agreement between the two parties, and then proceeds to encrypt and decrypt the voice packets. It has its own little separate GUI, telling the user if the call is secure. It's as if ZRTP were a "bump on the cord", sitting between the VoIP client and the Internet. Think of it as a bump in the protocol stack.
Q: Why do we need ZRTP? For that matter, why do we even need secure VoIP at all?
A: As VoIP grows into a replacement for the PSTN, we will absolutely need to protect it, or organized crime will be attacking it as intensively as they attack the rest of the Internet today. VoIP is far more vulnerable to interception than the PSTN. A PC on your office network can unknowingly host spyware that can intercept your corporate VoIP calls and store and organize them on a hard disk for convenient browsing by criminals half a world away, giving them trade secrets and insider trading opportunities. To see an example of an actual implementation of this kind of spyware, take a look at Peter Cox's SIPtap demo..
The Internet is not a safe medium to carry our phone calls. But ZRTP solves these problems. This technology has social benefits. It has the power to change our lives, enabling us to have a private conversation any time we want with anyone, anywhere - without buying a plane ticket.
A: The ZRTP protocol has some nice cryptographic features lacking in many other approaches to VoIP encryption. Although it uses a public key algorithm, it does not rely on a public key infrastructure (PKI). In fact, it does not use persistent public keys at all. It uses ephemeral Diffie-Hellman with hash commitment, and allows the detection of man-in-the-middle (MiTM) attacks by displaying a short authentication string for the users to read and verbally compare over the phone. It has perfect forward secrecy, meaning the keys are destroyed at the end of the call, which precludes retroactively compromising the call by future disclosures of key material. But even if the users are too lazy to bother with short authentication strings, we still get fairly decent authentication against a MiTM attack, based on a form of key continuity. It does this by caching some key material to use in the next call, to be mixed in with the next call's DH shared secret, giving it key continuity properties analogous to SSH. All this is done without reliance on a PKI, key certification, trust models, certificate authorities, or key management complexity that bedevils the email encryption world. It also does not rely on SIP signaling for the key management, and in fact does not rely on any servers at all. It performs its key agreements and key management in a purely peer-to-peer manner over the RTP packet stream. And it supports opportunistic encryption by auto-sensing if the other VoIP client supports ZRTP.
There are good reasons why ZRTP does not rely on a PKI approach. There are major problems and complexities with building, maintaining, and relying on PKI. That's why in the 1990s, a number of companies died trying to build and market PKI technology. See Ellison and Schneier's paper Ten Risks of PKI: What You're Not Being Told About Public Key Infrastructure and Ellison's paper Improvements on Conventional PKI Wisdom. Nonetheless, despite all these good reasons why it's not a good idea to become dependent on a PKI, ZRTP can use a PKI if you already have one up and running. Follow this link for how this is done.
In July 2005 at the Black Hat briefings, ZRTP's protocol was introduced and it brought some important innovations. Although they have been used in other forms in the past for other environments, such as PSTN encryption or remote terminal logins, they had never before been applied specifically to negotiate session keys for Secure RTP media streams:
Q: Will ZRTP's protocol become an IETF standard?
A: Alan Johnston, Jon Callas, and Phil Zimmermann have submitted an IETF Internet Draft for the ZRTP protocol, which is used by ZRTP to set up the cryptographic key agreement. Alan co-authored RFC 3261 which defines the SIP standard, and Jon is CTO at PGP Corp.
You can view the current state the ZRTP Internet Draft here.
Q: Can ZRTP use a PKI (public key infrastructure)?
A: Yes. Despite all the good reasons not to depend on a PKI, the ZRTP protocol does have the optional capability to use a PKI if you already have a PKI up and running. The ZRTP Internet Draft describes how ZRTP can use a PKI-backed digital signature key to sign the short authentication string in the ZRTP CONFIRM packet, to reduce reliance on users verbally comparing them during the call. Organizations that feel comfortable with PKIs can still get what they want. Thus, ZRTP offers all of the advantages of a protocol that can use a PKI, without actually becoming dependent on a PKI for security.
Q: Is ZRTP CALEA compliant?
A: ZRTP's architecture likely renders that question irrelevant. We're not lawyers, but our understanding of the Communications Assistance for Law Enforcement Act (CALEA) applies in the US to the PSTN phone companies and VoIP service providers, such as Vonage. CALEA imposes requirements on VoIP service providers to give law enforcement access to whatever they have at the service provider, which would be only encrypted voice packets. ZRTP does all its key management in a peer-to-peer manner, so the service provider does not have access to any of the keys. Only ZRTP's end users are involved in the key negotiation, and CALEA does not apply to end users. If the VoIP service providers are smart, they will welcome ZRTP as a solution to being caught in the middle between the end users and the government. In the early 1990s, the government tried to control the end user's use of crypto by introducing the Clipper chip. That didn't go over too well politically, and had to be abandoned. The government will find it difficult to try again to stop end users from encrypting their traffic, regardless of whether that traffic is email, e-commerce web transactions, or VoIP calls.
Q: Will the government attempt to stop VoIP encryption?
A: It's a fair question to ask in a post-9/11 world. Just how likely would it be for the government to restrict the end user's use of secure VoIP? The question of whether strong cryptography should be restricted by the government was debated all through the 1990's. This debate had the participation of the White House, the NSA, the FBI, the courts, the Congress, the computer industry, civilian academia, and the press. This debate fully took into account the question of terrorists using strong crypto, and in fact, that was one of the core issues of the debate. Nonetheless, society's collective decision (over the FBI's objections) was that on the whole, we would be better off with strong crypto, unencumbered with government back doors. The export controls were lifted and no domestic controls were imposed. This was a good decision, because we took the time and had such broad expert participation. The 9/11 attacks did not change the wisdom of that collective decision, and although civil liberties on the whole have eroded since then, we haven't lost our right to use strong crypto.
The law enforcement community will be understandably concerned about the effects encrypted VoIP will have on their ability to perform lawful intercepts. But what will be the overall effects on the criminal justice system if we fail to encrypt VoIP? Historically, law enforcement has benefited from a strong asymmetry in the feasibility of government or criminals wiretapping the PSTN. As we migrate to VoIP, that asymmetry collapses. VoIP interception is so easy, organized crime will be able to wiretap prosecutors and judges, revealing details of ongoing investigations, names of witnesses and informants, and conversations with their wives about what time to pick up their kids at school. The law enforcement community will come to recognize that VoIP encryption actually serves their vital interests.
Q: Exactly how does ZRTP protect against a man-in-the-middle (MiTM) attack?
A: The Diffie-Hellman key exchange by itself does not provide protection against man-in-the-middle (MiTM) attacks. To authenticate the key exchange, ZRTP uses a Short Authentication String (SAS), which is essentially a cryptographic hash of the two Diffie-Hellman values. The SAS value is rendered to both ZRTP endpoints. To carry out authentication, this SAS value is read aloud to the communication partner over the voice connection. If the values on both ends do not match, it indicates the presence of a man-in-middle attack. If they do match, there is a high probability that no man-in-the-middle is present. The use of hash commitment in the DH exchange constrains the attacker to only one guess to generate the correct SAS in his attack, which means the SAS can be quite short. A 16-bit SAS, for example, provides the attacker only one chance out of 65536 of not being detected.
ZRTP provides a second layer of authentication against a MitM attack, based on a form of key continuity. It does this by caching some hashed key material to use in the next call, to be mixed in with the next call's DH shared secret, giving it key continuity properties analogous to SSH. If the MitM is not present in the first call, he is locked out of subsequent calls. Thus, even if the SAS is never used, most MitM attacks are stopped, because they weren't present in the first call.
ZRTP provides yet a third layer of authentication by recommending that the SAS be provided to the signaling layer, where it can be sent via a SIP re-invite after the call is underway. The SAS can then be automatically compared by the VoIP clients. There are some environments where there is good reason to trust the signaling layer. The signaling layer must provide its own separate authenticated channel (out-of-band from the media layer) through which to send the SAS. Perhaps this may seem to contradict the design goal of not trusting the signaling layer with protecting the media encryption, but it does no harm to add this as yet another layer of checking the SAS, provided we don't depend on it. It does not replace the need to have the human users check the SAS rendered via a user interface.
A: Phil Zimmermann is a firm believer in publishing the source code for cryptographic software for peer review, to build public confidence that it contains no back doors, a tradition he started in 1991 with PGP. PGP is a proprietary product, even though the source code is available for peer review. Publishing the source code for peer review is not the same as making it available under an open source license. A number of people suggested making at least part of ZRTP available under a dual-licensing approach, with GPL licensing for inclusion in open source projects, and commercial licensing for proprietary products. Some software, such as MySQL, has taken this path. So Phil decided to dual-license the ZRTP SDK. This transition to a dual-licensed GPL approach will go into effect at the completion of the beta test phase.
ZRTP has several major components, and not all of them are published under the same licensing terms. The entire body of source code for the complete ZRTP client software is published for peer review. In addition, the ZRTP libZRTP SDK library is published under the GPL version 2. The libZRTP SDK may be included in any GPL project (and if you need it for other non-GPL open source applications, contact us). However, the rest of the ZRTP application as a whole (as opposed to just the libZRTP SDK library) remains proprietary, and is published for public peer review, but not under an open source license. For a full discussion of this, see the ZRTP Licensing Policy page.
Q: Does ZRTP work with Skype?
A: No. Skype uses a closed proprietary protocol, which they do not publish. That makes it hard to make ZRTP work with it. Skype does not interoperate with the rest of the VoIP industry, which is built on open standards. ZRTP follows the industry standards.
Q: Does ZRTP work with plain old telephone service (POTS) phones?
A: Nope. Sorry. It only works with VoIP protocols, not PSTN, or POTS, phones. VoIP is the wave of the future, and doesn't work with the old public switched telephone network. A famous hockey player said "I try to skate to where I think the puck will be."
Q: My VoIP service provider (such as Vonage or AT&T) gave me an ATA (Analog Telephony Adapter), or VoIP router, that allows me to connect my old-fashioned telephone to my broadband connection. Will ZRTP work with that?
A: Well, not with that exact setup, no. Your ATA or VoIP router is a hardware device that lets you connect your old analog telephone to a VoIP network. To make a secure call with that kind of setup, you would have to have an ATA with the ZRTP protocol integrated inside, which will happen someday, (Tell us at Ripcord if you want this). In the meantime, if you really want to run ZRTP now, you need to run a software VoIP client (such as X-Lite, Gizmo, SJphone, or perhaps a software VoIP client supplied by your VoIP service provider) on your PC or Macintosh computer. You can use the software VoIP client to connect to your VoIP service provider from your computer, not from a normal telephone. Then you can install ZRTP on the same computer, and have it convert your VoIP call to the ZRTP protocol. And, of course, the other party you are calling must also be running VoIP with the ZRTP protocol (such as ZRTP) on the other end. This will become simpler when ATA or VoIP router vendors integrate the ZRTP protocol inside their hardware.
Q: Why do we need ZRTP if we already have SRTP? Isn't SRTP good enough?
A: This is the wrong question to ask. Despite the similarity in the two names, it is not a choice between SRTP and ZRTP. One does not replace the other. SRTP is the protocol we use to encrypt the low level voice packets. But SRTP alone is not the whole solution. You cannot use SRTP until both parties have agreed on what key to use for the SRTP encryption. That's where ZRTP comes in. ZRTP is the protocol that the two parties use to negotiate the SRTP session key. ZRTP uses SRTP, but it uses ZRTP first to negotiate the SRTP session key.
Q: Can ZRTP be used with H.323 or other signaling protocols?
A: Yes, ZRTP can be used with any signaling protocol, including SIP, H.323, MGCP, Jingle, and Peer-to-Peer SIP . ZRTP is independent of the signaling layer, because it does all its key negotiations in the media stream.
Q: Isn't it a protocol layer violation to do the key management in the media instead of in the signaling?
A: Some proponents of other VoIP encryption schemes say that it offends their sensibilities to see ZRTP negotiate the cryptographic keys in the media stream, instead of in the signaling layer, as other VoIP encryption schemes do. They call it a "layer violation". But to many protocol designers it seems clear that the signaling should take care of its own key negotiation for signaling authentication, and the media layer should negotiate its own keys for media encryption. The two layers should each take care of their own cryptographic needs. If anything, doing the media encryption key negotiation in the signaling layer is the real layer violation.
In the same vein, the VoIP service providers can't always be trusted to act with your interests in mind, so ZRTP doesn't involve their SIP servers in ZRTP encryption key negotiations. If you want to speak Navajo with your business partner on the phone, you shouldn't have to clear it first with the phone company. It's just none of their business. And that's part of what makes ZRTP so broadly appealing.
It's also worth noting that traditional secure telephones in the PSTN world, such as the AT&T TSD 3600 or the STU-III, did all their key negotiations in the media stream. They used a modem to establish a digital channel on a normal voice grade phone line, negotiated their keys, and sent an encrypted voice stream all on the same channel. No one called it a layer violation. This is the way secure phones always worked before VoIP came along.
Q: Many VoIP clients include some form of built-in text chat or instant messaging. Does ZRTP encrypt those text messages?
A: No. Not yet, anyway. ZRTP does very well by limiting its mission to just managing the keys and encrypting RTP media streams for VoIP. Also, these instant text messaging protocols come in a number of different variants, such as AIM or Jabber, with different VoIP clients supporting different instant messaging protocols. Each of them will require a different method of encryption, and that remains to be worked out. Some methods already exist for encrypting some forms of text chatting, such as the one offered by PGP Corp. We're looking at the most appropriate methods to add this capability to ZRTP.
Q: Does ZRTP work with IAX?
A: No. IAX is the Inter-Asterisk eXchange protocol used by Asterisk, an open source PBX server from Digium. However, ZRTP has been successfully integrated into Asterisk PBX servers in support of SIP/RTP calls, and Zfone is working with Digium to make it available in Asterisk products. But ZRTP and IAX are not well suited for each other in their present forms. Perhaps Phil will have a look at IAX more closely to see what can be done to improve its security.
Q: Does ZRTP have any "back doors"?
A: Anyone who knows anything about Phil Zimmermann knows the answer is No. In fact, he has a whole page on that subject regarding PGP software, and it applies equally to ZRTP. Now, having said that, we remind you that ZRTP is still beta software, and has a few bugs. Until Zfone does a real release, they make no claims about it being secure. The internal code reviews aren't finished, and someone may yet discover bugs that affect security. That's also why there is a public beta. We need you to help us test the code.
It's easy to keep back doors out of your own product, as is done with ZRTP. It's much harder to keep back doors out of other vendors' implementations of the ZRTP protocol. Take a look at these ideas for ZRTP's back-door-resistant features.
Q: Is the Short Authentication String (SAS) vulnerable to an attacker with voice impersonation capabilities?
A: In practical terms, no. It is a mistake to think this is simply an exercise in voice impersonation (perhaps this could be called the "Rich Little" attack). Although there are digital signal processing techniques for changing a person's voice, that does not mean a man-in-the-middle attacker can safely break into a phone conversation and inject his own short authentication string (SAS) at just the right moment. He doesn't know exactly when or in what manner the users will choose to read aloud the SAS, or in what context they will bring it up or say it, or even which of the two speakers will say it, or if indeed they both will say it. In addition, some methods of rendering the SAS involve using a list of words such as the PGP word list, in a manner analogous to how pilots use the NATO phonetic alphabet to convey information. This can make it even more complicated for the attacker, because these words can be worked into the conversation in unpredictable ways. Remember that the attacker places a very high value on not being detected, and if he makes a mistake, he doesn't get to do it over.
Some people have raised the question that even if the attacker lacks voice impersonation capabilities, it may be unsafe for people who don't know each other's voices to depend on the SAS procedure. This is not as much of a problem as it seems, because it isn't necessary that they recognize each other by their voice, it's only necessary that they detect that the voice used for the SAS procedure matches the voice in the rest of the phone conversation.
Q: Is ZRTP covered by any patents?
A: Phil thinks software patents stifle innovation and have done a great deal of harm to the software industry, especially in the crypto world, and he agrees with the League for Programming Freedom on this matter. The RSA patent holders wielded their patent to do all they could to destroy PGP in the 1990s. Rather than experience this problem again, Phil applied for a patent relating to some aspects of ZRTP and plans to use it defensively against other companies who might make patent claims against ZRTP in the future. Zfone has filed an Intellectual Property declaration with the IETF regarding any patent rights they may have in the future, and have offered a royalty-free license under many, if not most, conditions, for people who don't sue them. Having the patent also allows Zfone to better serve the user community's interests by providing leverage to get other ZRTP implementers to abide by ZRTP's back-door-resistant features. For more details, see the IPR statement ] filed with the IETF. Keep in mind that the IPR statement is not the real license, which would contain the definitive details.
Q: Does ZRTP encrypt Touch-Tone keypad DTMF tones?
A: Yes. ZRTP encrypts all RTP traffic, including Touch-Tone keypad DTMF tones. DTMF tones are carried in the RTP media stream using methods defined by RFC 2833, embedded as special RTP payload types. ZRTP encrypts these along with the rest of the RTP media stream, which is important because people use DTMF tones to enter their credit card numbers when they call their bank, for example.
There was a problem with an old version of the ZRTP beta software regarding the encryption of DTMF tones. Forbes magazine had an article on 2 August 2007 that reported a problem which they attributed to ZRTP's handling of DTMF tone encryption. In fact this was not due to any deficiencies in the ZRTP protocol, but was due to a software bug in the ZRTP beta software that existed in April 2007. That bug was fixed before the article appeared. The bug only happened when ZRTP was used in conjunction with SJLabs' SJphone, triggered by a subtle interaction with a bug in SJphone that was improperly generating DTMF packet sequences. Current versions of ZRTP always encrypt DTMF tones from all VoIP clients, including SJphone.
There is a potential but unlikely problem with DTMF handling that has never been reported in ZRTP. In unusual cases it is possible to send DTMF over the SIP channel. Some very old, non-standard SIP clients send it using a SIP INFO method - there is no RFC for this and it is discouraged strongly by the SIP standards community. There is also a new SIP extension (RFC 4730) known as KPML (Keypress Markup Language) which can be used to send DTMF over a SIP NOTIFY - but very few VoIP clients implement this yet, and the ones that do don't seem to be using it, and can be easily configured to not use it. If you ever find a VoIP client that uses KPML, we recommend that you simply disable this feature and allow DTMF to be carried the traditional way in the RTP media stream. Note that all RTP media encryption protocols, not just ZRTP, would be equally affected by this problem if SIP is used to carry DTMF.
A: It was Alan Johnston who suggested ZRTP, because it negotiates the session keys for SRTP, Secure Real-time Transport Protocol (RFC 3711). An alternative is the unwieldy acronym for Media Path Key Agreement for Secure RTP . The regrettably less descriptive mutation of that name, incorporating Phil's last initial, won out. It's worth noting that in the crypto community it's very much the norm for crypto protocols to be named after their inventors. Examples include RSA, Diffie-Hellman, ElGamal, CAST, RC2 and RC4 (Ron's Code), Blakely-Shamir, and many others.
Q: Why do I have to register in order to download ZRTP?
A: Although the US has ended most of its export controls for crypto software, there are still some reasonable residual export controls in place, namely, to prevent the software from being exported to a few embargoed nations, such as North Korea, Iran, Libya, Syria, and Sudan. And for commercial encryption software that you actually pay for (which does not include this free public beta), there are now requirements to check customers against government watch lists as well, which is something that companies such as PGP comply with these days. PGP Corp volunteered to host the public beta software on their server, with all the appropriate checks in place.
The ZRTP registration page checks your IP address against the list of embargoed countries, then emails you a link that you must click on to start your download, and checks your IP address again when you follow that link, which presumably means you did not receive your email in an embargoed country, and that the download itself did not go to an embargoed country. The U.S. Government deems this as adequate evidence that we made our best efforts to comply with U.S. export laws. Staying out of that kind of trouble is important to everyone.
Try free trial of Ripcord Secure: pc with ZRTP through Zfone Project.
Q: Will I have to worry about U.S. export controls on ZRTP in Ripcord Canopy?
A: Well, yes, but it's pretty easy to deal with these days. If you plan to export your product from the U.S., you will have to file some papers with the U.S. Commerce Department.
Q: Why can't I just use IPsec to encrypt my VoIP calls?
A: Well, you can, but it would be a bad idea in most cases. IPsec encryption is done down in the IP layer of the Internet protocol suite's protocol stack, which is too low a layer to let the user know if it is running. Some routers support IPsec, and some don't. You don't know if the other party supports IPsec, so some connections will be not encrypted, and you would never know it. If you don't know for sure whether the call is encrypted, what good is it? It's better to do the encryption at the application layer, so that the user can be told if the call is encrypted.
Q: Does ZRTP protect against "social network analysis" and other forms of analysis based on traffic patterns?
A: No, not at all. ZRTP just encrypts the contents of the call. The only way to protect against traffic analysis is to go through multiple intermediaries, which is a technique that has been used to protect email and web browsing (see the TOR project for an example of this). But this adds latency to communications, which may be unnoticeable for email, and at least tolerable for web browsing, but would be unacceptable for phone calls. Further, these countermeasures may be ineffective against a clever and resourceful opponent, because it's hard to hide the timing and length of the messages, especially if there are real-time communication requirements.
Q: How does ZRTP verify the identity of who you call?
A: It doesn't. It doesn't even try. It's not necessary to verify the identity of the other party to establish a secure call. In a normal PSTN phone call, what happens if you call someone's number, and his wife answers the phone? Do you sound a klaxon horn and blow a fuse? No. You use your brain to figure it out. That's just how phones work, and it's no big deal. It's certainly no reason to fail to make a secure connection. The most important wiretapping vulnerability is a Man-in-the-Middle (MiTM) attack, which ZRTP guards against by using either a short authentication string, or key continuity, or both.
Of course, it helps if you know the identity of the caller before you answer the phone, like the Caller ID in the PSTN. The SIP protocol attempts to address that problem in the signaling. It is a different problem, and certainly worthy of attention, but it is not the job of ZRTP. ZRTP does not begin until after the user answers the phone and the call is underway. ZRTP merely establishes a secure wiretap-resistant connection to another ZRTP endpoint, and does it very well by narrowing the scope of its mission.
Many people get hung up on this question of making an "authenticated phone call". It's a hard problem, and not worth the effort, in our opinion. Most phones, both at home and at work, are used by more than one person. And many people use other people's phones on a regular basis. There is not a one-to-one relationship between people and phones. Then there is the problem of establishing a digital identity. We could have a complex bureaucracy create a public key infrastructure that issues a certificate that we can attach to your phone, which can be displayed by the caller's phone. Not only is that of questionable value in our opinion, but it's also hard. A number of clever people, including Carl Ellison, have written about the complexity of creating unambiguous unique names and attaching them to people.
It's a mistake to view the world through a radar screen-- you must also use your eyes. And your common sense. The ZRTP protocol cannot tell you the name on the birth certificate of the person you are talking to. Or that the person you are talking to is telling the truth. And it cannot tell you if the other "endpoint" is then forwarding the call to another device. But neither can anything else.
Used with Permission of Zfone Project, Inc.
Security |