Use an existing protocol?
If you opt to use an existing protocol like SSH-TRANS, TLS, or IPSec, you’re actually being pretty sensible. There are plenty of good reasons why you’d want to roll your own though:
- Simplicity. Too much choice of ciphers is bad for you. Do something simple that uses some good primitives and have done with it.
- Understandable. Generic transports can be too generalised, with dangerous features you don’t know about. Libraries offering implementations of TLS are very hard to use, for example. I have more confidence that I can write correct code invoking some AES routines than I have in my ability to invoke TLS routines correctly. There are just too many options and quirky behaviours to understand, renegotiations, stored sessions, complex certificate models, close behaviour, and more.
- Minimal. Generic transports might have stuff that just doesn’t make sense for you. Transport-level compression? Padding modes? You might need something simple that you can implement everywhere with minimal dependencies.
Use ephemeral encryption for forward security
This is the #1 thing we got wrong in the 90s. Many protocols didn’t use ephemeral encryption (in fact, did any?). The idea was perhaps that hardware was too slow to support it. In any case, we can do it now, so any new connection layer should do it.
The idea is that if someone records your session, then waits for a year until you throw away your phone, and recovers your private key, he can now decrypt the session. He doesn’t need to go after the key that’s safe in your datacentre: either end’s private key is enough. This is very unfortunate. It’s not an attack that needs crypto cleverness, you just steal the key and use off-the-shelf shrinkwrapped software from your vendor (eg stock RealVNC or OpenSSH) to decrypt the saved session.
When a session ends, both parties should scrub the key used for encryption, and the knowledge of it passes out of the world. Later key compromise won’t let someone decrypt connections made prior to the key’s theft.
Never use long-lived keys (which identify a party) to do encryption as well.
How do we do this? There’s basically only one system these days worth using: Diffie-Hellman. Traditional DH is a bit slow, but forward security is trending now because of ECDH, which is really rather fast. There’s now no excuse. Use ECDH for every connection you ever make.
How do I identify someone?
Use RSA signatures. See “What signature algorithm do I use?”.
How do I then encrypt data?
The obvious answer is Rijndael (AES). AES isn’t ideal, but it’s so widely used you need a good reason to strike out and use something else.
There are some good reasons, in fact. AES is notoriously hard to implement in software in constant-time. If you can run on the same processor where some encryption is being done with a key want to steal, just watching the timeslices leaks enough information to get it eventually. Getting a slice on Rackspace where your enemy runs his infrastructure on some VMs in theory could be enough.
Timing attacks are not easy however. I’m not concerned (maybe I should be), but any crypto is good if that’s the biggest worry.
Further, AES can be done with dedicated instruction on most Intel processors since 2010, and on ARMv8. These are constant-time.
Finally, AES is approved for sale into military or government markets. The competitors aren’t (at high security levels—3DES is approved up to 80 bits of security).
For an overview of alternatives to AES, see Matthew Green, “So you want to use an alternative cipher…”. The general wisdom is that Salsa20 is one of the best replacements for AES, certainly my favourite, but at 3 cycle/byte on x86 it’s not even quicker than hardware-accelerated (AES-NI) AES, although it beats i586 AES (15 cycles/byte). Secondly, there are no widely-approved ways of adding message authentication to a stream cipher. Check back in a decade.
Finally, how do we authenticate data?
Or, what mode of operation do I use for AES? There are a lot of different opinions here. (A few representative ones: Colin Percival says “only use CTR-HMAC”; plenty of people don’t like GCM because it’s too hard to understand; some people don’t like EAX because it was rejected by NIST…)
Some absolutes: CBC is too hard to get right. Don’t do it. HMAC-CTR is a mistake (if you’re going to MAC, do encrypt-then-MAC rather than MAC-then-encrypt, because practical breaches have been shown against protocols that did the wrong thing, like SSH).
The choices are then CTR-HMAC, OCB, EAX, or GCM. I’m assuming we know what those are and the arguments.
My own opinion is to use GCM. I have some idiosyncratic reasons for that preference, as well as standard ones: 1. It’s NIST approved. Marketing/sales potential. This wins over EAX. 1. It’s fast. The only mode that will get hardware acceleration in phones. Mobile is important for the future of your product, even if you don’t think so. Fast AES followed by slow authentication is no-one’s idea of a good time.
- It’s unencumbered (beating OCB).
- It’s easy to use, if not to implement (beating CTR-HMAC, which actually has some nasty gotchas regarding the IV)
- We do now have some roughly constant-time software implementations if that’s a concern.
- My favourite: it’s got browser support. If you want to port your application to run in a browser, and you should, then you’ll want to use the Web Cryptography API. WebCrypto offers access to fast native primitives from JavaScript. In traditional software, if your platform doesn’t ship with certain routines you can supply them yourself and not lose out (except on automatic updates). It’s OK for a desktop product to pick a maverick cipher and ship the code for it. A browser-based product needs to get that browser support though to be fast. Using a cipher that might be dropped by browsers, or risks not having 100% browser support in the future, is a risk for you protocol! GCM is the only authenticated AES mode I can see being supported by all browsers for ever, because it’s enshrined in TLS over and above CTR-HMAC and EAX. Mozilla and IE ship it already.