Why I don’t Use 2048 or 4096 RSA Key Sizes

I have used non-standard RSA key size for maybe 15 years. For example, my old OpenPGP key created in 2002. With non-standard key sizes, I mean a RSA key size that is not 2048 or 4096. I do this when I generate OpenPGP/SSH keys (using GnuPG with a smartcard like this) and PKIX certificates (using GnuTLS or OpenSSL, e.g. for XMPP or for HTTPS). People sometimes ask me why. I haven’t seen anyone talk about this, or provide a writeup, that is consistent with my views. So I wanted to write about my motivation, so that it is easy for me to refer to, and hopefully to inspire others to think similarily. Or to provoke discussion and disagreement — that’s fine, and hopefully I will learn something.

Before proceeding, here is some context: When building new things, it is usually better to use the Elliptic Curve technology algorithm Ed25519 instead of RSA. There is also ECDSA — which has had a comparatively slow uptake, for a number of reasons — that is widely available and is a reasonable choice when Ed25519 is not available. There are also post-quantum algorithms, but they are newer and adopting them today requires a careful cost-benefit analysis.

First some background. RSA is an asymmetric public-key scheme, and relies on generating private keys which are the product of distinct prime numbers (typically two). The size of the resulting product, called the modulus n, is usually expressed in bit length and forms the key size. Historically RSA key sizes used to be a couple of hundred bits, then 512 bits settled as a commonly used size. With better understanding of RSA security levels, the common key size evolved into 768, 1024, and later 2048. Today’s recommendations (see keylength.com) suggest that 2048 is on the weak side for long-term keys (5+ years), so there has been a trend to jump to 4096. The performance of RSA private-key operations starts to suffer at 4096, and the bandwidth requirements is causing issues in some protocols. Today 2048 and 4096 are the most common choices.

My preference for non-2048/4096 RSA key sizes is based on the simple and naïve observation that if I would build a RSA key cracker, there is some likelihood that I would need to optimize the implementation for a particular key size in order to get good performance. Since 2048 and 4096 are dominant today, and 1024 were dominent some years ago, it may be feasible to build optimized versions for these three key sizes.

My observation is a conservative decision based on speculation, and speculation on several levels. First I assume that there is an attack on RSA that we don’t know about. Then I assume that this attack is not as efficient for some key sizes than others, either on a theoretical level, at implementation level (optimized libraries for certain characteristics), or at an economic/human level (decision to focus on common key sizes). Then I assume that by avoiding the efficient key sizes I can increase the difficulty to a sufficient level.

Before analyzing whether those assumptions even remotely may make sense, it is useful to understand what is lost by selecting uncommon key sizes. This is to understand the cost of the trade-off.

A significant burden would be if implementations didn’t allow selecting unusual key sizes. In my experience, enough common applications support uncommon key sizes, for example GnuPG, OpenSSL, OpenSSH, FireFox, and Chrome. Some applications limit the permitted choices; this appears to be rare, but I have encountered it once. Some environments also restrict permitted choices, for example I have experienced that LetsEncrypt has introduced a requirement for RSA key sizes to be a multiples of 8. I noticed this since I chose a RSA key size of 3925 for my blog and received a certificate from LetsEncrypt in December 2015 however during renewal in 2016 it lead to an error message about the RSA key size. Some commercial CAs that I have used before restrict the RSA key size to one of 1024, 2048 or 4096 only. Some smart-cards also restrict the key sizes, sadly the YubiKey has this limitation. So it is not always possible, but possible often enough for me to be worthwhile.

Another cost is that RSA signature operations are slowed down. This is because the exponentiation function is faster than multiplication, and if the bit pattern of the RSA key is a 1 followed by several 0’s, it is quicker to compute. I have not done benchmarks, but I have not experienced that this is a practical problem for me. I don’t notice RSA operations in the flurry of all of other operations (network, IO) that is usually involved in my daily life. Deploying this on a large scale may have effects, of course, so benchmarks would be interesting.

Back to the speculation that leads me to this choice. The first assumption is that there is an attack on RSA that we don’t know about. In my mind, until there are proofs that the currently known attacks (GNFS-based attacks) are the best that can be found, or at least some heuristic argument that we can’t do better than the current attacks, the probability for an unknown RSA attack is therefor, as strange as it may sound, 100%.

The second assumption is that the unknown attack(s) are not as efficient for some key sizes than others. That statement can also be expressed like this: the cost to mount the attack is higher for some key sizes compared to others.

At the implementation level, it seems reasonable to assume that implementing a RSA cracker for arbitrary key sizes could be more difficult and costlier than focusing on particular key sizes. Focusing on some key sizes allows optimization and less complex code.

At the mathematical level, the assumption that the attack would be costlier for certain types of RSA key sizes appears dubious. It depends on the kind of algorithm the unknown attack is. For something similar to GNFS attacks, I believe the same algorithm applies equally for a RSA key size of 2048, 2730 and 4096 and that the running time depends mostly on the key size. Other algorithms that could crack RSA, such as some approximation algorithms, does not seem likely to be thwarted by using non-standard RSA key sizes either. I am not a mathematician though.

At the economical or human level, it seems reasonable to say that if you can crack 95% of all keys out there (sizes 1024, 2048, 4096) then that is good enough and cracking the last 5% is just diminishing returns of the investment. Here I am making up the 95% number. Currently, I would guess that more than 95% of all RSA key sizes on the Internet are 1024, 2048 or 4096 though. So this aspect holds as long as people behave as they have done.

The final assumption is that by using non-standard key sizes I raise the bar sufficiently high to make an attack impossible. To be honest, this scenario appears unlikely. However it might increase the cost somewhat, by a factor or two or five. Which might make someone target a lower hanging fruit instead.

Putting my argument together, I have 1) identified some downsides of using non-standard RSA Key sizes and discussed their costs and implications, and 2) mentioned some speculative upsides of using non-standard key sizes. I am not aware of any argument that the odds of my speculation is 0% likely to be true. It appears there is some remote chance, higher than 0%, that my speculation is true. Therefor, my personal conservative approach is to hedge against this unlikely, but still possible, attack scenario by paying the moderate cost to use non-standard RSA key sizes. Of course, the QA engineer in me also likes to break things by not doing what everyone else does, so I end this with an ObXKCD.

Let’s Encrypt Clients

As many others, I have been following the launch of Let’s Encrypt. Let’s Encrypt is a new zero-cost X.509 Certificate Authority that supports the Automated Certificate Management Environment (ACME) protocol. ACME allow you to automate creation and retrieval of HTTPS server certificates. As anyone who has maintained a number of HTTPS servers can attest, this process has unfortunately been manual, error-prone and differ between CAs.

On some of my personal domains, such as this blog.josefsson.org, I have been using the CACert authority to sign the HTTPS server certificate. The problem with CACert is that the CACert trust anchors aren’t shipped with sufficient many operating systems and web browsers. The user experience is similar to reaching a self-signed server certificate. For organization-internal servers that you don’t want to trust external parties for, I continue to believe that running your own CA and distributing it to your users is better than using a public CA (compare my XMPP server certificate setup). But for public servers, availability without prior configuration is more important. Therefor I decided that my public HTTPS servers should use a CA/Browser Forum-approved CA with support for ACME, and as long as Let’s Encrypt is trustworthy and zero-cost, they are a good choice.

I was in need of a free software ACME client, and set out to research what’s out there. Unfortunately, I did not find any web pages that listed the available options and compared them. The Let’s Encrypt CA points to the “official” Let’s Encrypt client, written by Jakub Warmuz, James Kasten, Peter Eckersley and several others. The manual contain pointers to two other clients in a seamingly unrelated section. Those clients are letsencrypt-nosudo by Daniel Roesler et al, and simp_le by (again!) Jakub Warmuz. From the letsencrypt.org’s client-dev mailing list I also found letsencrypt.sh by Gerhard Heift and LetsEncryptShell by Jan Mojžíš. Is anyone aware of other ACME clients?

By comparing these clients, I learned what I did not like in them. I wanted something small so that I can audit it. I want something that doesn’t require root access. Preferably, it should be able to run on my laptop, since I wasn’t ready to run something on the servers. Generally, it has to be Secure, which implies something about how it approaches private key handling. The letsencrypt official client can do everything, and has plugin for various server software to automate the ACME negotiation. All the cryptographic operations appear to be hidden inside the client, which usually means it is not flexible. I really did not like how it was designed, it looks like your typical monolithic proof-of-concept design. The simp_le client looked much cleaner, and gave me a good feeling. The letsencrypt.sh client is simple and written in /bin/sh shell script, but it appeared a bit too simplistic. The LetsEncryptShell looked decent, but I wanted something more automated.

What all of these clients did not have, and that letsencrypt-nosudo client had, was the ability to let me do the private-key operations. All the operations are done interactively on the command-line using OpenSSL. This would allow me to put the ACME user private key, and the HTTPS private key, on a YubiKey, using its PIV applet and techniques similar to what I used to create my SSH host CA. While the HTTPS private key has to be available on the HTTPS server (used to setup TLS connections), I wouldn’t want the ACME user private key to be available there. Similarily, I wouldn’t want to have the ACME or the HTTPS private key on my laptop. The letsencrypt-nosudo tool is otherwise more rough around the edges than the more cleaner simp_le client. However the private key handling aspect was the deciding matter for me.

After fixing some hard-coded limitations on RSA key sizes, getting the cert was as simple as following the letsencrypt-nosudo instructions. I’ll follow up with a later post describing how to put the ACME user private key and the HTTPS server certificate private key on a YubiKey and how to use that with letsencrypt-nosudo.

So you can now enjoy browsing my blog over HTTPS! Thank you Let’s Encrypt!

Cosmos – A Simple Configuration Management System

Back in early 2012 I had been helping with system administration of a number of Debian/Ubuntu-based machines, and the odd Solaris machine, for a couple of years at $DAYJOB. We had a combination of hand-written scripts, documentation notes that we cut’n’paste’d from during installation, and some locally maintained Debian packages for pulling in dependencies and providing some configuration files. As the number of people and machines involved grew, I realized that I wasn’t happy with how these machines were being administrated. If one of these machines would disappear in flames, it would take time (and more importantly, non-trivial manual labor) to get its services up and running again. I wanted a system that could automate the complete configuration of any Unix-like machine. It should require minimal human interaction. I wanted the configuration files to be version controlled. I wanted good security properties. I did not want to rely on a centralized server that would be a single point of failure. It had to be portable and be easy to get to work on new (and very old) platforms. It should be easy to modify a configuration file and get it deployed. I wanted it to be easy to start to use on an existing server. I wanted it to allow for incremental adoption. Surely this must exist, I thought.

During January 2012 I evaluated the existing configuration management systems around, like CFEngine, Chef, and Puppet. I don’t recall my reasons for rejecting each individual project, but needless to say I did not find what I was looking for. The reasons for rejecting the projects I looked at ranged from centralization concerns (single-point-of-failure central servers), bad security (no OpenPGP signing integration), to the feeling that the projects were too complex and hence fragile. I’m sure there were other reasons too.

In February I started going back to my original needs and tried to see if I could abstract something from the knowledge that was in all these notes, script snippets and local dpkg packages. I realized that the essence of what I wanted was one shell script per machine, OpenPGP signed, in a Git repository. I could check out that Git repository on every new machine that I wanted to configure, verify the OpenPGP signature of the shell script, and invoke the script. The script would do everything needed to get the machine up into an operational stage again, including package installation and configuration file changes. Since I would usually want to modify configuration files on a system even after its initial installation (hey not everyone is perfect), it was natural to extend this idea to a cron job that did ‘git pull’, verified the OpenPGP signature, and ran the script. The script would then have to be a bit more clever and not redo everything every time.

Since we had many machines, it was obvious that there would be huge code duplication between scripts. It felt natural to think of splitting up the shell script into a directory with many smaller shell scripts, and invoke each shell script in turn. Think of the /etc/init.d/ hierarchy and how it worked with System V initd. This would allow re-use of useful snippets across several machines. The next realization was that large parts of the shell script would be to create configuration files, such as /etc/network/interfaces. It would be easier to modify the content of those files if they were stored as files in a separate directory, an “overlay” stored in a sub-directory overlay/, and copied into the file system’s hierarchy with rsync. The final realization was that it made some sense to run one set of scripts before rsync’ing in the configuration files (to be able to install packages or set things up for the configuration files to make sense), and one set of scripts after the rsync (to perform tasks that require some package to be installed and configured). These set of scripts were called the “pre-tasks” and “post-tasks” respectively, and stored in sub-directories called pre-tasks.d/ and post-tasks.d/.

I started putting what would become Cosmos together during February 2012. Incidentally, I had been using etckeeper on our machines, and I had been reading its source code, and it greatly inspired the internal design of Cosmos. The git history shows well how the ideas evolved — even that Cosmos was initially called Eve but in retrospect I didn’t like the religious connotations — and there were a couple of rewrites on the way, but on the 28th of February I pushed out version 1.0. It was in total 778 lines of code, with at least 200 of those lines being the license boiler plate at the top of each file. Version 1.0 had a debian/ directory and I built the dpkg file and started to deploy on it some machines. There were a couple of small fixes in the next few days, but development stopped on March 5th 2012. We started to use Cosmos, and converted more and more machines to it, and I quickly also converted all of my home servers to use it. And even my laptops. It took until September 2014 to discover the first bug (the fix is a one-liner). Since then there haven’t been any real changes to the source code. It is in daily use today.

The README that comes with Cosmos gives a more hands-on approach on using it, which I hope will serve as a starting point if the above introduction sparked some interest. I hope to cover more about how to use Cosmos in a later blog post. Since Cosmos does so little on its own, to make sense of how to use it, you want to see a Git repository with machine models. If you want to see how the Git repository for my own machines looks you can see the sjd-cosmos repository. Don’t miss its README at the bottom. In particular, its global/ sub-directory contains some of the foundation, such as OpenPGP key trust handling.

SSH Host Certificates with YubiKey NEO

If you manage a bunch of server machines, you will undoubtedly have run into the following OpenSSH question:

The authenticity of host 'host.example.org (1.2.3.4)' can't be established.
RSA key fingerprint is 1b:9b:b8:5e:74:b1:31:19:35:48:48:ba:7d:d0:01:f5.
Are you sure you want to continue connecting (yes/no)?

If the server is a single-user machine, where you are the only person expected to login on it, answering “yes” once and then using the ~/.ssh/known_hosts file to record the key fingerprint will (sort-of) work and protect you against future man-in-the-middle attacks. I say sort-of, since if you want to access the server from multiple machines, you will need to sync the known_hosts file somehow. And once your organization grows larger, and you aren’t the only person that needs to login, having a policy that everyone just answers “yes” on first connection on all their machines is bad. The risk that someone is able to successfully MITM attack you grows every time someone types “yes” to these prompts.

Setting up one (or more) SSH Certificate Authority (CA) to create SSH Host Certificates, and have your users trust this CA, will allow you and your users to automatically trust the fingerprint of the host through the indirection of the SSH Host CA. I was surprised (but probably shouldn’t have been) to find that deploying this is straightforward. Even setting this up with hardware-backed keys, stored on a YubiKey NEO, is easy. Below I will explain how to set this up for a hypothethical organization where two persons (sysadmins) are responsible for installing and configuring machines.

I’m going to assume that you already have a couple of hosts up and running and that they run the OpenSSH daemon, so they have a /etc/ssh/ssh_host_rsa_key* public/private keypair, and that you have one YubiKey NEO with the PIV applet and that the NEO is in CCID mode. I don’t believe it matters, but I’m running a combination of Debian and Ubuntu machines. The Yubico PIV tool is used to configure the YubiKey NEO, and I will be using OpenSC‘s PKCS#11 library to connect OpenSSH with the YubiKey NEO. Let’s install some tools:

apt-get install yubikey-personalization yubico-piv-tool opensc-pkcs11 pcscd

Every person responsible for signing SSH Host Certificates in your organization needs a YubiKey NEO. For my example, there will only be two persons, but the number could be larger. Each one of them will have to go through the following process.

The first step is to prepare the NEO. First mode switch it to CCID using some device configuration tool, like yubikey-personalization.

ykpersonalize -m1

Then prepare the PIV applet in the YubiKey NEO. This is covered by the YubiKey NEO PIV Introduction but I’ll reproduce the commands below. Do this on a disconnected machine, saving all files generated on one or more secure media and store that in a safe.

user=simon
key=`dd if=/dev/random bs=1 count=24 2>/dev/null | hexdump -v -e '/1 "%02X"'`
echo $key > ssh-$user-key.txt
pin=`dd if=/dev/random bs=1 count=6 2>/dev/null | hexdump -v -e '/1 "%u"'|cut -c1-6`
echo $pin > ssh-$user-pin.txt
puk=`dd if=/dev/random bs=1 count=6 2>/dev/null | hexdump -v -e '/1 "%u"'|cut -c1-8`
echo $puk > ssh-$user-puk.txt

yubico-piv-tool -a set-mgm-key -n $key
yubico-piv-tool -k $key -a change-pin -P 123456 -N $pin
yubico-piv-tool -k $key -a change-puk -P 12345678 -N $puk

Then generate a RSA private key for the SSH Host CA, and generate a dummy X.509 certificate for that key. The only use for the X.509 certificate is to make PIV/PKCS#11 happy — they want to be able to extract the public-key from the smartcard, and do that through the X.509 certificate.

openssl genrsa -out ssh-$user-ca-key.pem 2048
openssl req -new -x509 -batch -key ssh-$user-ca-key.pem -out ssh-$user-ca-crt.pem

You import the key and certificate to the PIV applet as follows:

yubico-piv-tool -k $key -a import-key -s 9c < ssh-$user-ca-key.pem
yubico-piv-tool -k $key -a import-certificate -s 9c < ssh-$user-ca-crt.pem

You now have a SSH Host CA ready to go! The first thing you want to do is to extract the public-key for the CA, and you use OpenSSH's ssh-keygen for this, specifying OpenSC's PKCS#11 module.

ssh-keygen -D /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so -e > ssh-$user-ca-key.pub

If you happen to use YubiKey NEO with OpenPGP using gpg-agent/scdaemon, you may get the following error message:

no slots
cannot read public key from pkcs11

The reason is that scdaemon exclusively locks the smartcard, so no other application can access it. You need to kill scdaemon, which can be done as follows:

gpg-connect-agent SCD KILLSCD SCD BYE /bye

The output from ssh-keygen may look like this:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCp+gbwBHova/OnWMj99A6HbeMAGE7eP3S9lKm4/fk86Qd9bzzNNz2TKHM7V1IMEj0GxeiagDC9FMVIcbg5OaSDkuT0wGzLAJWgY2Fn3AksgA6cjA3fYQCKw0Kq4/ySFX+Zb+A8zhJgCkMWT0ZB0ZEWi4zFbG4D/q6IvCAZBtdRKkj8nJtT5l3D3TGPXCWa2A2pptGVDgs+0FYbHX0ynD0KfB4PmtR4fVQyGJjJ0MbF7fXFzQVcWiBtui8WR/Np9tvYLUJHkAXY/FjLOZf9ye0jLgP1yE10+ihe7BCxkM79GU9BsyRgRt3oArawUuU6tLgkaMN8kZPKAdq0wxNauFtH

Now all your users in your organization needs to add a line to their ~/.ssh/known_hosts as follows:

@cert-authority *.example.com ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCp+gbwBHova/OnWMj99A6HbeMAGE7eP3S9lKm4/fk86Qd9bzzNNz2TKHM7V1IMEj0GxeiagDC9FMVIcbg5OaSDkuT0wGzLAJWgY2Fn3AksgA6cjA3fYQCKw0Kq4/ySFX+Zb+A8zhJgCkMWT0ZB0ZEWi4zFbG4D/q6IvCAZBtdRKkj8nJtT5l3D3TGPXCWa2A2pptGVDgs+0FYbHX0ynD0KfB4PmtR4fVQyGJjJ0MbF7fXFzQVcWiBtui8WR/Np9tvYLUJHkAXY/FjLOZf9ye0jLgP1yE10+ihe7BCxkM79GU9BsyRgRt3oArawUuU6tLgkaMN8kZPKAdq0wxNauFtH

Each sysadmin needs to go through this process, and each user needs to add one line for each sysadmin. While you could put the same key/certificate on multiple YubiKey NEOs, to allow users to only have to put one line into their file, dealing with revocation becomes a bit more complicated if you do that. If you have multiple CA keys in use at the same time, you can roll over to new CA keys without disturbing production. Users may also have different policies for different machines, so that not all sysadmins have the power to create host keys for all machines in your organization.

The CA setup is now complete, however it isn't doing anything on its own. We need to sign some host keys using the CA, and to configure the hosts' sshd to use them. What you could do is something like this, for every host host.example.com that you want to create keys for:

h=host.example.com
scp root@$h:/etc/ssh/ssh_host_rsa_key.pub .
gpg-connect-agent "SCD KILLSCD" "SCD BYE" /bye
ssh-keygen -D /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so -s ssh-$user-ca-key.pub -I $h -h -n $h -V +52w ssh_host_rsa_key.pub
scp ssh_host_rsa_key-cert.pub root@$h:/etc/ssh/

The ssh-keygen command will use OpenSC's PKCS#11 library to talk to the PIV applet on the NEO, and it will prompt you for the PIN. Enter the PIN that you set above. The output of the command would be something like this:

Enter PIN for 'PIV_II (PIV Card Holder pin)': 
Signed host key ssh_host_rsa_key-cert.pub: id "host.example.com" serial 0 for host.example.com valid from 2015-06-16T13:39:00 to 2016-06-14T13:40:58

The host now has a SSH Host Certificate installed. To use it, you must make sure that /etc/ssh/sshd_config has the following line:

HostCertificate /etc/ssh/ssh_host_rsa_key-cert.pub

You need to restart sshd to apply the configuration change. If you now try to connect to the host, you will likely still use the known_hosts fingerprint approach. So remove the fingerprint from your machine:

ssh-keygen -R $h

Now if you attempt to ssh to the host, and using the -v parameter to ssh, you will see the following:

debug1: Server host key: RSA-CERT 1b:9b:b8:5e:74:b1:31:19:35:48:48:ba:7d:d0:01:f5
debug1: Host 'host.example.com' is known and matches the RSA-CERT host certificate.

Success!

One aspect that may warrant further discussion is the host keys. Here I only created host certificates for the hosts' RSA key. You could create host certificate for the DSA, ECDSA and Ed25519 keys as well. The reason I did not do that was that in this organization, we all used GnuPG's gpg-agent/scdaemon with YubiKey NEO's OpenPGP Card Applet with RSA keys for user authentication. So only the host RSA key is relevant.

Revocation of a YubiKey NEO key is implemented by asking users to drop the corresponding line for one of the sysadmins, and regenerate the host certificate for the hosts that the sysadmin had created host certificates for. This is one reason users should have at least two CAs for your organization that they trust for signing host certificates, so they can migrate away from one of them to the other without interrupting operations.

Scrypt in IETF

Colin Percival and I have worked on an internet-draft on scrypt for some time. I realize now that the -00 draft was published over two years ago, turning this effort today somewhat into archeology rather than rocket science. Still, having a published RFC that is easy to refer to from other Internet protocols will hopefully help to establish the point that PBKDF2 alone no longer provides state-of-the-art protection for password hashing.

I have written about password hashing before where I give a quick introduction to the basic concepts in the context of the well-known PBKDF2 algorithm. The novelty in scrypt is that it is designed to combat brute force and hardware accelerated attacks on hashed password databases. Briefly, scrypt expands the password and salt (using PBKDF2 as a component) and then uses that to create a large array (typically tens or hundreds of megabytes) using the Salsa20 core hash function and then de-references that large array in a random and sequential pattern. There are three parameters to the scrypt function: a CPU/Memory cost parameter N (varies, typical values are 16384 or 1048576), a blocksize parameter r (typically 8), and a parallelization parameter p (typically a low number like 1 or 16). The process is described in the draft, and there are further discussions in Colin’s original scrypt paper.

The document has been stable for some time, and we are now asking for it to be published. Thus now is good time to provide us with feedback on the document. The live document on gitlab is available if you want to send us a patch.

Certificates for XMPP/Jabber

I am revamping my XMPP server and I’ve written down notes on how to set up certificates to enable TLS.

I will run Debian Jessie with JabberD 2.x, using the recent jabberd2 jessie-backport. The choice of server software is not significant for the rest of this post.

Running XMPP over TLS is a good idea. So I need a X.509 PKI for this purpose. I don’t want to use a third-party Certificate Authority, since that gives them the ability to man-in-the-middle my XMPP connection. Therefor I want to create my own CA. I prefer tightly scoped (per-purpose or per-application) CAs, so I will set up a CA purely to issue certificates for my XMPP server.

The current XMPP specification, RFC 6120, includes a long section 13.7 that discuss requirements on Certificates.

One complication is the requirement to include an AIA for OCSP/CRLs — fortunately, it is not a strict “MUST” requirement but a weaker “SHOULD”. I note that checking revocation using OCSP and CRL is a “MUST” requirement for certificate validation — some specification language impedence mismatch at work there.

The specification demand that the CA certificate MUST have a keyUsage extension with the digitalSignature bit set. This feels odd to me, and I’m wondering if keyCertSign was intended instead. Nothing in the XMPP document, nor in any PKIX document as far as I am aware of, will verify that the digitalSignature bit is asserted in a CA certificate. Below I will assert both bits, since a CA needs the keyCertSign bit and the digitalSignature bit seems unnecessary but mostly harmless.

My XMPP/Jabber server will be “chat.sjd.se” and my JID will be “simon@josefsson.org”. This means the server certificate need to include references to both these domains. The relevant DNS records for the “josefsson.org” zone is as follows, see section 3.2.1 of RFC 6120 for more background.

_xmpp-client._tcp.josefsson.org.	IN	SRV 5 0 5222 chat.sjd.se.
_xmpp-server._tcp.josefsson.org.	IN	SRV 5 0 5269 chat.sjd.se.

The DNS records or the “sjd.se” zone is as follows:

chat.sjd.se.	IN	A	...
chat.sjd.se.	IN	AAAA	...

The following commands will generate the private key and certificate for the CA. In a production environment, you would keep the CA private key in a protected offline environment. I’m asserting a expiration date ~30 years in the future. While I dislike arbitrary limits, I believe this will be many times longer than the anticipated lifelength of this setup.

openssl genrsa -out josefsson-org-xmpp-ca-key.pem 3744
cat > josefsson-org-xmpp-ca-crt.conf << EOF
[ req ]
x509_extensions = v3_ca
distinguished_name = req_distinguished_name
prompt = no
[ req_distinguished_name ]
CN=XMPP CA for josefsson.org
[ v3_ca ]
subjectKeyIdentifier=hash
basicConstraints = CA:true
keyUsage=critical, digitalSignature, keyCertSign
EOF
openssl req -x509 -set_serial 1 -new -days 11147 -sha256 -config josefsson-org-xmpp-ca-crt.conf -key josefsson-org-xmpp-ca-key.pem -out josefsson-org-xmpp-ca-crt.pem

Let’s generate the private key and server certificate for the XMPP server. The wiki page on XMPP certificates is outdated wrt PKIX extensions. I will embed a SRV-ID field, as discussed in RFC 6120 section 13.7.1.2.1 and RFC 4985. I chose to skip the XmppAddr identifier type, even though the specification is somewhat unclear about it: section 13.7.1.2.1 says that it “is no longer encouraged in certificates issued by certification authorities” while section 13.7.1.4 says “Use of the ‘id-on-xmppAddr’ format is RECOMMENDED in the generation of certificates”. The latter quote should probably have been qualified to say “client certificates” rather than “certificates”, since the latter can refer to both client and server certificates.

Note the use of a default expiration time of one month: I believe in frequent renewal of entity certificates, rather than use of revocation mechanisms.

openssl genrsa -out josefsson-org-xmpp-server-key.pem 3744
cat > josefsson-org-xmpp-server-csr.conf << EOF
[ req ]
distinguished_name = req_distinguished_name
prompt = no
[ req_distinguished_name ]
CN=XMPP server for josefsson.org
EOF
openssl req -sha256 -new -config josefsson-org-xmpp-server-csr.conf -key josefsson-org-xmpp-server-key.pem -nodes -out josefsson-org-xmpp-server-csr.pem
cat > josefsson-org-xmpp-server-crt.conf << EOF
subjectAltName=@san
[san]
DNS=chat.sjd.se
otherName.0=1.3.6.1.5.5.7.8.7;UTF8:_xmpp-server.josefsson.org
otherName.1=1.3.6.1.5.5.7.8.7;UTF8:_xmpp-client.josefsson.org
EOF
openssl x509 -sha256 -CA josefsson-org-xmpp-ca-crt.pem -CAkey josefsson-org-xmpp-ca-key.pem -set_serial 2 -req -in josefsson-org-xmpp-server-csr.pem -out josefsson-org-xmpp-server-crt.pem -extfile josefsson-org-xmpp-server-crt.conf

With this setup, my XMPP server can be tested by the XMPP IM Observatory. You can see the c2s test results and the s2s test results. Of course, there are warnings regarding the trust anchor issue. It complains about a self-signed certificate in the chain. This is permitted but not recommended — however when the trust anchor is not widely known, I find it useful to include it. This allows people to have a mechanism of fetching the trust anchor certificate should they want to. Some weaker cipher suites trigger warnings, which is more of a jabberd2 configuration issue and/or a concern with jabberd2 defaults.

My jabberd2 configuration is simple — in c2s.xml I add a <id> entity with the “require-starttls”, “cachain”, and “pemfile” fields. In s2s.xml, I have the <pemfile>, <resolve-ipv6>, and <require-tls> entities.

Some final words are in order. While this setup will result in use of TLS for XMPP connections (c2s and s2s), other servers are unlikely to find my CA trust anchor, let alone be able to trust it for verifying my server certificate. I’m happy to read about Peter Saint-Andre’s recent SSL/TLS work, and in particular I will follow the POSH effort.

EdDSA and Ed25519 goes to IETF

After meeting Niels Möller at FOSDEM and learning about his Ed25519 implementation in GNU Nettle, I started working on a simple-to-implement description of Ed25519. The goal is to help implementers of various IETF (and non-IETF) protocols add support for Ed25519. As many are aware, OpenSSH and GnuPG has support for Ed25519 in recent versions, and OpenBSD since the v5.5 May 2014 release are signed with Ed25519. The paper describing EdDSA and Ed25519 is not aimed towards implementers, and does not include test vectors. I felt there were room for improvement to get wider and more accepted adoption.

Our work is published in the IETF as draft-josefsson-eddsa-ed25519 and we are soliciting feedback from implementers and others. Please help us iron out the mistakes in the document, and point out what is missing. For example, what could be done to help implementers avoid side-channel leakage? I don’t think the draft is the place for optimized and side-channel free implementations, and it is also not the place for a comprehensive tutorial on side-channel free programming. But maybe there is a middle ground where we can say something more than what we can do today. Ideas welcome!

OpenPGP Smartcards and GNOME

The combination of GnuPG and a OpenPGP smartcard (such as the YubiKey NEO) has been implemented and working well for around a decade. I recall starting to use it when I received a FSFE Fellowship card long time ago. Sadly there has been some regressions when using them under GNOME recently. I reinstalled my laptop with Debian Jessie (beta2) recently, and now took the time to work through the issue and write down a workaround.

To work with GnuPG and smartcards you install GnuPG agent, scdaemon, pscsd and pcsc-tools. On Debian you can do it like this:

apt-get install gnupg-agent scdaemon pcscd pcsc-tools

Use the pcsc_scan command line tool to make sure pcscd recognize the smartcard before continuing, if that doesn’t recognize the smartcard nothing beyond this point will work. The next step is to make sure you have the following line in ~/.gnupg/gpg.conf:

use-agent

Logging out and into GNOME should start gpg-agent for you, through the /etc/X11/Xsession.d/90gpg-agent script. In theory, this should be all that is required. However, when you start a terminal and attempt to use the smartcard through GnuPG you would get an error like this:

jas@latte:~$ gpg --card-status
gpg: selecting openpgp failed: unknown command
gpg: OpenPGP card not available: general error
jas@latte:~$

The reason is that the GNOME Keyring hijacks the GnuPG agent’s environment variables and effectively replaces gpg-agent with gnome-keyring-daemon which does not support smartcard commands (Debian bug #773304). GnuPG uses the environment variable GPG_AGENT_INFO to find the location of the agent socket, and when the GNOME Keyring is active it will typically look like this:

jas@latte:~$ echo $GPG_AGENT_INFO 
/run/user/1000/keyring/gpg:0:1
jas@latte:~$ 

If you use GnuPG with a smartcard, I recommend to disable GNOME Keyring’s GnuPG and SSH agent emulation code. This used to be easy to achieve in older GNOME releases (e.g., the one included in Debian Wheezy), through the gnome-session-properties GUI. Sadly there is no longer any GUI for disabling this functionality (Debian bug #760102). The GNOME Keyring GnuPG/SSH agent replacement functionality is invoked through the XDG autostart mechanism, and the documented way to disable system-wide services for a normal user account is to invoke the following commands.

jas@latte:~$ mkdir ~/.config/autostart
jas@latte:~$ cp /etc/xdg/autostart/gnome-keyring-gpg.desktop ~/.config/autostart/
jas@latte:~$ echo 'Hidden=true' >> ~/.config/autostart/gnome-keyring-gpg.desktop 
jas@latte:~$ cp /etc/xdg/autostart/gnome-keyring-ssh.desktop ~/.config/autostart/
jas@latte:~$ echo 'Hidden=true' >> ~/.config/autostart/gnome-keyring-ssh.desktop 
jas@latte:~$ 

You now need to logout and login again. When you start a terminal, you can look at the GPG_AGENT_INFO environment variable again and everything should be working again.

jas@latte:~$ echo $GPG_AGENT_INFO 
/tmp/gpg-dqR4L7/S.gpg-agent:1890:1
jas@latte:~$ echo $SSH_AUTH_SOCK 
/tmp/gpg-54VfLs/S.gpg-agent.ssh
jas@latte:~$ gpg --card-status
Application ID ...: D2760001240102000060000000420000
...
jas@latte:~$ ssh-add -L
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDFP+UOTZJ+OXydpmbKmdGOVoJJz8se7lMs139T+TNLryk3EEWF+GqbB4VgzxzrGjwAMSjeQkAMb7Sbn+VpbJf1JDPFBHoYJQmg6CX4kFRaGZT6DHbYjgia59WkdkEYTtB7KPkbFWleo/RZT2u3f8eTedrP7dhSX0azN0lDuu/wBrwedzSV+AiPr10rQaCTp1V8sKbhz5ryOXHQW0Gcps6JraRzMW+ooKFX3lPq0pZa7qL9F6sE4sDFvtOdbRJoZS1b88aZrENGx8KSrcMzARq9UBn1plsEG4/3BRv/BgHHaF+d97by52R0VVyIXpLlkdp1Uk4D9cQptgaH4UAyI1vr cardno:006000000042
jas@latte:~$ 

That’s it. Resolving this properly involves 1) adding smartcard code to the GNOME Keyring, 2) disabling the GnuPG/SSH replacement code in GNOME Keyring completely, 3) reorder the startup so that gpg-agent supersedes gnome-keyring-daemon instead of vice versa, so that people who installed the gpg-agent really gets it instead of the GNOME default, or 4) something else. I don’t have a strong opinion on how to solve this, but 3) sounds like a simple way forward.

Dice Random Numbers

Generating data with entropy, or random number generation (RNG), is a well-known difficult problem. Many crypto algorithms and protocols assumes random data is available. There are many implementations out there, including /dev/random in the BSD and Linux kernels and API calls in crypto libraries such as GnuTLS or OpenSSL. How they work can be understood by reading the source code. The quality of the data depends on actual hardware and what entropy sources were available — the RNG implementation itself is deterministic, it merely convert data with supposed entropy from a set of data sources and then generate an output stream.

In some situations, like on virtualized environments or on small embedded systems, it is hard to find sources of sufficient quantity. Rarely are there any lower-bound estimates on how much entropy there is in the data you get. You can improve the RNG issue by using a separate hardware RNG, but there is deployment complexity in that, and from a theoretical point of view, the problem of trusting that you get good random data merely moved from one system to another. (There is more to say about hardware RNGs, I’ll save that for another day.)

For some purposes, the available solutions does not inspire enough confidence in me because of the high complexity. Complexity is often the enemy of security. In crypto discussions I have said, only half-jokingly, that about the only RNG process that I would trust is one that I can explain in simple words and implement myself with the help of pen and paper. Normally I use the example of rolling a normal six-sided dice (a D6) several times. I have been thinking about this process in more detail lately, and felt it was time to write it down, regardless of how silly it may seem.

A die with six sides produces a random number between 1 and 6. It is relatively straight forward to intuitively convinced yourself that it is not clearly biased: inspect that it looks symmetric and do some trial rolls. By repeatedly rolling the die, you can generate how much data you need, time permitting.

I do not understand enough thermodynamics to know how to estimate the amount of entropy of a physical process, so I need to resort to intuitive arguments. It would be easy to just assume that a die produces 2.5 bits of entropy, because log_2(6)~=2.584. At least I find it easy to convince myself intuitively that 2.5 bits is an upper bound, there appears to me to be no way to get out more entropy than that out looking at a die roll outcome. I suspect that most dice have some form of defect, though, which leads to a very small bias that could be found with a large number of rolls. Thus I would propose that the amount of entropy of most D6’s are slightly below 2.5 bits on average. Further, to establish a lower bound, and intuitively, it seems easy to believe that if the entropy of particular D6 would be closer to 2 bits than to 2.5 bits, this would be noticeable fairly quickly by trial rolls. That assumes the die does not have complex logic and machinery in it that would hide the patterns. With the tinfoil hat on, consider a die with a power source and mechanics in it that allowed it to decide which number it would land on: it could generate seamingly-looking random pattern that still contained 0 bits of entropy. For example, suppose a D6 is built to produce the pattern 4, 1, 4, 2, 1, 3, 5, 6, 2, 3, 1, 3, 6, 3, 5, 6, 4, … this would mean it produces 0 bits of entropy (compare the numbers with the decimals of sqrt(2)). Other factors may also influence the amount of entropy in the output, consider if you roll the die by just dropping straight down from 1cm/1inch above the table. There could also be other reasons why the amount of entropy in a die roll is more limited, intuitive arguments are sometimes completely wrong! With this discussion as background, and for simplicity, going forward, I will assume that my D6 produces 2.5 bits of entropy on every roll.

We need to figure out how many times we need to roll it. I usually find myself needing a 128-bit random number (16 bytes). Crypto algorithms and protocols typically use power-of-2 data sizes. 64 bits of entropy results in brute-force attacks requiring about 2^64 tests, and for many operations, this is feasible with today’s computing power. Performing 2^128 operations does not seem possible with today’s technology. To produce 128 bits of entropy using a D6 that produces 2.5 bits of entropy per roll, you need to perform ceil(128/2.5)=52 rolls.

We also need to design an algorithm to convert the D6 output into the resulting 128-bit random number. While it would be nice from a theoretical point of view to let each and every bit of the D6 output influence each and every bit of the 128-bit random number, this becomes difficult to do with pen and paper. Update:This blog post used to include an algorithm here, however it was clearly wrong (written too late in the evening…) so I’ve removed it — I need to come back and think more about this.

So what’s the next step? Depends on what you want to use the random data for. For some purposes, such as generating a high-quality 128-bit AES key, I would be done. The key is right there. To generate a high-quality ECC private key, you need to generate somewhat more randomness (matching the ECC curve size) and do a couple of EC operations. To generate a high-quality RSA private key, unfortunately you will need much more randomness, at the point where it makes more sense to implement a PRNG seeded with a strong 128-bit seed generated using this process. The latter approach is the general solution: generate 128 bits of data using the dice approach, and then seed a CSPRNG of your choice to get large number of data quickly. These steps are somewhat technical, and you lose the pen-and-paper properties, but code to implement these parts are easier to verify compared to verifying that you get good quality entropy out of your RNG implementation.

The Case for Short OpenPGP Key Validity Periods

After I moved to a new OpenPGP key (see key transition statement) I have received comments about the short life length of my new key. When I created the key (see my GnuPG setup) I set it to expire after 100 days. Some people assumed that I would have to create a new key then, and therefore wondered what value there is to sign a key that will expire in two months. It doesn’t work like that, and below I will explain how OpenPGP key expiration works; how to extend the expiration time of your key; and argue why having a relatively short validity period can be a good thing.
Continue reading The Case for Short OpenPGP Key Validity Periods