Classic McEliece goes to IETF and OpenSSH

My earlier work on Streamlined NTRU Prime has been progressing along. The IETF document on sntrup761 in SSH has passed several process points. GnuPG’s libgcrypt has added support for sntrup761. The libssh support for sntrup761 is working, but the merge request is stuck mostly due to lack of time to debug why the regression test suite sporadically errors out in non-sntrup761 related parts with the patch.

The foundation for lattice-based post-quantum algorithms has some uncertainty around it, and I have felt that there is more to the post-quantum story than adding sntrup761 to implementations. Classic McEliece has been mentioned to me a couple of times, and I took some time to learn it and did a cut’n’paste job of the proposed ISO standard and published draft-josefsson-mceliece in the IETF to make the algorithm easily available to the IETF community. A high-quality implementation of Classic McEliece has been published as libmceliece and I’ve been supporting the work of Jan Mojžíš to package libmceliece for Debian, alas it has been stuck in the ftp-master NEW queue for manual review for over two months. The pre-dependencies librandombytes and libcpucycles are available in Debian already.

All that text writing and packaging work set the scene to write some code. When I added support for sntrup761 in libssh, I became familiar with the OpenSSH code base, so it was natural to return to OpenSSH to experiment with a new SSH KEX for Classic McEliece. DJB suggested to pick mceliece6688128 and combine it with the existing X25519+sntrup761 or with plain X25519. While a three-algorithm hybrid between X25519, sntrup761 and mceliece6688128 would be a simple drop-in for those that don’t want to lose the benefits offered by sntrup761, I decided to start the journey on a pure combination of X25519 with mceliece6688128. The key combiner in sntrup761x25519 is a simple SHA512 call and the only good I can say about that is that it is simple to describe and implement, and doesn’t raise too many questions since it is already deployed.

After procrastinating coding for months, once I sat down to work it only took a couple of hours until I had a successful Classic McEliece SSH connection. I suppose my brain had sorted everything in background before I started. To reproduce it, please try the following in a Debian testing environment (I use podman to get a clean environment).

# podman run -it --rm debian:testing-slim
apt update
apt dist-upgrade -y
apt install -y wget python3 librandombytes-dev libcpucycles-dev gcc make git autoconf libz-dev libssl-dev

cd ~
wget -q -O- https://lib.mceliece.org/libmceliece-20230612.tar.gz | tar xfz -
cd libmceliece-20230612/
./configure
make install
ldconfig
cd ..

git clone https://gitlab.com/jas/openssh-portable
cd openssh-portable
git checkout jas/mceliece
autoreconf
./configure # verify 'libmceliece support: yes'
make # CC="cc -DDEBUG_KEX=1 -DDEBUG_KEXDH=1 -DDEBUG_KEXECDH=1"

You should now have a working SSH client and server that supports Classic McEliece! Verify support by running ./ssh -Q kex and it should mention mceliece6688128x25519-sha512@openssh.com.

To have it print plenty of debug outputs, you may remove the # character on the final line, but don’t use such a build in production.

You can test it as follows:

./ssh-keygen -A # writes to /usr/local/etc/ssh_host_...

# setup public-key based login by running the following:
./ssh-keygen -t rsa -f ~/.ssh/id_rsa -P ""
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys

adduser --system sshd
mkdir /var/empty

while true; do $PWD/sshd -p 2222 -f /dev/null; done &

./ssh -v -p 2222 localhost -oKexAlgorithms=mceliece6688128x25519-sha512@openssh.com date

On the client you should see output like this:

OpenSSH_9.5p1, OpenSSL 3.0.11 19 Sep 2023
...
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: mceliece6688128x25519-sha512@openssh.com
debug1: kex: host key algorithm: ssh-ed25519
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: SSH2_MSG_KEX_ECDH_REPLY received
debug1: Server host key: ssh-ed25519 SHA256:YognhWY7+399J+/V8eAQWmM3UFDLT0dkmoj3pIJ0zXs
...

debug1: Host '[localhost]:2222' is known and matches the ED25519 host key.
debug1: Found key in /root/.ssh/known_hosts:1
debug1: rekey out after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: rekey in after 134217728 blocks
...
debug1: Sending command: date
debug1: pledge: fork
debug1: permanently_set_uid: 0/0
Environment:
  USER=root
  LOGNAME=root
  HOME=/root
  PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin
  MAIL=/var/mail/root
  SHELL=/bin/bash
  SSH_CLIENT=::1 46894 2222
  SSH_CONNECTION=::1 46894 ::1 2222
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0
Sat Dec  9 22:22:40 UTC 2023
debug1: channel 0: free: client-session, nchannels 1
Transferred: sent 1048044, received 3500 bytes, in 0.0 seconds
Bytes per second: sent 23388935.4, received 78108.6
debug1: Exit status 0

Notice the kex: algorithm: mceliece6688128x25519-sha512@openssh.com output.

How about network bandwidth usage? Below is a comparison of a complete SSH client connection such as the one above that log in and print date and logs out. Plain X25519 is around 7kb, X25519 with sntrup761 is around 9kb, and mceliece6688128 with X25519 is around 1MB. Yes, Classic McEliece has large keys, but for many environments, 1MB of data for the session establishment will barely be noticeable.

./ssh -v -p 2222 localhost -oKexAlgorithms=curve25519-sha256 date 2>&1 | grep ^Transferred
Transferred: sent 3028, received 3612 bytes, in 0.0 seconds

./ssh -v -p 2222 localhost -oKexAlgorithms=sntrup761x25519-sha512@openssh.com date 2>&1 | grep ^Transferred
Transferred: sent 4212, received 4596 bytes, in 0.0 seconds

./ssh -v -p 2222 localhost -oKexAlgorithms=mceliece6688128x25519-sha512@openssh.com date 2>&1 | grep ^Transferred
Transferred: sent 1048044, received 3764 bytes, in 0.0 seconds

So how about session establishment time?

date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost -oKexAlgorithms=curve25519-sha256 date > /dev/null 2>&1; i=`expr $i + 1`; done; date
Sat Dec  9 22:39:19 UTC 2023
Sat Dec  9 22:39:25 UTC 2023
# 6 seconds

date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost -oKexAlgorithms=sntrup761x25519-sha512@openssh.com date > /dev/null 2>&1; i=`expr $i + 1`; done; date
Sat Dec  9 22:39:29 UTC 2023
Sat Dec  9 22:39:38 UTC 2023
# 9 seconds

date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost -oKexAlgorithms=mceliece6688128x25519-sha512@openssh.com date > /dev/null 2>&1; i=`expr $i + 1`; done; date
Sat Dec  9 22:39:55 UTC 2023
Sat Dec  9 22:40:07 UTC 2023
# 12 seconds

I never noticed adding sntrup761, so I’m pretty sure I wouldn’t notice this increase either. This is all running on my laptop that runs Trisquel so take it with a grain of salt but at least the magnitude is clear.

Future work items include:

  • Use a better hybrid mode combiner than SHA512? See for example KEM Combiners.
  • Write IETF document describing the Classic McEliece SSH protocol
  • Submit my patch to the OpenSSH community for discussion and feedback, please review meanwhile!
  • Implement a mceliece6688128sntrup761x25519 multi-hybrid mode?
  • Write a shell script a’la sntrup761.sh to import a stripped-down mceliece6688128 implementation to avoid having OpenSSH depend on libmceliece?
  • My kexmceliece6688128x25519.c is really just kexsntrup761x25519.c with the three calls to sntrup761 replaced with mceliece6688128. This should be parametrized to avoid cut’n’paste of code, if the OpenSSH community is interested.
  • Consider if the behaviour of librandombytes related to closing of file descriptors is relevant to OpenSSH.

Happy post-quantum SSH’ing!

Update: Changing the mceliece6688128_keypair call to mceliece6688128f_keypair (i.e., using the fully compatible f-variant) results in McEliece being just as fast as sntrup761 on my machine.

Update 2023-12-26: An initial IETF document draft-josefsson-ssh-mceliece-00 published.

Streamlined NTRU Prime sntrup761 goes to IETF

The OpenSSH project added support for a hybrid Streamlined NTRU Prime post-quantum key encapsulation method sntrup761 to strengthen their X25519-based default in their version 8.5 released on 2021-03-03. While there has been a lot of talk about post-quantum crypto generally, my impression has been that there has been a slowdown in implementing and deploying them in the past two years. Why is that? Regardless of the answer, we can try to collaboratively change things, and one effort that appears strangely missing are IETF documents for these algorithms.

Building on some earlier work that added X25519/X448 to SSH, writing a similar document was relatively straight-forward once I had spent a day reading OpenSSH and TinySSH source code to understand how it worked. While I am not perfectly happy with how the final key is derived from the sntrup761/X25519 secrets – it is a SHA512 call on the concatenated secrets – I think the construct deserves to be better documented, to pave the road for increased confidence or better designs. Also, reusing the RFC5656§4 structs makes for a worse specification (one unnecessary normative reference), but probably a simpler implementation. I have published draft-josefsson-ntruprime-ssh-00 here. Credit here goes to Jan Mojžíš of TinySSH that designed the earlier sntrup4591761x25519-sha512@tinyssh.org in 2018, Markus Friedl who added it to OpenSSH in 2019, and Damien Miller that changed it to sntrup761 in 2020. Does anyone have more to add to the history of this work?

Once I had sharpened my xml2rfc skills, preparing a document describing the hybrid construct between the sntrup761 key-encapsulation mechanism and the X25519 key agreement method in a non-SSH fashion was easy. I do not know if this work is useful, but it may serve as a reference for further study. I published draft-josefsson-ntruprime-hybrid-00 here.

Finally, how about a IETF document on the base Streamlined NTRU Prime? Explaining all the details, and especially the math behind it would be a significant effort. I started doing that, but realized it is a subjective call when to stop explaining things. If we can’t assume that the reader knows about lattice math, is a document like this the best place to teach it? I settled for the most minimal approach instead, merely giving an introduction to the algorithm, included SageMath and C reference implementations together with test vectors. The IETF audience rarely understands math, so I think it is better to focus on the bits on the wire and the algorithm interfaces. Everything here was created by the Streamlined NTRU Prime team, I merely modified it a bit hoping I didn’t break too much. I have now published draft-josefsson-ntruprime-streamlined-00 here.

I maintain the IETF documents on my ietf-ntruprime GitLab page, feel free to open merge requests or raise issues to help improve them.

To have confidence in the code was working properly, I ended up preparing a branch with sntrup761 for the GNU-project Nettle and have submitted it upstream for review. I had the misfortune of having to understand and implement NIST’s DRBG-CTR to compute the sntrup761 known-answer tests, and what a mess it is. Why does a deterministic random generator support re-seeding? Why does it support non-full entropy derivation? What’s with the key size vs block size confusion? What’s with the optional parameters? What’s with having multiple algorithm descriptions? Luckily I was able to extract a minimal but working implementation that is easy to read. I can’t locate DRBG-CTR test vectors, anyone? Does anyone have sntrup761 test vectors that doesn’t use DRBG-CTR? One final reflection on publishing known-answer tests for an algorithm that uses random data: are the test vectors stable over different ways to implement the algorithm? Just consider of some optimization moved one randomness-extraction call before another, then wouldn’t the output be different? Are there other ways to verify correctness of implementations?

As always, happy hacking!

OpenPGP smartcard under GNOME on Debian 10 Buster

Debian buster is almost released, and today I celebrate midsummer by installing (a pre-release) of it on my Lenovo X201 laptop. Everything went smooth, except for the usual issues with smartcards under GNOME. I use a FST-01G running Gnuk, but the same issue apply to all OpenPGP cards including YubiKeys. I wrote about this problem for earlier releases, read Smartcards on Debian 9 Stretch and Smartcards on Debian 8 Jessie. Some things have changed – now GnuPG‘s internal ccid support works, and dirmngr is installed by default when you install Debian with GNOME. I thought I’d write a new post for the new release.

After installing Debian and logging into GNOME, I start a terminal and attempt to use the smartcard as follows.

jas@latte:~$ gpg --card-status
gpg: error getting version from 'scdaemon': No SmartCard daemon
gpg: OpenPGP card not available: No SmartCard daemon
jas@latte:~$ 

The reason is that the scdaemon package is not installed. Install it as follows.

jas@latte:~$ sudo apt-get install scdaemon

After this, gpg --card-status works. It is now using GnuPG’s internal CCID library, which appears to be working. The pcscd package is not required to get things working any more — however installing it also works, and you might need pcscd if you use other applications that talks to the smartcard.

jas@latte:~$ gpg --card-status
Reader ...........: Free Software Initiative of Japan Gnuk (FSIJ-1.2.14-67252015) 00 00
Application ID ...: D276000124010200FFFE672520150000
Version ..........: 2.0
Manufacturer .....: unmanaged S/N range
Serial number ....: 67252015
Name of cardholder: Simon Josefsson
Language prefs ...: sv
Sex ..............: man
URL of public key : https://josefsson.org/key-20190320.txt
Login data .......: jas
Signature PIN ....: inte tvingad
Key attributes ...: ed25519 cv25519 ed25519
Max. PIN lengths .: 127 127 127
PIN retry counter : 3 3 3
Signature counter : 710
KDF setting ......: off
Signature key ....: A3CC 9C87 0B9D 310A BAD4  CF2F 5172 2B08 FE47 45A2
      created ....: 2019-03-20 23:40:49
Encryption key....: A9EC 8F4D 7F1E 50ED 3DEF  49A9 0292 3D7E E76E BD60
      created ....: 2019-03-20 23:40:26
Authentication key: CA7E 3716 4342 DF31 33DF  3497 8026 0EE8 A9B9 2B2B
      created ....: 2019-03-20 23:40:37
General key info..: [none]
jas@latte:~$ 

As before, using the key does not work right away:

jas@latte:~$ echo foo|gpg -a --sign
gpg: no default secret key: No public key
gpg: signing failed: No public key
jas@latte:~$ 

This is because GnuPG does not have the public key that correspond to the private key inside the smartcard.

jas@latte:~$ gpg --list-keys
jas@latte:~$ gpg --list-secret-keys
jas@latte:~$ 

You may retrieve your public key from the clouds as follows. With Debian Buster, the dirmngr package is installed by default so there is no need to install it. Alternatively, if you configured your smartcard with a public key URL that works, you may type “retrieve” into the gpg --card-edit interactive interface. This could be considered slightly more reliable (at least from a self-hosting point of view), because it uses your configured URL for retrieving the public key rather than trusting clouds.

jas@latte:~$ gpg --recv-keys "A3CC 9C87 0B9D 310A BAD4  CF2F 5172 2B08 FE47 45A2"
gpg: key D73CF638C53C06BE: public key "Simon Josefsson <simon@josefsson.org>" imported
gpg: marginals needed: 3  completes needed: 1  trust model: pgp
gpg: depth: 0  valid:   2  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 2u
gpg: next trustdb check due at 2019-10-22
gpg: Total number processed: 1
gpg:               imported: 1
jas@latte:~$ 

Now signing with the smart card works! Yay! Btw: compare the output size with the output size in the previous post to understand the size advantage with Ed25519 over RSA.

jas@latte:~$ echo foo|gpg -a --sign
-----BEGIN PGP MESSAGE-----

owGbwMvMwCEWWKTN8c/ddRHjaa4khlieP//S8vO5OkpZGMQ4GGTFFFkWn5nTzj3X
kGvXlfP6MLWsTCCFDFycAjARscUM/5MnXTF9aSG4ScVa3sDiB2//nPSVz13Mkpbo
nlzSezowRZrhn+Ky7/O6M7XljzzJvtJhfPvOyS+rpyqJlD+buumL+/eOPywA
=+WN7
-----END PGP MESSAGE-----

As before, encrypting to myself does not work smoothly because of the trust setting on the public key. Witness the problem here:

jas@latte:~$ echo foo|gpg -a --encrypt -r simon@josefsson.org
gpg: 02923D7EE76EBD60: There is no assurance this key belongs to the named user

sub  cv25519/02923D7EE76EBD60 2019-03-20 Simon Josefsson <simon@josefsson.org>
 Primary key fingerprint: B1D2 BD13 75BE CB78 4CF4  F8C4 D73C F638 C53C 06BE
      Subkey fingerprint: A9EC 8F4D 7F1E 50ED 3DEF  49A9 0292 3D7E E76E BD60

It is NOT certain that the key belongs to the person named
in the user ID.  If you *really* know what you are doing,
you may answer the next question with yes.

Use this key anyway? (y/N) 
gpg: signal Interrupt caught ... exiting

jas@latte:~$

You update the trust setting with the gpg --edit-key command. Take note that this is not the general way of getting rid of the “There is no assurance this key belongs to the named user” warning — using a ultimate trust setting is normally only relevant for your own keys, which is the case here.

jas@latte:~$ gpg --edit-key simon@josefsson.org
gpg (GnuPG) 2.2.12; Copyright (C) 2018 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret subkeys are available.

pub  ed25519/D73CF638C53C06BE
     created: 2019-03-20  expires: 2019-10-22  usage: SC  
     trust: unknown       validity: unknown
ssb  cv25519/02923D7EE76EBD60
     created: 2019-03-20  expires: 2019-10-22  usage: E   
     card-no: FFFE 67252015
ssb  ed25519/80260EE8A9B92B2B
     created: 2019-03-20  expires: 2019-10-22  usage: A   
     card-no: FFFE 67252015
ssb  ed25519/51722B08FE4745A2
     created: 2019-03-20  expires: 2019-10-22  usage: S   
     card-no: FFFE 67252015
[ unknown] (1). Simon Josefsson <simon@josefsson.org>

gpg> trust
pub  ed25519/D73CF638C53C06BE
     created: 2019-03-20  expires: 2019-10-22  usage: SC  
     trust: unknown       validity: unknown
ssb  cv25519/02923D7EE76EBD60
     created: 2019-03-20  expires: 2019-10-22  usage: E   
     card-no: FFFE 67252015
ssb  ed25519/80260EE8A9B92B2B
     created: 2019-03-20  expires: 2019-10-22  usage: A   
     card-no: FFFE 67252015
ssb  ed25519/51722B08FE4745A2
     created: 2019-03-20  expires: 2019-10-22  usage: S   
     card-no: FFFE 67252015
[ unknown] (1). Simon Josefsson <simon@josefsson.org>

Please decide how far you trust this user to correctly verify other users' keys
(by looking at passports, checking fingerprints from different sources, etc.)

  1 = I don't know or won't say
  2 = I do NOT trust
  3 = I trust marginally
  4 = I trust fully
  5 = I trust ultimately
  m = back to the main menu

Your decision? 5
Do you really want to set this key to ultimate trust? (y/N) y

pub  ed25519/D73CF638C53C06BE
     created: 2019-03-20  expires: 2019-10-22  usage: SC  
     trust: ultimate      validity: unknown
ssb  cv25519/02923D7EE76EBD60
     created: 2019-03-20  expires: 2019-10-22  usage: E   
     card-no: FFFE 67252015
ssb  ed25519/80260EE8A9B92B2B
     created: 2019-03-20  expires: 2019-10-22  usage: A   
     card-no: FFFE 67252015
ssb  ed25519/51722B08FE4745A2
     created: 2019-03-20  expires: 2019-10-22  usage: S   
     card-no: FFFE 67252015
[ unknown] (1). Simon Josefsson <simon@josefsson.org>
Please note that the shown key validity is not necessarily correct
unless you restart the program.

gpg> quit
jas@latte:~$

Confirm gpg --list-keys indicate that the key is now trusted, and encrypting to yourself should work.

jas@latte:~$ gpg --list-keys
/home/jas/.gnupg/pubring.kbx
----------------------------
pub   ed25519 2019-03-20 [SC] [expires: 2019-10-22]
      B1D2BD1375BECB784CF4F8C4D73CF638C53C06BE
uid           [ultimate] Simon Josefsson <simon@josefsson.org>
sub   ed25519 2019-03-20 [A] [expires: 2019-10-22]
sub   ed25519 2019-03-20 [S] [expires: 2019-10-22]
sub   cv25519 2019-03-20 [E] [expires: 2019-10-22]

jas@latte:~$ gpg --list-secret-keys
/home/jas/.gnupg/pubring.kbx
----------------------------
sec#  ed25519 2019-03-20 [SC] [expires: 2019-10-22]
      B1D2BD1375BECB784CF4F8C4D73CF638C53C06BE
uid           [ultimate] Simon Josefsson <simon@josefsson.org>
ssb>  ed25519 2019-03-20 [A] [expires: 2019-10-22]
ssb>  ed25519 2019-03-20 [S] [expires: 2019-10-22]
ssb>  cv25519 2019-03-20 [E] [expires: 2019-10-22]

jas@latte:~$ echo foo|gpg -a --encrypt -r simon@josefsson.org
gpg: checking the trustdb
gpg: marginals needed: 3  completes needed: 1  trust model: pgp
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: next trustdb check due at 2019-10-22
-----BEGIN PGP MESSAGE-----

hF4DApI9fuduvWASAQdA4FIwM27EFqNK1I5eZERaZVDAXJDmYLZQHjZD8TexT3gw
7SDaeTLm7s0QSyKtsRugRpex6eSVhfA3WG8fUOyzbNv4o7AC/TQdhZ2TDtXZGFtY
0j8BRYIjVDbYOIp1NM3kHnMGHWEJRsTbtLCitMWmLdp4C98DE/uVkwjw98xEJauR
/9ZNmmvzuWpaHuEJNiFjORA=
=tAXh
-----END PGP MESSAGE-----
jas@latte:~$ 

The issue with OpenSSH and GNOME Keyring still exists as in previous releases.

jas@latte:~$ ssh-add -L
The agent has no identities.
jas@latte:~$ echo $SSH_AUTH_SOCK 
/run/user/1000/keyring/ssh
jas@latte:~$ 

The trick we used last time still works, and as far as I can tell, it is still the only recommended method to disable the gnome-keyring ssh component. Notice how we also configure GnuPG’s gpg-agent to enable SSH daemon support.

jas@latte:~$ mkdir ~/.config/autostart
jas@latte:~$ cp /etc/xdg/autostart/gnome-keyring-ssh.desktop ~/.config/autostart/
jas@latte:~$ echo 'Hidden=true' >> ~/.config/autostart/gnome-keyring-ssh.desktop 
jas@latte:~$ echo enable-ssh-support >> ~/.gnupg/gpg-agent.conf 

Log out of GNOME and log in again. Now the environment variable points to gpg-agent’s socket, and SSH authentication using the smartcard works.

jas@latte:~$ echo $SSH_AUTH_SOCK 
/run/user/1000/gnupg/S.gpg-agent.ssh
jas@latte:~$ ssh-add -L
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAILzCFcHHrKzVSPDDarZPYqn89H5TPaxwcORgRg+4DagE cardno:FFFE67252015
jas@latte:~$ 

Topics for further discussion and research this time around includes:

  1. Should scdaemon (and possibly pcscd) be pre-installed on Debian desktop systems?
  2. Could gpg --card-status attempt to import the public key and secret key stub automatically? Alternatively, some new command that automate the bootstrapping of a new smartcard.
  3. Should GNOME keyring support smartcards?
  4. Why is GNOME keyring used by default for SSH rather than gpg-agent?
  5. Should gpg-agent default to enable the SSH daemon?
  6. What could be done to automatically infer the trust setting for a smartcard based private key?

Thanks for reading and happy smartcarding!

SSH Host Certificates with YubiKey NEO

If you manage a bunch of server machines, you will undoubtedly have run into the following OpenSSH question:

The authenticity of host 'host.example.org (1.2.3.4)' can't be established.
RSA key fingerprint is 1b:9b:b8:5e:74:b1:31:19:35:48:48:ba:7d:d0:01:f5.
Are you sure you want to continue connecting (yes/no)?

If the server is a single-user machine, where you are the only person expected to login on it, answering “yes” once and then using the ~/.ssh/known_hosts file to record the key fingerprint will (sort-of) work and protect you against future man-in-the-middle attacks. I say sort-of, since if you want to access the server from multiple machines, you will need to sync the known_hosts file somehow. And once your organization grows larger, and you aren’t the only person that needs to login, having a policy that everyone just answers “yes” on first connection on all their machines is bad. The risk that someone is able to successfully MITM attack you grows every time someone types “yes” to these prompts.

Setting up one (or more) SSH Certificate Authority (CA) to create SSH Host Certificates, and have your users trust this CA, will allow you and your users to automatically trust the fingerprint of the host through the indirection of the SSH Host CA. I was surprised (but probably shouldn’t have been) to find that deploying this is straightforward. Even setting this up with hardware-backed keys, stored on a YubiKey NEO, is easy. Below I will explain how to set this up for a hypothethical organization where two persons (sysadmins) are responsible for installing and configuring machines.

I’m going to assume that you already have a couple of hosts up and running and that they run the OpenSSH daemon, so they have a /etc/ssh/ssh_host_rsa_key* public/private keypair, and that you have one YubiKey NEO with the PIV applet and that the NEO is in CCID mode. I don’t believe it matters, but I’m running a combination of Debian and Ubuntu machines. The Yubico PIV tool is used to configure the YubiKey NEO, and I will be using OpenSC‘s PKCS#11 library to connect OpenSSH with the YubiKey NEO. Let’s install some tools:

apt-get install yubikey-personalization yubico-piv-tool opensc-pkcs11 pcscd

Every person responsible for signing SSH Host Certificates in your organization needs a YubiKey NEO. For my example, there will only be two persons, but the number could be larger. Each one of them will have to go through the following process.

The first step is to prepare the NEO. First mode switch it to CCID using some device configuration tool, like yubikey-personalization.

ykpersonalize -m1

Then prepare the PIV applet in the YubiKey NEO. This is covered by the YubiKey NEO PIV Introduction but I’ll reproduce the commands below. Do this on a disconnected machine, saving all files generated on one or more secure media and store that in a safe.

user=simon
key=`dd if=/dev/random bs=1 count=24 2>/dev/null | hexdump -v -e '/1 "%02X"'`
echo $key > ssh-$user-key.txt
pin=`dd if=/dev/random bs=1 count=6 2>/dev/null | hexdump -v -e '/1 "%u"'|cut -c1-6`
echo $pin > ssh-$user-pin.txt
puk=`dd if=/dev/random bs=1 count=6 2>/dev/null | hexdump -v -e '/1 "%u"'|cut -c1-8`
echo $puk > ssh-$user-puk.txt

yubico-piv-tool -a set-mgm-key -n $key
yubico-piv-tool -k $key -a change-pin -P 123456 -N $pin
yubico-piv-tool -k $key -a change-puk -P 12345678 -N $puk

Then generate a RSA private key for the SSH Host CA, and generate a dummy X.509 certificate for that key. The only use for the X.509 certificate is to make PIV/PKCS#11 happy — they want to be able to extract the public-key from the smartcard, and do that through the X.509 certificate.

openssl genrsa -out ssh-$user-ca-key.pem 2048
openssl req -new -x509 -batch -key ssh-$user-ca-key.pem -out ssh-$user-ca-crt.pem

You import the key and certificate to the PIV applet as follows:

yubico-piv-tool -k $key -a import-key -s 9c < ssh-$user-ca-key.pem
yubico-piv-tool -k $key -a import-certificate -s 9c < ssh-$user-ca-crt.pem

You now have a SSH Host CA ready to go! The first thing you want to do is to extract the public-key for the CA, and you use OpenSSH's ssh-keygen for this, specifying OpenSC's PKCS#11 module.

ssh-keygen -D /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so -e > ssh-$user-ca-key.pub

If you happen to use YubiKey NEO with OpenPGP using gpg-agent/scdaemon, you may get the following error message:

no slots
cannot read public key from pkcs11

The reason is that scdaemon exclusively locks the smartcard, so no other application can access it. You need to kill scdaemon, which can be done as follows:

gpg-connect-agent SCD KILLSCD SCD BYE /bye

The output from ssh-keygen may look like this:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCp+gbwBHova/OnWMj99A6HbeMAGE7eP3S9lKm4/fk86Qd9bzzNNz2TKHM7V1IMEj0GxeiagDC9FMVIcbg5OaSDkuT0wGzLAJWgY2Fn3AksgA6cjA3fYQCKw0Kq4/ySFX+Zb+A8zhJgCkMWT0ZB0ZEWi4zFbG4D/q6IvCAZBtdRKkj8nJtT5l3D3TGPXCWa2A2pptGVDgs+0FYbHX0ynD0KfB4PmtR4fVQyGJjJ0MbF7fXFzQVcWiBtui8WR/Np9tvYLUJHkAXY/FjLOZf9ye0jLgP1yE10+ihe7BCxkM79GU9BsyRgRt3oArawUuU6tLgkaMN8kZPKAdq0wxNauFtH

Now all your users in your organization needs to add a line to their ~/.ssh/known_hosts as follows:

@cert-authority *.example.com ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCp+gbwBHova/OnWMj99A6HbeMAGE7eP3S9lKm4/fk86Qd9bzzNNz2TKHM7V1IMEj0GxeiagDC9FMVIcbg5OaSDkuT0wGzLAJWgY2Fn3AksgA6cjA3fYQCKw0Kq4/ySFX+Zb+A8zhJgCkMWT0ZB0ZEWi4zFbG4D/q6IvCAZBtdRKkj8nJtT5l3D3TGPXCWa2A2pptGVDgs+0FYbHX0ynD0KfB4PmtR4fVQyGJjJ0MbF7fXFzQVcWiBtui8WR/Np9tvYLUJHkAXY/FjLOZf9ye0jLgP1yE10+ihe7BCxkM79GU9BsyRgRt3oArawUuU6tLgkaMN8kZPKAdq0wxNauFtH

Each sysadmin needs to go through this process, and each user needs to add one line for each sysadmin. While you could put the same key/certificate on multiple YubiKey NEOs, to allow users to only have to put one line into their file, dealing with revocation becomes a bit more complicated if you do that. If you have multiple CA keys in use at the same time, you can roll over to new CA keys without disturbing production. Users may also have different policies for different machines, so that not all sysadmins have the power to create host keys for all machines in your organization.

The CA setup is now complete, however it isn't doing anything on its own. We need to sign some host keys using the CA, and to configure the hosts' sshd to use them. What you could do is something like this, for every host host.example.com that you want to create keys for:

h=host.example.com
scp root@$h:/etc/ssh/ssh_host_rsa_key.pub .
gpg-connect-agent "SCD KILLSCD" "SCD BYE" /bye
ssh-keygen -D /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so -s ssh-$user-ca-key.pub -I $h -h -n $h -V +52w ssh_host_rsa_key.pub
scp ssh_host_rsa_key-cert.pub root@$h:/etc/ssh/

The ssh-keygen command will use OpenSC's PKCS#11 library to talk to the PIV applet on the NEO, and it will prompt you for the PIN. Enter the PIN that you set above. The output of the command would be something like this:

Enter PIN for 'PIV_II (PIV Card Holder pin)': 
Signed host key ssh_host_rsa_key-cert.pub: id "host.example.com" serial 0 for host.example.com valid from 2015-06-16T13:39:00 to 2016-06-14T13:40:58

The host now has a SSH Host Certificate installed. To use it, you must make sure that /etc/ssh/sshd_config has the following line:

HostCertificate /etc/ssh/ssh_host_rsa_key-cert.pub

You need to restart sshd to apply the configuration change. If you now try to connect to the host, you will likely still use the known_hosts fingerprint approach. So remove the fingerprint from your machine:

ssh-keygen -R $h

Now if you attempt to ssh to the host, and using the -v parameter to ssh, you will see the following:

debug1: Server host key: RSA-CERT 1b:9b:b8:5e:74:b1:31:19:35:48:48:ba:7d:d0:01:f5
debug1: Host 'host.example.com' is known and matches the RSA-CERT host certificate.

Success!

One aspect that may warrant further discussion is the host keys. Here I only created host certificates for the hosts' RSA key. You could create host certificate for the DSA, ECDSA and Ed25519 keys as well. The reason I did not do that was that in this organization, we all used GnuPG's gpg-agent/scdaemon with YubiKey NEO's OpenPGP Card Applet with RSA keys for user authentication. So only the host RSA key is relevant.

Revocation of a YubiKey NEO key is implemented by asking users to drop the corresponding line for one of the sysadmins, and regenerate the host certificate for the hosts that the sysadmin had created host certificates for. This is one reason users should have at least two CAs for your organization that they trust for signing host certificates, so they can migrate away from one of them to the other without interrupting operations.