Classic McEliece goes to IETF and OpenSSH

My earlier work on Streamlined NTRU Prime has been progressing along. The IETF document on sntrup761 in SSH has passed several process points. GnuPG’s libgcrypt has added support for sntrup761. The libssh support for sntrup761 is working, but the merge request is stuck mostly due to lack of time to debug why the regression test suite sporadically errors out in non-sntrup761 related parts with the patch.

The foundation for lattice-based post-quantum algorithms has some uncertainty around it, and I have felt that there is more to the post-quantum story than adding sntrup761 to implementations. Classic McEliece has been mentioned to me a couple of times, and I took some time to learn it and did a cut’n’paste job of the proposed ISO standard and published draft-josefsson-mceliece in the IETF to make the algorithm easily available to the IETF community. A high-quality implementation of Classic McEliece has been published as libmceliece and I’ve been supporting the work of Jan Mojžíš to package libmceliece for Debian, alas it has been stuck in the ftp-master NEW queue for manual review for over two months. The pre-dependencies librandombytes and libcpucycles are available in Debian already.

All that text writing and packaging work set the scene to write some code. When I added support for sntrup761 in libssh, I became familiar with the OpenSSH code base, so it was natural to return to OpenSSH to experiment with a new SSH KEX for Classic McEliece. DJB suggested to pick mceliece6688128 and combine it with the existing X25519+sntrup761 or with plain X25519. While a three-algorithm hybrid between X25519, sntrup761 and mceliece6688128 would be a simple drop-in for those that don’t want to lose the benefits offered by sntrup761, I decided to start the journey on a pure combination of X25519 with mceliece6688128. The key combiner in sntrup761x25519 is a simple SHA512 call and the only good I can say about that is that it is simple to describe and implement, and doesn’t raise too many questions since it is already deployed.

After procrastinating coding for months, once I sat down to work it only took a couple of hours until I had a successful Classic McEliece SSH connection. I suppose my brain had sorted everything in background before I started. To reproduce it, please try the following in a Debian testing environment (I use podman to get a clean environment).

# podman run -it --rm debian:testing-slim
apt update
apt dist-upgrade -y
apt install -y wget python3 librandombytes-dev libcpucycles-dev gcc make git autoconf libz-dev libssl-dev

cd ~
wget -q -O- | tar xfz -
cd libmceliece-20230612/
make install
cd ..

git clone
cd openssh-portable
git checkout jas/mceliece
./configure # verify 'libmceliece support: yes'

You should now have a working SSH client and server that supports Classic McEliece! Verify support by running ./ssh -Q kex and it should mention

To have it print plenty of debug outputs, you may remove the # character on the final line, but don’t use such a build in production.

You can test it as follows:

./ssh-keygen -A # writes to /usr/local/etc/ssh_host_...

# setup public-key based login by running the following:
./ssh-keygen -t rsa -f ~/.ssh/id_rsa -P ""
cat ~/.ssh/ > ~/.ssh/authorized_keys

adduser --system sshd
mkdir /var/empty

while true; do $PWD/sshd -p 2222 -f /dev/null; done &

./ssh -v -p 2222 localhost date

On the client you should see output like this:

OpenSSH_9.5p1, OpenSSL 3.0.11 19 Sep 2023
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm:
debug1: kex: host key algorithm: ssh-ed25519
debug1: kex: server->client cipher: MAC: <implicit> compression: none
debug1: kex: client->server cipher: MAC: <implicit> compression: none
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: SSH2_MSG_KEX_ECDH_REPLY received
debug1: Server host key: ssh-ed25519 SHA256:YognhWY7+399J+/V8eAQWmM3UFDLT0dkmoj3pIJ0zXs

debug1: Host '[localhost]:2222' is known and matches the ED25519 host key.
debug1: Found key in /root/.ssh/known_hosts:1
debug1: rekey out after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: rekey in after 134217728 blocks
debug1: Sending command: date
debug1: pledge: fork
debug1: permanently_set_uid: 0/0
  SSH_CLIENT=::1 46894 2222
  SSH_CONNECTION=::1 46894 ::1 2222
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug1: client_input_channel_req: channel 0 rtype reply 0
Sat Dec  9 22:22:40 UTC 2023
debug1: channel 0: free: client-session, nchannels 1
Transferred: sent 1048044, received 3500 bytes, in 0.0 seconds
Bytes per second: sent 23388935.4, received 78108.6
debug1: Exit status 0

Notice the kex: algorithm: output.

How about network bandwidth usage? Below is a comparison of a complete SSH client connection such as the one above that log in and print date and logs out. Plain X25519 is around 7kb, X25519 with sntrup761 is around 9kb, and mceliece6688128 with X25519 is around 1MB. Yes, Classic McEliece has large keys, but for many environments, 1MB of data for the session establishment will barely be noticeable.

./ssh -v -p 2222 localhost -oKexAlgorithms=curve25519-sha256 date 2>&1 | grep ^Transferred
Transferred: sent 3028, received 3612 bytes, in 0.0 seconds

./ssh -v -p 2222 localhost date 2>&1 | grep ^Transferred
Transferred: sent 4212, received 4596 bytes, in 0.0 seconds

./ssh -v -p 2222 localhost date 2>&1 | grep ^Transferred
Transferred: sent 1048044, received 3764 bytes, in 0.0 seconds

So how about session establishment time?

date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost -oKexAlgorithms=curve25519-sha256 date > /dev/null 2>&1; i=`expr $i + 1`; done; date
Sat Dec  9 22:39:19 UTC 2023
Sat Dec  9 22:39:25 UTC 2023
# 6 seconds

date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost date > /dev/null 2>&1; i=`expr $i + 1`; done; date
Sat Dec  9 22:39:29 UTC 2023
Sat Dec  9 22:39:38 UTC 2023
# 9 seconds

date; i=0; while test $i -le 100; do ./ssh -v -p 2222 localhost date > /dev/null 2>&1; i=`expr $i + 1`; done; date
Sat Dec  9 22:39:55 UTC 2023
Sat Dec  9 22:40:07 UTC 2023
# 12 seconds

I never noticed adding sntrup761, so I’m pretty sure I wouldn’t notice this increase either. This is all running on my laptop that runs Trisquel so take it with a grain of salt but at least the magnitude is clear.

Future work items include:

  • Use a better hybrid mode combiner than SHA512? See for example KEM Combiners.
  • Write IETF document describing the Classic McEliece SSH protocol
  • Submit my patch to the OpenSSH community for discussion and feedback, please review meanwhile!
  • Implement a mceliece6688128sntrup761x25519 multi-hybrid mode?
  • Write a shell script a’la to import a stripped-down mceliece6688128 implementation to avoid having OpenSSH depend on libmceliece?
  • My kexmceliece6688128x25519.c is really just kexsntrup761x25519.c with the three calls to sntrup761 replaced with mceliece6688128. This should be parametrized to avoid cut’n’paste of code, if the OpenSSH community is interested.
  • Consider if the behaviour of librandombytes related to closing of file descriptors is relevant to OpenSSH.

Happy post-quantum SSH’ing!

Update: Changing the mceliece6688128_keypair call to mceliece6688128f_keypair (i.e., using the fully compatible f-variant) results in McEliece being just as fast as sntrup761 on my machine.

Update 2023-12-26: An initial IETF document draft-josefsson-ssh-mceliece-00 published.

Streamlined NTRU Prime sntrup761 goes to IETF

The OpenSSH project added support for a hybrid Streamlined NTRU Prime post-quantum key encapsulation method sntrup761 to strengthen their X25519-based default in their version 8.5 released on 2021-03-03. While there has been a lot of talk about post-quantum crypto generally, my impression has been that there has been a slowdown in implementing and deploying them in the past two years. Why is that? Regardless of the answer, we can try to collaboratively change things, and one effort that appears strangely missing are IETF documents for these algorithms.

Building on some earlier work that added X25519/X448 to SSH, writing a similar document was relatively straight-forward once I had spent a day reading OpenSSH and TinySSH source code to understand how it worked. While I am not perfectly happy with how the final key is derived from the sntrup761/X25519 secrets – it is a SHA512 call on the concatenated secrets – I think the construct deserves to be better documented, to pave the road for increased confidence or better designs. Also, reusing the RFC5656§4 structs makes for a worse specification (one unnecessary normative reference), but probably a simpler implementation. I have published draft-josefsson-ntruprime-ssh-00 here. Credit here goes to Jan Mojžíš of TinySSH that designed the earlier in 2018, Markus Friedl who added it to OpenSSH in 2019, and Damien Miller that changed it to sntrup761 in 2020. Does anyone have more to add to the history of this work?

Once I had sharpened my xml2rfc skills, preparing a document describing the hybrid construct between the sntrup761 key-encapsulation mechanism and the X25519 key agreement method in a non-SSH fashion was easy. I do not know if this work is useful, but it may serve as a reference for further study. I published draft-josefsson-ntruprime-hybrid-00 here.

Finally, how about a IETF document on the base Streamlined NTRU Prime? Explaining all the details, and especially the math behind it would be a significant effort. I started doing that, but realized it is a subjective call when to stop explaining things. If we can’t assume that the reader knows about lattice math, is a document like this the best place to teach it? I settled for the most minimal approach instead, merely giving an introduction to the algorithm, included SageMath and C reference implementations together with test vectors. The IETF audience rarely understands math, so I think it is better to focus on the bits on the wire and the algorithm interfaces. Everything here was created by the Streamlined NTRU Prime team, I merely modified it a bit hoping I didn’t break too much. I have now published draft-josefsson-ntruprime-streamlined-00 here.

I maintain the IETF documents on my ietf-ntruprime GitLab page, feel free to open merge requests or raise issues to help improve them.

To have confidence in the code was working properly, I ended up preparing a branch with sntrup761 for the GNU-project Nettle and have submitted it upstream for review. I had the misfortune of having to understand and implement NIST’s DRBG-CTR to compute the sntrup761 known-answer tests, and what a mess it is. Why does a deterministic random generator support re-seeding? Why does it support non-full entropy derivation? What’s with the key size vs block size confusion? What’s with the optional parameters? What’s with having multiple algorithm descriptions? Luckily I was able to extract a minimal but working implementation that is easy to read. I can’t locate DRBG-CTR test vectors, anyone? Does anyone have sntrup761 test vectors that doesn’t use DRBG-CTR? One final reflection on publishing known-answer tests for an algorithm that uses random data: are the test vectors stable over different ways to implement the algorithm? Just consider of some optimization moved one randomness-extraction call before another, then wouldn’t the output be different? Are there other ways to verify correctness of implementations?

As always, happy hacking!

Towards pluggable GSS-API modules

GSS-API is a standardized framework that is used by applications to, primarily, support Kerberos V5 authentication. GSS-API is standardized by IETF and supported by protocols like SSH, SMTP, IMAP and HTTP, and implemented by software projects such as OpenSSH, Exim, Dovecot and Apache httpd (via mod_auth_gssapi). The implementations of Kerberos V5 and GSS-API that are packaged for common GNU/Linux distributions, such as Debian, include MIT Kerberos, Heimdal and (less popular) GNU Shishi/GSS.

When an application or library is packaged for a GNU/Linux distribution, a choice is made which GSS-API library to link with. I believe this leads to two problematic consequences: 1) it is difficult for end-users to chose between Kerberos implementation, and 2) dependency bloat for non-Kerberos users. Let’s discuss these separately.

  1. No system admin or end-user choice over the GSS-API/Kerberos implementation used

    There are differences in the bug/feature set of MIT Kerberos and that of Heimdal’s, and definitely that of GNU Shishi. This can lead to a situation where an application (say, Curl) is linked to MIT Kerberos, and someone discovers a Kerberos related problem that would have been working if Heimdal was used, or vice versa. Sometimes it is possible to locally rebuild a package using another set of dependencies. However doing so has a high maintenance cost to track security fixes in future releases. It is an unsatisfying solution for the distribution to flip flop between which library to link to, depending on which users complain the most. To resolve this, a package could be built in two variants: one for MIT Kerberos and one for Heimdal. Both can be shipped. This can help solve the problem, but the question of which variant to install by default leads to similar concerns, and will also eventually leads to dependency conflicts. Consider an application linked to libraries (possible in several steps) where one library only supports MIT Kerberos and one library only supports Heimdal.

    The fact remains that there will continue to be multiple Kerberos implementations. Distributions will continue to support them, and will be faced with the dilemma of which one to link to by default. Distributions and the people who package software will have little guidance on which implementation to chose from their upstream, since most upstream support both implementations. The result is that system administrators and end-users are not given a simple way to have flexibility about which implementation to use.
  2. Dependency bloat for non-Kerberos use-cases.

    Compared to the number of users of GNU/Linux systems out there, the number of Kerberos users on GNU/Linux systems is smaller. Here distributions face another dilemma. Should they enable GSS-API for all applications, to satisfy the Kerberos community, or should they be conservative with adding dependencies to reduce attacker surface for the non-Kerberos users? This is a dilemma with no clear answer, and one approach has been to ship two versions of a package: one with Kerberos support and one without. Another option here is for upstream to support loadable modules, for example Dovecot implement this and Debian ship with a separate ‘dovecot-gssapi’ package that extend the core Dovecot seamlessly. Few except some larger projects appear to be willing to carry that maintenance cost upstream, so most only support build-time linking of the GSS-API library.

    There are a number of real-world situations to consider, but perhaps the easiest one to understand for most GNU/Linux users is OpenSSH. The SSH protocol supports Kerberos via GSS-API, and OpenSSH implement this feature, and most GNU/Linux distributions ship a SSH client and SSH server linked to a GSS-API library. Someone made the choice of linking it to a GSS-API library, for the arguable smaller set of people interested in it, and also the choice which library to link to. Rebuilding OpenSSH locally without Kerberos support comes with a high maintenance cost. Many people will not need or use the Kerberos features of the SSH client or SSH server, and having it enabled by default comes with a security cost. Having a vulnerability in OpenSSH is critical for many systems, and therefor its dependencies are a reasonable concern. Wouldn’t it be nice if OpenSSH was built in a way that didn’t force you to install MIT Kerberos or Heimdal? While still making it easy for Kerberos users to use it, of course.

Hopefully I have made the problem statement clear above, and that I managed to convince you that the state of affairs is in need of improving. I learned of the problems from my personal experience with maintaining GNU SASL in Debian, and for many years I ignored this problem.

Let me introduce Libgssglue!

Matryoshka Dolls
Matryoshka Dolls – photo CC-4.0-BY-NC by PngAll

Libgssglue is a library written by Kevin W. Coffman based on historical GSS-API code, the initial release was in 2004 (using the name libgssapi) and the last release was in 2012. Libgssglue provides a minimal GSS-API library and header file, so that any application can link to it instead of directly to MIT Kerberos or Heimdal (or GNU GSS). The administrator or end-user can select during run-time which GSS-API library to use, through a global /etc/gssapi_mech.conf file or even a local GSSAPI_MECH_CONF environment variable. Libgssglue is written in C, has no external dependencies, and is BSD-style licensed. It was developed for the CITI NFSv4 project but libgssglue ended up not being used.

I have added support to build GNU SASL with libgssglue — the changes required were only ./ since GSS-API is a standardized framework. I have written a fairly involved CI/CD check that builds GNU SASL with MIT Kerberos, Heimdal, libgssglue and GNU GSS, sets ups a local Kerberos KDC and verify successful GSS-API and GS2-KRB5 authentications. The ‘gsasl’ command line tool connects to a local example SMTP server, also based on GNU SASL (linked to all variants of GSS-API libraries), and to a system-installed Dovecot IMAP server that use the MIT Kerberos GSS-API library. This is on Debian but I expect it to be easily adaptable to other GNU/Linux distributions. The check triggered some (expected) Shishi/GSS-related missing features, and triggered one problem related to authorization identities that may be a bug in GNU SASL. However, testing shows that it is possible to link GNU SASL with libgssglue and have it be operational with any choice of GSS-API library that is shipped with Debian. See GitLab CI/CD code and its CI/CD output.

This experiment worked so well that I contacted Kevin to learn that he didn’t have any future plans for the project. I have adopted libgssglue and put up a Libgssglue GitLab project page, and pushed out a libgssglue 0.5 release fixing only some minor build-related issues. There are still some missing newly introduced GSS-API interfaces that could be added, but I haven’t been able to find any critical issues with it. Amazing that an untouched 10 year old project works so well!

My current next steps are:

  • Release GNU SASL with support for Libgssglue and encourage its use in documentation.
  • Make GNU SASL link to Libgssglue in Debian, to avoid a hard dependency on MIT Kerberos, but still allowing a default out-of-the-box Kerberos experience with GNU SASL.
  • Maintain libgssglue upstream and implement self-checks, CI/CD testing, new GSS-API interfaces that have been defined, and generally fix bugs and improve the project. Help appreciated!
  • Maintain the libgssglue package in Debian.
  • Look into if there are applications in Debian that link to a GSS-API library that could instead be linked to libgssglue to allow flexibility for the end-user and reduce dependency bloat.

What do you think? Happy Hacking!

What’s wrong with SCRAM?

Simple Authentication and Security Layer (SASL, RFC4422) is the framework that was abstracted from the IMAP and POP protocols. Among the most popular mechanisms are PLAIN (clear-text passwords, usually under TLS), CRAM-MD5 (RFC2195), and GSSAPI (for Kerberos V5). The DIGEST-MD5 mechanism was an attempt to improve upon the CRAM-MD5 mechanism, but ended up introducing a lot of complexity and insufficient desirable features and deployment was a mess — read RFC6331 for background on why it has been deprecated.


The effort to develop SCRAM (RFC5802) came, as far as I can tell, from the experiences with DIGEST-MD5 and the desire to offer something better than CRAM-MD5. In protocol design discussions, SCRAM is often still considered as “new” even though the specification was published in 2011 and even that had been in the making for several years. Developers that implement IMAP and SMTP still usually start out with supporting PLAIN and CRAM-MD5. The focus of this blog post is to delve into why this is and inspire the next step in this area. My opinion around this topic has existed for a couple of years already, formed while implementing SCRAM in GNU SASL, and my main triggers to write something about them now are 1) Martin Lambers‘ two-post blog series that first were negative about SCRAM and then became positive, and 2) my desire to work on or support new efforts in this area.

Let’s take a step back and spend some time analyzing PLAIN and CRAM-MD5. What are the perceived advantages and disadvantages?

Advantages: PLAIN and CRAM-MD5 solves the use-case of password-based user authentication, and are easy to implement.

Main disadvantages with PLAIN and CRAM-MD5:

  • PLAIN transfers passwords in clear text to the server (sometimes this is considered an advantage, but from a security point of view, it isn’t).
  • CRAM-MD5 requires that the server stores the password in plaintext (impossible to use a hashed or encrypted format).
  • Non-ASCII support was not there from the start.

A number of (debatable) inconveniences with PLAIN and CRAM-MD5 exists:

  • CRAM-MD5 does not support the notion of authorization identities.
  • The authentication is not bound to a particular secure channel, opening up for tunneling attacks.
  • CRAM-MD5 is based on HMAC-MD5 that is cryptographically “old” (but has withhold well) – the main problem today is that usually MD5 is not something you want to implement since there is diminishing other uses for it.
  • Servers can impersonate the client against other servers since they know the password.
  • Neither offer to authenticate the server to the client.

If you are familiar with SCRAM, you know that it solves these issues. So why hasn’t everyone jumped on it and CRAM-MD5 is now a thing of the past? In the first few years, my answer was that things take time and we’ll see improvements. Today we are ten years later; there are many SCRAM implementations out there, and the Internet has generally migrated away from protocols that have much larger legacy issues (e.g., SSL), but we are still doing CRAM-MD5. I think it is time to become critical of the effort and try to learn from the past. Here is my attempt at summarizing the concerns I’ve seen come up:

  • The mechanism family concept add complexity, in several ways:
    • The specification is harder to understand.
    • New instances of the mechanism family (SCRAM-SHA-256) introduce even more complexity since they tweak some of the poor choices made in the base specification.
    • Introducing new hashes to the family (like the suggested SHA3 variants) adds deployment costs since databases needs new type:value pairs to hold more than one “SCRAM” hashed password.
    • How to negotiate which variant to use is not well-defined. Consider if the server only has access to a SCRAM-SHA-1 hashed password for user X and a SCRAM-SHA-256 hashed password for user Y. What mechanisms should it offer to an unknown client? Offering both is likely to cause authentication failures, and the fall-back behaviour of SASL is poor.
  • The optional support for channel bindings and the way they are negotiated adds complexity.
  • The original default ‘tls-unique’ channel binding turned out to be insecure, and it cannot be supported in TLS 1.3.
  • Support for channel bindings requires interaction between TLS and SASL layers in an application.
  • The feature that servers cannot impersonate a client is dubious: the server only needs to participate in one authentication exchange with the client to gain this ability.
  • SCRAM does not offer any of the cryptographic properties of a Password-authenticated key agreement.

What other concerns are there? I’m likely forgetting some. Some of these are debatable and were intentional design choices.

Can we save SCRAM? I’m happy to see the effort to introduce a new channel binding and update the SCRAM specifications to use it for TLS 1.3+. I brought up a similar approach back in the days when some people were still insisting on ‘tls-unique’. A new channel binding solves some of the issues above.

It is hard to tell what the main reason for not implementing SCRAM more often is. A sense of urgency appears to be lacking. My gut feeling is that to an implementer SCRAM looks awfully similar to DIGEST-MD5. Most of the problems with DIGEST-MD5 could be fixed, but the fixes add more complexity.

How to proceed from here? I see a couple of options:

  • Let time go by to see increased adoption. Improving the channel binding situation will help.
  • Learn from the mistakes and introduce a new simple SCRAM, which could have the following properties:
    • No mechanism family, just one mechanism instance.
    • Hash is hard-coded, just like CRAM-MD5.
    • TLS and a channel binding is required and always used.
  • Review one of the PAKE alternatives and specify a SASL mechanism for it. Preferably without repeating the mistakes of CRAM-MD5, DIGEST-MD5 and SCRAM.
  • Give up on having “complex” authentication mechanisms inside SASL, and help some PAKE variant become implemented through a TLS library, and SASL applications should just use EXTERNAL to use TLS user authentication.


I feel the following XKCD is appropriate here.

Scrypt in IETF

Colin Percival and I have worked on an internet-draft on scrypt for some time. I realize now that the -00 draft was published over two years ago, turning this effort today somewhat into archeology rather than rocket science. Still, having a published RFC that is easy to refer to from other Internet protocols will hopefully help to establish the point that PBKDF2 alone no longer provides state-of-the-art protection for password hashing.

I have written about password hashing before where I give a quick introduction to the basic concepts in the context of the well-known PBKDF2 algorithm. The novelty in scrypt is that it is designed to combat brute force and hardware accelerated attacks on hashed password databases. Briefly, scrypt expands the password and salt (using PBKDF2 as a component) and then uses that to create a large array (typically tens or hundreds of megabytes) using the Salsa20 core hash function and then de-references that large array in a random and sequential pattern. There are three parameters to the scrypt function: a CPU/Memory cost parameter N (varies, typical values are 16384 or 1048576), a blocksize parameter r (typically 8), and a parallelization parameter p (typically a low number like 1 or 16). The process is described in the draft, and there are further discussions in Colin’s original scrypt paper.

The document has been stable for some time, and we are now asking for it to be published. Thus now is good time to provide us with feedback on the document. The live document on gitlab is available if you want to send us a patch.