GPS on Replicant 6

I use Replicant on my main Samsung S3 mobile phone. Replicant is a fully free Android distribution. One consequence of the “fully free” means that some functionality is not working properly, because the hardware requires non-free software. I am in the process of upgrading my main phone to the latest beta builds of Replicant 6. Getting GPS to work on Replicant/S3 is not that difficult. I have made the decision that I am willing to compromise on freedom a bit for my Geocaching hobby. I have written before how to get GPS to work on Replicant 4.0 and GPS on Replicant 4.2. When I upgraded to Wolfgang’s Replicant 6 build back in September 2016, it took some time to figure out how to get GPS to work. I prepared notes on non-free firmware on Replicant 6 which included a section on getting GPS to work. Unfortunately, that method requires that you build your own image and has access to the build tree. Which is not for everyone. This writeup explains how to get GPS to work on a Replicant 6 without building your own image. Wolfgang already explained how to add all other non-free firmware to Replicant 6 but it did not cover GPS. The reason is that GPS requires non-free software to run on your main CPU. You should understand the consequences of this before proceeding!

The first step is to download a Replicant 6.0 image, currently they are available from the replicant 6.0 forum thread. Download the replicant-6.0-i9300.zip file and flash it to your phone as usual. Make sure everything (except GPS of course) works, after loading other non-free firmware (Wifi, Bluetooth etc) using "./firmwares.sh i9300 all" that you may want. You can install the Geocaching client c:geo via fdroid by adding fdroid.cgeo.org as a separate repository. Start the app and verify that GPS does not work. Keep the replicant-6.0-i9300.zip file around, you will need it later.

The tricky part about GPS is that the daemon is started through the init system of Android, specified by the file /init.target.rc. Replicant ships with the GPS part commented out. To modify this file, we need to bring out our little toolbox. Modifying the file on the device itself will not work, the root filesystem is extracted from a ramdisk file on every boot. Any changes made to the file will not be persistent. The file /init.target.rc is stored in the boot.img ramdisk, and that is the file we need to modify to make a persistent modification.

First we need the unpackbootimg and mkbootimg tools. If you are lucky, you might find them pre-built for your operating system. I am using Debian and I couldn’t find them easily. Building them from scratch is however not that difficult. Assuming you have a normal build environment (i.e., apt-get install build-essentials) try the following to build the tools. I was inspired by a post on unpacking and editing boot.img for some of the following instructions.

git clone https://github.com/CyanogenMod/android_system_core.git
cd android_system_core/
git checkout cm-13.0 
cd mkbootimg/
gcc -o ./mkbootimg -I ../include ../libmincrypt/*.c ./mkbootimg.c
gcc -o ./unpackbootimg -I ../include ../libmincrypt/*.c ./unpackbootimg.c
sudo cp mkbootimg unpackbootimg /usr/local/bin/

You are now ready to unpack the boot.img file. You will need the replicant ZIP file in your home directory. Also download the small patch I made for the init.target.rc file: https://gitlab.com/snippets/1639447. Save the patch as replicant-6-gps-fix.diff in your home directory.

mkdir t
cd t
unzip ~/replicant-6.0-i9300.zip 
unpackbootimg -i ./boot.img
mkdir ./ramdisk
cd ./ramdisk/
gzip -dc ../boot.img-ramdisk.gz | cpio -imd
patch < ~/replicant-6-gps-fix.diff 

Assuming the patch applied correctly (you should see output like "patching file init.target.rc" at the end) you will now need to put the ramdisk back together.

find . ! -name . | LC_ALL=C sort | cpio -o -H newc -R root:root | gzip > ../new-boot.img-ramdisk.gz
cd ..
mkbootimg --kernel ./boot.img-zImage \
--ramdisk ./new-boot.img-ramdisk.gz \
--second ./boot.img-second \
--cmdline "$(cat ./boot.img-cmdline)" \
--base "$(cat ./boot.img-base)" \
--pagesize "$(cat ./boot.img-pagesize)" \
--dt ./boot.img-dt \
--ramdisk_offset "$(cat ./boot.img-ramdisk_offset)" \
--second_offset "$(cat ./boot.img-second_offset)" \
--tags_offset "$(cat ./boot.img-tags_offset)" \
--output ./new-boot.img

Reboot your phone to the bootloader:

adb reboot bootloader

Then flash the new boot image back to your phone:

heimdall flash --BOOT new-boot.img

The phone will restart. To finalize things, you need the non-free GPS software components glgps, gps.exynos4.so and gps.cer. Before I used a complicated method involving sdat2img.py to extract these files from a CyanogenMod 13.x archive. Fortunately, Lineage OS is now offering downloads containing the relevant files too. You will need to download some files, extract them, and load them onto your phone.

wget https://mirrorbits.lineageos.org/full/i9300/20170125/lineage-14.1-20170125-experimental-i9300-signed.zip
mkdir lineage
cd lineage
unzip ../lineage-14.1-20170125-experimental-i9300-signed.zip
adb root
adb wait-for-device
adb remount
adb push system/bin/glgps /system/bin/
adb push system/lib/hw/gps.exynos4.vendor.so /system/lib/hw/gps.exynos4.so
adb push system/bin/gps.cer /system/bin/

Now reboot your phone and start c:geo and it should find some satellites. Congratulations!

Let’s Encrypt Clients

As many others, I have been following the launch of Let’s Encrypt. Let’s Encrypt is a new zero-cost X.509 Certificate Authority that supports the Automated Certificate Management Environment (ACME) protocol. ACME allow you to automate creation and retrieval of HTTPS server certificates. As anyone who has maintained a number of HTTPS servers can attest, this process has unfortunately been manual, error-prone and differ between CAs.

On some of my personal domains, such as this blog.josefsson.org, I have been using the CACert authority to sign the HTTPS server certificate. The problem with CACert is that the CACert trust anchors aren’t shipped with sufficient many operating systems and web browsers. The user experience is similar to reaching a self-signed server certificate. For organization-internal servers that you don’t want to trust external parties for, I continue to believe that running your own CA and distributing it to your users is better than using a public CA (compare my XMPP server certificate setup). But for public servers, availability without prior configuration is more important. Therefor I decided that my public HTTPS servers should use a CA/Browser Forum-approved CA with support for ACME, and as long as Let’s Encrypt is trustworthy and zero-cost, they are a good choice.

I was in need of a free software ACME client, and set out to research what’s out there. Unfortunately, I did not find any web pages that listed the available options and compared them. The Let’s Encrypt CA points to the “official” Let’s Encrypt client, written by Jakub Warmuz, James Kasten, Peter Eckersley and several others. The manual contain pointers to two other clients in a seamingly unrelated section. Those clients are letsencrypt-nosudo by Daniel Roesler et al, and simp_le by (again!) Jakub Warmuz. From the letsencrypt.org’s client-dev mailing list I also found letsencrypt.sh by Gerhard Heift and LetsEncryptShell by Jan Mojžíš. Is anyone aware of other ACME clients?

By comparing these clients, I learned what I did not like in them. I wanted something small so that I can audit it. I want something that doesn’t require root access. Preferably, it should be able to run on my laptop, since I wasn’t ready to run something on the servers. Generally, it has to be Secure, which implies something about how it approaches private key handling. The letsencrypt official client can do everything, and has plugin for various server software to automate the ACME negotiation. All the cryptographic operations appear to be hidden inside the client, which usually means it is not flexible. I really did not like how it was designed, it looks like your typical monolithic proof-of-concept design. The simp_le client looked much cleaner, and gave me a good feeling. The letsencrypt.sh client is simple and written in /bin/sh shell script, but it appeared a bit too simplistic. The LetsEncryptShell looked decent, but I wanted something more automated.

What all of these clients did not have, and that letsencrypt-nosudo client had, was the ability to let me do the private-key operations. All the operations are done interactively on the command-line using OpenSSL. This would allow me to put the ACME user private key, and the HTTPS private key, on a YubiKey, using its PIV applet and techniques similar to what I used to create my SSH host CA. While the HTTPS private key has to be available on the HTTPS server (used to setup TLS connections), I wouldn’t want the ACME user private key to be available there. Similarily, I wouldn’t want to have the ACME or the HTTPS private key on my laptop. The letsencrypt-nosudo tool is otherwise more rough around the edges than the more cleaner simp_le client. However the private key handling aspect was the deciding matter for me.

After fixing some hard-coded limitations on RSA key sizes, getting the cert was as simple as following the letsencrypt-nosudo instructions. I’ll follow up with a later post describing how to put the ACME user private key and the HTTPS server certificate private key on a YubiKey and how to use that with letsencrypt-nosudo.

So you can now enjoy browsing my blog over HTTPS! Thank you Let’s Encrypt!

Automatic Replicant Backup over USB using rsync

I have been using Replicant on the Samsung SIII I9300 for over two years. I have written before on taking a backup of the phone using rsync but recently I automated my setup as described below. This work was prompted by a screen accident with my phone that caused it to die, and I noticed that I hadn’t taken regular backups. I did not lose any data this time, since typically all content I create on the device is immediately synchronized to my clouds. Photos are uploaded by the ownCloud app, SMS Backup+ saves SMS and call logs to my IMAP server, and I use DAVDroid for synchronizing contacts, calendar and task lists with my instance of ownCloud. Still, I strongly believe in regular backups of everything, so it was time to automate this.

For my use-case, taking backups of the phone whenever I connect it to one of my laptops is sufficient. I typically connect it to my laptops for charging at least every other day. My laptops are all running Debian, but this should be applicable to most modern GNU/Linux system. This is not Replicant-specific, although you need a rooted phone. I thought that automating this would be simple, but I got to learn the ins and outs of systemd and udev in the process and this ended up taking the better part of an evening.

I started out adding an udev rule and a small script, thinking I could invoke the backup process from the udev rule. However rsync would magically die after running a few seconds. After an embarrassing long debugging session, finally I found someone with a similar problem which led me to a nice writeup on the topic of running long-running services on udev events. I created a file /etc/udev/rules.d/99-android-backup.rules with the following content:

ACTION=="add", SUBSYSTEMS=="usb", ENV{ID_SERIAL_SHORT}=="323048a5ae82918b", TAG+="systemd", ENV{SYSTEMD_WANTS}+="android-backup@$env{ID_SERIAL_SHORT}.service"
ACTION=="add", SUBSYSTEMS=="usb", ENV{ID_SERIAL_SHORT}=="4df9e09c25e75f63", TAG+="systemd", ENV{SYSTEMD_WANTS}+="android-backup@$env{ID_SERIAL_SHORT}.service"

The serial numbers correspond to the device serial numbers of the two devices I wish to backup. The adb devices command will print them for you, and you need to replace my values with the values from your phones. Next I created a systemd service to describe a oneshot service. The file /etc/systemd/system/android-backup@.service have the following content:

[Service]
Type=oneshot
ExecStart=/usr/local/sbin/android-backup %I

The at-sign (“@”) in the service filename signal that this is a service that takes a parameter. I’m not enough of an udev/systemd person to explain these two files using the proper terminology, but at least you can pattern-match and follow the basic idea of them: the udev rule matches the devices that I’m interested in (I don’t want this to happen to all random Android devices I attach, hence matching against known serial numbers), and it causes a systemd service with a parameter to be started. The systemd service file describe the script to run, and passes on the parameter.

Now for the juicy part, the script. I have /usr/local/sbin/android-backup with the following content.

#!/bin/bash

DIRBASE=/var/backups/android
export ANDROID_SERIAL="$1"

exec 2>&1 | logger

if ! test -d "$DIRBASE-$ANDROID_SERIAL"; then
    echo "could not find directory: $DIRBASE-$ANDROID_SERIAL"
    exit 1
fi

set -x

adb wait-for-device
adb root
adb wait-for-device
adb shell printf "address 127.0.0.1\nuid = root\ngid = root\n[root]\n\tpath = /\n" \> /mnt/secure/rsyncd.conf
adb shell rsync --daemon --no-detach --config=/mnt/secure/rsyncd.conf &
adb forward tcp:6010 tcp:873
sleep 2
rsync -av --delete --exclude /dev --exclude /acct --exclude /sys --exclude /proc rsync://localhost:6010/root/ $DIRBASE-$ANDROID_SERIAL/
: rc $?
adb forward --remove tcp:6010
adb shell rm -f /mnt/secure/rsyncd.conf

This script warrant more detailed explanation. Backups are placed under, e.g., /var/backups/android-323048a5ae82918b/ for later off-site backup (you do backup your laptop, right?). You have to manually create this directory, as a safety catch to not wildly rsync data into non-existing directories. The script logs everything using syslog, so run a tail -F /var/log/syslog& when setting this up. You may want to reduce verbosity of rsync if you prefer (replace rsync -av with rsync -a). The script runs adb wait-for-device which you rightly guessed will wait for the device to settle. Next adb root is invoked to get root on the device (reading all files from the system naturally requires root). It takes some time to switch, so another wait-for-device call is needed. Next a small rsyncd configuration file is created in /mnt/secure/rsyncd.conf on the phone. The file tells rsync do listen on localhost, run as root, and use / as the path. By default, rsyncd is read-only so the host will not be able to upload any data over rsync, just read data out. Next rsync is started on the phone. The adb forward command forwards port 6010 on the laptop to port 873 on the phone (873 is the default rsyncd port). Unfortunately, setting up the TCP forward appears to take some time, and adb wait-for-device will not wait for that to complete, hence an ugly sleep 2 at this point. Next is the rsync invocation itself, which just pulls in everything from the phone to the laptop, excluding some usual suspects. The somewhat cryptic : rc $? merely logs the exit code of the rsync process into syslog. Finally we clean up the TCP forward and remove the rsyncd.conf file that was temporarily created.

This setup appears stable to me. I can plug in a phone and a backup will be taken. I can even plug in both my devices at the same time, and they will run at the same time. If I unplug a device, the script or rsync will error out and systemd cleans up.

If anyone has ideas on how to avoid the ugly temporary rsyncd.conf file or the ugly sleep 2, I’m interested. It would also be nice to not have to do the ‘adb root’ dance, and instead have the phone start the rsync daemon when connecting to my laptop somehow. TCP forwarding might be troublesome on a multi-user system, but my laptops aren’t. Killing rsync on the phone is probably a good idea too. If you have ideas on how to fix any of this, other feedback, or questions, please let me know!

Combining Dnsmasq and Unbound

For my home office network I have been using Dnsmasq for some time. Dnsmasq provides me with DNS, DHCP, DHCPv6, and IPv6 Router Advertisement. I run dnsmasq on a Debian Jessie server, but it works similar with OpenWRT if you want to use a smaller device. My entire /etc/dnsmasq.d/local configuration used to look like this:

dhcp-authoritative
interface=eth1
read-ethers
dhcp-range=192.168.1.100,192.168.1.150,12h
dhcp-range=2001:9b0:104:42::100,2001:9b0:104:42::1500
dhcp-option=option6:dns-server,[::]
enable-ra

Here dhcp-authoritative enable DHCP. interface=eth1 says to listen on eth1 only, which is my internal (IPv4 NAT) network. I try to keep track of the MAC address of all my devices in a /etc/ethers file, so I use read-ethers to have dnsmasq give stable IP addresses for them. The dhcp-range is used to enable DHCP and DHCPv6 on my internal network. The dhcp-option=option6:dns-server,[::] statement is needed to inform the DHCP clients of the DNS resolver’s IPv6 address, otherwise they would only get the IPv4 DNS server address. The enable-ra parameter enables IPv6 router advertisement on the internal network, thereby removing the need to run radvd too — useful since I prefer to use copyleft software.

Recently I had a desire to use DNSSEC, and enabled it in Dnsmasq using the following statements:

dnssec
trust-anchor=.,19036,8,2,49AAC11D7B6F6446702E54A1607371607A1A41855200FD2CE1CDDE32F24E8FB5
dnssec-check-unsigned

The dnssec keyword enable DNSSEC validation in dnsmasq, using the indicated trust-anchor (get the root-anchors from IANA). The dnssec-check-unsigned deserves some more discussion. The dnsmasq manpage describes it as follows:

As a default, dnsmasq does not check that unsigned DNS replies are legitimate: they are assumed to be valid and passed on (without the “authentic data” bit set, of course). This does not protect against an attacker forging unsigned replies for signed DNS zones, but it is fast. If this flag is set, dnsmasq will check the zones of unsigned replies, to ensure that unsigned replies are allowed in those zones. The cost of this is more upstream queries and slower performance.

For example, this means that dnsmasq’s DNSSEC functionality is not secure against active man-in-the-middle attacks between dnsmasq and the DNS server it is using. Even if example.org used DNSSEC properly, an attacker could fake that it was unsigned to dnsmasq, and I would get potentially incorrect values in return. We all know that the Internet is not a secure place, and your threat model should include active attackers. I believe this mode should be the default in dnsmasq, and users should have to configure dnsmasq to not be in that mode if they really want to (with the obvious security warning).

Running with this enabled for a couple of days resulted in frustration about not being able to reach a couple of domains. The behaviour was that my clients would hang indefinitely or get a SERVFAIL, both resulting in lack of service. You can enable query logging in dnsmasq with log-queries and enabling this I noticed three distinct form of error patterns:

jow13gw dnsmasq 460 - -  forwarded www.fritidsresor.se to 213.80.101.3
jow13gw dnsmasq 460 - -  validation result is BOGUS
jow13gw dnsmasq 547 - -  reply cloudflare-dnssec.net is BOGUS DNSKEY
jow13gw dnsmasq 547 - -  validation result is BOGUS
jow13gw dnsmasq 547 - -  reply linux.conf.au is BOGUS DS
jow13gw dnsmasq 547 - -  validation result is BOGUS

The first only happened intermittently, the second did not cause any noticeable problem, and the final one was reproducible. To be fair, I only found the last example after starting to search for problem reports (see post confirming bug).

At this point, I had a confirmed bug in dnsmasq that affect my use-case. I want to use official packages from Debian on this machine, so installing newer versions manually is not an option. So I started to look into alternatives for DNS resolving, and quickly found Unbound. Installing it was easy:

apt-get install unbound
unbound-control-setup 

I created a local configuration file in /etc/unbound/unbound.conf.d/local.conf as follows:

server:
	interface: 127.0.0.1
	interface: ::1
	interface: 192.168.1.2
	interface: 2001:9b0:104:42::2
	access-control: 127.0.0.1 allow
	access-control: ::1 allow
	access-control: 192.168.1.2/24 allow
	access-control: 2001:9b0:104:42::2/64 allow
	outgoing-interface: 155.4.17.2
	outgoing-interface: 2001:9b0:1:1a04::2
#	log-queries: yes
#	verbosity: 2

The interface keyword determine which IP addresses to listen on, here I used the loopback interface and the local address of the physical network interface for my internal network. The access-control allows recursive DNS resolving from those networks. And outgoing-interface specify my external Internet-connected interface. log-queries and/or verbosity are useful for debugging.

To make things work, dnsmasq has to stop providing DNS services. This can be achieved with the port=0 keyword, however that will also disable informing DHCP clients about the DNS server to use. So this has to be added in manually. I ended up adding the two following lines to /etc/dnsmasq.d/local:

port=0
dhcp-option=option:dns-server,192.168.1.2

Restarting unbound and dnsmasq now leads to working (and secure) internal DNSSEC-aware name resolution over both IPv4 and IPv6. I can verify that resolution works, and that Unbound verify signatures and reject bad domains properly with dig as below, or use online DNSSEC resolver test page although I’m not sure how confident you can be in the result from that page.

$ host linux.conf.au
linux.conf.au has address 192.55.98.190
linux.conf.au mail is handled by 1 linux.org.au.
$ host sigfail.verteiltesysteme.net
;; connection timed out; no servers could be reached
$ 

I use Munin to monitor my services, and I was happy to find a nice Unbound Munin plugin. I installed the file in /usr/share/munin/plugins/ and created a Munin plugin configuration file /etc/munin/plugin-conf.d/unbound as follows:

[unbound*]
user root
env.statefile /var/lib/munin-node/plugin-state/root/unbound.state
env.unbound_conf /etc/unbound/unbound.conf
env.unbound_control /usr/sbin/unbound-control
env.spoof_warn 1000
env.spoof_crit 100000

I run munin-node-configure --shell|sh to enable it. To work unbound has to be configured as well, so I create a /etc/unbound/unbound.conf.d/munin.conf as follows.

server:
	extended-statistics: yes
	statistics-cumulative: no
	statistics-interval: 0
remote-control:
	control-enable: yes

The graphs may be viewed at my munin instance.

Cosmos – A Simple Configuration Management System

Back in early 2012 I had been helping with system administration of a number of Debian/Ubuntu-based machines, and the odd Solaris machine, for a couple of years at $DAYJOB. We had a combination of hand-written scripts, documentation notes that we cut’n’paste’d from during installation, and some locally maintained Debian packages for pulling in dependencies and providing some configuration files. As the number of people and machines involved grew, I realized that I wasn’t happy with how these machines were being administrated. If one of these machines would disappear in flames, it would take time (and more importantly, non-trivial manual labor) to get its services up and running again. I wanted a system that could automate the complete configuration of any Unix-like machine. It should require minimal human interaction. I wanted the configuration files to be version controlled. I wanted good security properties. I did not want to rely on a centralized server that would be a single point of failure. It had to be portable and be easy to get to work on new (and very old) platforms. It should be easy to modify a configuration file and get it deployed. I wanted it to be easy to start to use on an existing server. I wanted it to allow for incremental adoption. Surely this must exist, I thought.

During January 2012 I evaluated the existing configuration management systems around, like CFEngine, Chef, and Puppet. I don’t recall my reasons for rejecting each individual project, but needless to say I did not find what I was looking for. The reasons for rejecting the projects I looked at ranged from centralization concerns (single-point-of-failure central servers), bad security (no OpenPGP signing integration), to the feeling that the projects were too complex and hence fragile. I’m sure there were other reasons too.

In February I started going back to my original needs and tried to see if I could abstract something from the knowledge that was in all these notes, script snippets and local dpkg packages. I realized that the essence of what I wanted was one shell script per machine, OpenPGP signed, in a Git repository. I could check out that Git repository on every new machine that I wanted to configure, verify the OpenPGP signature of the shell script, and invoke the script. The script would do everything needed to get the machine up into an operational stage again, including package installation and configuration file changes. Since I would usually want to modify configuration files on a system even after its initial installation (hey not everyone is perfect), it was natural to extend this idea to a cron job that did ‘git pull’, verified the OpenPGP signature, and ran the script. The script would then have to be a bit more clever and not redo everything every time.

Since we had many machines, it was obvious that there would be huge code duplication between scripts. It felt natural to think of splitting up the shell script into a directory with many smaller shell scripts, and invoke each shell script in turn. Think of the /etc/init.d/ hierarchy and how it worked with System V initd. This would allow re-use of useful snippets across several machines. The next realization was that large parts of the shell script would be to create configuration files, such as /etc/network/interfaces. It would be easier to modify the content of those files if they were stored as files in a separate directory, an “overlay” stored in a sub-directory overlay/, and copied into the file system’s hierarchy with rsync. The final realization was that it made some sense to run one set of scripts before rsync’ing in the configuration files (to be able to install packages or set things up for the configuration files to make sense), and one set of scripts after the rsync (to perform tasks that require some package to be installed and configured). These set of scripts were called the “pre-tasks” and “post-tasks” respectively, and stored in sub-directories called pre-tasks.d/ and post-tasks.d/.

I started putting what would become Cosmos together during February 2012. Incidentally, I had been using etckeeper on our machines, and I had been reading its source code, and it greatly inspired the internal design of Cosmos. The git history shows well how the ideas evolved — even that Cosmos was initially called Eve but in retrospect I didn’t like the religious connotations — and there were a couple of rewrites on the way, but on the 28th of February I pushed out version 1.0. It was in total 778 lines of code, with at least 200 of those lines being the license boiler plate at the top of each file. Version 1.0 had a debian/ directory and I built the dpkg file and started to deploy on it some machines. There were a couple of small fixes in the next few days, but development stopped on March 5th 2012. We started to use Cosmos, and converted more and more machines to it, and I quickly also converted all of my home servers to use it. And even my laptops. It took until September 2014 to discover the first bug (the fix is a one-liner). Since then there haven’t been any real changes to the source code. It is in daily use today.

The README that comes with Cosmos gives a more hands-on approach on using it, which I hope will serve as a starting point if the above introduction sparked some interest. I hope to cover more about how to use Cosmos in a later blog post. Since Cosmos does so little on its own, to make sense of how to use it, you want to see a Git repository with machine models. If you want to see how the Git repository for my own machines looks you can see the sjd-cosmos repository. Don’t miss its README at the bottom. In particular, its global/ sub-directory contains some of the foundation, such as OpenPGP key trust handling.

SSH Host Certificates with YubiKey NEO

If you manage a bunch of server machines, you will undoubtedly have run into the following OpenSSH question:

The authenticity of host 'host.example.org (1.2.3.4)' can't be established.
RSA key fingerprint is 1b:9b:b8:5e:74:b1:31:19:35:48:48:ba:7d:d0:01:f5.
Are you sure you want to continue connecting (yes/no)?

If the server is a single-user machine, where you are the only person expected to login on it, answering “yes” once and then using the ~/.ssh/known_hosts file to record the key fingerprint will (sort-of) work and protect you against future man-in-the-middle attacks. I say sort-of, since if you want to access the server from multiple machines, you will need to sync the known_hosts file somehow. And once your organization grows larger, and you aren’t the only person that needs to login, having a policy that everyone just answers “yes” on first connection on all their machines is bad. The risk that someone is able to successfully MITM attack you grows every time someone types “yes” to these prompts.

Setting up one (or more) SSH Certificate Authority (CA) to create SSH Host Certificates, and have your users trust this CA, will allow you and your users to automatically trust the fingerprint of the host through the indirection of the SSH Host CA. I was surprised (but probably shouldn’t have been) to find that deploying this is straightforward. Even setting this up with hardware-backed keys, stored on a YubiKey NEO, is easy. Below I will explain how to set this up for a hypothethical organization where two persons (sysadmins) are responsible for installing and configuring machines.

I’m going to assume that you already have a couple of hosts up and running and that they run the OpenSSH daemon, so they have a /etc/ssh/ssh_host_rsa_key* public/private keypair, and that you have one YubiKey NEO with the PIV applet and that the NEO is in CCID mode. I don’t believe it matters, but I’m running a combination of Debian and Ubuntu machines. The Yubico PIV tool is used to configure the YubiKey NEO, and I will be using OpenSC‘s PKCS#11 library to connect OpenSSH with the YubiKey NEO. Let’s install some tools:

apt-get install yubikey-personalization yubico-piv-tool opensc-pkcs11 pcscd

Every person responsible for signing SSH Host Certificates in your organization needs a YubiKey NEO. For my example, there will only be two persons, but the number could be larger. Each one of them will have to go through the following process.

The first step is to prepare the NEO. First mode switch it to CCID using some device configuration tool, like yubikey-personalization.

ykpersonalize -m1

Then prepare the PIV applet in the YubiKey NEO. This is covered by the YubiKey NEO PIV Introduction but I’ll reproduce the commands below. Do this on a disconnected machine, saving all files generated on one or more secure media and store that in a safe.

user=simon
key=`dd if=/dev/random bs=1 count=24 2>/dev/null | hexdump -v -e '/1 "%02X"'`
echo $key > ssh-$user-key.txt
pin=`dd if=/dev/random bs=1 count=6 2>/dev/null | hexdump -v -e '/1 "%u"'|cut -c1-6`
echo $pin > ssh-$user-pin.txt
puk=`dd if=/dev/random bs=1 count=6 2>/dev/null | hexdump -v -e '/1 "%u"'|cut -c1-8`
echo $puk > ssh-$user-puk.txt

yubico-piv-tool -a set-mgm-key -n $key
yubico-piv-tool -k $key -a change-pin -P 123456 -N $pin
yubico-piv-tool -k $key -a change-puk -P 12345678 -N $puk

Then generate a RSA private key for the SSH Host CA, and generate a dummy X.509 certificate for that key. The only use for the X.509 certificate is to make PIV/PKCS#11 happy — they want to be able to extract the public-key from the smartcard, and do that through the X.509 certificate.

openssl genrsa -out ssh-$user-ca-key.pem 2048
openssl req -new -x509 -batch -key ssh-$user-ca-key.pem -out ssh-$user-ca-crt.pem

You import the key and certificate to the PIV applet as follows:

yubico-piv-tool -k $key -a import-key -s 9c < ssh-$user-ca-key.pem
yubico-piv-tool -k $key -a import-certificate -s 9c < ssh-$user-ca-crt.pem

You now have a SSH Host CA ready to go! The first thing you want to do is to extract the public-key for the CA, and you use OpenSSH's ssh-keygen for this, specifying OpenSC's PKCS#11 module.

ssh-keygen -D /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so -e > ssh-$user-ca-key.pub

If you happen to use YubiKey NEO with OpenPGP using gpg-agent/scdaemon, you may get the following error message:

no slots
cannot read public key from pkcs11

The reason is that scdaemon exclusively locks the smartcard, so no other application can access it. You need to kill scdaemon, which can be done as follows:

gpg-connect-agent SCD KILLSCD SCD BYE /bye

The output from ssh-keygen may look like this:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCp+gbwBHova/OnWMj99A6HbeMAGE7eP3S9lKm4/fk86Qd9bzzNNz2TKHM7V1IMEj0GxeiagDC9FMVIcbg5OaSDkuT0wGzLAJWgY2Fn3AksgA6cjA3fYQCKw0Kq4/ySFX+Zb+A8zhJgCkMWT0ZB0ZEWi4zFbG4D/q6IvCAZBtdRKkj8nJtT5l3D3TGPXCWa2A2pptGVDgs+0FYbHX0ynD0KfB4PmtR4fVQyGJjJ0MbF7fXFzQVcWiBtui8WR/Np9tvYLUJHkAXY/FjLOZf9ye0jLgP1yE10+ihe7BCxkM79GU9BsyRgRt3oArawUuU6tLgkaMN8kZPKAdq0wxNauFtH

Now all your users in your organization needs to add a line to their ~/.ssh/known_hosts as follows:

@cert-authority *.example.com ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCp+gbwBHova/OnWMj99A6HbeMAGE7eP3S9lKm4/fk86Qd9bzzNNz2TKHM7V1IMEj0GxeiagDC9FMVIcbg5OaSDkuT0wGzLAJWgY2Fn3AksgA6cjA3fYQCKw0Kq4/ySFX+Zb+A8zhJgCkMWT0ZB0ZEWi4zFbG4D/q6IvCAZBtdRKkj8nJtT5l3D3TGPXCWa2A2pptGVDgs+0FYbHX0ynD0KfB4PmtR4fVQyGJjJ0MbF7fXFzQVcWiBtui8WR/Np9tvYLUJHkAXY/FjLOZf9ye0jLgP1yE10+ihe7BCxkM79GU9BsyRgRt3oArawUuU6tLgkaMN8kZPKAdq0wxNauFtH

Each sysadmin needs to go through this process, and each user needs to add one line for each sysadmin. While you could put the same key/certificate on multiple YubiKey NEOs, to allow users to only have to put one line into their file, dealing with revocation becomes a bit more complicated if you do that. If you have multiple CA keys in use at the same time, you can roll over to new CA keys without disturbing production. Users may also have different policies for different machines, so that not all sysadmins have the power to create host keys for all machines in your organization.

The CA setup is now complete, however it isn't doing anything on its own. We need to sign some host keys using the CA, and to configure the hosts' sshd to use them. What you could do is something like this, for every host host.example.com that you want to create keys for:

h=host.example.com
scp root@$h:/etc/ssh/ssh_host_rsa_key.pub .
gpg-connect-agent "SCD KILLSCD" "SCD BYE" /bye
ssh-keygen -D /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so -s ssh-$user-ca-key.pub -I $h -h -n $h -V +52w ssh_host_rsa_key.pub
scp ssh_host_rsa_key-cert.pub root@$h:/etc/ssh/

The ssh-keygen command will use OpenSC's PKCS#11 library to talk to the PIV applet on the NEO, and it will prompt you for the PIN. Enter the PIN that you set above. The output of the command would be something like this:

Enter PIN for 'PIV_II (PIV Card Holder pin)': 
Signed host key ssh_host_rsa_key-cert.pub: id "host.example.com" serial 0 for host.example.com valid from 2015-06-16T13:39:00 to 2016-06-14T13:40:58

The host now has a SSH Host Certificate installed. To use it, you must make sure that /etc/ssh/sshd_config has the following line:

HostCertificate /etc/ssh/ssh_host_rsa_key-cert.pub

You need to restart sshd to apply the configuration change. If you now try to connect to the host, you will likely still use the known_hosts fingerprint approach. So remove the fingerprint from your machine:

ssh-keygen -R $h

Now if you attempt to ssh to the host, and using the -v parameter to ssh, you will see the following:

debug1: Server host key: RSA-CERT 1b:9b:b8:5e:74:b1:31:19:35:48:48:ba:7d:d0:01:f5
debug1: Host 'host.example.com' is known and matches the RSA-CERT host certificate.

Success!

One aspect that may warrant further discussion is the host keys. Here I only created host certificates for the hosts' RSA key. You could create host certificate for the DSA, ECDSA and Ed25519 keys as well. The reason I did not do that was that in this organization, we all used GnuPG's gpg-agent/scdaemon with YubiKey NEO's OpenPGP Card Applet with RSA keys for user authentication. So only the host RSA key is relevant.

Revocation of a YubiKey NEO key is implemented by asking users to drop the corresponding line for one of the sysadmins, and regenerate the host certificate for the hosts that the sysadmin had created host certificates for. This is one reason users should have at least two CAs for your organization that they trust for signing host certificates, so they can migrate away from one of them to the other without interrupting operations.

Laptop decision fatigue

I admit defeat. I have made some effort into researching recent laptop models (see first and second post). Last week I asked myself what the biggest problem with my current 4+ year old X201 is. I couldn’t articulate any significant concern. So I have bought another second-hand X201 for semi-permanent use at my second office. At ~225 USD/EUR, including another docking station, it is an amazing value. I considered the X220-X240 but they have a different docking station, and were roughly twice the price — the latter allowed for a Samsung 850 PRO SSD purchase. Thanks everyone for your advice, anyway!

OpenPGP Smartcards and GNOME

The combination of GnuPG and a OpenPGP smartcard (such as the YubiKey NEO) has been implemented and working well for around a decade. I recall starting to use it when I received a FSFE Fellowship card long time ago. Sadly there has been some regressions when using them under GNOME recently. I reinstalled my laptop with Debian Jessie (beta2) recently, and now took the time to work through the issue and write down a workaround.

To work with GnuPG and smartcards you install GnuPG agent, scdaemon, pscsd and pcsc-tools. On Debian you can do it like this:

apt-get install gnupg-agent scdaemon pcscd pcsc-tools

Use the pcsc_scan command line tool to make sure pcscd recognize the smartcard before continuing, if that doesn’t recognize the smartcard nothing beyond this point will work. The next step is to make sure you have the following line in ~/.gnupg/gpg.conf:

use-agent

Logging out and into GNOME should start gpg-agent for you, through the /etc/X11/Xsession.d/90gpg-agent script. In theory, this should be all that is required. However, when you start a terminal and attempt to use the smartcard through GnuPG you would get an error like this:

jas@latte:~$ gpg --card-status
gpg: selecting openpgp failed: unknown command
gpg: OpenPGP card not available: general error
jas@latte:~$

The reason is that the GNOME Keyring hijacks the GnuPG agent’s environment variables and effectively replaces gpg-agent with gnome-keyring-daemon which does not support smartcard commands (Debian bug #773304). GnuPG uses the environment variable GPG_AGENT_INFO to find the location of the agent socket, and when the GNOME Keyring is active it will typically look like this:

jas@latte:~$ echo $GPG_AGENT_INFO 
/run/user/1000/keyring/gpg:0:1
jas@latte:~$ 

If you use GnuPG with a smartcard, I recommend to disable GNOME Keyring’s GnuPG and SSH agent emulation code. This used to be easy to achieve in older GNOME releases (e.g., the one included in Debian Wheezy), through the gnome-session-properties GUI. Sadly there is no longer any GUI for disabling this functionality (Debian bug #760102). The GNOME Keyring GnuPG/SSH agent replacement functionality is invoked through the XDG autostart mechanism, and the documented way to disable system-wide services for a normal user account is to invoke the following commands.

jas@latte:~$ mkdir ~/.config/autostart
jas@latte:~$ cp /etc/xdg/autostart/gnome-keyring-gpg.desktop ~/.config/autostart/
jas@latte:~$ echo 'Hidden=true' >> ~/.config/autostart/gnome-keyring-gpg.desktop 
jas@latte:~$ cp /etc/xdg/autostart/gnome-keyring-ssh.desktop ~/.config/autostart/
jas@latte:~$ echo 'Hidden=true' >> ~/.config/autostart/gnome-keyring-ssh.desktop 
jas@latte:~$ 

You now need to logout and login again. When you start a terminal, you can look at the GPG_AGENT_INFO environment variable again and everything should be working again.

jas@latte:~$ echo $GPG_AGENT_INFO 
/tmp/gpg-dqR4L7/S.gpg-agent:1890:1
jas@latte:~$ echo $SSH_AUTH_SOCK 
/tmp/gpg-54VfLs/S.gpg-agent.ssh
jas@latte:~$ gpg --card-status
Application ID ...: D2760001240102000060000000420000
...
jas@latte:~$ ssh-add -L
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDFP+UOTZJ+OXydpmbKmdGOVoJJz8se7lMs139T+TNLryk3EEWF+GqbB4VgzxzrGjwAMSjeQkAMb7Sbn+VpbJf1JDPFBHoYJQmg6CX4kFRaGZT6DHbYjgia59WkdkEYTtB7KPkbFWleo/RZT2u3f8eTedrP7dhSX0azN0lDuu/wBrwedzSV+AiPr10rQaCTp1V8sKbhz5ryOXHQW0Gcps6JraRzMW+ooKFX3lPq0pZa7qL9F6sE4sDFvtOdbRJoZS1b88aZrENGx8KSrcMzARq9UBn1plsEG4/3BRv/BgHHaF+d97by52R0VVyIXpLlkdp1Uk4D9cQptgaH4UAyI1vr cardno:006000000042
jas@latte:~$ 

That’s it. Resolving this properly involves 1) adding smartcard code to the GNOME Keyring, 2) disabling the GnuPG/SSH replacement code in GNOME Keyring completely, 3) reorder the startup so that gpg-agent supersedes gnome-keyring-daemon instead of vice versa, so that people who installed the gpg-agent really gets it instead of the GNOME default, or 4) something else. I don’t have a strong opinion on how to solve this, but 3) sounds like a simple way forward.

Offline GnuPG Master Key and Subkeys on YubiKey NEO Smartcard

I have moved to a new OpenPGP key. There are many tutorials and blog posts on GnuPG key generation around, but none of them matched exactly the setup I wanted to have. So I wrote down the steps I took, to remember them if I need to in the future. Briefly my requirements were as follows:

  • The new master GnuPG key is on an USB stick.
  • The USB stick is only ever used on an offline computer.
  • There are subkeys stored on a YubiKey NEO smartcard for daily use.
  • I want to generate the subkeys using GnuPG so I have a backup.
  • Some non-default hash/cipher preferences encoded into the public key.

Continue reading Offline GnuPG Master Key and Subkeys on YubiKey NEO Smartcard

OpenPGP Key Transition Statement

I have created a new OpenPGP key 54265e8c and will be transitioning away from my old key. If you have signed my old key, I would appreciate signatures on my new key as well. I have created a transition statement that can be downloaded from https://josefsson.org/key-transition-2014-06-22.txt.

Below is the signed statement.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

OpenPGP Key Transition Statement for Simon Josefsson

I have created a new OpenPGP key and will be transitioning away from
my old key.  The old key has not been compromised and will continue to
be valid for some time, but I prefer all future correspondence to be
encrypted to the new key, and will be making signatures with the new
key going forward.

I would like this new key to be re-integrated into the web of trust.
This message is signed by both keys to certify the transition.  My new
and old keys are signed by each other.  If you have signed my old key,
I would appreciate signatures on my new key as well, provided that
your signing policy permits that without re-authenticating me.

The old key, which I am transitioning away from, is:

pub   1280R/B565716F 2002-05-05
      Key fingerprint = 0424 D4EE 81A0 E3D1 19C6  F835 EDA2 1E94 B565 716F

The new key, to which I am transitioning, is:

pub   3744R/54265E8C 2014-06-22
      Key fingerprint = 9AA9 BDB1 1BB1 B99A 2128  5A33 0664 A769 5426 5E8C

The entire key may be downloaded from: https://josefsson.org/54265e8c.txt

To fetch the full new key from a public key server using GnuPG, run:

  gpg --keyserver keys.gnupg.net --recv-key 54265e8c

If you already know my old key, you can now verify that the new key is
signed by the old one:

  gpg --check-sigs 54265e8c

If you are satisfied that you've got the right key, and the User IDs
match what you expect, I would appreciate it if you would sign my key:

  gpg --sign-key 54265e8c

You can upload your signatures to a public keyserver directly:

  gpg --keyserver keys.gnupg.net --send-key 54265e8c

Or email simon@josefsson.org (possibly encrypted) the output from:

  gpg --armor --export 54265e8c

If you'd like any further verification or have any questions about the
transition please contact me directly.

To verify the integrity of this statement:

  wget -q -O- https://josefsson.org/key-transition-2014-06-22.txt|gpg --verify

/Simon
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iLwEAQEKAAYFAlOnV+AACgkQ7aIelLVlcW89XgUAljJgYfReyR9/bU+Om6UHUttt
CAOgSRqdcQSQ2hT69vzuhb/bc8CslIQcBtGqTgxDFsxEFhbm5zKn+tSzy5MHNHqt
MsqHcZjlYuYVhMXDhka+cfyhtd9zIxjVE5vk8v+GqEGoh8DGYq0vPy3VfvcSz5Z3
MSUpSj8gN00jlU1z4nad3maEq0ApvsLr8EsLZmtxF5TNFvzJ8mmwY+gHBGHjVYkB
8AQBAQoABgUCU6dX4AAKCRAGZKdpVCZejD1eDp46XGL2puMp0le2OF75WIUW8xqf
TMiZeB99ruk3P/jvuLnGPP2J5o7SIKE50FkMEss0yvxi6jBlHk+cJeKWGXVjBpxU
0QHq063NU+kjbMYwDfi5ZxXqaKeYODJm8Xmfh3d7lRaWF5rUOosR8nC/OROSrhg4
TjlAbvbxpQsls/JPbbporK2gbAtMlzJPD8zC8z/dT+t0qjlce8fADugblVW3bACC
Kl53X4XpojzNd/U19tSXkIBdNY/GVJqci+iruiJ1WGARF9ocnIXVuNXsfyt7UGq4
UiM/AeDVzI76v1QnE8WpsmSXzi2zXe3VahUPhOU2nPDoL53ggiVsTY3TwilvQLfX
Av/74PIaEtCi1g23YeojQlpdYzcWfnE+tUyTSNwPIBzyzHvFAHNg1Pg0KKUALsD9
P7EjrMuz63z2276EBKX8++9GnQQNCNfdHSuX4WGrBx2YgmOOqRdllMKz6pVMZdJO
V+gXbCMx0D5G7v50oB58Mb5NOgIoOnh3IQhJ7LkLwmcdG39yCdpU+92XbAW73elV
kmM8i0wsj5kDUU2ys32Gj2HnsVpbnh3Fvm9fjFJRbbQL/FxNAjzNcHe4cF3g8hTb
YVJJlzhmHGvd7HvXysJJaa0=
=ZaqY
-----END PGP SIGNATURE-----