IDNA flaws with regard to U+2024

In a bug report against libidn, Erik van der Poel gives an example of an internationalized domain name that is handled differently by different implementation. Another example of one such string is:

‘räksmörgÃ¥s’ U+2024 ‘com’

If your browser supports Unicode, the string is: räksmörgÃ¥s․com. Use cut’n’paste of the string into your browser and see what it tries to lookup (please let me know what you notice!).

The problem with this string is that it is on the form “[non-ASCII][DOT-Like code point]com”. Here ‘räksmörgÃ¥s’ represents the non-ASCII string, which can be any non-ASCII string. Further, the U+2024 represent one character which looks like a dot, there are others that also contain dot-like characters.

The IDNA algorithm (section 3.1) implies that applications should treat the string as one label. The U+2024 character is not one of the dot-like characters that needs to be treated as a label separator. The ASCII string which is output after applying the IDNA algorithm is:

xn--rksmrgs.com-l8as9u

Note that the string contains an ASCII dot ‘.’ (0x0E). If applications are not careful how they resolv the name in the DNS, they will request information in a non-existing top-level domain ‘com-l8as9u’. This is because the DNS do not use ‘.’ to separate labels, but instead uses a length-value pair for each label. Thus the wrong string to lookup would be:

(11)xn--rksmrgs(10)com-l8as9u

Whereas the right string to lookup would be:

(22)xn--rksmrgs.com-l8as9u

Using DNS master file syntax, the name to lookup is xn--rksmrgs.com-l8as9u.

What’s interesting here is that some implementations, such as Microsoft Internet Explorer and Firefox implements IDNA not according to the standard. Instead, they compute the following string:

xn--rksmrgs-5wao1o.com

Arguable, this is a better approach than what is specified by RFC 3490. MSIE/Firefox recognize that U+2024 is a “dot-like” character, by using NFKC. What is debatable is whether U+2024 will actually occur in practice, Unicode expert Kenneth Whistler says U+2024 will not be entered accidentally.

As the maintainer of GNU Libidn, I’m not yet sure about what to do about the situation. The conservative approach is to do nothing until the RFCs are updated. I have come up with a patch to add a new IDNA flag that treat U+2024 as a dot-like character early on. This would at least make it possible to produce the same (RFC non-conforming) output that MSIE/Firefox computes.

On TLS-AUTHZ

The TLS-AUTHZ document (protocol spec here) describes a mechanism to add support for authorization in the TLS protocol. The idea is part of a patent application, see the patent notification to the IETF. The protocol has a complicated history in the IETF. Right now a third last call is open to request feedback from the community. I’ve written about TLS-AUTHZ before.

RedPhoneSecurity is now trying to circumvent the IETF standardization process by trying to get the document published as an ‘experimental standard’. The document earlier failed to get consensus for publication on the standards track.

The responsible IETF Area Director, Tim Polk, argues that because there exists independent implementations, the community benefits from having the document published. The argument is silly because the only independent implementation is mine and I’m opposed to publication of the standard. Further, the document will remain accessible to anyone in the community with access to the Internet since it has been published as an Internet Draft. To clarify that we have no interest in a standard with patent claims, we have decided to remove the tls-authz implementation from GnuTLS. Together with the FSF we came up with the following statement which is part of the GnuTLS 2.0.2 release announcement:

** TLS authorization support removed.
This technique may be patented in the future, and it is not of crucial importance for the Internet community. After deliberation we have concluded that the best thing we can do in this situation is to encourage society not to adopt this technique. We have decided to lead the way with our own actions.

If you are concerned about having patented standards adopted by the IETF, now is a very good time to make your voice heard! The last call ends on October 23th. Please read about the issue, and familiarize yourself with the IETF process (RFC 2026, with updates related to patents in RFC 3989) and send your feedback to ietf@ietf.org.

Free-ietf-review

I have created a mailing list whose purpose is to discuss everything related to free software and the IETF, in particular themes related to copyright and patent. The idea is also to CC this list on discussions in various IETF areas that is relevant to the topic, so that everyone on this list becomes aware of what is going on. For example of useful things to CC are reviews (from a free software perspective) of documents in last call, and discussions in working groups related to patent/copyright decisions.

You may subscribe to the list.

TLS-AUTHZ Patent Concerns

I’ve implemented tls-authz in GnuTLS but there has been a long discussion of the patent situation for that technology on the IETF list. A few days ago there was a new IPR Disclosure with a patent license for this technology:

https://datatracker.ietf.org/public/ipr_detail_show.cgi?&ipr_id=833

I evaluated this license from a free software perspective, here is my writeup:

http://article.gmane.org/gmane.ietf.general/24690