Sun, 28 Oct 2007
Non-ASCII Basic Authentication
We use a HTTP basic autentication for authenticating users to IS MU. It is most widely supported in clients, and it does not have some security problems which cookies have. Recently, while implementing a better method for storing passwords in IS MU, which would allow passwords of an arbitrary length, I have discovered the following problem:
I wanted to use this change to not only allow
passwords of arbitrary length, but also to allow non-ASCII characters
in passwords. However, reading the HTTP specification, I have not found
in which charset is the Authorization:
HTTP header supposed
to be sent.
Experiments showed that for example MSIE sends the password in the client's
native encoding (windows-1250
in this part of the world),
regardless of the charset of the previous 401 page.
What is worse, my Galeon (so probably all Gecko-based browsers too)
mangle the password into something totally unreadable: for example,
when I tried to send an user name "kas
" and a password
"asděščřžýáíé
",
the Authorization:
header from Galeon contained the
base64-encoded string "a2FzOmFzZBthDVl+/eHt6Q==
" (decoded to bytes
it is
6b 61 73 3a 61 73 64 1b 61 0d 59 7e fd e1 ed e9
).
Total crap, even with a newline! It looks like these bytes are the lowest bytes of the Unicode
codepoint number of the character in the password. No UTF-8 or something,
just codepoints truncated to 8 bits.
So we will stay with ASCII-only passwords, as we do not want to affect the accessibility of IS from various devices. By the way, did you know that Apache has its own proprietary algorithm for MD5-based password hashing? Talk about reinventing the wheel.