Thu, 30 Mar 2006
Old bug hunting
Today I have found and fixed a bug in our POP-3 server which has been there almost since
the Day 1 (5 years or so). Back then I have decided to implement even some of the optional
commands of the POP-3
protocol. One of the commands, UIDL
,
was pretty easy to implement - just one function which returns an unique
message identification for each message (easy as we have the message number
as a primary key in the table of messages), and a wrapper function which
does the protocol part. And here was the problem:
When implementing the UIDL
command, I simply copied the
implementation of the LIST
command, which was almost the same,
it just returns the message size instead of the unique identification.
And the bug was that in the copy I have changed only the part when
the UIDL msg-number
command is handled, and forgot to
modify the part where UIDL
with no arguments (i.e. list of
the whole folder) is handled. So for one message it worked correctly,
but for the whole folder it returned the message sizes instead of unique IDs.
It seems that today's mail clients can be pretty confused when they get
duplicate message IDs in the UIDL
listing...
Mon, 27 Mar 2006
Give me back my webserver!
I have found this link in Ted Tso's blog. Just make sure you read this in a place where you can laugh loudly.
Fri, 24 Mar 2006
HDD utilization
I am a statistics freak. At our FTP server, I have a statistics page with various measurements. These has many times proven themselves to be useful. The most interesting data can be collected when there is a high load on the server (which is otherwise very hard to simulate). On our FTP server, the high load occurs mainly during Fedora Core releases, as we run an official Fedora mirror. Yesterday I have found something interesting in the stats:
The first image shows the utilization of /dev/sdg
, the second one shows the utilization of /dev/sdh
. By "utilization" I mean the percentage of time when there was at least one pending request to the drive.
I measure this in order to see whether the HDDs are the bottleneck of the
server. The server has eight identical drives, WD2500JB (altough of
different age and firmware revisions because of replacement of the faulty
drives over the time).
As you can see, the /dev/sdg
has approached the 100% utilization
several times, while the other drive was nowhere near to this utilization
(the graphs for the other six drives are similar to /dev/sdg
).
But the load on both drives is similar - both have the same number of sectors
per second read and written (as it can be seen on the
HDD stats page
- I will not copy the images here). Both drives are members of the same
SW RAID volumes. I have tried to watch the iostat -x 5
output,
and it seems that /dev/sdg
has not only a higher utilization,
but also the longer request wait time and request service time, while the
amount of sectors read and written is about the same. The
sdg
drive is a year newer than sdh
.
So, any idea about what might be the cause of the higher utilization of one drive out of eight almost identical ones?
Thu, 23 Mar 2006
Car insurance
I have got an invoice from the insurance company which provides damage liability for my car - they want me to pay the damage liability insurance for the next year. The invoice was for approx. CZK 5700. I am not sure how much it was last year, but I think it was less. So I visited the web pages of the insurance company, they have an on-line calculator there.
When I entered the data about my car, age, and location there, the computed price was about the same as the one on the invoice (CZK 1 less, in fact :-). However, they have 7% discount for on-line arranged contracts. So it seems it is around CZK 400 cheaper for me to withdraw from the previous contract and arrange a new one.
And what is even more interesting, the difference is even bigger for the car insurance than for the damage liability - the price of the car insurance is computed from the "common price" of the car at the time of the contract arrangement. Even with bonuses and discounts it it still more expensive than a new contract, because the "common price" lowers faster than the bonus. And if you damage the car, they will not refund more than the current "common price" anyway. It seems that they are betting that the customer will not bother to re-check the prices, and will simply renew the old contract.
I think I will withdraw from the car insurance contract anyway - it is by definition more expensive than to pay the repairs yourself. It is good when you are very short of money (such as immediately after buying a new car :-), but nothing more.
Tue, 21 Mar 2006
Long term memory
I have found that children have probably better long-term memory than I expected: last week I went with Iva to my parents', and I have put the blue plastic case with snow chains into the car trunk. After some time, Iva said "Grandpa has snow chains, yellow case.".
Indeed, my father-in-law has his snow chains in a case similar to my one, but it is yellow instead of blue. I think Iva has seen this case (and the snow chains themselves) only once, maybe half a year ago. After this time she not only able to remember the term "snow chains", but also the fact that the other Grandpa has snow chains too, in the differently colored case.
Mon, 20 Mar 2006
FC5 bug reports
Report nothing, expect nothing. The
Red Hat Bugzilla
has now the Fedora Core 5 category open,
so I have reported the regressions I have found so far:
#185943
(wrong keyboard mapping with the evdev
input driver),
#185944
(dual-head crash on ATI Radeon), and
#185945
(gnome-terminal
screen corruption).
I hope that I did not miss some obvious solution to these problems, or I will (again :-) look as a complete idiot...
The ACPI problems I have written about on Friday are indeed related to the kernel.org bug #6111. I have added notice to the RH Bugzilla as #185947.
Fri, 17 Mar 2006
Fedora Core 5
On Monday afternoon there is a scheduled release of Fedora Core 5 (the link will be working after the release). I have mirrored it from the upstream site (thanks to Jakub Jelinek), and I have tried to upgrade some of my boxes to FC5.
So far it looks mostly OK, with the following glitches:
- On my primary workstation the dual-head setup caused the system lockup. I have played with X configuration yesterday, and today it works OK. I am not sure which change caused this (I have even tried to recompile the X.org from FC4, so I think I use some drivers from 6.8.2 on the 7.0 X server, I need to investigate this further).
-
Because of the new
udev
I had to move to the newer kernel on my laptop. This means I have hit the kernel/ACPI bug #6111 (either that or the older bug #4150 is back). - On both my primary workstations at home and at work I was not able to log in - my gnome-session dies somewhere in the initial stage. I have fixed this at work (altough I am not sure what was the cause), so I will try to fix it at home as well.
- MPlayer and Liferea have missing few shared libraries. I think this one will get fixed as soon as the FC5 will be opened and repositories like Extras, Livna, and FreshRPMs manage to catch up. For now, I have simply copied the libraries from the FC4 system.
Apart from that, I think FC5 is a decent system - I like new graphics (and the ClearLooks theme). I have yet to explore the new technologies like Avahi and others. I have tried both i386 and AMD64 systems, and Vlasta even installed FC5/ppc on his iBook.
Tue, 14 Mar 2006
A Perl bug?
I have seen a strange behaviour of Perl code - the minimal code which triggers this is along the following lines:
#!/usr/bin/perl use DBIx::ShowCaller; # use DBI; my $dbh = DBIx::ShowCaller->connect('dbi:Oracle:', 'dbuser', 'dbpass') # my $dbh = DBI->connect('dbi:Oracle:', 'dbuser', 'dbpass') or die; my $temp_file = "/tmp/input.txt"; open(TEMP, $temp_file) or die "Can't open $temp_file: $!\n"; my $text = join('',<TEMP>); close TEMP; for my $radek (split (/\n/, $text)) { my $rv; ($rv) = $dbh->selectrow_array('SELECT 1 FROM DUAL'); # ($rv) = $dbh->selectrow_array('SELECT 1 FROM DUAL'); if ($rv) { print "is ok\n"; } else { print "NOT OK!\n"; } last; }
Do not even try to find out why TF it is written this way - it has been created by pruning and testing the larger script. It returns "NOT OK" from the SELECT. However, any of the following changes makes it work:
- Re-running the SELECT (i.e. uncommenting the second SELECT).
- Doing anything with
$dbh
prior to the SELECT, be it anotherprepare()
or a mere$dbh->ping
. - Replacing the
selectrow_array()
withprepare()
,execute()
, andfetchrow_array()
. - Removing the
split()
- replacing it with "my @lines=<TEMP>; for my $radek (@lines) {
". - Replacing
DBIx::ShowCaller
withDBI
(uncommenting the first two commented-out lines). - Using the
/tmp/input.txt
file with 1004 or less lines (1005+ lines make this problem appear again). This is not dependent on the file size (it can be almost all empty lines, or extremely wide lines (100+ chars per line).
Strange, ins't it? I have tested on both Linux/i386 and Linux/ia64
(with different versions of the Oracle client, altough DBI and
DBD::Oracle have the same version number on both systems). Also
selectrow_arrayref()
has the same behavour. It is Perl
5.8.6. What's going on there?
Sat, 11 Mar 2006
Japanese lessons
Oozy told me about Japanese lessons at the Faculty of Arts, and I have decided to enroll to the course with him. The course is taught by Mr. Koushi Hirayama, a native Japanese.
So far we are learning Hiragana, which I already know, but the lessons are nevertheless useful for me, because Hirayama-sensei also tells us what parts of the letters are more important than the others (something which cannot be learned just by looking at the Hiragana computer fonts).
The most tricky thing is the order of strokes. I thought it could be characterized as "left to right and top to bottom, with horizontal strokes before the vertical strokes which cross them". But alas, there are letters for which this rule does not work - for example も (MO, click to enlarge) is written with the long stroke first, and や (YA) has the shortest stroke written before the other vertical stroke. Does anybody know about a better-working (yet still simple) rule for determining the order of strokes in Hiragana?
Fri, 03 Mar 2006
IP telephony
Playing with the IP telephony continues. I have reported few bugs in Ekiga, and it seems it makes big progress almost daily. I use the latest CVS version (from the ekiga snaphots site, wich also includes the snapshots of the libraries).
After few problems I now use Siproxd on my home NAT gateway, and I am able to call from the outside network to the laptop behind this NAT and vice versa. I have also obtained an account on the CESNET H.323 gateway, which gave me a public phone number for incoming calls. Unfortunately, the H.323 NAT/conntrack helper is still in the early stages of the development and not very stable, so I cannot use it from behind the NAT host. I hope I manage to fill a bug report soon.
I have also discovered few problems with CESNET VoIP - my H.323 number is not reachable from the Vodafone network, and some Cisco gear at CESNET does not like that my SIP From: header contains a non-numeric local part of the address (my login name instead of the phone number). The CESNET people are trying to fix both of these problems, and I must say, they were extremely helpful when I was trying to set up the VoIP service.
Yesterday a friend of my wife also tried a SIP telephone (Twinkle), and we were able to call her through the Freevoice.CZ gateway. Twinkle has some interesting features (such as two virtual voice lines), but it does not support H.323, and also the SPEEX codec (the later is in their TODO, though).
I wonder for how long the current pricing of the legacy voice services can remain at the current level, because the price of the broadband Net connection is about the same as the price of the analog voice line (especially in bigger cities on networks like Netbox), and the bandwidth is an order of magnitude (or two) higher. The monthly fee is about the same, but the legacy phone service is in addition to that charged for every minute of the call, while the Net connection is always on. The analog phone uses 64kbit/s for a single call, while SPEEX wideband codec needs about 32kbit/s, so the 1 Mbit/s line can hold 32 calls (and I am not counting the fact that for traffic inside the Netbox own network I have 10/10 Mbit/s of bandwidth). And as far as I can tell, the quality of the IP calls is not so bad, especially with the SPEEX codec.
So, is it time for huge price cuts, or time for yet another monopoly abuse like disabling the VoIP traffic on the telco's own network?
Thu, 02 Mar 2006
Feb29 problem
Yesterday all our chip card access systems ceased to work. When I tried to find out what the problem was, there were no daemon processes running on servers which communicate with the card readers. When I tried to start one such process by hand, it failed with an "invalid date" message.
It turned out that the problem was in the new feature I have added to the service software few months ago: I decided, that when a card is used to unlock the door, the log message should be inserted to the database with a timestamp provided by the card reader, not with the database sysdate() function (which can be later, when the message is not read from the card reader immediately).
This is of course an improvement, but unfortunately it seems that the card readers suffer from the "Feb29" problem, because they all thought that yesterday was February 29, 2006. Which is an invalid date, and the service daemon has failed to convert this date to the UNIX timestamp, and crashed. The software of course does have an automatic time synchronization each day after the midnight, but this did not work as well, because (in order to see the time offset of the card reader to the system time) I read and log the old time first, and only then I set the new time. But reading the old time crashed the daemon, so the new time did not went in. Oh well.