Fri, 29 Sep 2006
Museum of Transportation
On Thursday I went with Iva and Filip for a walk. I wanted to show Iva the old abandoned railway in Líšeň. I was very surprised when I found out that the old railway station was open, including the depository of tramways owned by the Brno Technical Museum.
They had tramways from all eras since 1900. There was even more to see there, for example MiG-21 and L-29 airplanes, some military vehicles, two or three fire brigade trucks, a model railway, etc. The most interesting thing was the 1:13 scaled model of a Škoda 22Tr trolleybus. Overall, I think I can recommend the museum, but they are just a depository, not a regular museum, so they do not have regular opening days.
Wed, 27 Sep 2006
Cisco woes
I've got a message from Dan saying
that Odysseus is strangely slow,
he said that copying files to Odysseus simply hanged after few kilobytes.
I have tried to run tcpdump
, and it seemed that Odysseus
did not send reply to some TCP frames. After figuring out that copying
data did not work even from my workstation to Odysseus, I started to suspect
the Cisco switch, in which I have upgraded the firmware yesterday.
Yesterday I have upgraded the firmware in some of our Cisco switches (in order to finally get SSH working on them). After looking at MRTG graphs for Odysseus, I have seen something like this (it was before 2pm today):
So for some reason, the link speed negotiation between Odysseus
and the Cisco switch has ended up at 100 Mbit/s FD, instead of 1 Gbit/s FD.
After restarting the negotiation using ethtool
it works OK.
I hereby declare Cisco 3750 being a total crap. For example, while other switches can be rebooted in a minute or so, the boot of 3750 takes much longer, and even then, it takes another half a minute for the ethernet interfaces to become active (and it is this way even for newly plugged-in cables). Stay away from Cisco switches (at least for L2 switching, I am not familiar with their L3 gear). HP is much more open, supported, and generally better.
Sorry for the long delay between my previous post and this one. I have been off-line for 10 days, and I have been busy catching up with my mail queue since then.
Sun, 03 Sep 2006
Computing Server
We had problems with some users, who occasionally ran huge computing tasks on a general-purpose staff server, slowing down work for all other people (including mail server, home directories file serving, etc.). Because of this, and partly to test the technology as well, we have decided to order a separate server, which will be dedicated for such huge computing tasks (as well as for our experiments with big NUMA iron).
Yesterday the server has finally arrived. It is Tyan Transport VX50, an eight-CPU server with dualcore CPUs, 16 cores total. Apart from other things, we want to test Oracle on it. When we have bought the present IS MU database server, the biggest Opteron-based configurations had four CPUs. So we went for SGI Altix box (with 2 to 16 Itanium2 CPUs) instead. If the VX50 is stable and fast enough, our next DB server can probably be Opteron-based - who knows whether there will be SGI as a server manufacturer next year?
So far our VX50 has survived my stress-tests, including many parallel
kernel compiles, user-space variant of memtest86
, filesystem
load on ext3 and XFS, etc. The hardware works, including SATA controller,
hardware sensors, etc.
77160 BogoMips[?]
in a single-image system is pretty impressive, isn't it?
Probably the most interesting part of this server design is its NUMA topology (see above). They have managed to use all of the three HyperTransport channels of AMD 8xx CPUs, even though the configuration is not strictly symmetric then. I wonder how the routing of requests over the HT mesh is done. Does it use static routing, or some kind of load balancing? Any pointers, my dear lazyweb?
Fri, 01 Sep 2006
The Future of NetBSD
There was an interesting thread in *BSD mailing lists, started by one of the founders of NetBSD. The thread is named The future of NetBSD, and the author states that NetBSD is losing its momentum because of the flaws in their development model (go read it now, I will wait :-).
A bit of personal history: my first UN*X on my own PC was 386BSD[?], version 0.1, then NetBSD 0.8, and for a short period also 0.9. I moved to Linux then, and never came back, altough I try to follow the news from all the *BSD projects.
In my opinion, the development model is the reason of success
of Linux. Not the enlightened leader (in terms of being superior software
designer and coder), not the support of big companies. Linus was always
open to the people who produced working code, provided that their work
did not try to rewrite everything from scratch, including few nearby
subsystems (Reiser4 :-), even though their code was not well suited for
all purposes. This is why current SATA/libata (ab)uses lots of SCSI
infrastructure, and this is why Linux has some things in /proc
and other things in /sys
. But Linux has a working driver model
(including hotplug, device classes etc.), nicely coupled to the desktop
environment via HAL,
Linux has probably the best threads implementation from the free UN*Xes,
Linux had a NUMA-aware memory allocator before NetBSD even started to
implement large-grained SMP, etc.
In this sense, I can say that Linux development model follows the
Extreme programming[?] methodology:
the solution does not need to be perfect (nobody can design a perfect interface
suitable even for the hardware which is yet to be developed, for example),
but it should work and should not look obviously wrong. Using these
lightweight and not over-designed interfaces, Linux can avoid many pitfalls,
such as superfluous mid-layer (see libata
or syncppp
parts of the Linux kernel for nice examples of how to avoid mid-layers,
and NetBSD DMA interface as an example of how to create them).
The openness does matter as well: the Linux kernel has patches flowing in at a rate of about 1000 changesets per week, and even though the interfaces probably gets rewritten more often than in NetBSD, they will match the actual needs of their users, not some hypothetical aestethical measures of their designers. I think the same loss of mommentum can happen also to OpenSolaris. They have a rigorous patch review process, which slows down the development. So far it seems that the main work is being done by Sun employees mostly.
I am not happy that NetBSD might be slowly dying, but I think the reasons of it are obvious: It is good to write a clean code, but it is definitely not good to let the emphasis on the design itself slow your development process down. Do not design for the future. The future will always be different, and the interfaces will have to be rewritten anyway. The closed and elitist attitude of the core developers also does not help.