Mon, 21 Jan 2008
What Is Luminance?
I have recently enhanced my scanned form recognition software in IS MU to handle not only greyscale scans, but also color ones. It is quite simple - the software converts the image to greyscale in the input phase. During tests, we have found a problematic input file - it has been recognized differently when scanned greyscale than when scanned in full colors. Before reading on, look at the image below and try to decide what number should the software see there:
I think the correct output should be the digit "3", with the previous
digit being recognized as rubbed out. Now look at the
color version. There are
three variants there: the first one is the output of the scanner in the color mode,
the second one is the color scan converted to greyscale (using
GIMP, but convert(1)
and ppmtopgm(1)
give similar results), and the last variant
is the scanner output in the greyscale mode (as seen in the image in this
page).
Apparently the conversion to greyscale is not as simple as, for example,
averaging the three color channels: (R+G+B)/3
.
In the ppmtopgm(1)
source code they refer
to the ITU-R BT.601.5 standard, which states that the luminance value
(essentially the grey level of the greyscale version of the image) should
be computed using the following formula:
0.2989*R + 0.5866*G + 0.1145*B
I guess the scanner does something simpler than using this formula, which leads to suboptimal results. When the requirements to the recognition software are "it should recognize what human would see in the scanned image", I think it correctly recognizes the greyscale scan as "3", and the color scan as "23".