Thu, 03 Apr 2008
EXIF Comment
For an internal project, we need to store comments inside the JPEG images.
I think the EXIF tag UserComment
is suitable for our purpose (we need texts also in the Czech language, and
the alternative tag, ImageDescription
, is strictly US-ASCII only).
Nevertheless, the problem still lies in the character set area.
The EXIF standard (PDF warning,
look at 34th page, numbered "page 28" near the bottom) defines the UserComment
data such that the first 8 bytes contain the charset info (strings
"ASCII", "JIS", or "UNICODE" padded to 8 bytes with null bytes), and then
the comment data. The problem is what "UNICODE" means. Is it UTF-8, UTF-16, or what?
I have tried to set the comment using Exiv2 utility, and tried to read it with Image::ExifTool
Perl library.
The following code prints the raw UserComment
value (i.e.
the string "UNICODE\0
my_own_comment_as_utf8_bytes"):
#!/usr/bin/perl -w use Image::ExifTool my $info = Image::ExifTool::ImageInfo("exif_comment.jpg", { Charset=> "UTF8",PrintConv=>0 }); print $info->{UserComment}, "\n";
However, with PrintConv=>1
it prints garbage, so probably
the UNICODE charset in EXIF means something different than UTF-8.
So, what does your favourite image handling program display as the
EXIF UserComment for the above image? It should read: "Příliš
žluťoučký kůň. こんにちは。
".