Thu, 03 Apr 2008
EXIF Comment
For an internal project, we need to store comments inside the JPEG images.
I think the EXIF tag UserComment
is suitable for our purpose (we need texts also in the Czech language, and
the alternative tag, ImageDescription
, is strictly US-ASCII only).
Nevertheless, the problem still lies in the character set area.
The EXIF standard (PDF warning,
look at 34th page, numbered "page 28" near the bottom) defines the UserComment
data such that the first 8 bytes contain the charset info (strings
"ASCII", "JIS", or "UNICODE" padded to 8 bytes with null bytes), and then
the comment data. The problem is what "UNICODE" means. Is it UTF-8, UTF-16, or what?
I have tried to set the comment using Exiv2 utility, and tried to read it with Image::ExifTool
Perl library.
The following code prints the raw UserComment
value (i.e.
the string "UNICODE\0
my_own_comment_as_utf8_bytes"):
#!/usr/bin/perl -w use Image::ExifTool my $info = Image::ExifTool::ImageInfo("exif_comment.jpg", { Charset=> "UTF8",PrintConv=>0 }); print $info->{UserComment}, "\n";
However, with PrintConv=>1
it prints garbage, so probably
the UNICODE charset in EXIF means something different than UTF-8.

So, what does your favourite image handling program display as the
EXIF UserComment for the above image? It should read: "Příliš
žluťoučký kůň. こんにちは。
".
4 replies for this story:
misch wrote:
Firefox Exif Viewer 1.40 says: User Comment (Hex) = 0x55,0x4e,0x49,0x43,0x4f,0x44,0x45,0x00,0x50,0xc5,0x99,0xc3,0xad,0x6c,0x69,0xc5,0xa1,0x20,0xc5,0xbe,0x6c,0x75,0xc5,0xa5,0x6f,0x75,0xc4,0x8d,0x6b,0xc3,0xbd,0x20,0x6b,0xc5,0xaf,0xc5,0x88,0x2e,0x20,0xe3,0x81,0x93,0xe3,0x82,0x93,0xe3,0x81,0xab,0xe3,0x81,0xa1,0xe3,0x81,0xaf,0xe3,0x80,0x82 User Comment Character Code = Unicode So it recognizes unicode text, but displays it as raw data :-(
Věroš wrote:
We use EXIF heavily at Cestovatel for more than two years. Most of images are commented by Zoner Photo Studio and their UNICODE EXIF is usually is usually saved as UTF-16. BTW: Try XMP ( http://www.adobe.com/products/xmp/ ). It's XML based solution so you don't have to bother with encoding.
Milan Zamazal wrote:
exiv2 displays it correctly, showfoto/digikam displays empty rectangles in place of all characters. Other programs I've tried either don't display user comments at all or they display them as common unknown tags (in hexa).
Yenya wrote: OK, next try
OK, next try - this time in UTF-16. Exiv2 does not display it correctly, Image::ExifTool does. Please reload the above image and retry with it. Thanks!