Tue, 27 May 2014
MPEG Transport Stream
Today I have investigated why some files with the .MTS
extension
do not have their MIME type detected. The file starts with the following
bytes:
$ od -tx1 file.mts | head -n 1 0000000 00 00 00 00 47 40 00 10 00 00 b0 11 00 00 c1 00
According to the current /usr/share/magic
from Fedora 20,
it is quite similar to the following entry:
0 belong&0xFF5FFF10 0x47400010 >188 byte 0x47 MPEG transport stream data
Also, the shared-mime-info package contains something similar:
<match type="big32" value="0x47400010" mask="0xff4000df" offset="0"/>
Note that both files expect the 0x47 byte to be at the beginning of the
file, not after four NULL bytes as in my example. Yet mplayer(1)
can play these files, and ffprobe(1)
can detect it as "mpegts"
with an audio and video stream. Looking into the ffmpeg
source,
I have discovered it does horrible things in order to detect a file format.
For example, for mpegts
, it scans the file for a 0x47 byte
at offset divisible by four, and then evaluates some other conditions.
The probe function returns score, and a file format with greatest score
is returned from the probe function. Ugly as hell, but probably needed
for handling real-world data files.
So, what should I do next? Should I submit a patch to file(1)
and shared-mime-info
to accept also the magic number at offset 4?
Are we getting to the point where the already-complicated language
of the /usr/share/magic
file is not powerful enough?