I can’t believe it was May 2013 when I first started writing up some of the challenges with these Samsung Files….
And then in October that year I discussed another take on this format, this time using the Xvid Codec….
Recently, I had to revisit these in order to investigate a piece of evidential video, and found that a lot of the hard work has now been alleviated by using Amped FIVE. I wrote about how FIVE can make life a lot easier in an Amped Blog.
As a follow up, I thought it worthwhile to dig a little deeper in an attempt to establish exactly what is going on with these files.
If you have not read the other posts and the Amped Blog first, it may be an idea to do so now…
To recap, we have a DivX AVI file with an associated SMI file containing the timestamps. Although the framerate of 29.812 is set by the container, the video stream itself only has a quarter of the amount of frames. Each frame is then duplicated 3 times to make this frame rate The true frame rate was just 7.4 FPS. The .smi file is linked to the frame rate in the container rather than individual image frames.
In order to take a closer look at the frame structure, I used FFprobe to output the frames as an xml spreadsheet. Part of that is seen below…
I have added in another column for DURATION. This is achieved through an excel macro calculating the duration that the frame should be displayed before the next one is presented.
Next up we have the raw output from the .smi subtitle file. Although these can be viewed in a text editor such as notepad. I use Subtitle Workshop.
We now have two timing references. One for the frames of video and one for the subtitle.
I have highlighted the 2nd second.
How can this end before the 3rd second? This goes on throughout the video, with some seconds being shorter than others.
When compared with the video timing, this is static with a constant 0.13417 seconds per frame.
It is clear that the creation of the subtitle is pretty much guesswork. I have no doubt that the start and end times are correct, as set by the DVR, but the times in between are estimates and you may have at least a half second variation between the actual time.
The only way to establish exactly what is going on would be a scene recording of a digital clock with milliseconds.
Macroblock analysis revealed a high compression ratio with minimal newly refreshed data. You can see a single newly encoded macroblock inside the sea of predicted (green) blocks within a P frame. Only large differences had newly encoded macroblocks.
The analysis of the raw data reveals what we can and cannot rely on. Can we rely on the timing? If not, why not? Can we rely on the position of small objects?
I hope that by going back and covering some of these issues already discussed in the previous posts, it highlights the importance of the initial file analysis. Relying on information without an in-depth understanding of the data could be highly detrimental.