Page MenuHome

Fix VSE video strip duration calculation
ClosedPublic

Authored by Sebastian Parborg (zeddb) on Jul 14 2021, 6:58 PM.

Details

Summary

The video duration was not read correctly from the video file.

It would use the global duration of the file which does in some cases not line up with the actual duration of the video stream.

Now we take the video stream duration and start time into account when calculating the strip duration.

Diff Detail

Event Timeline

Sebastian Parborg (zeddb) requested review of this revision.Jul 14 2021, 6:58 PM
Sebastian Parborg (zeddb) created this revision.

I don't quite understand the part with audio stream. stream_dur = (double)pFormatCtx->duration / AV_TIME_BASE - (video_start - audio_start); would make sense to me without the audio part. Can you provide some sample file that should be fixed by this patch?

source/blender/imbuf/intern/anim_movie.c
619

I guess this could be clarified a bit, because it's not taken strictly from container?

I don't quite understand the part with audio stream. stream_dur = (double)pFormatCtx->duration / AV_TIME_BASE - (video_start - audio_start); would make sense to me without the audio part. Can you provide some sample file that should be fixed by this patch?

The file in https://developer.blender.org/T87967 is fixed with that.

Basically the pFormatCtx->duration is the absolute duration of all streams. However the start times do not have to start at zero.
So all streams could have a start time of 200, but the pFormatCtx->duration could be 200 as well.
If we simply remove the video_start time from the duration, we would have a total duration of 0 in this example.

To correctly calculate the true video duration, we need to remove the start time difference between the audio and the video stream:
200 - (200-200) = 200

In the video file in T87967 the start time values are huge, so if we don't do this difference, the duration would become negative and incorrect.

source/blender/imbuf/intern/anim_movie.c
619

Right, here we are reading the duration from the container, but using the stream start times to deduce the actual duration of the video stream.

I don't quite understand the part with audio stream. stream_dur = (double)pFormatCtx->duration / AV_TIME_BASE - (video_start - audio_start); would make sense to me without the audio part. Can you provide some sample file that should be fixed by this patch?

The file in https://developer.blender.org/T87967 is fixed with that.

That file seemed to have correct duration in master (1130 frames). Also video_stream->duration != AV_NOPTS_VALUE evaluates to true and so it doesn't go through branch implemented in this patch. This may be due to platform difference perhaps?

Basically the pFormatCtx->duration is the absolute duration of all streams. However the start times do not have to start at zero.
So all streams could have a start time of 200, but the pFormatCtx->duration could be 200 as well.
If we simply remove the video_start time from the duration, we would have a total duration of 0 in this example.

To correctly calculate the true video duration, we need to remove the start time difference between the audio and the video stream:
200 - (200-200) = 200

I don't get the point though.

If streams don't start at the same time, then formula video_duration = stream_duration - (video_start - audio_start) makes it depend on audio stream start. Consider file where video stream starts earlier than audio
like video_duration = 10s - (0s - 1s) that would result in video_duration = 11s. Now if you don't change length of these streams only swap their start time, it would result in video_duration = 10s - (1s - 0s) so video_duration = 9s.

Do you want to check if there is any dead space before video stream, so it can be substracted from total duration?
In such case there can be more audio, video streams and perhaps subtitle streams too. So I think you should look for mimimum start time in all streams. Then you could do video_duration = stream_duration - (video_start - min_stream_start).
I don't think, that all streams must end at the same time though, that would make this calculation flawed. This would be reasonable expectation for "supported" file format though. I think, that we are already making very fine adjustments.

That file seemed to have correct duration in master (1130 frames). Also video_stream->duration != AV_NOPTS_VALUE evaluates to true and so it doesn't go through branch implemented in this patch. This may be due to platform difference perhaps?

My bad. I used that file to check that the code gave the same numbers as the actual saved durations in that file to confirm that the calculations was correct.
So it is not a platform difference. It is just that I forgot that the file had duration already.

I think it might have been the mts file I shared with you eariler that got fixed.

I don't get the point though.

If streams don't start at the same time, then formula video_duration = stream_duration - (video_start - audio_start) makes it depend on audio stream start. Consider file where video stream starts earlier than audio
like video_duration = 10s - (0s - 1s) that would result in video_duration = 11s. Now if you don't change length of these streams only swap their start time, it would result in video_duration = 10s - (1s - 0s) so video_duration = 9s.

Do you want to check if there is any dead space before video stream, so it can be substracted from total duration?
In such case there can be more audio, video streams and perhaps subtitle streams too. So I think you should look for mimimum start time in all streams. Then you could do video_duration = stream_duration - (video_start - min_stream_start).
I don't think, that all streams must end at the same time though, that would make this calculation flawed. This would be reasonable expectation for "supported" file format though. I think, that we are already making very fine adjustments.

From what I can tell, audio always starts before the video stream. (This is the assumption I made in the code, I guess I should add a comment here about it?)
This can of course be messed up with files with multiple audio and video streams.
However we don't support files with multiple streams like that currently either way.
We just take the first audio and video stream and import them.

That file seemed to have correct duration in master (1130 frames). Also video_stream->duration != AV_NOPTS_VALUE evaluates to true and so it doesn't go through branch implemented in this patch. This may be due to platform difference perhaps?

My bad. I used that file to check that the code gave the same numbers as the actual saved durations in that file to confirm that the calculations was correct.
So it is not a platform difference. It is just that I forgot that the file had duration already.

I think it might have been the mts file I shared with you eariler that got fixed.

If you mean newmobcal1920_12mbps.ts that seems to also use video_stream->duration value.

From what I can tell, audio always starts before the video stream. (This is the assumption I made in the code, I guess I should add a comment here about it?)

If this is assumption, it should be mentioned. Also mention why is this needed - Actual video stream may be shorter than pFormatCtx->duration.
Should we use BLI_assert here perhaps? This way if things don't work well in release, we would immediately know what is going on. Though it may be undesirable too if you just want to debug something different in file which violates this assumption.

Checked some files, and it seems that audio usually starts before video, which is probably to be expected, because audio packets are presented for shorter time period. So this is probably reasonable expectation.
Going through all streams may not add much complexity, but there may be some unexpected side effects, so I can't really recommend that approach.

This can of course be messed up with files with multiple audio and video streams.
However we don't support files with multiple streams like that currently either way.

Actually, we do support files with multiple video streams. It's argument for IMB_open_anim function, and exposed as strip property in VSE.

This revision is now accepted and ready to land.Jul 22 2021, 3:01 AM
Sebastian Parborg (zeddb) marked an inline comment as done.

Added an assert and clarified with comments! :)