Audio imported from a video is different between DAWs - can anyone explain why?

OMF/AAF place audio according to embedded timecode, so these are not off-sync.
When an OMF or AAF is off-sync, then you have other troubles than video encoding.

Embedded audio (and especially highly compressed video/audio) is never completely accurate.
AAC and mp3 audio are always offsync, and that can be compensated for at the video encoding and/or at video import.
I have no idea if an application can know if it is compensated for upon encoding or not.
I also don’t have a clue what the usual tolerences are. (But these should be less than 2 frames, that is true)
And I alos don’t know which encoders actually compensate for that.
Expect for ER Media toolkit from Audio Spot, they have an option for that in their encoder.


Bottom line is, AAF/OMF are (pretty much always) correct.
Embedded audio in video rarely is.


Fredo

Aaahh okay, that confirms my guesses I’ve been doing it correct all along, as still as many unanswered questions as to why this phenomena exists. Thanks Fredo.

Pretty much what i was saying.

This is a side-effect from encoding/decoding from Wav to AAC/MP3 (and even worse from AAC/Mp3 to AAC/Mp3).
But as said, I don’t have a clue what and who is correct and/or incorrect.
It’s just a thing that happens and should be compensated for at encoding and/or decoding.

Bottom line, you possibly can’t know.
And stay away from it.

Fredo

Got it. Thanks alot Fredo!

Introduction and background: Due to extensive video decoding time some highly compressed and complexed codecs are designed to offset their included audio streams to ‘force’ video audio playback synchronization while played in their native form through various media players in various devices, However, converting those codecs into none compressed form may introduce the opposite result!
If not treated in conversion process ALL H26x codec variations will introduce delayed audio of some sort no question about it.
Therefore, I designed a unique smart algorithm that detects those potential offset problems in both input and target audio/video and only if necessary adjust intelligently the audio stream offset to keep it ‘in sync’ with the target video, resulting in less than 1ms sync accuracy between video and audio streams with most popular mp4 but never ever risking going into negative values resulting in audio material beginning cutoffs (this method is unique for ER Media ToolKit in both Video and Audio conversion or ‘Extract Audio to Wave’).


The Test:
Original audio positioning: 120 ms from start [5760 samples at 48khz] 3 frames at 25 fps
Picture: One Frame White burst at the same spot [fps=25]
Rendered with 5 different codecs variations of h26x (there is a lot more variations but I was short of time, current test duration 5 hours)

Opened In several A/V professional software
PT (ProTools 12.5 PC)

QT h264 aac 120ms [5760] -on spot-
QT h263 aac 120ms [5760] -on spot-
Premiere2018 mp4 h264 aac 109ms [5245] / -515 samples = -10.7291666 ms deleted from the beginning of the audio material !!!
Vegas(16 Magix codec) mp4 h264 aac 120ms [5760] -on spot-
FFmpeg mp4 h264 aac 75.8125 ms [3639] / -2121 samples = -44.1875 ms deleted from the beginning of the audio material !!!
Vegas(15 Sony codec) mp4 h264 aac 79.9791666 ms [3839] / -1921 samples = -40.020833 ms deleted from the beginning of the audio material !!!

SF (SoundForge Pro 12.1 Magix)

QT h264 aac 117ms [5616] / -144 samples = -3 ms deleted from the beginning of the audio material!!!
QT h263 aac 117ms [5616] / -144 samples = -3 ms deleted from the beginning of the audio material!!!
Premiere2018 mp4 h264 aac 109ms [5232] / -528 samples = -11 ms deleted from the beginning of the audio material !!!
Vegas(16 Magix codec) mp4 h264 aac 120ms [5760] -on spot-
FFmpeg mp4 h264 aac 75ms [3600] / -2160 samples = -45 ms deleted from the beginning of the audio material !!!
Vegas(15 Sony codec) mp4 h264 aac 120ms [5760] -on spot-

SN (Steinberg Nuendo 8.3.10)

QT h264 aac 163ms [7870] / +2110 samples = +43.958333 ms ADD to the beginning of the audio material
QT h263 aac 163ms [7869] / +2111 samples = +43.979166 ms ADD to the beginning of the audio material
Premiere2018 mp4 h264 aac 153ms [7357] / +1597 samples = +33.2708333 ms ADD to the beginning of the audio material
Vegas(16 Magix codec) mp4 h264 aac 163ms [7869] / +2109 samples = +43.9375 ms ADD to the beginning of the audio material
FFmpeg mp4 h264 aac 141.333 ms [6784] / +1024 samples = +21.3333 ms ADD to the beginning of the audio material
Vegas(15 Sony codec) mp4 163.979166 ms [7871] / + 2111 samples = +43.979166 ms ADD to the beginning of the audio material

VP (Vegas Pro 15 Magix)

QT h264 aac 98.666 ms [4,736] / -1024 samples = -21.3333 ms deleted from the beginning of the audio material !!!
QT h263 aac 120ms [5760] -on spot-
Premiere2018 mp4 h264 aac 88ms [4,224] / -1536 samples = -32 ms deleted from the beginning of the audio material !!!
Vegas(16 Magix codec) mp4 h264 aac 98.666 ms [4,736] / -1024 samples = -21.3333 ms deleted from the beginning of the audio material !!!
FFmpeg mp4 h264 aac 54.667 ms [2,624] / -3136 samples = -65.3333 ms deleted from the beginning of the audio material !!!
Vegas(15 Sony codec) mp4 h264 aac 98.666 ms [4,736] / -1024 samples = -21.3333 ms deleted from the beginning of the audio material !!!


Coverted to PCM Wave using QT 7.7.9 import to PT, SF, SN and VP (all show same results - They ought to!)

QT h264 aac 120ms [5761] -on spot-
QT h263 aac 120ms [5761] -on spot-
Premiere2018 mp4 h264 aac 109ms [5232] / -528 samples = -11 ms deleted from the beginning of the audio material !!!
Vegas(16 Magix codec) mp4 aac 120ms [5761] -on spot-
FFmpeg mp4 h264 aac 76ms [3649] / -2111 samples = -43.9791666 ms deleted from the beginning of the audio material !!!
Vegas(15 Sony codec) mp4 h264 aac 120ms [5761] -on spot-


Coverted to PCM Wave using ER Media ToolKit(1.4.38) import to PT, SF, SN and VP (all show same results - They ought to!)

QT h264 aac 126ms [6048] / +288 samples = +6 ms ADD to the beginning of the audio material
QT h263 aac 120ms [5760] -on spot-
Premiere2018 mp4 h264 aac 120ms [5760] -on spot-
Vegas(16 Magix codec) mp4 h264 aac 127ms [6096] / +336 samples = +7 ms ADD to the beginning of the audio material
FFmpeg mp4 h264 aac 120ms [5760] -on spot-
Vegas(15 Sony codec) mp4 h264 aac 121ms [5808] / +48 samples = +1 ms ADD to the beginning of the audio material


Conclusion:
To be honest I was surprised, I have made this test 18 month ago and not even one software tired back then to deal with the issue, now they all try (except for Steinberg) and they suck at it (excuse me) NEVER would a fix can ever exceed to negative values NEVER EVER !
To cut off audio material beginning is a Big No No. And to do it with more than 15 ms is a total disaster.
I’ll take every time again and again softwares that do nothing at all than those who do irreparable damage.
Even if we leave aside the odd duck (h 263) and analyze only the pure H264 results, It still look very bad.
Doesn’t look very good, on Avid’s part, they do try to fix the issue but very poorly focusing on a single calculation blindsided to everything else, as if the world revolves around Apple QT, as do Magix.
It looks like they are both using a unified defected assumption based on algorithm which works for some files and totaly Destroy others.
Steinberg for the time being are totally sitting on the bench they should get in the race but be smart about it, not follow other vendors horrific mistakes.
As far as I can tel ER Media ToolKit algorithm is the smartest and the safest (Max plus 336 samples not negative values), getting it in very close range but not risking going into audio cut-off.

All my test files are available for download and inspection: https://goo.gl/oikgAT
It would be nice if someone test those on some other A/V software and share the results.

All the best,
Sagi Gal