A/V sync drift in long capture



Hello,

I've built a DirectShow-based live capture application (based on AMCap
in the SDK) and need to use it for very long captures (several days or
longer), but after several hours the audio and video is noticably out
of sync (about 100ms every 10hrs). I've verified that there's not some
disk I/O problems, or CPU maxed out, or anything like that. Also, I'm
not using the AVI file format so it's perfectly acceptable for the
video frame rate to fluctuate slightly, as long as the audio and video
remain more or less in sync over time. I build a normal capture graph
but instead of writing samples to an AVI file, I have a custom filter
that has an audio and video input pin.

Some background:

I've learned that the audio sample start/end times are sometimes bogus
(and I *think* I understand the 'why' behind this - the audio gets
buffered and delivered in chunks). To compensate, I first tried simply
writing out the audio samples end-to-end, using the data length to
calculate their true sizes. But eventually I realized that even though
the source said it was generating samples at 44100Hz, from the
perspective of my application they are actually arriving at a rate of
about 44090Hz.

To compensate, I "stretch" the audio slightly (by duplicating a sample
here or there) so that to my app the audio is in fact arriving at
precisely the correct rate on average (it tracks the exact number of
samples it should have so that at any point in time, the number of
samples generated is within ~50 samples of the expected total number of
audio samples).

So my first question is: is this the "right" thing to do? My
application needs to keep pace with real time, and by that I mean it
needs to output 1 second of audio for every 1 second of real time that
has elapsed (on average - if there's a slight delay here or there it
doesn't matter as long as overall it keeps up). This approach does seem
to accomplish that goal, so I *think* I'm on the right track.

Either way, that leaves me with the problem of the video. The capture
filter claims that it'll be generating video frames at 29.97fps, but
when I measure the rate it's off slightly. Should I just accept the
rate "as is" or do I need to correct the video sample times too?

One thing I tried was to ignore the sample times altogether and take
the video hardware at its word: if it said frames would arrive every
33.36666ms, then I'd count the number of frames that had arrived and
compute what the sample time "should" have been (33.36666 * frameCount
+ some initial offset). I tried this but even when no frames were
dropped the audio and video still drifted apart after a few hours.

Given that I'm stretching the audio to make it match what my
application expects, and that I don't need the video frame rate to be
constant, my next idea to try is to simply write out the video samples
without trying to space them evenly, but I'm not sure what timestamps
to use so that they'll still be in sync with the stretched audio.

I guess I could query the high performance counter when the first video
sample arrives, so that I could accurately compute timestamps in
application time relative to that first frame, but how do I map that
first sample's time to the application's (or audio stream's) notion of
time? In other words, how can I accurately map video frame times to my
application's perceived timeline?

Also, a point of confusion for me: what exactly does setting a
different reference clock in a capture graph actually *do*? I ask
because I thought (perhaps incorrectly) that the sample times for audio
and the sample times for video were supposed to correspond to the same
timeline anyway, so does setting a different reference clock affect
synchronization at all? What is the effect of SetSyncSource(NULL) in a
capture graph?

I think I've read every thread on these topics but I'm still a little
unsure as to what the 'right' thing to do is. If anyone can give my any
advice or pointers I'd *really* appreciate it.

Many thanks in advance,
-Dave

.



Relevant Pages

  • Creating a Tuner Object
    ... Just getting started with TV Tuners and Video Capture in DShow and I ... Add Hauppauge Win PVR PCI II TV Audio ... Link Tuner Analog Audio to TV Audio In on the Audio filter ...
    (microsoft.public.win32.programmer.directx.video)
  • Re: Surveillance
    ... microphone's through a minijack and you can capture from them continously ... > surveillance that will provide video and audio that will stand up in court. ... > One of my neighbors, directly across the street, is a nuisance. ...
    (Fedora)
  • [x86_64] linux 2.6.15-rc6 mplayer fails to record ALSA audio.
    ... With today's git tree I'm still experiencing the same problem that audio isn't captured with mencoder. ... Selected driver: v4l2 name: Video 4 Linux 2 input ... video capture video overlay VBI capture device tuner read/write streaming ... Opening video decoder: [raw] RAW Uncompressed Video ...
    (Linux-Kernel)
  • Re: Audio quality and WME9 series
    ... >ATI WDM Rage Theater Audio ... >ATI WDM Rage Theater Video NSP ... >when selecting the "screen capture" on the WME9 series main window, ...
    (microsoft.public.windowsmedia.encoder)
  • RE: A/V sync drift in long capture
    ... your problem is based on drift of audio device and system clock. ... > video frame rate to fluctuate slightly, as long as the audio and video ... > different reference clock in a capture graph actually *do*? ...
    (microsoft.public.win32.programmer.directx.video)