I'm trying to implement a Media Player in android using the MediaCodec API.
I've created three threads
Thread 1 : To de-queue the input buffers to get free indices and then queuing the audio and video frames in respective codec's input buffer
Thread 2 : To de-queue the audio codec's output buffer and render it using AudioTrack class' write method
Thread 3 : To de-queue the video codec's output buffer and render it using releaseBuffer method
I'm facing a lot of problem in achieving synchronization between audio and video frames. I never drop audio frames and before rendering video frames I check whether the decoded frames are late by more than 3omsecs, if they are I drop the frame, if they are more than 10ms early I don't render the frame.
To find the difference between audio and video I use following logic
public long calculateLateByUs(long timeUs) {
long nowUs = 0;
if (hasAudio && audioTrack != null) {
synchronized (audioTrack) {
if(first_audio_sample && startTimeUs >=0){
System.out.println("First video after audio Time Us: " + timeUs );
startTimeUs = -1;
first_audio_sample = false;
}
nowUs = (audioTrack.getPlaybackHeadPosition() * 1000000L) /
audioCodec.format.getInteger(MediaFormat.KEY_SAMPLE_RATE);
}
} else if(!hasAudio){
nowUs = System.currentTimeMillis() * 1000;
startTimeUs = 0;
}else{
nowUs = System.currentTimeMillis() * 1000;
}
if (startTimeUs == -1) {
startTimeUs = nowUs - timeUs;
}
if(syslog){
System.out.println("Timing Statistics:");
System.out.println("Key Sample Rate :"+ audioCodec.format.getInteger(MediaFormat.KEY_SAMPLE_RATE) + " nowUs: " + nowUs + " startTimeUs: "+startTimeUs + " timeUs: "+timeUs + " return value :"+(nowUs - (startTimeUs + timeUs)));
}
return (nowUs - (startTimeUs + timeUs));
}
timeUs is the presentation time in micro-seconds of the video frame. nowUs is supposed to contain the duration in micro-seconds for which audio has been playing. startTimeUs is the initial difference between audio and video frames which has to be maintained always.
The first if block checks, if there is indeed an audio track and it has been initialized and sets the value of nowUs by calculating it from audiotrack
If there is no audio (first else) nowUs is set to SystemTime and the initial gap is set to zero. startTimeUs is initialized to zero in main function.
The if block in the synchronized block is used in case, first frame to be rendered is audio and audio frame joins later. first_audio_sample flag is initially set to true.
Please let me know if anything is not clear.
Also if you know of any open source link where media player of an a-v file has been implemented using video codec, that would be great.