0

I'm new to using ffmpeg. I have some AVI files that I believe should have audio, but I cannot get the audio to play with VLC. I tried using ffmpeg to analyze the AVI files for audio.

import subprocess

def has_audio(filename): result = subprocess.run(["ffprobe", "-v", "error", "-show_entries", "format=nb_streams", "-of", "default=noprint_wrappers=1:nokey=1", filename], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) return int(result.stdout) -1

print(has_audio('filename.avi'))

Below is the output for one of the AVI files. The file looks to have audio, but maybe I'm reading the output incorrectly. I also tried the code in this question, which tells me the video has audio.

How can I ensure that this file has audio or not using either ffmpeg or another application?

avi @ 0x7fddac005140] non-interleaved AVI
Guessed Channel Layout for Input Stream #0.1 : mono
Input #0, avi, from 'filename.avi':
  Metadata:
    date            : 2010-06-29
  Duration: 00:00:06.83, start: 0.000000, bitrate: 10607 kb/s
  Stream #0:0: Video: mjpeg (Baseline) (MJPG / 0x47504A4D), yuvj420p(pc, bt470bg/unknown/unknown), 1920x1080, 9878 kb/s, 30 fps, 30 tbr, 30 tbn
  Stream #0:1: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz, 1 channels, s16, 512 kb/s
Stream mapping:
  Stream #0:1 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    date            : 2010-06-29
    encoder         : Lavf60.3.100
  Stream #0:0: Audio: pcm_s16le, 32000 Hz, mono, s16, 512 kb/s
    Metadata:
      encoder         : Lavc60.3.100 pcm_s16le
size=N/A time=00:00:05.99 bitrate=N/A speed=1.52e+03x       
video:0kB audio:375kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[Parsed_volumedetect_0 @ 0x7fdd9b704080] n_samples: 192000
[Parsed_volumedetect_0 @ 0x7fdd9b704080] mean_volume: -44.3 dB
[Parsed_volumedetect_0 @ 0x7fdd9b704080] max_volume: -16.2 dB
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_16db: 3
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_17db: 0
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_18db: 9
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_19db: 9
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_20db: 15
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_21db: 16
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_22db: 12
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_23db: 25
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_24db: 19
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_25db: 25
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_26db: 22
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_27db: 13
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_28db: 17
[Parsed_volumedetect_0 @ 0x7fdd9b704080] histogram_29db: 13

Here is the output from MediaInfo

Bit rate: 12884197, Frame rate: 30.000, Format: JPEG
Duration (raw value): 6833
Duration (other values:
['6 s 833 ms',
 '6 s 833 ms',
 '6 s 833 ms',
 '00:00:06.833',
 '00:00:06:25',
 '00:00:06.833 (00:00:06:25)']
Track data:
{'alignment': 'Aligned',
 'bit_depth': 16,
 'bit_rate': 512000,
 'bit_rate_mode': 'CBR',
 'channel_s': 1,
 'codec_id': '1',
 'codec_id_url': 'http://www.microsoft.com/windows/',
 'commercial_name': 'PCM',
 'count': '285',
 'count_of_stream_of_this_kind': '1',
 'delay': 0,
 'delay__origin': 'Stream',
 'delay_relative_to_video': 0,
 'duration': 6000,
 'format': 'PCM',
 'format_settings': 'Little / Signed',
 'format_settings__endianness': 'Little',
 'format_settings__sign': 'Signed',
 'interleave__duration': 1139,
 'kind_of_stream': 'Audio',
 'other_alignment': ['Aligned on interleaves'],
 'other_bit_depth': ['16 bits'],
 'other_bit_rate': ['512 kb/s'],
 'other_bit_rate_mode': ['Constant'],
 'other_channel_s': ['1 channel'],
 'other_delay': ['00:00:00.000', '00:00:00.000'],
 'other_delay__origin': ['Raw stream'],
 'other_delay_relative_to_video': ['00:00:00.000', '00:00:00.000'],
 'other_duration': ['6 s 0 ms',
                    '6 s 0 ms',
                    '6 s 0 ms',
                    '00:00:06.000',
                    '00:00:06.000'],
 'other_format': ['PCM'],
 'other_interleave__duration': ['1139',
                                '1139  ms (34.17 video frames)',
                                '34.17'],
 'other_kind_of_stream': ['Audio'],
 'other_sampling_rate': ['32.0 kHz'],
 'other_stream_size': ['375 KiB (4%)',
                       '375 KiB',
                       '375 KiB',
                       '375 KiB',
                       '375.0 KiB',
                       '375 KiB (4%)'],
 'other_track_id': ['1'],
 'proportion_of_this_stream': '0.04238',
 'samples_count': '192000',
 'sampling_rate': 32000,
 'stream_identifier': '0',
 'stream_size': 384000,
 'streamorder': '1',
 'track_id': 1,
 'track_type': 'Audio'}
Giacomo1968
  • 58,727

2 Answers2

1

Your output shows that the file has audio :

  Stream #0:1: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz, 1 channels, s16, 512 kb/s
harrymc
  • 498,455
1

Try this:

ffmpeg -i filename.avi -c:v copy -c:a aac file_pcm_to_aac.mp4

This will keep the video as is, but encode the PCM WAV to m4a AAC.

If that doesn't play:

ffmpeg -i filename.avi -an -c:v copy file_video_only.mp4 && ffmpeg -i filename.avi -vn filename_audio_only.wav

Then see if it will play as a wav file by itself with aplay. If not, try opening it with audacity to see if it opens. If audacity cannot open it, try dumping it to raw, specifying the input type.

ffmpeg -f s16le -i filename_audio_only.wav -c:a pcm_s16le output.wav

Then try to import it from RAW, specifying it's signed, 16-bit, little endian, 32000 Hz, mono to audacity.


There could be something to [avi @ 0x5617e27c4cc0] non-interleaved AVI.
This says, "framerate has to be divisable by the audio sample rate".

AviSynth may be able to use DirectShow to target the audio specifically.

From the above post:

Now if I "reencode" the file in another AVI:

ffmpeg -i f.avi -c copy f2.avi

I can extract the audio from f2.avi in milliseconds!

Conversion to an interleaved .avi is easiest to try first.


I've played around with it and I'm pretty sure the audio converts. The clip's only half a second of bumped microphone, so I'm not positive. It needs tested on a longer file.

The source video, when properly de-muxed from that strange PCM track, plays properly at 305K in an mkv container. That left the audio. WAV files are large, so that makes sense. The variations of extracting ranged from 2.4mb to 3.0mb... Without -f sle16 leading the input .avi file, it would extract something around 400k, which is all wrong.

├── [ 2.4M]  audio_file.wav
├── [ 3.3M]  MOVI0000.avi
├── [ 305K]  MOVI0000.mkv

There's no difference between -c copy or -c:a pcm_s16le, the file comes out 2.4mb+ (PCM is not compressed, so unless you are changing channel, frequency or bit-rate, it's redundant.)

This will properly convert the video:

 ffmpeg -fflags +genpts -i MOVI0000.avi -vf "format=yuv420p" -c:v libx264 -x264opts b-pyramid=normal -g 120 -preset veryslow -b:v 500K -maxrate:v 4M -bufsize 8M -rc-lookahead 60 -refs 3 -bf 2 -b_strategy 2 -subq 11 -mixed-refs 1 -8x8dct 1 -partitions all -direct-pred auto -nal-hrd vbr -aq-mode autovariance -aq-strength 1.1 -trellis 2 -c:a aac -ac 1 -b:a 128k MOVI0000.mkv

If the audio isn't converting properly, just replace -c:a aac -ac 1 -b:a 128k with -an to eliminate it.