The purpose of this script is to read a file, extract the audio, and print out a transcript by running it through IBM Watson speech to text API. My problem is when I try to save the output from the subprocess into a variable and pass it into the open function, it reads as binary. What am I doing wrong? Any help would be appreciated!
import sys
import re 
import json
import requests
import subprocess 
from subprocess import Popen, PIPE
fullVideo = sys.argv[1]
title = re.findall('^([^.]*).*', fullVideo)
title = str(title[0])
output = subprocess.Popen('ffmpeg -i ' + fullVideo + ' -vn -ab 128k ' + title + '.flac', shell = True, stdin=subprocess.PIPE).communicate()[0]
sfile= open(output, "rb")
response = requests.post("https://stream.watsonplatform.net/speech-to-text/api/v1/recognize",
         auth=("USERNAME", "PASSWORD"),
         headers = {"content-type": "audio/flac"},
         data=sfile
         )
print (json.loads(response.text))
 
    