I have an international conference video which contains two spoken languages, i.e. the video is mixed with sentences of English and Chinese. I would like to remove the Chinese part by command line.
firstly, enerate subtitle files using whisper
whisper myvideo --model large --language en
the subtitle file contains both languages and timing
1
00:00:00,000 --> 00:00:04,220
if you are not concerned and doing the work of the Lord.
2
00:00:09,120 --> 00:00:13,880
如果你不愿意去遵行耶稣基督的话语的话,
3
00:00:14,220 --> 00:00:18,220
就没有必要昼夜去默想神的话。
4
00:00:18,220 --> 00:00:22,200
Take more of me, give me more of you.
....
The question is how to use command line and ffmpeg to remove all the Chinese video parts based on the timing in the subtitle? The video is very long, and the purpose is to use command line to do the task, rather than manually.
step1 ) So I need to identify language of every line of the subtitle:
#!/bin/bash
while IFS= read -r line
do
echo "text: $line"
lan= trans -id $line |awk '/^Code/ {print $2}'
echo "lan: $lan"
done < "$1"
Well this above bash doesn't work properly yet, how to do this?