Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

milahu/srtgen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

55 Commits

Repository files navigation

srtgen

Generate subtitles for video file

Using the paid Google Cloud Speech-To-Text API

This program requires a Google account and an API key: Create project on Google Cloud

usage

$ ./srtgen.py 
usage
 srtgen.py --apikey path/to/keyfile.json path/to/input-video.mp4
environment variables
 GOOGLE_APPLICATION_CREDENTIALS=path/to/keyfile.json srtgen.py path/to/input-video.mp4
keyfile
 This program requires a Google account and an API key
 https://console.cloud.google.com/projectcreate

subtitle is written to stdout and output/xxxxxx-input-video.mp4/output_file.srt
where xxxxxx is the sha1 hash of the input video file

temporary files are stored in output/xxxxxx-input-video.mp4/

features

  • workaround size limit in google API
    • no need for Google Cloud Storage = gs protocol
    • duration is limited to 60 seconds
    • file size is limited to 10485760 bytes

dependencies

  • ffmpeg
  • python
    • pydub
    • google.cloud.speech
      • API key
      • pricing
        • speech recognition needs lots of space and time = there is no free lunch
        • https://cloud.google.com/speech-to-text/pricing#pricing_table
          • first hour is free
            • TODO one hour per month or one hour per google account?
          • Speech Recognition without Data Logging: 0ドル.006 / 15 seconds = 0ドル.024 / 1 minute = about 1ドル.50 / 1 hour
          • Speech Recognition with Data Logging: 0ドル.004 / 15 seconds = 0ドル.016 / 1 minute = about 1ドル.00 / 1 hour
          • Data Logging = feedback of manually corrected text to improve quality of service
            • TODO implement upload of corrected text
      • TODO Automatic punctuation

related

based on

postprocessing tools

similar tools

todo

  • use speech_recognition module, so srtgen can use multiple backend services
  • hybrid of offline and online speech recognition
    • deepspeech for offline speech recognition
    • google for online speech recognition
    • can deepspeech return confidence values?
    • run deepspeech with different models? (and manually select the best result?)
  • automatic postprocessing
    • reduce manual work
    • split long sentences
    • merge short sentences

About

Generate subtitles for video file

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 85.0%
  • Nix 15.0%

AltStyle によって変換されたページ (->オリジナル) /