SubtitleX v1.0 White Paper

SubtitleX v1.0 White Paper

Product Description

  1. Subtitle auto-load with Chrome Extension: Automatically loads subtitles for collected online videos via a Chrome browser extension.
  2. Subtitle Library Search: Allows searching a library of subtitles for over 30,000 online videos via a web application.
  3. Subtitle Generation: Generates subtitles for uncollected videos using the Fast-Whisper v3 large model, and translates them into various languages using the Google Translate API.
  4. Seed Video Crawling: Seed videos collection via Python+Selenium+BeautifulSoup, to crawling target website getting potential content with ETL then generate the subtitle utilize AI power.

Applications

  1. Chrome Extension github repository
  2. Web Application github repository
  3. Subscription System
  4. Crawling Tool and Download-Transcribe-Translate Tool github repository
  5. Service API build with Golang github repository

Technology Stack

Front End

  • React
  • Next.js, Prisma, Next-Auth
  • Chrome Extension
  • TailwindCSS, Material-UI

Back End

  • Golang (Gin)

Database

  • PostgreSQL
  • TSDB
  • Redis

Utility

  • Crawling: BeautifulSoup4, Selenium
  • Streaming Download: Stream-Link
  • STT: Faster-Whisper

Resources Used

  1. Web Application: Vercel, Supabase
  2. Server: Self-hosted (4 cores, x86 with RedHat)
  3. Connection: Cloudflare DNS and Zero Trust Tunnel
  4. API: Google Translate API
  5. Icons:
  6. Free GPU: Google Colab
  7. Payment: Stripe

Programming and Testing

  • Laptop: ThinkPad T14 with Ryzen 5 4650U, 12 cores with built-in GPU, 32GB RAM
  • Desktop Workstation: Ryzen 5 5600, 12 cores with RTX 2060 Super GPU, 16GB RAM

Promotion

  1. Google Search
  2. Twitter

Attention Please

  1. This tool is designed to work with a variety of online streaming websites. However, some content in the library is adapted for websites that contain mature content, so this tool may include links to adult content websites.
  2. All functions are built using open technology and open-source programs. The copyright of content on third-party websites is not accounted for.

Source Code