- Auto language detection using OpenAI Whisper during video upload
- Editable transcript interface for reviewing/correcting transcriptions
- Updated translation pipeline to use edited transcripts when available
- Migrated from JWT to Google OAuth-only authentication for better security
- Added complete Docker containerization with docker-compose.yml
- Updated database schema with language detection and transcript editing fields
- Enhanced API documentation and workflow in README
- Added comprehensive environment variable configuration
🤖 Generated with BackendIM
Co-Authored-By: Claude <noreply@anthropic.com>
Features:
- JWT authentication with user registration and login
- Video upload to Amazon S3 with file validation (200MB limit)
- Audio transcription using OpenAI Whisper API
- Text translation using GPT-4 API
- Voice cloning and audio synthesis using ElevenLabs API
- Video processing with ffmpeg for audio replacement
- Complete SQLite database with proper models and migrations
- Background task processing for long-running operations
- Health endpoint and comprehensive API documentation
Tech stack:
- FastAPI with SQLAlchemy ORM
- SQLite database with Alembic migrations
- Amazon S3 for file storage
- OpenAI APIs for transcription and translation
- ElevenLabs API for voice cloning
- ffmpeg for video processing
- JWT authentication with bcrypt password hashing