Go to file

Automated Action 025900c849 Fix database migration conflicts and improve migration safety

Changes:
- Enhanced migration 002 with column existence checks to prevent duplicate column errors
- Added comprehensive migration 003 that syncs database with current model state
- Modified application startup to avoid conflicts between create_all() and Alembic
- Added proper table/column existence checking in migrations
- Improved migration safety for production environments
- Removed automatic table creation when database already exists (relies on migrations)

This resolves the 'duplicate column name' error by ensuring migrations
check for existing columns before attempting to add them.

2025-06-24 19:46:05 +00:00

alembic

Fix database migration conflicts and improve migration safety

2025-06-24 19:46:05 +00:00

app

Add comprehensive user profile management system

2025-06-24 19:40:23 +00:00

.gitignore

Initial commit from template

2025-06-24 17:45:42 +00:00

alembic.ini

Implement complete AI video dubbing backend with FastAPI

2025-06-24 17:56:12 +00:00

main.py

Fix database migration conflicts and improve migration safety

2025-06-24 19:46:05 +00:00

README.md

Add comprehensive user profile management system

2025-06-24 19:40:23 +00:00

requirements.txt

Add email-validator dependency to fix EmailStr validation

2025-06-24 17:58:47 +00:00

README.md

AI Video Dubbing API

A FastAPI backend for an AI-powered video dubbing tool that allows content creators to upload short-form videos, transcribe audio, translate to different languages, clone voices, and generate dubbed videos with lip-sync.

Features

🔐 Authentication: JWT-based user registration and login 👤 User Profiles: Complete profile management with settings 📁 Video Upload: Upload MP4/MOV files to Amazon S3 (max 200MB) 🧠 Transcription: Audio transcription using OpenAI Whisper API 🌍 Translation: Text translation using GPT-4 API 🗣️ Voice Cloning: Voice synthesis using ElevenLabs API 🎥 Video Processing: Audio replacement and video processing with ffmpeg

Tech Stack

FastAPI - Modern, fast web framework
SQLite - Database with SQLAlchemy ORM
Amazon S3 - File storage
OpenAI Whisper - Audio transcription
GPT-4 - Text translation
ElevenLabs - Voice cloning and synthesis
ffmpeg - Video/audio processing

Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Set Environment Variables

Create a .env file in the root directory with the following variables:

# Authentication
SECRET_KEY=your-secret-key-change-this-in-production

# AWS S3 Configuration
AWS_ACCESS_KEY_ID=your-aws-access-key
AWS_SECRET_ACCESS_KEY=your-aws-secret-key
AWS_REGION=us-east-1
S3_BUCKET_NAME=your-s3-bucket-name

# OpenAI Configuration
OPENAI_API_KEY=your-openai-api-key

# ElevenLabs Configuration
ELEVENLABS_API_KEY=your-elevenlabs-api-key

3. Run Database Migrations

The database will be automatically created when you start the application. The SQLite database will be stored at /app/storage/db/db.sqlite.

4. Start the Application

python main.py

Or with uvicorn:

uvicorn main:app --host 0.0.0.0 --port 8000 --reload

The API will be available at:

API: http://localhost:8000
Documentation: http://localhost:8000/docs
Alternative Docs: http://localhost:8000/redoc
Health Check: http://localhost:8000/health

API Endpoints

Authentication

POST /auth/register - User registration
POST /auth/login - User login

Profile Management

GET /profile/ - Get user profile
PUT /profile/ - Update profile information
PUT /profile/password - Update password
PUT /profile/email - Update email address
DELETE /profile/ - Delete user account

Video Management

POST /videos/upload - Upload video with language settings
GET /videos/ - Get user's videos
GET /videos/{video_id} - Get specific video details

Processing Pipeline

POST /transcription/{video_id} - Start audio transcription
GET /transcription/{video_id} - Get transcription results
POST /translation/{video_id} - Start text translation
GET /translation/{video_id} - Get translation results
POST /voice/clone/{video_id} - Start voice cloning and audio generation
GET /voice/{video_id} - Get dubbed audio results
POST /process/{video_id} - Start final video processing
GET /process/{video_id} - Get processed video results

Results

GET /process/results/{video_id} - Get complete processing results

Workflow

Register/Login to get JWT token
Upload Video with source and target languages
Transcribe the audio from the video
Translate the transcribed text
Clone Voice and generate dubbed audio
Process Video to replace original audio with dubbed audio
Download the final dubbed video

Environment Variables Reference

Variable	Description	Required
`SECRET_KEY`	JWT secret key for authentication	Yes
`AWS_ACCESS_KEY_ID`	AWS access key for S3	Yes
`AWS_SECRET_ACCESS_KEY`	AWS secret key for S3	Yes
`AWS_REGION`	AWS region (default: us-east-1)	No
`S3_BUCKET_NAME`	S3 bucket name for file storage	Yes
`OPENAI_API_KEY`	OpenAI API key for Whisper and GPT-4	Yes
`ELEVENLABS_API_KEY`	ElevenLabs API key for voice cloning	Yes

File Storage Structure

Files are stored in S3 with the following structure:

/videos/{uuid}.mp4        - Original uploaded videos
/dubbed_audio/{uuid}.mp3  - Generated dubbed audio files
/processed_videos/{uuid}.mp4 - Final processed videos

Database Schema

users: User accounts with email/password
videos: Video metadata and processing status
transcriptions: Audio transcriptions
translations: Translated text
dubbed_audios: Generated audio files
dubbed_videos: Final processed videos

Status Tracking

Videos have the following status values:

uploaded - Video uploaded successfully
transcribing - Audio transcription in progress
transcribed - Transcription completed
translating - Text translation in progress
translated - Translation completed
voice_cloning - Voice cloning and audio generation in progress
voice_cloned - Dubbed audio generated
processing_video - Final video processing in progress
completed - All processing completed
*_failed - Various failure states

Development

Code Linting

ruff check . --fix

Project Structure

├── main.py                 # FastAPI application entry point
├── requirements.txt        # Python dependencies
├── alembic.ini            # Database migration configuration
├── app/
│   ├── db/                # Database configuration
│   ├── models/            # SQLAlchemy models
│   ├── routes/            # API endpoints
│   ├── services/          # Business logic and external API integrations
│   └── utils/             # Utility functions (auth, etc.)
└── alembic/
    └── versions/          # Database migration files