Automated Action 99937b3fd7 Add comprehensive Google OAuth integration for easy login/signup
Features:
- Complete Google OAuth 2.0 integration with ID token and authorization code flows
- Enhanced User model with Google OAuth fields (google_id, is_google_user, email_verified, profile_picture)
- Google OAuth service for token verification and user info extraction
- Multiple authentication endpoints:
  - GET /auth/google/oauth-url (get OAuth URL for frontend)
  - POST /auth/google/login-with-token (direct ID token login)
  - POST /auth/google/login-with-code (authorization code exchange)
- Smart user handling: creates new users or links existing accounts
- Issues own JWT tokens after Google authentication
- Database migration 004 for Google OAuth fields
- Enhanced login logic to handle Google vs password users
- Comprehensive README with Google OAuth setup instructions
- Frontend integration examples for both OAuth flows

Google OAuth automatically:
- Creates user accounts on first login
- Links existing email accounts to Google
- Extracts profile information (name, picture, locale)
- Verifies email addresses
- Issues secure JWT tokens for API access
2025-06-25 05:49:54 +00:00

7.9 KiB

AI Video Dubbing API

A FastAPI backend for an AI-powered video dubbing tool that allows content creators to upload short-form videos, transcribe audio, translate to different languages, clone voices, and generate dubbed videos with lip-sync.

Features

🔐 Authentication: JWT-based user registration and login 👤 User Profiles: Complete profile management with settings 📁 Video Upload: Upload MP4/MOV files to Amazon S3 (max 200MB) 🧠 Transcription: Audio transcription using OpenAI Whisper API 🌍 Translation: Text translation using GPT-4 API 🗣️ Voice Cloning: Voice synthesis using ElevenLabs API 🎥 Video Processing: Audio replacement and video processing with ffmpeg

Tech Stack

  • FastAPI - Modern, fast web framework
  • SQLite - Database with SQLAlchemy ORM
  • Amazon S3 - File storage
  • OpenAI Whisper - Audio transcription
  • GPT-4 - Text translation
  • ElevenLabs - Voice cloning and synthesis
  • ffmpeg - Video/audio processing

Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Set Environment Variables

Create a .env file in the root directory with the following variables:

# Authentication
SECRET_KEY=your-secret-key-change-this-in-production

# Google OAuth Configuration
GOOGLE_CLIENT_ID=your-google-client-id
GOOGLE_CLIENT_SECRET=your-google-client-secret
GOOGLE_REDIRECT_URI=http://localhost:3000/auth/google/callback

# AWS S3 Configuration
AWS_ACCESS_KEY_ID=your-aws-access-key
AWS_SECRET_ACCESS_KEY=your-aws-secret-key
AWS_REGION=us-east-1
S3_BUCKET_NAME=your-s3-bucket-name

# OpenAI Configuration
OPENAI_API_KEY=your-openai-api-key

# ElevenLabs Configuration
ELEVENLABS_API_KEY=your-elevenlabs-api-key

3. Run Database Migrations

The database will be automatically created when you start the application. The SQLite database will be stored at /app/storage/db/db.sqlite.

4. Start the Application

python main.py

Or with uvicorn:

uvicorn main:app --host 0.0.0.0 --port 8000 --reload

The API will be available at:

API Endpoints

Authentication

  • POST /auth/register - User registration with email/password
  • POST /auth/login - User login with email/password
  • GET /auth/google/oauth-url - Get Google OAuth URL for frontend
  • POST /auth/google/login-with-token - Login/signup with Google ID token
  • POST /auth/google/login-with-code - Login/signup with Google authorization code

Profile Management

  • GET /profile/ - Get user profile
  • PUT /profile/ - Update profile information
  • PUT /profile/password - Update password
  • PUT /profile/email - Update email address
  • DELETE /profile/ - Delete user account

Video Management

  • POST /videos/upload - Upload video with language settings
  • GET /videos/ - Get user's videos
  • GET /videos/{video_id} - Get specific video details

Processing Pipeline

  • POST /transcription/{video_id} - Start audio transcription
  • GET /transcription/{video_id} - Get transcription results
  • POST /translation/{video_id} - Start text translation
  • GET /translation/{video_id} - Get translation results
  • POST /voice/clone/{video_id} - Start voice cloning and audio generation
  • GET /voice/{video_id} - Get dubbed audio results
  • POST /process/{video_id} - Start final video processing
  • GET /process/{video_id} - Get processed video results

Results

  • GET /process/results/{video_id} - Get complete processing results

Google OAuth Setup

1. Create Google OAuth Application

  1. Go to Google Cloud Console
  2. Create a new project or select existing one
  3. Enable the Google+ API
  4. Go to "Credentials" → "Create Credentials" → "OAuth 2.0 Client IDs"
  5. Choose "Web application"
  6. Add authorized redirect URIs:
    • http://localhost:3000/auth/google/callback (for development)
    • Your production callback URL

2. Configure Environment Variables

Add these to your .env file:

GOOGLE_CLIENT_ID=your-google-oauth-client-id
GOOGLE_CLIENT_SECRET=your-google-oauth-client-secret
GOOGLE_REDIRECT_URI=http://localhost:3000/auth/google/callback

3. Frontend Integration

Option 1: Direct Token Method

// Use Google's JavaScript library to get ID token
const response = await fetch('/auth/google/login-with-token', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ id_token: googleIdToken })
});

Option 2: Authorization Code Method

// Redirect user to Google OAuth URL, then exchange code
const oauthUrl = await fetch('/auth/google/oauth-url').then(r => r.json());
// Redirect to oauthUrl.oauth_url
// On callback, exchange code:
const response = await fetch('/auth/google/login-with-code', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ 
    code: authorizationCode,
    redirect_uri: 'http://localhost:3000/auth/google/callback'
  })
});

Workflow

  1. Register/Login (Email/Password or Google OAuth) to get JWT token
  2. Upload Video with source and target languages
  3. Transcribe the audio from the video
  4. Translate the transcribed text
  5. Clone Voice and generate dubbed audio
  6. Process Video to replace original audio with dubbed audio
  7. Download the final dubbed video

Environment Variables Reference

Variable Description Required
SECRET_KEY JWT secret key for authentication Yes
GOOGLE_CLIENT_ID Google OAuth client ID No*
GOOGLE_CLIENT_SECRET Google OAuth client secret No*
GOOGLE_REDIRECT_URI Google OAuth redirect URI No*
AWS_ACCESS_KEY_ID AWS access key for S3 Yes
AWS_SECRET_ACCESS_KEY AWS secret key for S3 Yes
AWS_REGION AWS region (default: us-east-1) No
S3_BUCKET_NAME S3 bucket name for file storage Yes
OPENAI_API_KEY OpenAI API key for Whisper and GPT-4 Yes
ELEVENLABS_API_KEY ElevenLabs API key for voice cloning Yes

*Required only if Google OAuth is enabled

File Storage Structure

Files are stored in S3 with the following structure:

/videos/{uuid}.mp4        - Original uploaded videos
/dubbed_audio/{uuid}.mp3  - Generated dubbed audio files
/processed_videos/{uuid}.mp4 - Final processed videos

Database Schema

  • users: User accounts with email/password
  • videos: Video metadata and processing status
  • transcriptions: Audio transcriptions
  • translations: Translated text
  • dubbed_audios: Generated audio files
  • dubbed_videos: Final processed videos

Status Tracking

Videos have the following status values:

  • uploaded - Video uploaded successfully
  • transcribing - Audio transcription in progress
  • transcribed - Transcription completed
  • translating - Text translation in progress
  • translated - Translation completed
  • voice_cloning - Voice cloning and audio generation in progress
  • voice_cloned - Dubbed audio generated
  • processing_video - Final video processing in progress
  • completed - All processing completed
  • *_failed - Various failure states

Development

Code Linting

ruff check . --fix

Project Structure

├── main.py                 # FastAPI application entry point
├── requirements.txt        # Python dependencies
├── alembic.ini            # Database migration configuration
├── app/
│   ├── db/                # Database configuration
│   ├── models/            # SQLAlchemy models
│   ├── routes/            # API endpoints
│   ├── services/          # Business logic and external API integrations
│   └── utils/             # Utility functions (auth, etc.)
└── alembic/
    └── versions/          # Database migration files