Automated Action 99937b3fd7 Add comprehensive Google OAuth integration for easy login/signup

Features:
- Complete Google OAuth 2.0 integration with ID token and authorization code flows
- Enhanced User model with Google OAuth fields (google_id, is_google_user, email_verified, profile_picture)
- Google OAuth service for token verification and user info extraction
- Multiple authentication endpoints:
  - GET /auth/google/oauth-url (get OAuth URL for frontend)
  - POST /auth/google/login-with-token (direct ID token login)
  - POST /auth/google/login-with-code (authorization code exchange)
- Smart user handling: creates new users or links existing accounts
- Issues own JWT tokens after Google authentication
- Database migration 004 for Google OAuth fields
- Enhanced login logic to handle Google vs password users
- Comprehensive README with Google OAuth setup instructions
- Frontend integration examples for both OAuth flows

Google OAuth automatically:
- Creates user accounts on first login
- Links existing email accounts to Google
- Extracts profile information (name, picture, locale)
- Verifies email addresses
- Issues secure JWT tokens for API access

2025-06-25 05:49:54 +00:00

7.9 KiB

Raw Blame History

AI Video Dubbing API

A FastAPI backend for an AI-powered video dubbing tool that allows content creators to upload short-form videos, transcribe audio, translate to different languages, clone voices, and generate dubbed videos with lip-sync.

Features

🔐 Authentication: JWT-based user registration and login 👤 User Profiles: Complete profile management with settings 📁 Video Upload: Upload MP4/MOV files to Amazon S3 (max 200MB) 🧠 Transcription: Audio transcription using OpenAI Whisper API 🌍 Translation: Text translation using GPT-4 API 🗣️ Voice Cloning: Voice synthesis using ElevenLabs API 🎥 Video Processing: Audio replacement and video processing with ffmpeg

Tech Stack

FastAPI - Modern, fast web framework
SQLite - Database with SQLAlchemy ORM
Amazon S3 - File storage
OpenAI Whisper - Audio transcription
GPT-4 - Text translation
ElevenLabs - Voice cloning and synthesis
ffmpeg - Video/audio processing

Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Set Environment Variables

Create a .env file in the root directory with the following variables:

# Authentication
SECRET_KEY=your-secret-key-change-this-in-production

# Google OAuth Configuration
GOOGLE_CLIENT_ID=your-google-client-id
GOOGLE_CLIENT_SECRET=your-google-client-secret
GOOGLE_REDIRECT_URI=http://localhost:3000/auth/google/callback

# AWS S3 Configuration
AWS_ACCESS_KEY_ID=your-aws-access-key
AWS_SECRET_ACCESS_KEY=your-aws-secret-key
AWS_REGION=us-east-1
S3_BUCKET_NAME=your-s3-bucket-name

# OpenAI Configuration
OPENAI_API_KEY=your-openai-api-key

# ElevenLabs Configuration
ELEVENLABS_API_KEY=your-elevenlabs-api-key

3. Run Database Migrations

The database will be automatically created when you start the application. The SQLite database will be stored at /app/storage/db/db.sqlite.

4. Start the Application

python main.py

Or with uvicorn:

uvicorn main:app --host 0.0.0.0 --port 8000 --reload

The API will be available at:

API: http://localhost:8000
Documentation: http://localhost:8000/docs
Alternative Docs: http://localhost:8000/redoc
Health Check: http://localhost:8000/health

API Endpoints

Authentication

POST /auth/register - User registration with email/password
POST /auth/login - User login with email/password
GET /auth/google/oauth-url - Get Google OAuth URL for frontend
POST /auth/google/login-with-token - Login/signup with Google ID token
POST /auth/google/login-with-code - Login/signup with Google authorization code

Profile Management

GET /profile/ - Get user profile
PUT /profile/ - Update profile information
PUT /profile/password - Update password
PUT /profile/email - Update email address
DELETE /profile/ - Delete user account

Video Management

POST /videos/upload - Upload video with language settings
GET /videos/ - Get user's videos
GET /videos/{video_id} - Get specific video details

Processing Pipeline

POST /transcription/{video_id} - Start audio transcription
GET /transcription/{video_id} - Get transcription results
POST /translation/{video_id} - Start text translation
GET /translation/{video_id} - Get translation results
POST /voice/clone/{video_id} - Start voice cloning and audio generation
GET /voice/{video_id} - Get dubbed audio results
POST /process/{video_id} - Start final video processing
GET /process/{video_id} - Get processed video results

Results

GET /process/results/{video_id} - Get complete processing results

Google OAuth Setup

1. Create Google OAuth Application

Go to Google Cloud Console
Create a new project or select existing one
Enable the Google+ API
Go to "Credentials" → "Create Credentials" → "OAuth 2.0 Client IDs"
Choose "Web application"
Add authorized redirect URIs:
- http://localhost:3000/auth/google/callback (for development)
- Your production callback URL

2. Configure Environment Variables

Add these to your .env file:

GOOGLE_CLIENT_ID=your-google-oauth-client-id
GOOGLE_CLIENT_SECRET=your-google-oauth-client-secret
GOOGLE_REDIRECT_URI=http://localhost:3000/auth/google/callback

3. Frontend Integration

Option 1: Direct Token Method

// Use Google's JavaScript library to get ID token
const response = await fetch('/auth/google/login-with-token', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ id_token: googleIdToken })
});

Option 2: Authorization Code Method

// Redirect user to Google OAuth URL, then exchange code
const oauthUrl = await fetch('/auth/google/oauth-url').then(r => r.json());
// Redirect to oauthUrl.oauth_url
// On callback, exchange code:
const response = await fetch('/auth/google/login-with-code', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ 
    code: authorizationCode,
    redirect_uri: 'http://localhost:3000/auth/google/callback'
  })
});

Workflow

Register/Login (Email/Password or Google OAuth) to get JWT token
Upload Video with source and target languages
Transcribe the audio from the video
Translate the transcribed text
Clone Voice and generate dubbed audio
Process Video to replace original audio with dubbed audio
Download the final dubbed video

Environment Variables Reference

Variable	Description	Required
`SECRET_KEY`	JWT secret key for authentication	Yes
`GOOGLE_CLIENT_ID`	Google OAuth client ID	No*
`GOOGLE_CLIENT_SECRET`	Google OAuth client secret	No*
`GOOGLE_REDIRECT_URI`	Google OAuth redirect URI	No*
`AWS_ACCESS_KEY_ID`	AWS access key for S3	Yes
`AWS_SECRET_ACCESS_KEY`	AWS secret key for S3	Yes
`AWS_REGION`	AWS region (default: us-east-1)	No
`S3_BUCKET_NAME`	S3 bucket name for file storage	Yes
`OPENAI_API_KEY`	OpenAI API key for Whisper and GPT-4	Yes
`ELEVENLABS_API_KEY`	ElevenLabs API key for voice cloning	Yes

*Required only if Google OAuth is enabled

File Storage Structure

Files are stored in S3 with the following structure:

/videos/{uuid}.mp4        - Original uploaded videos
/dubbed_audio/{uuid}.mp3  - Generated dubbed audio files
/processed_videos/{uuid}.mp4 - Final processed videos

Database Schema

users: User accounts with email/password
videos: Video metadata and processing status
transcriptions: Audio transcriptions
translations: Translated text
dubbed_audios: Generated audio files
dubbed_videos: Final processed videos

Status Tracking

Videos have the following status values:

uploaded - Video uploaded successfully
transcribing - Audio transcription in progress
transcribed - Transcription completed
translating - Text translation in progress
translated - Translation completed
voice_cloning - Voice cloning and audio generation in progress
voice_cloned - Dubbed audio generated
processing_video - Final video processing in progress
completed - All processing completed
*_failed - Various failure states

Development

Code Linting

ruff check . --fix

Project Structure

├── main.py                 # FastAPI application entry point
├── requirements.txt        # Python dependencies
├── alembic.ini            # Database migration configuration
├── app/
│   ├── db/                # Database configuration
│   ├── models/            # SQLAlchemy models
│   ├── routes/            # API endpoints
│   ├── services/          # Business logic and external API integrations
│   └── utils/             # Utility functions (auth, etc.)
└── alembic/
    └── versions/          # Database migration files

7.9 KiB Raw Blame History