Implement complete AI video dubbing backend with FastAPI

Features:
- JWT authentication with user registration and login
- Video upload to Amazon S3 with file validation (200MB limit)
- Audio transcription using OpenAI Whisper API
- Text translation using GPT-4 API
- Voice cloning and audio synthesis using ElevenLabs API
- Video processing with ffmpeg for audio replacement
- Complete SQLite database with proper models and migrations
- Background task processing for long-running operations
- Health endpoint and comprehensive API documentation

Tech stack:
- FastAPI with SQLAlchemy ORM
- SQLite database with Alembic migrations
- Amazon S3 for file storage
- OpenAI APIs for transcription and translation
- ElevenLabs API for voice cloning
- ffmpeg for video processing
- JWT authentication with bcrypt password hashing
This commit is contained in:
Automated Action 2025-06-24 17:56:12 +00:00
parent 751754cf6d
commit 92e4d992b2
33 changed files with 2067 additions and 2 deletions

174
README.md
View File

@ -1,3 +1,173 @@
# FastAPI Application # AI Video Dubbing API
This is a FastAPI application bootstrapped by BackendIM, the AI-powered backend generation platform. A FastAPI backend for an AI-powered video dubbing tool that allows content creators to upload short-form videos, transcribe audio, translate to different languages, clone voices, and generate dubbed videos with lip-sync.
## Features
🔐 **Authentication**: JWT-based user registration and login
📁 **Video Upload**: Upload MP4/MOV files to Amazon S3 (max 200MB)
🧠 **Transcription**: Audio transcription using OpenAI Whisper API
🌍 **Translation**: Text translation using GPT-4 API
🗣️ **Voice Cloning**: Voice synthesis using ElevenLabs API
🎥 **Video Processing**: Audio replacement and video processing with ffmpeg
## Tech Stack
- **FastAPI** - Modern, fast web framework
- **SQLite** - Database with SQLAlchemy ORM
- **Amazon S3** - File storage
- **OpenAI Whisper** - Audio transcription
- **GPT-4** - Text translation
- **ElevenLabs** - Voice cloning and synthesis
- **ffmpeg** - Video/audio processing
## Quick Start
### 1. Install Dependencies
```bash
pip install -r requirements.txt
```
### 2. Set Environment Variables
Create a `.env` file in the root directory with the following variables:
```env
# Authentication
SECRET_KEY=your-secret-key-change-this-in-production
# AWS S3 Configuration
AWS_ACCESS_KEY_ID=your-aws-access-key
AWS_SECRET_ACCESS_KEY=your-aws-secret-key
AWS_REGION=us-east-1
S3_BUCKET_NAME=your-s3-bucket-name
# OpenAI Configuration
OPENAI_API_KEY=your-openai-api-key
# ElevenLabs Configuration
ELEVENLABS_API_KEY=your-elevenlabs-api-key
```
### 3. Run Database Migrations
The database will be automatically created when you start the application. The SQLite database will be stored at `/app/storage/db/db.sqlite`.
### 4. Start the Application
```bash
python main.py
```
Or with uvicorn:
```bash
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
```
The API will be available at:
- **API**: http://localhost:8000
- **Documentation**: http://localhost:8000/docs
- **Alternative Docs**: http://localhost:8000/redoc
- **Health Check**: http://localhost:8000/health
## API Endpoints
### Authentication
- `POST /auth/register` - User registration
- `POST /auth/login` - User login
### Video Management
- `POST /videos/upload` - Upload video with language settings
- `GET /videos/` - Get user's videos
- `GET /videos/{video_id}` - Get specific video details
### Processing Pipeline
- `POST /transcription/{video_id}` - Start audio transcription
- `GET /transcription/{video_id}` - Get transcription results
- `POST /translation/{video_id}` - Start text translation
- `GET /translation/{video_id}` - Get translation results
- `POST /voice/clone/{video_id}` - Start voice cloning and audio generation
- `GET /voice/{video_id}` - Get dubbed audio results
- `POST /process/{video_id}` - Start final video processing
- `GET /process/{video_id}` - Get processed video results
### Results
- `GET /process/results/{video_id}` - Get complete processing results
## Workflow
1. **Register/Login** to get JWT token
2. **Upload Video** with source and target languages
3. **Transcribe** the audio from the video
4. **Translate** the transcribed text
5. **Clone Voice** and generate dubbed audio
6. **Process Video** to replace original audio with dubbed audio
7. **Download** the final dubbed video
## Environment Variables Reference
| Variable | Description | Required |
|----------|-------------|----------|
| `SECRET_KEY` | JWT secret key for authentication | Yes |
| `AWS_ACCESS_KEY_ID` | AWS access key for S3 | Yes |
| `AWS_SECRET_ACCESS_KEY` | AWS secret key for S3 | Yes |
| `AWS_REGION` | AWS region (default: us-east-1) | No |
| `S3_BUCKET_NAME` | S3 bucket name for file storage | Yes |
| `OPENAI_API_KEY` | OpenAI API key for Whisper and GPT-4 | Yes |
| `ELEVENLABS_API_KEY` | ElevenLabs API key for voice cloning | Yes |
## File Storage Structure
Files are stored in S3 with the following structure:
```
/videos/{uuid}.mp4 - Original uploaded videos
/dubbed_audio/{uuid}.mp3 - Generated dubbed audio files
/processed_videos/{uuid}.mp4 - Final processed videos
```
## Database Schema
- **users**: User accounts with email/password
- **videos**: Video metadata and processing status
- **transcriptions**: Audio transcriptions
- **translations**: Translated text
- **dubbed_audios**: Generated audio files
- **dubbed_videos**: Final processed videos
## Status Tracking
Videos have the following status values:
- `uploaded` - Video uploaded successfully
- `transcribing` - Audio transcription in progress
- `transcribed` - Transcription completed
- `translating` - Text translation in progress
- `translated` - Translation completed
- `voice_cloning` - Voice cloning and audio generation in progress
- `voice_cloned` - Dubbed audio generated
- `processing_video` - Final video processing in progress
- `completed` - All processing completed
- `*_failed` - Various failure states
## Development
### Code Linting
```bash
ruff check . --fix
```
### Project Structure
```
├── main.py # FastAPI application entry point
├── requirements.txt # Python dependencies
├── alembic.ini # Database migration configuration
├── app/
│ ├── db/ # Database configuration
│ ├── models/ # SQLAlchemy models
│ ├── routes/ # API endpoints
│ ├── services/ # Business logic and external API integrations
│ └── utils/ # Utility functions (auth, etc.)
└── alembic/
└── versions/ # Database migration files
```

109
alembic.ini Normal file
View File

@ -0,0 +1,109 @@
# A generic, single database configuration.
[alembic]
# path to migration scripts
script_location = alembic
# template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
# Uncomment the line below if you want the files to be prepended with date and time
# file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s
# sys.path path, will be prepended to sys.path if present.
# defaults to the current working directory.
prepend_sys_path = .
# timezone to use when rendering the date within the migration file
# as well as the filename.
# If specified, requires the python-dateutil library that can be
# installed by adding `alembic[tz]` to the pip requirements
# string value is passed to dateutil.tz.gettz()
# leave blank for localtime
# timezone =
# max length of characters to apply to the
# "slug" field
# truncate_slug_length = 40
# set to 'true' to run the environment during
# the 'revision' command, regardless of autogenerate
# revision_environment = false
# set to 'true' to allow .pyc and .pyo files without
# a source .py file to be detected as revisions in the
# versions/ directory
# sourceless = false
# version path separator; As mentioned above, this is the character used to split
# version_locations. The default within new alembic.ini files is "os", which uses
# os.pathsep. If this key is omitted entirely, it falls back to the legacy
# behavior of splitting on spaces and/or commas.
# Valid values for version_path_separator are:
#
# version_path_separator = :
# version_path_separator = ;
# version_path_separator = space
version_path_separator = os
# set to 'true' to search source files recursively
# in each "version_locations" directory
# new in Alembic version 1.10
# recursive_version_locations = false
# the output encoding used when revision files
# are written from script.py.mako
# output_encoding = utf-8
sqlalchemy.url = sqlite:////app/storage/db/db.sqlite
[post_write_hooks]
# post_write_hooks defines scripts or Python functions that are run
# on newly generated revision scripts. See the documentation for further
# detail and examples
# format using "black" - use the console_scripts runner, against the "black" entrypoint
# hooks = black
# black.type = console_scripts
# black.entrypoint = black
# black.options = -l 79 REVISION_SCRIPT_FILENAME
# lint with attempts to fix using "ruff" - use the exec runner, execute a binary
# hooks = ruff
# ruff.type = exec
# ruff.executable = %(here)s/.venv/bin/ruff
# ruff.options = --fix REVISION_SCRIPT_FILENAME
# Logging configuration
[loggers]
keys = root,sqlalchemy,alembic
[handlers]
keys = console
[formatters]
keys = generic
[logger_root]
level = WARN
handlers = console
qualname =
[logger_sqlalchemy]
level = WARN
handlers =
qualname = sqlalchemy.engine
[logger_alembic]
level = INFO
handlers =
qualname = alembic
[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic
[formatter_generic]
format = %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %H:%M:%S

81
alembic/env.py Normal file
View File

@ -0,0 +1,81 @@
from logging.config import fileConfig
from sqlalchemy import engine_from_config
from sqlalchemy import pool
from alembic import context
import sys
import os
# Add the project root to the Python path
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from app.db.base import Base
# this is the Alembic Config object, which provides
# access to the values within the .ini file in use.
config = context.config
# Interpret the config file for Python logging.
# This line sets up loggers basically.
if config.config_file_name is not None:
fileConfig(config.config_file_name)
# add your model's MetaData object here
# for 'autogenerate' support
target_metadata = Base.metadata
# other values from the config, defined by the needs of env.py,
# can be acquired:
# my_important_option = config.get_main_option("my_important_option")
# ... etc.
def run_migrations_offline() -> None:
"""Run migrations in 'offline' mode.
This configures the context with just a URL
and not an Engine, though an Engine is acceptable
here as well. By skipping the Engine creation
we don't even need a DBAPI to be available.
Calls to context.execute() here emit the given string to the
script output.
"""
url = config.get_main_option("sqlalchemy.url")
context.configure(
url=url,
target_metadata=target_metadata,
literal_binds=True,
dialect_opts={"paramstyle": "named"},
)
with context.begin_transaction():
context.run_migrations()
def run_migrations_online() -> None:
"""Run migrations in 'online' mode.
In this scenario we need to create an Engine
and associate a connection with the context.
"""
connectable = engine_from_config(
config.get_section(config.config_ini_section, {}),
prefix="sqlalchemy.",
poolclass=pool.NullPool,
)
with connectable.connect() as connection:
context.configure(
connection=connection, target_metadata=target_metadata
)
with context.begin_transaction():
context.run_migrations()
if context.is_offline_mode():
run_migrations_offline()
else:
run_migrations_online()

26
alembic/script.py.mako Normal file
View File

@ -0,0 +1,26 @@
"""${message}
Revision ID: ${up_revision}
Revises: ${down_revision | comma,n}
Create Date: ${create_date}
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
${imports if imports else ""}
# revision identifiers, used by Alembic.
revision: str = ${repr(up_revision)}
down_revision: Union[str, None] = ${repr(down_revision)}
branch_labels: Union[str, Sequence[str], None] = ${repr(branch_labels)}
depends_on: Union[str, Sequence[str], None] = ${repr(depends_on)}
def upgrade() -> None:
${upgrades if upgrades else "pass"}
def downgrade() -> None:
${downgrades if downgrades else "pass"}

View File

@ -0,0 +1,104 @@
"""Initial migration
Revision ID: 001
Revises:
Create Date: 2024-01-01 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = '001'
down_revision: Union[str, None] = None
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Create users table
op.create_table('users',
sa.Column('id', sa.Integer(), nullable=False),
sa.Column('email', sa.String(), nullable=False),
sa.Column('password_hash', sa.String(), nullable=False),
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('(CURRENT_TIMESTAMP)'), nullable=True),
sa.PrimaryKeyConstraint('id')
)
op.create_index(op.f('ix_users_email'), 'users', ['email'], unique=True)
op.create_index(op.f('ix_users_id'), 'users', ['id'], unique=False)
# Create videos table
op.create_table('videos',
sa.Column('id', sa.Integer(), nullable=False),
sa.Column('user_id', sa.Integer(), nullable=False),
sa.Column('original_s3_url', sa.String(), nullable=False),
sa.Column('language_from', sa.String(), nullable=False),
sa.Column('language_to', sa.String(), nullable=False),
sa.Column('status', sa.String(), nullable=True),
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('(CURRENT_TIMESTAMP)'), nullable=True),
sa.ForeignKeyConstraint(['user_id'], ['users.id'], ),
sa.PrimaryKeyConstraint('id')
)
op.create_index(op.f('ix_videos_id'), 'videos', ['id'], unique=False)
# Create transcriptions table
op.create_table('transcriptions',
sa.Column('id', sa.Integer(), nullable=False),
sa.Column('video_id', sa.Integer(), nullable=False),
sa.Column('text', sa.Text(), nullable=False),
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('(CURRENT_TIMESTAMP)'), nullable=True),
sa.ForeignKeyConstraint(['video_id'], ['videos.id'], ),
sa.PrimaryKeyConstraint('id')
)
op.create_index(op.f('ix_transcriptions_id'), 'transcriptions', ['id'], unique=False)
# Create translations table
op.create_table('translations',
sa.Column('id', sa.Integer(), nullable=False),
sa.Column('video_id', sa.Integer(), nullable=False),
sa.Column('text', sa.Text(), nullable=False),
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('(CURRENT_TIMESTAMP)'), nullable=True),
sa.ForeignKeyConstraint(['video_id'], ['videos.id'], ),
sa.PrimaryKeyConstraint('id')
)
op.create_index(op.f('ix_translations_id'), 'translations', ['id'], unique=False)
# Create dubbed_audios table
op.create_table('dubbed_audios',
sa.Column('id', sa.Integer(), nullable=False),
sa.Column('video_id', sa.Integer(), nullable=False),
sa.Column('s3_url', sa.String(), nullable=False),
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('(CURRENT_TIMESTAMP)'), nullable=True),
sa.ForeignKeyConstraint(['video_id'], ['videos.id'], ),
sa.PrimaryKeyConstraint('id')
)
op.create_index(op.f('ix_dubbed_audios_id'), 'dubbed_audios', ['id'], unique=False)
# Create dubbed_videos table
op.create_table('dubbed_videos',
sa.Column('id', sa.Integer(), nullable=False),
sa.Column('video_id', sa.Integer(), nullable=False),
sa.Column('s3_url', sa.String(), nullable=False),
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('(CURRENT_TIMESTAMP)'), nullable=True),
sa.ForeignKeyConstraint(['video_id'], ['videos.id'], ),
sa.PrimaryKeyConstraint('id')
)
op.create_index(op.f('ix_dubbed_videos_id'), 'dubbed_videos', ['id'], unique=False)
def downgrade() -> None:
op.drop_index(op.f('ix_dubbed_videos_id'), table_name='dubbed_videos')
op.drop_table('dubbed_videos')
op.drop_index(op.f('ix_dubbed_audios_id'), table_name='dubbed_audios')
op.drop_table('dubbed_audios')
op.drop_index(op.f('ix_translations_id'), table_name='translations')
op.drop_table('translations')
op.drop_index(op.f('ix_transcriptions_id'), table_name='transcriptions')
op.drop_table('transcriptions')
op.drop_index(op.f('ix_videos_id'), table_name='videos')
op.drop_table('videos')
op.drop_index(op.f('ix_users_id'), table_name='users')
op.drop_index(op.f('ix_users_email'), table_name='users')
op.drop_table('users')

0
app/__init__.py Normal file
View File

0
app/db/__init__.py Normal file
View File

3
app/db/base.py Normal file
View File

@ -0,0 +1,3 @@
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()

23
app/db/session.py Normal file
View File

@ -0,0 +1,23 @@
from pathlib import Path
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
DB_DIR = Path("/app/storage/db")
DB_DIR.mkdir(parents=True, exist_ok=True)
SQLALCHEMY_DATABASE_URL = f"sqlite:///{DB_DIR}/db.sqlite"
engine = create_engine(
SQLALCHEMY_DATABASE_URL,
connect_args={"check_same_thread": False}
)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
def get_db():
db = SessionLocal()
try:
yield db
finally:
db.close()

8
app/models/__init__.py Normal file
View File

@ -0,0 +1,8 @@
from .user import User
from .video import Video
from .transcription import Transcription
from .translation import Translation
from .dubbed_audio import DubbedAudio
from .dubbed_video import DubbedVideo
__all__ = ["User", "Video", "Transcription", "Translation", "DubbedAudio", "DubbedVideo"]

View File

@ -0,0 +1,12 @@
from sqlalchemy import Column, Integer, String, DateTime, ForeignKey
from sqlalchemy.sql import func
from app.db.base import Base
class DubbedAudio(Base):
__tablename__ = "dubbed_audios"
id = Column(Integer, primary_key=True, index=True)
video_id = Column(Integer, ForeignKey("videos.id"), nullable=False)
s3_url = Column(String, nullable=False)
created_at = Column(DateTime(timezone=True), server_default=func.now())

View File

@ -0,0 +1,12 @@
from sqlalchemy import Column, Integer, String, DateTime, ForeignKey
from sqlalchemy.sql import func
from app.db.base import Base
class DubbedVideo(Base):
__tablename__ = "dubbed_videos"
id = Column(Integer, primary_key=True, index=True)
video_id = Column(Integer, ForeignKey("videos.id"), nullable=False)
s3_url = Column(String, nullable=False)
created_at = Column(DateTime(timezone=True), server_default=func.now())

View File

@ -0,0 +1,12 @@
from sqlalchemy import Column, Integer, DateTime, ForeignKey, Text
from sqlalchemy.sql import func
from app.db.base import Base
class Transcription(Base):
__tablename__ = "transcriptions"
id = Column(Integer, primary_key=True, index=True)
video_id = Column(Integer, ForeignKey("videos.id"), nullable=False)
text = Column(Text, nullable=False)
created_at = Column(DateTime(timezone=True), server_default=func.now())

12
app/models/translation.py Normal file
View File

@ -0,0 +1,12 @@
from sqlalchemy import Column, Integer, DateTime, ForeignKey, Text
from sqlalchemy.sql import func
from app.db.base import Base
class Translation(Base):
__tablename__ = "translations"
id = Column(Integer, primary_key=True, index=True)
video_id = Column(Integer, ForeignKey("videos.id"), nullable=False)
text = Column(Text, nullable=False)
created_at = Column(DateTime(timezone=True), server_default=func.now())

12
app/models/user.py Normal file
View File

@ -0,0 +1,12 @@
from sqlalchemy import Column, Integer, String, DateTime
from sqlalchemy.sql import func
from app.db.base import Base
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True, index=True)
email = Column(String, unique=True, index=True, nullable=False)
password_hash = Column(String, nullable=False)
created_at = Column(DateTime(timezone=True), server_default=func.now())

15
app/models/video.py Normal file
View File

@ -0,0 +1,15 @@
from sqlalchemy import Column, Integer, String, DateTime, ForeignKey
from sqlalchemy.sql import func
from app.db.base import Base
class Video(Base):
__tablename__ = "videos"
id = Column(Integer, primary_key=True, index=True)
user_id = Column(Integer, ForeignKey("users.id"), nullable=False)
original_s3_url = Column(String, nullable=False)
language_from = Column(String, nullable=False)
language_to = Column(String, nullable=False)
status = Column(String, default="uploaded")
created_at = Column(DateTime(timezone=True), server_default=func.now())

0
app/routes/__init__.py Normal file
View File

71
app/routes/auth.py Normal file
View File

@ -0,0 +1,71 @@
from fastapi import APIRouter, Depends, HTTPException, status
from sqlalchemy.orm import Session
from pydantic import BaseModel, EmailStr
from app.db.session import get_db
from app.models.user import User
from app.utils.auth import get_password_hash, verify_password, create_access_token
router = APIRouter()
class UserCreate(BaseModel):
email: EmailStr
password: str
class UserLogin(BaseModel):
email: EmailStr
password: str
class Token(BaseModel):
access_token: str
token_type: str
class UserResponse(BaseModel):
id: int
email: str
created_at: str
class Config:
orm_mode = True
@router.post("/register", response_model=UserResponse)
async def register(user: UserCreate, db: Session = Depends(get_db)):
db_user = db.query(User).filter(User.email == user.email).first()
if db_user:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Email already registered"
)
hashed_password = get_password_hash(user.password)
db_user = User(
email=user.email,
password_hash=hashed_password
)
db.add(db_user)
db.commit()
db.refresh(db_user)
return UserResponse(
id=db_user.id,
email=db_user.email,
created_at=str(db_user.created_at)
)
@router.post("/login", response_model=Token)
async def login(user: UserLogin, db: Session = Depends(get_db)):
db_user = db.query(User).filter(User.email == user.email).first()
if not db_user or not verify_password(user.password, db_user.password_hash):
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Incorrect email or password",
headers={"WWW-Authenticate": "Bearer"},
)
access_token = create_access_token(data={"sub": db_user.email})
return {"access_token": access_token, "token_type": "bearer"}

140
app/routes/transcription.py Normal file
View File

@ -0,0 +1,140 @@
from fastapi import APIRouter, Depends, HTTPException, status, BackgroundTasks
from sqlalchemy.orm import Session
from pydantic import BaseModel
from app.db.session import get_db
from app.models.user import User
from app.models.video import Video
from app.models.transcription import Transcription
from app.utils.auth import get_current_user
from app.services.transcription_service import transcribe_video_audio
router = APIRouter()
class TranscriptionResponse(BaseModel):
id: int
video_id: int
text: str
created_at: str
class Config:
orm_mode = True
class TranscriptionStartResponse(BaseModel):
message: str
video_id: int
async def background_transcribe(video_id: int, video_s3_url: str, db: Session):
try:
# Update video status
video = db.query(Video).filter(Video.id == video_id).first()
if video:
video.status = "transcribing"
db.commit()
# Transcribe the video
transcript_text = await transcribe_video_audio(video_s3_url)
if transcript_text:
# Save transcription to database
transcription = Transcription(
video_id=video_id,
text=transcript_text
)
db.add(transcription)
# Update video status
if video:
video.status = "transcribed"
db.commit()
else:
# Update video status to error
if video:
video.status = "transcription_failed"
db.commit()
except Exception:
# Update video status to error
video = db.query(Video).filter(Video.id == video_id).first()
if video:
video.status = "transcription_failed"
db.commit()
@router.post("/{video_id}", response_model=TranscriptionStartResponse)
async def start_transcription(
video_id: int,
background_tasks: BackgroundTasks,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
# Check if video exists and belongs to user
video = db.query(Video).filter(
Video.id == video_id,
Video.user_id == current_user.id
).first()
if not video:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Video not found"
)
# Check if transcription already exists
existing_transcription = db.query(Transcription).filter(
Transcription.video_id == video_id
).first()
if existing_transcription:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Transcription already exists for this video"
)
# Start background transcription
background_tasks.add_task(background_transcribe, video_id, video.original_s3_url, db)
return TranscriptionStartResponse(
message="Transcription started in background",
video_id=video_id
)
@router.get("/{video_id}", response_model=TranscriptionResponse)
async def get_transcription(
video_id: int,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
# Check if video exists and belongs to user
video = db.query(Video).filter(
Video.id == video_id,
Video.user_id == current_user.id
).first()
if not video:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Video not found"
)
# Get transcription
transcription = db.query(Transcription).filter(
Transcription.video_id == video_id
).first()
if not transcription:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Transcription not found"
)
return TranscriptionResponse(
id=transcription.id,
video_id=transcription.video_id,
text=transcription.text,
created_at=str(transcription.created_at)
)

159
app/routes/translation.py Normal file
View File

@ -0,0 +1,159 @@
from fastapi import APIRouter, Depends, HTTPException, status, BackgroundTasks
from sqlalchemy.orm import Session
from pydantic import BaseModel
from app.db.session import get_db
from app.models.user import User
from app.models.video import Video
from app.models.transcription import Transcription
from app.models.translation import Translation
from app.utils.auth import get_current_user
from app.services.translation_service import translate_text
router = APIRouter()
class TranslationResponse(BaseModel):
id: int
video_id: int
text: str
created_at: str
class Config:
orm_mode = True
class TranslationStartResponse(BaseModel):
message: str
video_id: int
async def background_translate(video_id: int, transcript_text: str, target_language: str, source_language: str, db: Session):
try:
# Update video status
video = db.query(Video).filter(Video.id == video_id).first()
if video:
video.status = "translating"
db.commit()
# Translate the text
translated_text = await translate_text(transcript_text, target_language, source_language)
if translated_text:
# Save translation to database
translation = Translation(
video_id=video_id,
text=translated_text
)
db.add(translation)
# Update video status
if video:
video.status = "translated"
db.commit()
else:
# Update video status to error
if video:
video.status = "translation_failed"
db.commit()
except Exception:
# Update video status to error
video = db.query(Video).filter(Video.id == video_id).first()
if video:
video.status = "translation_failed"
db.commit()
@router.post("/{video_id}", response_model=TranslationStartResponse)
async def start_translation(
video_id: int,
background_tasks: BackgroundTasks,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
# Check if video exists and belongs to user
video = db.query(Video).filter(
Video.id == video_id,
Video.user_id == current_user.id
).first()
if not video:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Video not found"
)
# Check if transcription exists
transcription = db.query(Transcription).filter(
Transcription.video_id == video_id
).first()
if not transcription:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Transcription not found. Please transcribe the video first."
)
# Check if translation already exists
existing_translation = db.query(Translation).filter(
Translation.video_id == video_id
).first()
if existing_translation:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Translation already exists for this video"
)
# Start background translation
background_tasks.add_task(
background_translate,
video_id,
transcription.text,
video.language_to,
video.language_from,
db
)
return TranslationStartResponse(
message="Translation started in background",
video_id=video_id
)
@router.get("/{video_id}", response_model=TranslationResponse)
async def get_translation(
video_id: int,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
# Check if video exists and belongs to user
video = db.query(Video).filter(
Video.id == video_id,
Video.user_id == current_user.id
).first()
if not video:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Video not found"
)
# Get translation
translation = db.query(Translation).filter(
Translation.video_id == video_id
).first()
if not translation:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Translation not found"
)
return TranslationResponse(
id=translation.id,
video_id=translation.video_id,
text=translation.text,
created_at=str(translation.created_at)
)

View File

@ -0,0 +1,206 @@
from fastapi import APIRouter, Depends, HTTPException, status, BackgroundTasks
from sqlalchemy.orm import Session
from pydantic import BaseModel
from app.db.session import get_db
from app.models.user import User
from app.models.video import Video
from app.models.dubbed_audio import DubbedAudio
from app.models.dubbed_video import DubbedVideo
from app.utils.auth import get_current_user
from app.services.video_processing_service import process_video_with_dubbed_audio
router = APIRouter()
class DubbedVideoResponse(BaseModel):
id: int
video_id: int
s3_url: str
created_at: str
class Config:
orm_mode = True
class VideoProcessingStartResponse(BaseModel):
message: str
video_id: int
class VideoResultsResponse(BaseModel):
video_id: int
original_video_url: str
transcript: str
translated_text: str
dubbed_audio_url: str
final_dubbed_video_url: str
processing_status: str
async def background_process_video(video_id: int, video_s3_url: str, dubbed_audio_s3_url: str, db: Session):
try:
# Update video status
video = db.query(Video).filter(Video.id == video_id).first()
if video:
video.status = "processing_video"
db.commit()
# Process video with dubbed audio
processed_video_url = await process_video_with_dubbed_audio(video_s3_url, dubbed_audio_s3_url)
if processed_video_url:
# Save processed video to database
dubbed_video = DubbedVideo(
video_id=video_id,
s3_url=processed_video_url
)
db.add(dubbed_video)
# Update video status
if video:
video.status = "completed"
db.commit()
else:
# Update video status to error
if video:
video.status = "video_processing_failed"
db.commit()
except Exception:
# Update video status to error
video = db.query(Video).filter(Video.id == video_id).first()
if video:
video.status = "video_processing_failed"
db.commit()
@router.post("/{video_id}", response_model=VideoProcessingStartResponse)
async def start_video_processing(
video_id: int,
background_tasks: BackgroundTasks,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
# Check if video exists and belongs to user
video = db.query(Video).filter(
Video.id == video_id,
Video.user_id == current_user.id
).first()
if not video:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Video not found"
)
# Check if dubbed audio exists
dubbed_audio = db.query(DubbedAudio).filter(
DubbedAudio.video_id == video_id
).first()
if not dubbed_audio:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Dubbed audio not found. Please generate dubbed audio first."
)
# Check if dubbed video already exists
existing_dubbed_video = db.query(DubbedVideo).filter(
DubbedVideo.video_id == video_id
).first()
if existing_dubbed_video:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Dubbed video already exists for this video"
)
# Start background video processing
background_tasks.add_task(
background_process_video,
video_id,
video.original_s3_url,
dubbed_audio.s3_url,
db
)
return VideoProcessingStartResponse(
message="Video processing started in background",
video_id=video_id
)
@router.get("/{video_id}", response_model=DubbedVideoResponse)
async def get_processed_video(
video_id: int,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
# Check if video exists and belongs to user
video = db.query(Video).filter(
Video.id == video_id,
Video.user_id == current_user.id
).first()
if not video:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Video not found"
)
# Get processed video
dubbed_video = db.query(DubbedVideo).filter(
DubbedVideo.video_id == video_id
).first()
if not dubbed_video:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Processed video not found"
)
return DubbedVideoResponse(
id=dubbed_video.id,
video_id=dubbed_video.video_id,
s3_url=dubbed_video.s3_url,
created_at=str(dubbed_video.created_at)
)
@router.get("/results/{video_id}", response_model=VideoResultsResponse)
async def get_video_results(
video_id: int,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
# Check if video exists and belongs to user
video = db.query(Video).filter(
Video.id == video_id,
Video.user_id == current_user.id
).first()
if not video:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Video not found"
)
# Get all related data
from app.models.transcription import Transcription
from app.models.translation import Translation
transcription = db.query(Transcription).filter(Transcription.video_id == video_id).first()
translation = db.query(Translation).filter(Translation.video_id == video_id).first()
dubbed_audio = db.query(DubbedAudio).filter(DubbedAudio.video_id == video_id).first()
dubbed_video = db.query(DubbedVideo).filter(DubbedVideo.video_id == video_id).first()
return VideoResultsResponse(
video_id=video.id,
original_video_url=video.original_s3_url,
transcript=transcription.text if transcription else "",
translated_text=translation.text if translation else "",
dubbed_audio_url=dubbed_audio.s3_url if dubbed_audio else "",
final_dubbed_video_url=dubbed_video.s3_url if dubbed_video else "",
processing_status=video.status
)

137
app/routes/videos.py Normal file
View File

@ -0,0 +1,137 @@
import uuid
from fastapi import APIRouter, Depends, HTTPException, status, UploadFile, File, Form, BackgroundTasks
from sqlalchemy.orm import Session
from pydantic import BaseModel
from typing import List
from app.db.session import get_db
from app.models.user import User
from app.models.video import Video
from app.utils.auth import get_current_user
from app.services.s3_service import upload_file_to_s3
router = APIRouter()
ALLOWED_VIDEO_TYPES = ["video/mp4", "video/quicktime"]
MAX_FILE_SIZE = 200 * 1024 * 1024 # 200MB
class VideoResponse(BaseModel):
id: int
user_id: int
original_s3_url: str
language_from: str
language_to: str
status: str
created_at: str
class Config:
orm_mode = True
class VideoUploadResponse(BaseModel):
video_id: int
message: str
s3_url: str
@router.post("/upload", response_model=VideoUploadResponse)
async def upload_video(
background_tasks: BackgroundTasks,
video: UploadFile = File(...),
language_from: str = Form(...),
language_to: str = Form(...),
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
# Validate file type
if video.content_type not in ALLOWED_VIDEO_TYPES:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Invalid file type. Only MP4 and MOV files are allowed."
)
# Read file content
file_content = await video.read()
# Validate file size
if len(file_content) > MAX_FILE_SIZE:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="File too large. Maximum size is 200MB."
)
# Generate unique filename
file_extension = video.filename.split('.')[-1] if video.filename else 'mp4'
unique_filename = f"videos/{uuid.uuid4()}.{file_extension}"
# Upload to S3
s3_url = await upload_file_to_s3(file_content, unique_filename, video.content_type)
if not s3_url:
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail="Failed to upload video to S3"
)
# Save video metadata to database
db_video = Video(
user_id=current_user.id,
original_s3_url=s3_url,
language_from=language_from,
language_to=language_to,
status="uploaded"
)
db.add(db_video)
db.commit()
db.refresh(db_video)
return VideoUploadResponse(
video_id=db_video.id,
message="Video uploaded successfully",
s3_url=s3_url
)
@router.get("/", response_model=List[VideoResponse])
async def get_user_videos(
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
videos = db.query(Video).filter(Video.user_id == current_user.id).all()
return [VideoResponse(
id=video.id,
user_id=video.user_id,
original_s3_url=video.original_s3_url,
language_from=video.language_from,
language_to=video.language_to,
status=video.status,
created_at=str(video.created_at)
) for video in videos]
@router.get("/{video_id}", response_model=VideoResponse)
async def get_video(
video_id: int,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
video = db.query(Video).filter(
Video.id == video_id,
Video.user_id == current_user.id
).first()
if not video:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Video not found"
)
return VideoResponse(
id=video.id,
user_id=video.user_id,
original_s3_url=video.original_s3_url,
language_from=video.language_from,
language_to=video.language_to,
status=video.status,
created_at=str(video.created_at)
)

158
app/routes/voice_cloning.py Normal file
View File

@ -0,0 +1,158 @@
from fastapi import APIRouter, Depends, HTTPException, status, BackgroundTasks
from sqlalchemy.orm import Session
from pydantic import BaseModel
from app.db.session import get_db
from app.models.user import User
from app.models.video import Video
from app.models.translation import Translation
from app.models.dubbed_audio import DubbedAudio
from app.utils.auth import get_current_user
from app.services.voice_cloning_service import clone_voice_and_generate_audio
router = APIRouter()
class DubbedAudioResponse(BaseModel):
id: int
video_id: int
s3_url: str
created_at: str
class Config:
orm_mode = True
class VoiceCloningStartResponse(BaseModel):
message: str
video_id: int
async def background_voice_clone(video_id: int, video_s3_url: str, translated_text: str, db: Session):
try:
# Update video status
video = db.query(Video).filter(Video.id == video_id).first()
if video:
video.status = "voice_cloning"
db.commit()
# Generate dubbed audio with voice cloning
dubbed_audio_url = await clone_voice_and_generate_audio(video_s3_url, translated_text)
if dubbed_audio_url:
# Save dubbed audio to database
dubbed_audio = DubbedAudio(
video_id=video_id,
s3_url=dubbed_audio_url
)
db.add(dubbed_audio)
# Update video status
if video:
video.status = "voice_cloned"
db.commit()
else:
# Update video status to error
if video:
video.status = "voice_cloning_failed"
db.commit()
except Exception:
# Update video status to error
video = db.query(Video).filter(Video.id == video_id).first()
if video:
video.status = "voice_cloning_failed"
db.commit()
@router.post("/clone/{video_id}", response_model=VoiceCloningStartResponse)
async def start_voice_cloning(
video_id: int,
background_tasks: BackgroundTasks,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
# Check if video exists and belongs to user
video = db.query(Video).filter(
Video.id == video_id,
Video.user_id == current_user.id
).first()
if not video:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Video not found"
)
# Check if translation exists
translation = db.query(Translation).filter(
Translation.video_id == video_id
).first()
if not translation:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Translation not found. Please translate the video first."
)
# Check if dubbed audio already exists
existing_dubbed_audio = db.query(DubbedAudio).filter(
DubbedAudio.video_id == video_id
).first()
if existing_dubbed_audio:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Dubbed audio already exists for this video"
)
# Start background voice cloning
background_tasks.add_task(
background_voice_clone,
video_id,
video.original_s3_url,
translation.text,
db
)
return VoiceCloningStartResponse(
message="Voice cloning started in background",
video_id=video_id
)
@router.get("/{video_id}", response_model=DubbedAudioResponse)
async def get_dubbed_audio(
video_id: int,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
# Check if video exists and belongs to user
video = db.query(Video).filter(
Video.id == video_id,
Video.user_id == current_user.id
).first()
if not video:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Video not found"
)
# Get dubbed audio
dubbed_audio = db.query(DubbedAudio).filter(
DubbedAudio.video_id == video_id
).first()
if not dubbed_audio:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Dubbed audio not found"
)
return DubbedAudioResponse(
id=dubbed_audio.id,
video_id=dubbed_audio.video_id,
s3_url=dubbed_audio.s3_url,
created_at=str(dubbed_audio.created_at)
)

0
app/services/__init__.py Normal file
View File

View File

@ -0,0 +1,61 @@
import os
import boto3
from botocore.exceptions import ClientError
from typing import Optional
import logging
logger = logging.getLogger(__name__)
AWS_ACCESS_KEY_ID = os.getenv("AWS_ACCESS_KEY_ID")
AWS_SECRET_ACCESS_KEY = os.getenv("AWS_SECRET_ACCESS_KEY")
AWS_REGION = os.getenv("AWS_REGION", "us-east-1")
S3_BUCKET_NAME = os.getenv("S3_BUCKET_NAME")
if not all([AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, S3_BUCKET_NAME]):
logger.warning("AWS credentials or S3 bucket name not configured")
s3_client = boto3.client(
's3',
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
region_name=AWS_REGION
) if AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY else None
async def upload_file_to_s3(file_content: bytes, file_name: str, content_type: str) -> Optional[str]:
if not s3_client or not S3_BUCKET_NAME:
logger.error("S3 client not configured properly")
return None
try:
s3_client.put_object(
Bucket=S3_BUCKET_NAME,
Key=file_name,
Body=file_content,
ContentType=content_type
)
s3_url = f"https://{S3_BUCKET_NAME}.s3.{AWS_REGION}.amazonaws.com/{file_name}"
return s3_url
except ClientError as e:
logger.error(f"Error uploading file to S3: {e}")
return None
async def download_file_from_s3(file_name: str) -> Optional[bytes]:
if not s3_client or not S3_BUCKET_NAME:
logger.error("S3 client not configured properly")
return None
try:
response = s3_client.get_object(Bucket=S3_BUCKET_NAME, Key=file_name)
return response['Body'].read()
except ClientError as e:
logger.error(f"Error downloading file from S3: {e}")
return None
def get_s3_file_url(file_name: str) -> str:
return f"https://{S3_BUCKET_NAME}.s3.{AWS_REGION}.amazonaws.com/{file_name}"

View File

@ -0,0 +1,60 @@
import os
import openai
from typing import Optional
import logging
import tempfile
from app.services.s3_service import download_file_from_s3
logger = logging.getLogger(__name__)
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
if not OPENAI_API_KEY:
logger.warning("OpenAI API key not configured")
openai.api_key = OPENAI_API_KEY
async def transcribe_video_audio(video_s3_url: str) -> Optional[str]:
if not OPENAI_API_KEY:
logger.error("OpenAI API key not configured")
return None
try:
# Extract filename from S3 URL
file_name = video_s3_url.split('/')[-1]
# Download video from S3
video_content = await download_file_from_s3(file_name)
if not video_content:
logger.error("Failed to download video from S3")
return None
# Create temporary file for the video
with tempfile.NamedTemporaryFile(suffix='.mp4', delete=False) as temp_file:
temp_file.write(video_content)
temp_file_path = temp_file.name
try:
# Transcribe using OpenAI Whisper
with open(temp_file_path, 'rb') as audio_file:
transcript = openai.Audio.transcribe(
model="whisper-1",
file=audio_file,
response_format="text"
)
# Clean up temporary file
os.unlink(temp_file_path)
return transcript
except Exception as e:
# Clean up temporary file on error
if os.path.exists(temp_file_path):
os.unlink(temp_file_path)
raise e
except Exception as e:
logger.error(f"Error transcribing video: {e}")
return None

View File

@ -0,0 +1,69 @@
import os
import openai
from typing import Optional
import logging
logger = logging.getLogger(__name__)
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
if not OPENAI_API_KEY:
logger.warning("OpenAI API key not configured")
openai.api_key = OPENAI_API_KEY
async def translate_text(text: str, target_language: str, source_language: str = "auto") -> Optional[str]:
if not OPENAI_API_KEY:
logger.error("OpenAI API key not configured")
return None
try:
# Language mapping for better prompts
language_names = {
"en": "English",
"es": "Spanish",
"fr": "French",
"de": "German",
"it": "Italian",
"pt": "Portuguese",
"ru": "Russian",
"ja": "Japanese",
"ko": "Korean",
"zh": "Chinese",
"ar": "Arabic",
"hi": "Hindi"
}
target_lang_name = language_names.get(target_language, target_language)
source_lang_name = language_names.get(source_language, source_language)
# Create translation prompt
if source_language == "auto":
prompt = f"Translate the following text to {target_lang_name}. Maintain the original tone and meaning:\n\n{text}"
else:
prompt = f"Translate the following text from {source_lang_name} to {target_lang_name}. Maintain the original tone and meaning:\n\n{text}"
# Use GPT-4 for translation
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[
{
"role": "system",
"content": "You are a professional translator. Provide accurate translations while maintaining the original tone and context. Return only the translated text without any additional commentary."
},
{
"role": "user",
"content": prompt
}
],
max_tokens=2000,
temperature=0.3
)
translated_text = response.choices[0].message.content.strip()
return translated_text
except Exception as e:
logger.error(f"Error translating text: {e}")
return None

View File

@ -0,0 +1,116 @@
import os
import ffmpeg
from typing import Optional
import logging
import tempfile
import uuid
from app.services.s3_service import upload_file_to_s3, download_file_from_s3
logger = logging.getLogger(__name__)
async def process_video_with_dubbed_audio(video_s3_url: str, dubbed_audio_s3_url: str) -> Optional[str]:
try:
# Extract filenames from S3 URLs
video_filename = video_s3_url.split('/')[-1]
audio_filename = dubbed_audio_s3_url.split('/')[-1]
# Download video and audio from S3
video_content = await download_file_from_s3(video_filename)
audio_content = await download_file_from_s3(audio_filename)
if not video_content or not audio_content:
logger.error("Failed to download video or audio from S3")
return None
# Create temporary files
with tempfile.NamedTemporaryFile(suffix='.mp4', delete=False) as video_temp:
video_temp.write(video_content)
video_temp_path = video_temp.name
with tempfile.NamedTemporaryFile(suffix='.mp3', delete=False) as audio_temp:
audio_temp.write(audio_content)
audio_temp_path = audio_temp.name
# Create output temporary file
output_temp_path = tempfile.mktemp(suffix='.mp4')
try:
# Use ffmpeg to replace audio in video
input_video = ffmpeg.input(video_temp_path)
input_audio = ffmpeg.input(audio_temp_path)
# Combine video and new audio
out = ffmpeg.output(
input_video['v'], # video stream
input_audio['a'], # audio stream
output_temp_path,
vcodec='copy', # copy video codec (faster)
acodec='aac', # encode audio as AAC
strict='experimental'
)
# Run ffmpeg command
ffmpeg.run(out, overwrite_output=True, quiet=True)
# Read the processed video
with open(output_temp_path, 'rb') as f:
processed_video_content = f.read()
# Generate unique filename for processed video
processed_video_filename = f"processed_videos/{uuid.uuid4()}.mp4"
# Upload processed video to S3
s3_url = await upload_file_to_s3(
processed_video_content,
processed_video_filename,
"video/mp4"
)
# Clean up temporary files
os.unlink(video_temp_path)
os.unlink(audio_temp_path)
os.unlink(output_temp_path)
return s3_url
except Exception as e:
# Clean up temporary files on error
for temp_path in [video_temp_path, audio_temp_path, output_temp_path]:
if os.path.exists(temp_path):
os.unlink(temp_path)
raise e
except Exception as e:
logger.error(f"Error processing video: {e}")
return None
def extract_audio_from_video(video_path: str, audio_output_path: str) -> bool:
try:
(
ffmpeg
.input(video_path)
.output(audio_output_path, acodec='mp3')
.overwrite_output()
.run(quiet=True)
)
return True
except Exception as e:
logger.error(f"Error extracting audio: {e}")
return False
def mute_video_audio(video_path: str, output_path: str) -> bool:
try:
(
ffmpeg
.input(video_path)
.output(output_path, vcodec='copy', an=None) # an=None removes audio
.overwrite_output()
.run(quiet=True)
)
return True
except Exception as e:
logger.error(f"Error muting video: {e}")
return False

View File

@ -0,0 +1,139 @@
import os
import requests
from typing import Optional
import logging
import tempfile
import uuid
from app.services.s3_service import upload_file_to_s3, download_file_from_s3
logger = logging.getLogger(__name__)
ELEVENLABS_API_KEY = os.getenv("ELEVENLABS_API_KEY")
ELEVENLABS_BASE_URL = "https://api.elevenlabs.io/v1"
if not ELEVENLABS_API_KEY:
logger.warning("ElevenLabs API key not configured")
async def clone_voice_and_generate_audio(video_s3_url: str, text: str, voice_id: Optional[str] = None) -> Optional[str]:
if not ELEVENLABS_API_KEY:
logger.error("ElevenLabs API key not configured")
return None
try:
headers = {
"Accept": "audio/mpeg",
"Content-Type": "application/json",
"xi-api-key": ELEVENLABS_API_KEY
}
# If no voice_id provided, use default voice or create one from video
if not voice_id:
voice_id = await create_voice_from_video(video_s3_url)
if not voice_id:
# Fallback to a default voice if voice cloning fails
voice_id = "21m00Tcm4TlvDq8ikWAM" # Default ElevenLabs voice
logger.warning("Using default voice as voice cloning failed")
# Generate speech with the voice
data = {
"text": text,
"model_id": "eleven_monolingual_v1",
"voice_settings": {
"stability": 0.5,
"similarity_boost": 0.5
}
}
response = requests.post(
f"{ELEVENLABS_BASE_URL}/text-to-speech/{voice_id}",
json=data,
headers=headers
)
if response.status_code == 200:
# Generate unique filename for the audio
audio_filename = f"dubbed_audio/{uuid.uuid4()}.mp3"
# Upload audio to S3
s3_url = await upload_file_to_s3(
response.content,
audio_filename,
"audio/mpeg"
)
return s3_url
else:
logger.error(f"ElevenLabs API error: {response.status_code} - {response.text}")
return None
except Exception as e:
logger.error(f"Error generating dubbed audio: {e}")
return None
async def create_voice_from_video(video_s3_url: str) -> Optional[str]:
if not ELEVENLABS_API_KEY:
logger.error("ElevenLabs API key not configured")
return None
try:
# Extract filename from S3 URL
file_name = video_s3_url.split('/')[-1]
# Download video from S3
video_content = await download_file_from_s3(file_name)
if not video_content:
logger.error("Failed to download video from S3 for voice cloning")
return None
# Create temporary file for the video
with tempfile.NamedTemporaryFile(suffix='.mp4', delete=False) as temp_file:
temp_file.write(video_content)
temp_file_path = temp_file.name
try:
# Headers for voice cloning
headers = {
"xi-api-key": ELEVENLABS_API_KEY
}
# Voice cloning payload
data = {
"name": f"cloned_voice_{uuid.uuid4()}",
"description": "Voice cloned from uploaded video"
}
# Upload file for voice cloning
with open(temp_file_path, 'rb') as f:
files = {
"files": f
}
response = requests.post(
f"{ELEVENLABS_BASE_URL}/voices/add",
headers=headers,
data=data,
files=files
)
# Clean up temporary file
os.unlink(temp_file_path)
if response.status_code == 200:
result = response.json()
return result.get("voice_id")
else:
logger.error(f"Voice cloning failed: {response.status_code} - {response.text}")
return None
except Exception as e:
# Clean up temporary file on error
if os.path.exists(temp_file_path):
os.unlink(temp_file_path)
raise e
except Exception as e:
logger.error(f"Error cloning voice: {e}")
return None

0
app/utils/__init__.py Normal file
View File

72
app/utils/auth.py Normal file
View File

@ -0,0 +1,72 @@
import os
from datetime import datetime, timedelta
from typing import Optional
from jose import JWTError, jwt
from passlib.context import CryptContext
from fastapi import HTTPException, status, Depends
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from sqlalchemy.orm import Session
from app.db.session import get_db
from app.models.user import User
SECRET_KEY = os.getenv("SECRET_KEY", "your-secret-key-change-this-in-production")
ALGORITHM = "HS256"
ACCESS_TOKEN_EXPIRE_MINUTES = 30
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
security = HTTPBearer()
def verify_password(plain_password: str, hashed_password: str) -> bool:
return pwd_context.verify(plain_password, hashed_password)
def get_password_hash(password: str) -> str:
return pwd_context.hash(password)
def create_access_token(data: dict, expires_delta: Optional[timedelta] = None):
to_encode = data.copy()
if expires_delta:
expire = datetime.utcnow() + expires_delta
else:
expire = datetime.utcnow() + timedelta(minutes=ACCESS_TOKEN_EXPIRE_MINUTES)
to_encode.update({"exp": expire})
encoded_jwt = jwt.encode(to_encode, SECRET_KEY, algorithm=ALGORITHM)
return encoded_jwt
def verify_token(token: str) -> dict:
try:
payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
return payload
except JWTError:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Could not validate credentials",
headers={"WWW-Authenticate": "Bearer"},
)
async def get_current_user(
credentials: HTTPAuthorizationCredentials = Depends(security),
db: Session = Depends(get_db)
) -> User:
token = credentials.credentials
payload = verify_token(token)
email: str = payload.get("sub")
if email is None:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Could not validate credentials",
headers={"WWW-Authenticate": "Bearer"},
)
user = db.query(User).filter(User.email == email).first()
if user is None:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Could not validate credentials",
headers={"WWW-Authenticate": "Bearer"},
)
return user

64
main.py Normal file
View File

@ -0,0 +1,64 @@
from pathlib import Path
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from contextlib import asynccontextmanager
from app.db.session import engine
from app.db.base import Base
from app.routes import auth, videos, transcription, translation, voice_cloning, video_processing
@asynccontextmanager
async def lifespan(app: FastAPI):
# Create database tables
storage_dir = Path("/app/storage/db")
storage_dir.mkdir(parents=True, exist_ok=True)
Base.metadata.create_all(bind=engine)
yield
app = FastAPI(
title="AI Video Dubbing API",
description="Backend API for AI-powered video dubbing with voice cloning and translation",
version="1.0.0",
lifespan=lifespan
)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Include routers
app.include_router(auth.router, prefix="/auth", tags=["Authentication"])
app.include_router(videos.router, prefix="/videos", tags=["Videos"])
app.include_router(transcription.router, prefix="/transcription", tags=["Transcription"])
app.include_router(translation.router, prefix="/translation", tags=["Translation"])
app.include_router(voice_cloning.router, prefix="/voice", tags=["Voice Cloning"])
app.include_router(video_processing.router, prefix="/process", tags=["Video Processing"])
@app.get("/")
async def root():
return {
"title": "AI Video Dubbing API",
"documentation": "/docs",
"health": "/health"
}
@app.get("/health")
async def health_check():
return {
"status": "healthy",
"service": "AI Video Dubbing API",
"database": "connected"
}
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)

14
requirements.txt Normal file
View File

@ -0,0 +1,14 @@
fastapi==0.104.1
uvicorn[standard]==0.24.0
python-multipart==0.0.6
sqlalchemy==2.0.23
alembic==1.12.1
python-jose[cryptography]==3.3.0
passlib[bcrypt]==1.7.4
boto3==1.34.0
openai==1.3.7
python-decouple==3.8
ruff==0.1.6
requests==2.31.0
ffmpeg-python==0.2.0
python-dotenv==1.0.0