A web application that automatically transforms long-form videos into short, shareable clips using AI-powered moment detection.
This tool processes long-form videos (10-30 minutes) and automatically:
- Generates transcripts using AI (Whisper)
- Identifies key moments using AI (with estimated timestamps)
- Creates short clips from detected moments
- Exports clips in both horizontal (16:9) and vertical (9:16) formats
- Framework: Next.js 16 (App Router)
- Frontend: React 19 + TypeScript
- Backend: Next.js API Routes
- Database: PostgreSQL (using Neon for easy setup)
- Video Processing: ffmpeg
- AI: Groq (Whisper for transcription), OpenRouter (GPT-OSS-20B for moment detection)
Video-Atomization-Tool-Poseidon/
├── nextjs-app/ # Next.js application
│ ├── app/ # Next.js App Router
│ │ ├── api/ # API routes
│ │ ├── page.tsx # Home page
│ │ ├── upload/ # Upload page
│ │ └── videos/[id]/ # Video details page
│ ├── lib/ # Shared libraries
│ │ ├── db/ # Database config & schema
│ │ ├── services/ # Business logic (transcript, moments, clips)
│ │ └── utils/ # Utility functions
│ ├── uploads/ # Uploaded videos
│ └── clips/ # Generated clips
└── README.md
- Node.js (v18+)
- PostgreSQL database (or use Neon for cloud setup)
- ffmpeg installed on your system
- Groq API key (free) - for transcription
- OpenRouter API key (free) - for moment detection
- Optional: OpenAI API key (if you prefer paid models)
- Navigate to the Next.js app directory:
cd nextjs-app- Install dependencies:
npm install- Create
.env.localfile in thenextjs-appdirectory:
DATABASE_URL=postgresql://user:password@localhost:5432/video_atomization
UPLOAD_DIR=./uploads
CLIPS_DIR=./clips
# AI Services (Free options)
USE_GROQ=true
GROQ_API_KEY=your_groq_api_key_here
USE_OPENROUTER=true
OPENROUTER_API_KEY=your_openrouter_api_key_here
OPENROUTER_MODEL=openai/gpt-oss-20b:free
APP_URL=http://localhost:4201
# Optional: OpenAI (if you prefer paid models)
OPENAI_API_KEY=your_openai_api_key_here- Initialize database:
npm run db:init- Start the development server:
npm run devThe application will be available at http://localhost:4201
The application follows a Next.js full-stack architecture:
┌─────────────┐ ┌─────────────┐
│ Next.js │ ──────> │ PostgreSQL │
│ App Router │ SQL │ (Neon DB) │
│ (React UI) │ └─────────────┘
└─────────────┘ │
│ │
├──> API Routes │
│ (Next.js) │
│ │
├──> Groq API (Whisper transcription)
├──> OpenRouter API (Moment detection)
└──> FFmpeg (Video Processing)
Framework: Next.js
- Full-stack framework with built-in API routes
- Server-side rendering and client-side interactivity
- TypeScript-first approach
- File-based routing with App Router
- Integrated development experience
Database: PostgreSQL (Neon)
- Relational data model (videos, transcripts, clips)
- Foreign key constraints with CASCADE deletes
- Indexes on frequently queried columns
- Cloud-hosted option with Neon for easy setup
Video Processing: FFmpeg
- Industry-standard video processing
- Supports multiple formats and codecs
- Efficient clip generation with scaling and padding
- Upload: User uploads video → Stored in
uploads/→ Metadata saved to DB - Transcription: Video sent to Groq Whisper → Transcript stored in DB
- Moment Detection: Transcript analyzed by OpenRouter (GPT-OSS-20B) → Key moments identified → Clips created in DB
- Clip Generation: FFmpeg processes video → Generates 16:9 and 9:16 formats → Files saved to
clips/ - Download: User requests clip → API route serves file from
clips/directory
- videos: Stores uploaded video metadata
- transcripts: One-to-one with videos, stores transcript text and status
- clips: One-to-many with videos, stores moment timestamps and file paths
All endpoints are Next.js API routes under /api:
POST /api/videos/upload- Upload videoGET /api/videos- List all videosGET /api/videos/[id]- Get video detailsDELETE /api/videos/[id]- Delete videoPOST /api/transcripts/[videoId]- Generate transcriptGET /api/transcripts/[videoId]- Get transcriptPOST /api/moments/[videoId]- Detect key momentsGET /api/moments/[videoId]- Get momentsPOST /api/clips/[videoId]- Generate all clipsGET /api/clips/[videoId]- Get clips for videoGET /api/clips/download/[id]/[format]- Download clip (horizontal/vertical)
-
File Storage: Videos and clips stored locally (not in cloud storage)
- Trade-off: Simpler setup, but not scalable for production
- Assumption: Single-server deployment
-
Synchronous Processing: Transcript and clip generation are blocking operations
- Trade-off: Simpler code, but user waits for completion
- Future: Could add job queue (Bull, BullMQ) for async processing
-
No Authentication: No user auth implemented
- Assumption: Single-user or internal tool usage
-
Error Handling: Basic error handling with try-catch
- Trade-off: Works for MVP, but could use centralized error middleware
-
Video Duration: Extracted on upload using ffprobe
- Duration is automatically extracted when video is uploaded
- If extraction fails, duration is set to null (upload still succeeds)
The application uses free AI services to keep costs at zero:
Groq API (for transcription):
- Service: Groq (free tier)
- Model:
whisper-large-v3-turbo - Input: Video file stream
- Output: Plain text transcript
- Usage: Called when user clicks "Generate Transcript"
- Why Groq: Free, fast, and uses the latest Whisper model
OpenRouter API (for moment detection):
- Service: OpenRouter (free tier)
- Model:
openai/gpt-oss-20b:free(21B parameter model) - Input: Transcript text with prompt
- Output: JSON array of moments with titles and timestamps
- Temperature: 0.7 (balanced creativity/consistency)
- Usage: Called when user clicks "Detect Moments"
- Why OpenRouter: Free access to powerful open-source models, OpenAI-compatible API
The code supports OpenAI as a fallback:
- Set
USE_GROQ=falseto use OpenAI Whisper - Set
USE_OPENROUTER=falseto use OpenAI GPT-4 - Requires valid
OPENAI_API_KEYwith credits
The moment detection prompt asks the LLM to:
- Identify 3-5 key moments
- Generate short titles (< 50 chars)
- Estimate timestamps based on transcript position
- Return structured JSON array
The system prompt sets context: "You are a video editor. Find the most interesting moments in transcripts that would work as short clips."
- API calls are made synchronously (user waits)
- Transcripts are cached in database (no re-generation)
- Moments are cached (can re-detect if needed)
- Error handling includes proper error messages for API failures
- Free tier limits: Groq has generous limits, OpenRouter has 50 free requests/day (1000/day with $10+ credits)
-
Video Upload
- Upload a video file (max 500MB)
- Verify file size validation works
- Check video appears in dashboard
- Verify duration is extracted
-
Transcript Generation
- Generate transcript for uploaded video
- Verify transcript appears in video details
- Check transcript status updates correctly
-
Moment Detection
- Detect moments from transcript
- Verify 3-5 moments are detected
- Check moment titles and timestamps
-
Clip Generation
- Generate clips for detected moments
- Verify both horizontal (16:9) and vertical (9:16) formats
- Check clips appear in gallery
-
Download
- Download horizontal clip
- Download vertical clip
- Verify files download correctly
-
Error Handling
- Test with invalid file types
- Test with files exceeding size limit
- Verify error messages display correctly
✅ Completed Features:
- Video upload with validation (500MB limit)
- Video duration extraction using ffprobe
- Transcript generation with Groq Whisper (free)
- AI-powered moment detection with OpenRouter GPT-OSS-20B (free)
- Clip generation in 16:9 and 9:16 formats
- Download functionality for generated clips
- Dashboard UI with video listing
- Video details page with processing pipeline
- Environment configuration with free AI services
- Error handling and user feedback
- File size and type validation
✅ Code Quality:
- Proper error handling
- Input validation
- Clean code structure
- TypeScript throughout
- Add job queue for async processing (Bull/BullMQ)
- Implement user authentication
- Add cloud storage integration (S3, etc.)
- Add video preview/thumbnail generation
- Implement search and filtering
- Add batch processing capabilities
- Improve error logging and monitoring