Image to Splat

Create 3D Gaussian Splats from photographs using AI-powered reconstruction.

Overview

The Image-to-Splat pipeline uses:

COLMAP - Structure from Motion (SfM) for camera pose estimation
SHARP - Our optimized Gaussian Splatting training pipeline
GPU Acceleration - CUDA for fast processing (NVIDIA GPUs)

Requirements

Hardware

GPU: NVIDIA GPU with 6GB+ VRAM (recommended)
RAM: 16GB+ recommended
Storage: 10GB+ free space

Software

Python 3.10+ (3.11 recommended)
NVIDIA drivers (for GPU acceleration)

Using the Desktop App

Step 1: Prepare Images

Capture or collect images:

Quantity: 50-200 images recommended
Coverage: Capture from multiple angles
Quality: Sharp, well-lit photos
Overlap: 60-80% overlap between adjacent photos

Image Capture Tips

Walk around the subject in a circle
Take photos at multiple heights
Avoid motion blur
Consistent lighting preferred

Step 2: Import Images

Open Image→Splat tab
Drag and drop images or click to browse
Images appear in the preview grid

Step 3: Configure Settings

Setting	Options	Description
Quality Preset	Fast / Balanced / High	Processing quality vs. speed
GPU Acceleration	On / Off	Use CUDA if available
Output Format	SOG / PLY	Final output format
Iterations	1000-30000	Training iterations

Quality Presets

Preset	Iterations	Time (100 images)	Quality
Fast	3000	~5 min	Good
Balanced	10000	~15 min	Better
High	30000	~45 min	Best

Step 4: Process

Click Process
Watch progress:
Camera estimation (COLMAP)
Point cloud generation
Gaussian training
Output generation
Preview result when complete

Step 5: Export

Review the preview
Adjust camera if needed
Click Export
Choose output location and format

Using the CLI

Basic Usage

# Process images to splat
splat-tools image2splat --input ./photos --output scene.sog

# With quality preset
splat-tools image2splat --input ./photos --output scene.sog --preset high

Advanced Options

splat-tools image2splat [options]

Options:
  --input, -i        Input directory with images
  --output, -o       Output file path
  --preset, -p       Quality preset (fast, balanced, high)
  --iterations       Number of training iterations
  --format, -f       Output format (sog, ply)
  --gpu              GPU device ID (default: auto)
  --verbose          Show detailed progress
  --keep-temp        Keep intermediate files

Examples

# High quality with specific GPU
splat-tools image2splat \
  --input ./my-photos \
  --output ./output/scene.sog \
  --preset high \
  --gpu 0

# Custom iterations
splat-tools image2splat \
  --input ./photos \
  --output scene.ply \
  --iterations 20000 \
  --keep-temp

Programmatic Usage

import { image2splat } from '@storysplat/splat-tools';

const result = await image2splat({
  input: './photos',
  output: './scene.sog',
  preset: 'balanced',
  gpu: 'auto',
  onProgress: (stage, progress, message) => {
    console.log(`${stage}: ${(progress * 100).toFixed(0)}% - ${message}`);
  },
});

console.log(`Created: ${result.outputPath}`);
console.log(`Gaussians: ${result.stats.gaussianCount}`);
console.log(`Processing time: ${result.stats.totalTime}s`);

Pipeline Details

Stage 1: Feature Detection

COLMAP detects features in each image:

Detecting features: 100/100 images
Found 125,432 features average per image

Stage 2: Feature Matching

Match features between overlapping images:

Matching features: 4950/4950 pairs
Matched 89,234 feature pairs

Stage 3: Sparse Reconstruction

Build sparse point cloud and camera poses:

Sparse reconstruction complete
Cameras: 100
Points: 45,678
Reprojection error: 0.52px

Stage 4: Gaussian Training

Train 3D Gaussians on the point cloud:

Iteration 10000/10000
Loss: 0.0234
Gaussians: 523,456
PSNR: 28.5 dB

Stage 5: Export

Convert to final format:

Exporting to SOG format...
Output: scene.sog (45.2 MB)

Image Capture Guide

Indoor Scenes

Use consistent artificial lighting
Avoid reflective surfaces
Include ceiling/floor in some shots
Capture corners and edges

Outdoor Scenes

Overcast days provide even lighting
Avoid harsh shadows
Capture at consistent time (lighting)
Include ground and sky references

Objects

Place on neutral background
Rotate around object (not object rotating)
Include top-down views
Avoid transparent/reflective materials

Problem Areas

Issue	Solution
Reflective surfaces	Add polarizing filter or avoid
Transparent objects	Include background reference
Moving elements	Remove from scene or mask
Poor lighting	Add supplemental lighting

Troubleshooting

"COLMAP failed - insufficient features"

Add more images with better overlap
Ensure images are sharp
Check for motion blur
Improve lighting

"GPU out of memory"

Reduce iteration count
Process fewer images
Use --gpu cpu for CPU fallback
Close other GPU applications

"Poor reconstruction quality"

Add more images
Increase iteration count
Check image quality
Ensure proper scene coverage

"Processing stuck"

Check GPU temperature
Verify CUDA installation
Review console for errors
Try smaller image batch first

Output Files

After processing, you'll have:

output/
├── scene.sog          # Final splat file
├── cameras.json       # Camera positions (optional)
└── preview.png        # Thumbnail (optional)

Best Practices

Start small - Test with 20-30 images first
Check coverage - Ensure no gaps in capture
Consistent conditions - Same lighting throughout
Quality images - Sharp, properly exposed
Iterate - Adjust and re-process as needed

Next Steps

Video to 4DGS - Create volumetric video
Format Conversion - Optimize output
Viewer Integration - Display your splat