Skip to content

Image to Splat

Create 3D Gaussian Splats from photographs using AI-powered reconstruction.

Overview

The Image-to-Splat pipeline uses:

  1. COLMAP - Structure from Motion (SfM) for camera pose estimation
  2. SHARP - Our optimized Gaussian Splatting training pipeline
  3. GPU Acceleration - CUDA for fast processing (NVIDIA GPUs)

Requirements

Hardware

  • GPU: NVIDIA GPU with 6GB+ VRAM (recommended)
  • RAM: 16GB+ recommended
  • Storage: 10GB+ free space

Software

  • Python 3.10+ (3.11 recommended)
  • NVIDIA drivers (for GPU acceleration)

Using the Desktop App

Step 1: Prepare Images

Capture or collect images:

  • Quantity: 50-200 images recommended
  • Coverage: Capture from multiple angles
  • Quality: Sharp, well-lit photos
  • Overlap: 60-80% overlap between adjacent photos

Image Capture Tips

  • Walk around the subject in a circle
  • Take photos at multiple heights
  • Avoid motion blur
  • Consistent lighting preferred

Step 2: Import Images

  1. Open Image→Splat tab
  2. Drag and drop images or click to browse
  3. Images appear in the preview grid

Step 3: Configure Settings

Setting Options Description
Quality Preset Fast / Balanced / High Processing quality vs. speed
GPU Acceleration On / Off Use CUDA if available
Output Format SOG / PLY Final output format
Iterations 1000-30000 Training iterations

Quality Presets

Preset Iterations Time (100 images) Quality
Fast 3000 ~5 min Good
Balanced 10000 ~15 min Better
High 30000 ~45 min Best

Step 4: Process

  1. Click Process
  2. Watch progress:
  3. Camera estimation (COLMAP)
  4. Point cloud generation
  5. Gaussian training
  6. Output generation
  7. Preview result when complete

Step 5: Export

  1. Review the preview
  2. Adjust camera if needed
  3. Click Export
  4. Choose output location and format

Using the CLI

Basic Usage

# Process images to splat
splat-tools image2splat --input ./photos --output scene.sog

# With quality preset
splat-tools image2splat --input ./photos --output scene.sog --preset high

Advanced Options

splat-tools image2splat [options]

Options:
  --input, -i        Input directory with images
  --output, -o       Output file path
  --preset, -p       Quality preset (fast, balanced, high)
  --iterations       Number of training iterations
  --format, -f       Output format (sog, ply)
  --gpu              GPU device ID (default: auto)
  --verbose          Show detailed progress
  --keep-temp        Keep intermediate files

Examples

# High quality with specific GPU
splat-tools image2splat \
  --input ./my-photos \
  --output ./output/scene.sog \
  --preset high \
  --gpu 0

# Custom iterations
splat-tools image2splat \
  --input ./photos \
  --output scene.ply \
  --iterations 20000 \
  --keep-temp

Programmatic Usage

import { image2splat } from '@storysplat/splat-tools';

const result = await image2splat({
  input: './photos',
  output: './scene.sog',
  preset: 'balanced',
  gpu: 'auto',
  onProgress: (stage, progress, message) => {
    console.log(`${stage}: ${(progress * 100).toFixed(0)}% - ${message}`);
  },
});

console.log(`Created: ${result.outputPath}`);
console.log(`Gaussians: ${result.stats.gaussianCount}`);
console.log(`Processing time: ${result.stats.totalTime}s`);

Pipeline Details

Stage 1: Feature Detection

COLMAP detects features in each image:

Detecting features: 100/100 images
Found 125,432 features average per image

Stage 2: Feature Matching

Match features between overlapping images:

Matching features: 4950/4950 pairs
Matched 89,234 feature pairs

Stage 3: Sparse Reconstruction

Build sparse point cloud and camera poses:

Sparse reconstruction complete
Cameras: 100
Points: 45,678
Reprojection error: 0.52px

Stage 4: Gaussian Training

Train 3D Gaussians on the point cloud:

Iteration 10000/10000
Loss: 0.0234
Gaussians: 523,456
PSNR: 28.5 dB

Stage 5: Export

Convert to final format:

Exporting to SOG format...
Output: scene.sog (45.2 MB)

Image Capture Guide

Indoor Scenes

  • Use consistent artificial lighting
  • Avoid reflective surfaces
  • Include ceiling/floor in some shots
  • Capture corners and edges

Outdoor Scenes

  • Overcast days provide even lighting
  • Avoid harsh shadows
  • Capture at consistent time (lighting)
  • Include ground and sky references

Objects

  • Place on neutral background
  • Rotate around object (not object rotating)
  • Include top-down views
  • Avoid transparent/reflective materials

Problem Areas

Issue Solution
Reflective surfaces Add polarizing filter or avoid
Transparent objects Include background reference
Moving elements Remove from scene or mask
Poor lighting Add supplemental lighting

Troubleshooting

"COLMAP failed - insufficient features"

  • Add more images with better overlap
  • Ensure images are sharp
  • Check for motion blur
  • Improve lighting

"GPU out of memory"

  • Reduce iteration count
  • Process fewer images
  • Use --gpu cpu for CPU fallback
  • Close other GPU applications

"Poor reconstruction quality"

  • Add more images
  • Increase iteration count
  • Check image quality
  • Ensure proper scene coverage

"Processing stuck"

  • Check GPU temperature
  • Verify CUDA installation
  • Review console for errors
  • Try smaller image batch first

Output Files

After processing, you'll have:

output/
├── scene.sog          # Final splat file
├── cameras.json       # Camera positions (optional)
└── preview.png        # Thumbnail (optional)

Best Practices

  1. Start small - Test with 20-30 images first
  2. Check coverage - Ensure no gaps in capture
  3. Consistent conditions - Same lighting throughout
  4. Quality images - Sharp, properly exposed
  5. Iterate - Adjust and re-process as needed

Next Steps