Batch Image Processing
This example demonstrates how to process multiple images concurrently using Umwelten's batch processing capabilities. This corresponds to the migrated image-feature-batch.ts
script functionality.
Basic Batch Processing
Simple Batch Image Analysis
Process all images in a directory with the same prompt across multiple models:
bash
umwelten eval batch \
--prompt "Analyze this image and describe key features including: objects, colors, composition, and any notable characteristics." \
--models "google:gemini-2.0-flash,ollama:qwen2.5vl:latest" \
--id "image-batch-analysis" \
--directory "input/images" \
--file-pattern "*.{jpg,jpeg,png}" \
--concurrent \
--max-concurrency 3
Structured Feature Extraction (Full Migration)
This is the complete CLI equivalent of the original image-feature-batch.ts
script:
bash
umwelten eval batch \
--prompt "Analyze this image and extract features including: able_to_parse (boolean), image_description (string), contain_text (boolean), color_palette (warm/cool/monochrome/earthy/pastel/vibrant/neutral/unknown), aesthetic_style (realistic/cartoon/abstract/clean/vintage/moody/minimalist/unknown), time_of_day (day/night/unknown), scene_type (indoor/outdoor/unknown), people_count (number), dress_style (fancy/casual/unknown). Return as JSON with confidence scores." \
--models "google:gemini-2.0-flash,ollama:qwen2.5vl:latest" \
--id "image-feature-batch" \
--directory "input/images" \
--file-pattern "*.jpeg" \
--concurrent \
--max-concurrency 5
With Schema Validation
Use structured output validation for consistent results:
bash
umwelten eval batch \
--prompt "Extract structured image features with confidence scores" \
--models "google:gemini-2.0-flash,google:gemini-1.5-flash-8b" \
--id "structured-image-batch" \
--directory "input/images" \
--file-pattern "*.{jpg,jpeg,png,webp}" \
--zod-schema "./schemas/image-feature-schema.ts" \
--concurrent \
--validate-output \
--coerce-types
Advanced Batch Processing
Different File Patterns
Target specific file types or naming patterns:
bash
# Process only high-resolution images
umwelten eval batch \
--prompt "Analyze this high-resolution image for technical quality" \
--models "google:gemini-2.0-flash" \
--id "high-res-batch" \
--directory "photos/high-res" \
--file-pattern "*_4k.jpg" \
--concurrent
# Process screenshots separately
umwelten eval batch \
--prompt "Analyze this screenshot and extract any visible text or UI elements" \
--models "google:gemini-2.0-flash" \
--id "screenshot-batch" \
--directory "screenshots" \
--file-pattern "screenshot_*.png" \
--concurrent
Recursive Directory Processing
Process images in subdirectories:
bash
umwelten eval batch \
--prompt "Categorize this image by content type and quality" \
--models "google:gemini-2.0-flash,ollama:qwen2.5vl:latest" \
--id "recursive-image-batch" \
--directory "media" \
--file-pattern "**/*.{jpg,png}" \
--concurrent \
--max-concurrency 4
File Limit Controls
Process a limited number of files for testing:
bash
umwelten eval batch \
--prompt "Analyze this image for content moderation" \
--models "google:gemini-2.0-flash" \
--id "moderation-test" \
--directory "user-uploads" \
--file-pattern "*.jpg" \
--file-limit 10 \
--concurrent
Interactive Batch Processing
Real-time Progress Monitoring
Watch batch processing progress in real-time:
bash
umwelten eval batch \
--prompt "Extract detailed metadata from this image" \
--models "google:gemini-2.0-flash,ollama:qwen2.5vl:latest" \
--id "metadata-extraction" \
--directory "photo-library" \
--file-pattern "*.{jpg,jpeg}" \
--ui \
--concurrent \
--max-concurrency 3
Generate Comprehensive Reports
Markdown Report with Image Analysis
bash
# Generate detailed markdown report
umwelten eval report --id image-feature-batch --format markdown
HTML Report with Embedded Previews
bash
# Generate HTML report with rich formatting
umwelten eval report --id structured-image-batch --format html --output batch-report.html
CSV Export for Analysis
bash
# Export structured data for further analysis
umwelten eval report --id structured-image-batch --format csv --output image-data.csv
Expected Output Structure
Directory Structure After Processing
output/evaluations/image-feature-batch/
├── responses/
│ ├── image1.jpg/
│ │ ├── google_gemini-2.0-flash.json
│ │ └── ollama_qwen2.5vl_latest.json
│ ├── image2.jpg/
│ │ ├── google_gemini-2.0-flash.json
│ │ └── ollama_qwen2.5vl_latest.json
│ └── image3.jpg/
│ ├── google_gemini-2.0-flash.json
│ └── ollama_qwen2.5vl_latest.json
└── reports/
├── results.md
└── results.html
Sample Response JSON
json
{
"content": {
"able_to_parse": {
"value": true,
"confidence": 0.98
},
"image_description": {
"value": "A vibrant outdoor scene showing children playing in a park with swings and slides. The setting is during daytime with clear blue skies and green grass.",
"confidence": 0.92
},
"contain_text": {
"value": false,
"confidence": 0.95
},
"color_palette": {
"value": "vibrant",
"confidence": 0.88
},
"aesthetic_style": {
"value": "realistic",
"confidence": 0.94
},
"time_of_day": {
"value": "day",
"confidence": 0.97
},
"scene_type": {
"value": "outdoor",
"confidence": 0.96
},
"people_count": {
"value": 3,
"confidence": 0.85
},
"dress_style": {
"value": "casual",
"confidence": 0.89
}
},
"metadata": {
"model": "gemini-2.0-flash",
"provider": "google",
"filename": "playground_scene.jpg",
"startTime": "2025-01-27T18:30:15.123Z",
"endTime": "2025-01-27T18:30:18.456Z",
"tokenUsage": {
"promptTokens": 45,
"completionTokens": 156,
"total": 201
},
"cost": {
"promptCost": 0.00000338,
"completionCost": 0.0000468,
"totalCost": 0.00005018
}
}
}
Performance Comparison Report
Sample Batch Processing Report
markdown
# Batch Image Processing Report: image-feature-batch
**Generated:** 2025-01-27T19:15:00.000Z
**Total Images:** 25
**Total Models:** 2
**Processing Mode:** Concurrent (max 5)
## Summary Statistics
| Model | Provider | Images Processed | Avg Time/Image | Total Cost | Success Rate |
|-------|----------|------------------|----------------|------------|--------------|
| gemini-2.0-flash | google | 25 | 3.2s | $0.001254 | 100% |
| qwen2.5vl:latest | ollama | 25 | 4.8s | Free | 96% |
## Processing Performance
- **Total Processing Time:** 4m 32s
- **Sequential Time Estimate:** 15m 45s
- **Speedup with Concurrency:** 3.5x faster
- **Average Images/Second:** 0.92
- **Peak Memory Usage:** 245 MB
## Image Analysis Results
### Feature Extraction Quality
| Feature | Gemini 2.0 Avg Confidence | Qwen2.5VL Avg Confidence | Notes |
|---------|---------------------------|--------------------------|-------|
| able_to_parse | 0.97 | 0.94 | Excellent across both models |
| image_description | 0.91 | 0.87 | Gemini more detailed |
| contain_text | 0.94 | 0.89 | Strong OCR detection |
| color_palette | 0.86 | 0.83 | Good color analysis |
| people_count | 0.82 | 0.78 | Most challenging feature |
### Error Analysis
- **Processing Errors:** 1/50 total evaluations (2%)
- **Validation Errors:** 0/50 (100% schema compliance)
- **Common Issues:** People counting in crowded scenes
- **Recovery Rate:** 100% (all errors automatically retried)
### File Type Performance
| Format | Count | Success Rate | Avg Processing Time |
|--------|-------|--------------|-------------------|
| JPEG | 18 | 100% | 3.4s |
| PNG | 6 | 100% | 4.1s |
| WebP | 1 | 100% | 3.8s |
## Cost Analysis
- **Google Gemini 2.0 Flash:** $0.001254 total
- **Ollama qwen2.5vl:** Free (local processing)
- **Cost per Image:** $0.000050 (Google only)
- **Cost vs. Quality:** Google provides 15% better accuracy for minimal cost
Advanced Patterns
Resume Interrupted Processing
Resume batch processing from where it left off:
bash
umwelten eval batch \
--prompt "Continue batch processing" \
--models "google:gemini-2.0-flash" \
--id "image-feature-batch" \
--directory "input/images" \
--file-pattern "*.jpg" \
--resume \
--concurrent
Different Prompts for Different Models
Use model-specific strengths:
bash
# Detailed analysis with expensive model
umwelten eval batch \
--prompt "Provide comprehensive artistic and technical analysis" \
--models "google:gemini-2.5-pro-exp-03-25" \
--id "detailed-analysis" \
--directory "art-collection" \
--file-pattern "*.jpg" \
--file-limit 5
# Quick categorization with fast model
umwelten eval batch \
--prompt "Categorize: portrait/landscape/object/abstract" \
--models "google:gemini-2.0-flash" \
--id "quick-categorization" \
--directory "mixed-images" \
--file-pattern "*.jpg" \
--concurrent \
--max-concurrency 8
Error Handling and Validation
Robust processing with validation:
bash
umwelten eval batch \
--prompt "Extract image features with validation" \
--models "google:gemini-2.0-flash,ollama:qwen2.5vl:latest" \
--id "robust-batch" \
--directory "user-uploads" \
--file-pattern "*.{jpg,png}" \
--zod-schema "./schemas/image-feature-schema.ts" \
--validate-output \
--strict-validation \
--concurrent \
--timeout 30000
Tips for Effective Batch Processing
Optimization Strategies
Concurrency Tuning
- Start with 3-5 concurrent processes
- Monitor system resources (CPU, memory, network)
- Increase gradually based on performance
File Organization
- Use descriptive directory structures
- Group similar images together
- Use consistent naming conventions
Model Selection
- Google Gemini: Best for detailed analysis and OCR
- Ollama qwen2.5vl: Best for privacy and cost-free processing
- Mix models for cost vs. quality optimization
Common Pitfalls
- Too High Concurrency: Can overwhelm API rate limits
- Large Images: May cause timeouts, consider preprocessing
- Mixed File Types: Different formats may have different processing times
- Schema Validation: Test schemas on single images first
Best Practices
- Test with small batches first (
--file-limit 5
) - Use
--ui
flag for monitoring large batches - Enable resume capability for long-running jobs
- Validate schemas before large batch runs
- Monitor costs with paid providers
- Use meaningful evaluation IDs for organization
Migration Benefits vs Original Script
Enhanced Performance
- ✅ 3-5x faster with concurrent processing
- ✅ Resume capability for interrupted jobs
- ✅ Better error handling with automatic retries
- ✅ Progress monitoring with interactive UI
Improved User Experience
- ✅ Consistent interface across all batch operations
- ✅ Multiple report formats (MD, HTML, JSON, CSV)
- ✅ Cost transparency with integrated pricing
- ✅ Flexible file patterns and directory scanning
Better Maintainability
- ✅ No custom code required for new use cases
- ✅ Standardized output format and structure
- ✅ Built-in validation and error reporting
- ✅ Easy extension through configuration
Next Steps
- Try structured image features for single image analysis
- Explore cost optimization for budget-conscious batches
- See migration guide for converting other batch scripts