Examples Overview
This section provides comprehensive examples showing how to use Umwelten for various AI model evaluation tasks. Many of these examples correspond to scripts that have been migrated from the scripts/
directory to CLI commands.
Basic Examples
Perfect for getting started with Umwelten:
- Simple Text Generation - Basic prompt evaluation across models
- Creative Writing - Poetry and story generation with temperature control
- Analysis & Reasoning - Complex reasoning tasks and literary analysis
- Tool Integration - Using and creating tools to enhance AI capabilities
Image Processing Examples
Working with visual content and structured data extraction:
- Basic Image Analysis - Simple image description and analysis
- Structured Image Features - Extract structured data with confidence scores
- Batch Image Processing - Process multiple images concurrently
Document Processing
Handle various document formats:
- PDF Analysis - Test native PDF parsing capabilities
- Multi-format Documents - Work with different document types
Advanced Workflows
Complex evaluation patterns and optimization:
- Multi-language Evaluation - Code generation across programming languages
- Complex Structured Output - Advanced schema validation with nested objects
- Cost Optimization - Compare model costs and performance
Migration Reference
These examples show CLI equivalents for scripts that have been migrated:
Script | Example | Status |
---|---|---|
cat-poem.ts | Creative Writing | ✅ Complete |
temperature.ts | Creative Writing | ✅ Complete |
frankenstein.ts | Analysis & Reasoning | ✅ Complete |
google-pricing.ts | Cost Optimization | ✅ Complete |
image-parsing.ts | Basic Image Analysis | ✅ Complete |
image-feature-extract.ts | Structured Image Features | ✅ Complete |
image-feature-batch.ts | Batch Image Processing | ✅ Complete |
pdf-identify.ts | PDF Analysis | ✅ Complete |
pdf-parsing.ts | PDF Analysis | ✅ Complete |
roadtrip.ts | Complex Structured Output | 🔄 Partial |
multi-language-evaluation.ts | Multi-language Evaluation | 🔄 Needs Pipeline |
Quick Examples
Here are some quick examples to get you started:
Basic Evaluation
bash
umwelten eval run \
--prompt "Explain quantum computing in simple terms" \
--models "ollama:gemma3:12b,google:gemini-2.0-flash" \
--id "quantum-explanation"
With Structured Output
bash
umwelten eval run \
--prompt "Extract person info: John is 25 and works as a developer" \
--models "google:gemini-2.0-flash" \
--id "person-extraction" \
--schema "name, age int, job"
Batch Processing
bash
umwelten eval batch \
--prompt "Analyze this image and describe key features" \
--models "google:gemini-2.0-flash,ollama:qwen2.5vl:latest" \
--id "image-batch" \
--directory "input/images" \
--file-pattern "*.jpg" \
--concurrent
Generate Reports
bash
# Markdown report
umwelten eval report --id quantum-explanation --format markdown
# HTML report with export
umwelten eval report --id image-batch --format html --output report.html
Common Patterns
Temperature Testing
Compare model outputs at different creativity levels:
bash
# High creativity
umwelten eval run --prompt "Write a creative story" --models "ollama:gemma3:12b" --temperature 1.5 --id "creative-high"
# Low creativity
umwelten eval run --prompt "Write a creative story" --models "ollama:gemma3:12b" --temperature 0.2 --id "creative-low"
Cost Comparison
Evaluate cost vs. quality trade-offs:
bash
umwelten eval run \
--prompt "Write a detailed analysis of renewable energy trends" \
--models "google:gemini-2.0-flash,openrouter:openai/gpt-4o-mini,openrouter:openai/gpt-4o" \
--id "cost-comparison" \
--concurrent
Multi-modal Evaluation
Test vision capabilities across models:
bash
umwelten eval run \
--prompt "Describe this image in detail and identify any text" \
--models "google:gemini-2.0-flash,ollama:qwen2.5vl:latest" \
--id "vision-test" \
--attach "./test-image.jpg"
Next Steps
- Browse specific examples for your use case
- Check the Migration Guide to see how scripts were converted
- Review Advanced Features for complex workflows