Skip to content

Examples Overview

This section provides comprehensive examples showing how to use Umwelten for various AI model evaluation tasks. Many of these examples correspond to scripts that have been migrated from the scripts/ directory to CLI commands.

Basic Examples

Perfect for getting started with Umwelten:

Image Processing Examples

Working with visual content and structured data extraction:

Document Processing

Handle various document formats:

Advanced Workflows

Complex evaluation patterns and optimization:

Migration Reference

These examples show CLI equivalents for scripts that have been migrated:

ScriptExampleStatus
cat-poem.tsCreative Writing✅ Complete
temperature.tsCreative Writing✅ Complete
frankenstein.tsAnalysis & Reasoning✅ Complete
google-pricing.tsCost Optimization✅ Complete
image-parsing.tsBasic Image Analysis✅ Complete
image-feature-extract.tsStructured Image Features✅ Complete
image-feature-batch.tsBatch Image Processing✅ Complete
pdf-identify.tsPDF Analysis✅ Complete
pdf-parsing.tsPDF Analysis✅ Complete
roadtrip.tsComplex Structured Output🔄 Partial
multi-language-evaluation.tsMulti-language Evaluation🔄 Needs Pipeline

Quick Examples

Here are some quick examples to get you started:

Basic Evaluation

bash
umwelten eval run \
  --prompt "Explain quantum computing in simple terms" \
  --models "ollama:gemma3:12b,google:gemini-2.0-flash" \
  --id "quantum-explanation"

With Structured Output

bash
umwelten eval run \
  --prompt "Extract person info: John is 25 and works as a developer" \
  --models "google:gemini-2.0-flash" \
  --id "person-extraction" \
  --schema "name, age int, job"

Batch Processing

bash
umwelten eval batch \
  --prompt "Analyze this image and describe key features" \
  --models "google:gemini-2.0-flash,ollama:qwen2.5vl:latest" \
  --id "image-batch" \
  --directory "input/images" \
  --file-pattern "*.jpg" \
  --concurrent

Generate Reports

bash
# Markdown report
umwelten eval report --id quantum-explanation --format markdown

# HTML report with export
umwelten eval report --id image-batch --format html --output report.html

Common Patterns

Temperature Testing

Compare model outputs at different creativity levels:

bash
# High creativity
umwelten eval run --prompt "Write a creative story" --models "ollama:gemma3:12b" --temperature 1.5 --id "creative-high"

# Low creativity  
umwelten eval run --prompt "Write a creative story" --models "ollama:gemma3:12b" --temperature 0.2 --id "creative-low"

Cost Comparison

Evaluate cost vs. quality trade-offs:

bash
umwelten eval run \
  --prompt "Write a detailed analysis of renewable energy trends" \
  --models "google:gemini-2.0-flash,openrouter:openai/gpt-4o-mini,openrouter:openai/gpt-4o" \
  --id "cost-comparison" \
  --concurrent

Multi-modal Evaluation

Test vision capabilities across models:

bash
umwelten eval run \
  --prompt "Describe this image in detail and identify any text" \
  --models "google:gemini-2.0-flash,ollama:qwen2.5vl:latest" \
  --id "vision-test" \
  --attach "./test-image.jpg"

Next Steps

Released under the MIT License.