Examples Overview

This section provides comprehensive examples showing how to use Umwelten for various AI model evaluation tasks. The examples demonstrate the new infrastructure-first architecture with stimulus-centric design.

🚀 Quick Start Examples

Run these examples to get started immediately:

bash

# Simple evaluation example
pnpm tsx scripts/examples/simple-evaluation-example.ts

# Matrix evaluation (compare multiple models)
pnpm tsx scripts/examples/matrix-evaluation-example.ts

# Batch evaluation (process multiple inputs)
pnpm tsx scripts/examples/batch-evaluation-example.ts

# Complex pipeline (multi-step evaluation)
pnpm tsx scripts/examples/complex-pipeline-example.ts

# Comprehensive analysis
pnpm tsx scripts/examples/comprehensive-analysis-example.ts

📚 Infrastructure Examples

The new infrastructure-first approach with reusable components:

Simple Evaluation - Basic single-model evaluation using stimulus templates
Matrix Evaluation - Multi-model comparison and benchmarking
Batch Evaluation - Batch processing with tool integration
Complex Pipeline - Multi-step evaluations with dependencies
Comprehensive Analysis - Performance and quality analysis

Basic Examples

Perfect for getting started with Umwelten:

Simple Text Generation - Basic prompt evaluation across models
Creative Writing - Poetry and story generation with temperature control
Analysis & Reasoning - Complex reasoning tasks and literary analysis
Tool Integration - Using and creating tools to enhance AI capabilities

Image Processing Examples

Working with visual content and structured data extraction:

Basic Image Analysis - Simple image description and analysis
Structured Image Features - Extract structured data with confidence scores
Batch Image Processing - Process multiple images concurrently

Document Processing

Handle various document formats:

PDF Analysis - Test native PDF parsing capabilities
Multi-format Documents - Work with different document types

Advanced Workflows

Complex evaluation patterns and optimization:

Multi-language Evaluation - Code generation across programming languages
Complex Structured Output - Advanced schema validation with nested objects
Cost Optimization - Compare model costs and performance

Migration Reference

These examples show CLI equivalents for scripts that have been migrated:

Script	Example	Status
`cat-poem.ts`	Creative Writing	✅ Complete
`temperature.ts`	Creative Writing	✅ Complete
`frankenstein.ts`	Analysis & Reasoning	✅ Complete
`google-pricing.ts`	Cost Optimization	✅ Complete
`image-parsing.ts`	Basic Image Analysis	✅ Complete
`image-feature-extract.ts`	Structured Image Features	✅ Complete
`image-feature-batch.ts`	Batch Image Processing	✅ Complete
`pdf-identify.ts`	PDF Analysis	✅ Complete
`pdf-parsing.ts`	PDF Analysis	✅ Complete
`roadtrip.ts`	Complex Structured Output	🔄 Partial
`multi-language-evaluation.ts`	Multi-language Evaluation	🔄 Needs Pipeline

Quick Examples

🆕 New Pattern Examples

bash

# Interactive chat with tools
pnpm tsx src/cli/cli.ts chat-new -p ollama -m llama3.2:latest

# Tools demonstration
pnpm tsx scripts/tools.ts -p ollama -m llama3.2:latest --prompt "What's the weather in New York?"

# Programmatic usage
pnpm tsx scripts/new-pattern-example.ts

Traditional CLI Examples

Here are some quick examples to get you started:

Basic Evaluation

bash

umwelten eval run \
  --prompt "Explain quantum computing in simple terms" \
  --models "ollama:gemma3:12b,google:gemini-2.0-flash" \
  --id "quantum-explanation"

With Structured Output

bash

umwelten eval run \
  --prompt "Extract person info: John is 25 and works as a developer" \
  --models "google:gemini-2.0-flash" \
  --id "person-extraction" \
  --schema "name, age int, job"

Batch Processing

bash

umwelten eval batch \
  --prompt "Analyze this image and describe key features" \
  --models "google:gemini-2.0-flash,ollama:qwen2.5vl:latest" \
  --id "image-batch" \
  --directory "input/images" \
  --file-pattern "*.jpg" \
  --concurrent

Generate Reports

bash

# Markdown report
umwelten eval report --id quantum-explanation --format markdown

# HTML report with export
umwelten eval report --id image-batch --format html --output report.html

Common Patterns

Temperature Testing

Compare model outputs at different creativity levels:

bash

# High creativity
umwelten eval run --prompt "Write a creative story" --models "ollama:gemma3:12b" --temperature 1.5 --id "creative-high"

# Low creativity  
umwelten eval run --prompt "Write a creative story" --models "ollama:gemma3:12b" --temperature 0.2 --id "creative-low"

Cost Comparison

Evaluate cost vs. quality trade-offs:

bash

umwelten eval run \
  --prompt "Write a detailed analysis of renewable energy trends" \
  --models "google:gemini-2.0-flash,openrouter:openai/gpt-4o-mini,openrouter:openai/gpt-4o" \
  --id "cost-comparison" \
  --concurrent

Test vision capabilities across models:

bash

umwelten eval run \
  --prompt "Describe this image in detail and identify any text" \
  --models "google:gemini-2.0-flash,ollama:qwen2.5vl:latest" \
  --id "vision-test" \
  --attach "./test-image.jpg"

Next Steps

Browse specific examples for your use case
Check the Migration Guide to see how scripts were converted
Review Advanced Features for complex workflows

Examples Overview ​

🚀 Quick Start Examples ​

📚 Infrastructure Examples ​

Basic Examples ​

Image Processing Examples ​

Document Processing ​

Advanced Workflows ​

Migration Reference ​

Quick Examples ​

🆕 New Pattern Examples ​

Traditional CLI Examples ​

Basic Evaluation ​

With Structured Output ​

Batch Processing ​

Generate Reports ​

Common Patterns ​

Temperature Testing ​

Cost Comparison ​

Multi-modal Evaluation ​

Next Steps ​

Examples Overview

🚀 Quick Start Examples

📚 Infrastructure Examples

Basic Examples

Image Processing Examples

Document Processing

Advanced Workflows

Migration Reference

Quick Examples

🆕 New Pattern Examples

Traditional CLI Examples

Basic Evaluation

With Structured Output

Batch Processing

Generate Reports

Common Patterns

Temperature Testing

Cost Comparison

Multi-modal Evaluation

Next Steps