Cognition Module

The cognition module provides the core infrastructure for AI model execution, including runners, type definitions, and streaming capabilities. This module implements the semantic concept of "Cognition" - the reasoning and thinking processes of AI models.

Overview

The cognition module consists of several key components:

BaseModelRunner: Core model execution engine
SmartModelRunner: Hookable runner with cognitive enhancements
Type Definitions: Interfaces and schemas for model interactions
Streaming Support: Real-time object and text streaming

Core Classes

BaseModelRunner

The primary class for executing model interactions and generating responses.

Import

typescript

import { BaseModelRunner } from '../src/cognition/runner.js';

Constructor

typescript

interface ModelRunnerConfig {
  rateLimitConfig?: RateLimitConfig;
  maxRetries?: number;
  maxTokens?: number;
}

const runner = new BaseModelRunner({
  maxRetries: 3,
  maxTokens: 4096,
  rateLimitConfig: { /* rate limiting options */ }
});

Methods

`generateText(interaction: Interaction): Promise<ModelResponse>`

Generate a text response from the model.

typescript

const runner = new BaseModelRunner();
const interaction = new Interaction(modelDetails, "You are a helpful assistant");
interaction.addMessage({
  role: 'user',
  content: 'Explain quantum computing'
});

const response = await runner.generateText(interaction);
console.log(response.content); // Generated text
console.log(response.metadata.tokenUsage); // Token usage stats
console.log(response.metadata.cost); // Cost information

`generateObject<T>(interaction: Interaction, schema: z.ZodSchema<T>): Promise<ModelResponse>`

Generate structured output validated against a Zod schema.

typescript

import { z } from 'zod';

const TaskSchema = z.object({
  title: z.string(),
  priority: z.enum(['low', 'medium', 'high']),
  due_date: z.string().optional(),
  completed: z.boolean().default(false)
});

const runner = new BaseModelRunner();
const interaction = new Interaction(modelDetails, "You are a task manager");
interaction.setOutputFormat(TaskSchema);
interaction.addMessage({
  role: 'user',
  content: 'Create a task for reviewing the quarterly report'
});

const response = await runner.generateObject(interaction, TaskSchema);
const task = JSON.parse(response.content); // Validated and typed

`streamObject<T>(interaction: Interaction, schema: z.ZodSchema<T>): Promise<ModelResponse>`

Generate structured output with real-time streaming, validated against a Zod schema.

typescript

import { z } from 'zod';

const RecipeSchema = z.object({
  name: z.string(),
  ingredients: z.array(z.string()),
  instructions: z.array(z.string()),
  prep_time: z.number(),
  cook_time: z.number()
});

const runner = new BaseModelRunner();
const interaction = new Interaction(modelDetails, "You are a recipe generator");
interaction.setOutputFormat(RecipeSchema);
interaction.addMessage({
  role: 'user',
  content: 'Generate a recipe for chocolate chip cookies'
});

const response = await runner.streamObject(interaction, RecipeSchema);
const recipe = JSON.parse(response.content); // Built from partial object stream

Important: This method uses partialObjectStream internally to avoid hanging issues. The implementation iterates over partial objects and merges them to build the final result.

Performance:

Google Gemini: ~600ms for streamObject
Ollama (gemma3:12b): ~500ms for streamObject
Real-time streaming: Works without hanging or timeout issues

`streamText(interaction: Interaction): Promise<ModelResponse>`

Stream text responses for real-time output.

typescript

const runner = new BaseModelRunner();
const interaction = new Interaction(modelDetails, "You are a storyteller");
interaction.addMessage({
  role: 'user',
  content: 'Tell me a story about a brave knight'
});

const response = await runner.streamText(interaction);
// response.content contains the complete text
console.log(response.content);

SmartModelRunner

A hookable runner that supports before, during, and after execution hooks for cognitive enhancements.

Import

typescript

import { SmartModelRunner } from '../src/cognition/smart_runner.js';

Constructor

typescript

interface SmartModelRunnerConfig {
  beforeHooks?: RunnerHook[];
  duringHooks?: RunnerHook[];
  afterHooks?: RunnerHook[];
  baseRunner: ModelRunner;
}

const smartRunner = new SmartModelRunner({
  baseRunner: new BaseModelRunner(),
  beforeHooks: [
    async (interaction) => {
      // Pre-processing logic
      console.log('Before execution:', interaction.prompt);
    }
  ],
  afterHooks: [
    async (interaction) => {
      // Post-processing logic
      console.log('After execution completed');
    }
  ]
});

Hook Types

RunnerHook

typescript

type RunnerHook = (interaction: Interaction) => Promise<void | RunnerAbort | RunnerModification>;

RunnerAbort

typescript

export class RunnerAbort {
  constructor(public reason: string) {}
}

// Usage in hook
async function validationHook(interaction: Interaction) {
  if (!interaction.prompt.trim()) {
    return new RunnerAbort('Empty prompt provided');
  }
}

RunnerModification

typescript

export class RunnerModification {
  constructor(public modify: (interaction: Interaction) => Interaction) {}
}

// Usage in hook
async function enhancementHook(interaction: Interaction) {
  return new RunnerModification((interaction) => {
    // Modify the interaction
    interaction.addMessage({
      role: 'system',
      content: 'Additional context added by hook'
    });
    return interaction;
  });
}

Type Definitions

ModelDetails

Core interface for model configuration:

typescript

interface ModelDetails {
  name: string;                    // Model identifier
  provider: string;                // Provider name (google, openrouter, ollama, etc.)
  temperature?: number;            // Sampling temperature (0-2)
  topP?: number;                   // Top-p sampling parameter
  topK?: number;                   // Top-k sampling parameter
  numCtx?: number;                 // Context window size
  contextLength?: number;          // Maximum context length
  costs?: {                        // Cost per million tokens
    promptTokens: number;
    completionTokens: number;
  };
  description?: string;            // Model description
  addedDate?: Date;                // When model was added
  lastUpdated?: Date;              // Last update timestamp
  details?: Record<string, unknown>; // Additional metadata
  originalProvider?: string;       // Original provider for routed models
}

ModelOptions

Configuration options for model execution:

typescript

interface ModelOptions {
  temperature?: number;            // Sampling temperature (0-2)
  maxTokens?: number;              // Maximum tokens to generate
  stop?: string[];                 // Stop sequences
}

ModelResponse

Standard response format for all model interactions:

typescript

interface ModelResponse {
  content: string;                 // Generated content
  metadata: ResponseMetadata;      // Execution metadata
  reasoning?: string;              // Reasoning chain (if enabled)
  reasoningDetails?: Array<{       // Detailed reasoning
    type: 'text' | 'redacted';
    text?: string;
    data?: string;
    signature?: string;
  }>;
}

ResponseMetadata

Metadata about the model execution:

typescript

interface ResponseMetadata {
  startTime: Date;                 // Execution start time
  endTime: Date;                   // Execution end time
  tokenUsage: TokenUsage;          // Token consumption
  provider: string;                // Model provider
  model: string;                   // Model name
  cost: CostBreakdown;             // Cost information
}

Streaming Patterns

The cognition module provides multiple methods for different streaming needs:

1. For Immediate Results

typescript

// Use generateObject for immediate structured results
const result = await runner.generateObject(interaction, schema);
const data = JSON.parse(result.content);
// data is immediately available

2. For Real-Time Streaming

typescript

// Use streamObject for real-time partial updates
const result = await runner.streamObject(interaction, schema);
const data = JSON.parse(result.content);
// data is built from partial object stream

3. For Flexible JSON

typescript

// Use generateText + JSON parsing for dynamic schemas
const result = await runner.generateText(interaction);
const jsonMatch = result.content.match(/\{.*\}/s);
const data = JSON.parse(jsonMatch[0]);

4. For Text Streaming

typescript

// Use streamText for real-time text chunks
const result = await runner.streamText(interaction);
// Process text chunks as they arrive

Error Handling

The cognition module provides comprehensive error handling:

Rate Limiting

typescript

try {
  const response = await runner.generateText(interaction);
} catch (error) {
  if (error.message.includes('rate limit')) {
    console.error('Rate limit exceeded, retrying...');
    // Implement retry logic
  }
}

Model Validation

typescript

try {
  const response = await runner.generateObject(interaction, schema);
} catch (error) {
  if (error.message.includes('Invalid model details')) {
    console.error('Model configuration error:', error.message);
  }
}

Schema Validation

typescript

try {
  const response = await runner.generateObject(interaction, schema);
} catch (error) {
  if (error.message.includes('schema validation')) {
    console.error('Output validation failed:', error.message);
  }
}

Best Practices

Model Selection

Google Gemini: Excellent for structured output and streaming
OpenRouter GPT-4: Best for complex reasoning tasks
Ollama Models: Good for local development and testing
Multiple Models: Use for validation and comparison

Performance Optimization

Use appropriate timeouts for different model types
Implement retry logic for transient failures
Monitor token usage and costs
Use streaming for interactive applications

Error Prevention

Validate model details before execution
Test schemas with simple examples first
Handle rate limits gracefully
Monitor for model-specific errors

Examples

Basic Text Generation

typescript

import { BaseModelRunner } from '../src/cognition/runner.js';
import { Interaction } from '../src/interaction/interaction.js';

const runner = new BaseModelRunner();
const modelDetails = {
  name: 'gemini-2.0-flash',
  provider: 'google',
  temperature: 0.7
};

const interaction = new Interaction(modelDetails, 'You are a helpful assistant');
interaction.addMessage({
  role: 'user',
  content: 'Explain machine learning in simple terms'
});

const response = await runner.generateText(interaction);
console.log(response.content);

Structured Output with Streaming

typescript

import { BaseModelRunner } from '../src/cognition/runner.js';
import { Interaction } from '../src/interaction/interaction.js';
import { z } from 'zod';

const ProductSchema = z.object({
  name: z.string(),
  price: z.number(),
  category: z.string(),
  features: z.array(z.string())
});

const runner = new BaseModelRunner();
const modelDetails = {
  name: 'gemini-2.0-flash',
  provider: 'google',
  temperature: 0.1
};

const interaction = new Interaction(modelDetails, 'You are a product analyzer');
interaction.setOutputFormat(ProductSchema);
interaction.addMessage({
  role: 'user',
  content: 'Analyze this product: iPhone 15 Pro'
});

const response = await runner.streamObject(interaction, ProductSchema);
const product = JSON.parse(response.content);
console.log('Product analysis:', product);

Hookable Runner with Validation

typescript

import { SmartModelRunner, RunnerAbort } from '../src/cognition/smart_runner.js';
import { BaseModelRunner } from '../src/cognition/runner.js';

const validationHook = async (interaction) => {
  if (!interaction.prompt.trim()) {
    return new RunnerAbort('Empty prompt provided');
  }
  if (interaction.prompt.length > 1000) {
    return new RunnerAbort('Prompt too long');
  }
};

const smartRunner = new SmartModelRunner({
  baseRunner: new BaseModelRunner(),
  beforeHooks: [validationHook]
});

// This will be aborted by the validation hook
try {
  await smartRunner.generateText(interaction);
} catch (error) {
  console.error('Execution aborted:', error.message);
}

Integration with Other Modules

Interaction Module

The cognition module works closely with the Interaction module for managing conversations and context.

Providers Module

Model execution relies on the providers module for model instantiation and validation.

Memory Module

The SmartModelRunner can integrate with the memory module for persistent context and knowledge retrieval.

Evaluation Module

The cognition module provides the foundation for systematic model evaluation and comparison.

Cognition Module ​

Overview ​

Core Classes ​

BaseModelRunner ​

Import ​

Constructor ​

Methods ​

generateText(interaction: Interaction): Promise<ModelResponse> ​

generateObject<T>(interaction: Interaction, schema: z.ZodSchema<T>): Promise<ModelResponse> ​

streamObject<T>(interaction: Interaction, schema: z.ZodSchema<T>): Promise<ModelResponse> ​

streamText(interaction: Interaction): Promise<ModelResponse> ​

SmartModelRunner ​

Import ​

Constructor ​

Hook Types ​

RunnerHook ​

RunnerAbort ​

RunnerModification ​

Type Definitions ​

ModelDetails ​

ModelOptions ​

ModelResponse ​

ResponseMetadata ​

Streaming Patterns ​

1. For Immediate Results ​

2. For Real-Time Streaming ​

3. For Flexible JSON ​

4. For Text Streaming ​

Error Handling ​

Rate Limiting ​

Model Validation ​

Schema Validation ​

Best Practices ​

Model Selection ​

Performance Optimization ​

Error Prevention ​

Examples ​

Basic Text Generation ​

Structured Output with Streaming ​

Hookable Runner with Validation ​

Integration with Other Modules ​

Interaction Module ​

Providers Module ​

Memory Module ​

Evaluation Module ​

Cognition Module

Overview

Core Classes

BaseModelRunner

Import

Constructor

Methods

`generateText(interaction: Interaction): Promise<ModelResponse>`

`generateObject<T>(interaction: Interaction, schema: z.ZodSchema<T>): Promise<ModelResponse>`

`streamObject<T>(interaction: Interaction, schema: z.ZodSchema<T>): Promise<ModelResponse>`

`streamText(interaction: Interaction): Promise<ModelResponse>`

SmartModelRunner

Import

Constructor

Hook Types

RunnerHook

RunnerAbort

RunnerModification

Type Definitions

ModelDetails

ModelOptions

ModelResponse

ResponseMetadata

Streaming Patterns

1. For Immediate Results

2. For Real-Time Streaming

3. For Flexible JSON

4. For Text Streaming

Error Handling

Rate Limiting

Model Validation

Schema Validation

Best Practices

Model Selection

Performance Optimization

Error Prevention

Examples

Basic Text Generation

Structured Output with Streaming

Hookable Runner with Validation

Integration with Other Modules

Interaction Module

Providers Module

Memory Module

Evaluation Module