Configuration Guide

ModelSEEDagent provides flexible configuration options to customize behavior, performance, and integration with external services.

Configuration Methods

1. Environment Variables (.env file)

Create a .env file in the project root:

# Core LLM Configuration
OPENAI_API_KEY=your_openai_key_here

# Argo Gateway Configuration (Recommended)
ARGO_GATEWAY_URL=https://your-argo-gateway.com
ARGO_API_KEY=your_argo_key_here

# Debug Configuration
MODELSEED_DEBUG_LEVEL=INFO
MODELSEED_DEBUG_COBRAKBASE=false
MODELSEED_DEBUG_LANGGRAPH=false
MODELSEED_DEBUG_HTTP=false
MODELSEED_DEBUG_TOOLS=true
MODELSEED_DEBUG_LLM=false
MODELSEED_LOG_LLM_INPUTS=false

# Console Output Capture (Phase 1 CLI Debug Capture)
MODELSEED_CAPTURE_CONSOLE_DEBUG=false      # Capture console debug output
MODELSEED_CAPTURE_AI_REASONING_FLOW=false  # Capture AI reasoning steps
MODELSEED_CAPTURE_FORMATTED_RESULTS=false  # Capture final formatted results

# Directory Configuration
MODELSEED_DATA_DIR=/path/to/data
MODELSEED_LOG_DIR=/path/to/logs
MODELSEED_SESSION_DIR=/path/to/sessions

# Performance Configuration
MODELSEED_CACHE_ENABLED=true
MODELSEED_PARALLEL_TOOLS=true
MODELSEED_MAX_WORKERS=4

2. Command Line Arguments

# Override LLM provider
modelseed-agent --llm argo analyze

# Set debug level
modelseed-agent --debug-level DEBUG analyze

# Custom data directory
modelseed-agent --data-dir /custom/path analyze

3. Configuration Files

Create config/config.yaml:

llm:
  default_provider: argo
  temperature: 0.1
  max_tokens: 4000
  timeout: 30

agents:
  metabolic:
    max_iterations: 10
    reasoning_depth: 3
  langgraph:
    visualization: true
    save_graphs: true

tools:
  cobra:
    default_solver: glpk
    tolerance: 1e-9
    timeout: 300
  modelseed:
    template_version: v5
    gapfill_mode: comprehensive

performance:
  cache_ttl: 3600
  max_memory_gb: 8
  parallel_execution: true

LLM Provider Configuration

OpenAI (Experimental)

# Required
OPENAI_API_KEY=your_key_here

# Optional
OPENAI_MODEL=gpt-4
OPENAI_MAX_TOKENS=4000
OPENAI_TEMPERATURE=0.1
OPENAI_ORGANIZATION=your_org_id

Argo Gateway

# Required
ARGO_GATEWAY_URL=https://gateway.argos.anl.gov
ARGO_API_KEY=your_key_here

# Optional
ARGO_MODEL=claude-3-sonnet
ARGO_TIMEOUT=60
ARGO_MAX_RETRIES=3

Local LLM

# Local model configuration
LOCAL_LLM_ENABLED=true
LOCAL_LLM_MODEL_PATH=/path/to/model
LOCAL_LLM_DEVICE=cuda  # or cpu
LOCAL_LLM_MAX_LENGTH=2048

Debug Configuration

Granular Debug Control

# Overall debug level (TRACE, DEBUG, INFO, WARNING, ERROR)
MODELSEED_DEBUG_LEVEL=INFO

# Component-specific debugging
MODELSEED_DEBUG_COBRAKBASE=false    # COBRApy/cobrakbase messages
MODELSEED_DEBUG_LANGGRAPH=false     # LangGraph workflow messages
MODELSEED_DEBUG_HTTP=false          # HTTP/SSL debug from httpx
MODELSEED_DEBUG_TOOLS=true          # Tool execution details
MODELSEED_DEBUG_LLM=false           # LLM interaction details

# Special logging
MODELSEED_LOG_LLM_INPUTS=false      # Log LLM prompts and responses

Debug Profiles

Developer Profile

MODELSEED_DEBUG_LEVEL=DEBUG
MODELSEED_DEBUG_TOOLS=true
MODELSEED_DEBUG_LLM=true
MODELSEED_LOG_LLM_INPUTS=true

Production Profile

MODELSEED_DEBUG_LEVEL=WARNING
MODELSEED_DEBUG_COBRAKBASE=false
MODELSEED_DEBUG_LANGGRAPH=false
MODELSEED_DEBUG_HTTP=false
MODELSEED_DEBUG_TOOLS=false
MODELSEED_DEBUG_LLM=false

Silent Profile

MODELSEED_DEBUG_LEVEL=ERROR
# All other debug flags default to false

Checking Debug Configuration

# View current debug settings
modelseed-agent debug

# Test debug levels
modelseed-agent --debug-level DEBUG debug

Performance Configuration

Caching

# Enable/disable caching
MODELSEED_CACHE_ENABLED=true

# Cache settings
MODELSEED_CACHE_TTL=3600           # Cache TTL in seconds
MODELSEED_CACHE_DIR=/path/to/cache # Custom cache directory
MODELSEED_CACHE_MAX_SIZE=1000      # Max cache entries

Parallel Execution

# Enable parallel tool execution
MODELSEED_PARALLEL_TOOLS=true

# Control number of workers
MODELSEED_MAX_WORKERS=4

# Tool-specific parallelization
COBRA_PARALLEL_FBA=true
COBRA_MAX_PARALLEL_JOBS=2

Memory Management

# Memory limits
MODELSEED_MAX_MEMORY_GB=8
MODELSEED_MEMORY_WARNING_THRESHOLD=0.8

# Cleanup settings
MODELSEED_AUTO_CLEANUP=true
MODELSEED_TEMP_DIR_CLEANUP=true

Directory Configuration

Default Directories

# Data directory (models, examples, databases)
MODELSEED_DATA_DIR=./data

# Log directory
MODELSEED_LOG_DIR=./logs

# Session directory
MODELSEED_SESSION_DIR=./sessions

# Cache directory
MODELSEED_CACHE_DIR=./cache

# Temporary directory
MODELSEED_TEMP_DIR=/tmp/modelseed

Custom Directories

# Example: Network storage setup
MODELSEED_DATA_DIR=/shared/modelseed/data
MODELSEED_LOG_DIR=/shared/modelseed/logs
MODELSEED_SESSION_DIR=/local/sessions
MODELSEED_CACHE_DIR=/local/cache

Tool-Specific Configuration

COBRApy Tools

# Solver configuration
COBRA_DEFAULT_SOLVER=glpk           # glpk, cplex, gurobi
COBRA_SOLVER_TIMEOUT=300            # seconds
COBRA_SOLVER_TOLERANCE=1e-9

# FBA configuration
COBRA_FBA_THREADS=1
COBRA_FBA_PRESOLVE=true

# Precision configuration
COBRA_FLUX_THRESHOLD=1e-6
COBRA_GROWTH_THRESHOLD=1e-3

ModelSEED Tools

# Template configuration
MODELSEED_TEMPLATE_VERSION=v5
MODELSEED_TEMPLATE_PATH=/path/to/templates

# Gapfilling configuration
MODELSEED_GAPFILL_MODE=comprehensive  # fast, comprehensive
MODELSEED_GAPFILL_TIMEOUT=1800        # seconds
MODELSEED_MAX_GAPFILL_REACTIONS=50

# Annotation configuration
RAST_SERVER_URL=https://rast.nmpdr.org
RAST_TIMEOUT=3600

Advanced Configuration

Custom Configuration Classes

# config/custom_settings.py
from src.config.settings import Settings

class CustomSettings(Settings):
    def __init__(self):
        super().__init__()
        self.custom_parameter = "value"

    def validate_custom(self):
        # Custom validation logic
        pass

Runtime Configuration

from src.config.settings import get_settings

# Get current settings
settings = get_settings()

# Override at runtime
settings.llm.temperature = 0.2
settings.debug.level = "DEBUG"

Configuration Validation

Validate Configuration

# Check configuration validity
modelseed-agent validate-config

# Verbose validation
modelseed-agent validate-config --verbose

# Check specific components
modelseed-agent validate-config --llm --tools

Configuration Testing

# Test configuration in Python
from src.config.settings import validate_configuration

# Validate current configuration
is_valid, errors = validate_configuration()

if not is_valid:
    for error in errors:
        print(f"Configuration error: {error}")

Security Considerations

API Key Security

# Use environment variables, not config files
export OPENAI_API_KEY="sk-..."

# Use key management services
OPENAI_API_KEY=$(aws secretsmanager get-secret-value --secret-id openai-key --query SecretString --output text)

# Rotate keys regularly
# Monitor key usage
# Use restricted permissions

Network Security

# Proxy configuration
HTTP_PROXY=http://proxy.company.com:8080
HTTPS_PROXY=https://proxy.company.com:8080
NO_PROXY=localhost,127.0.0.1,.company.com

# SSL verification
SSL_VERIFY=true
SSL_CERT_PATH=/path/to/certificates

# Timeout configuration
REQUEST_TIMEOUT=30
CONNECTION_TIMEOUT=10

Troubleshooting Configuration

Common Issues

API Key Issues

# Test API key validity
modelseed-agent test-llm-connection

# Check environment variables
env | grep -E "(OPENAI|ARGO)"

Path Issues

# Check directory permissions
ls -la $MODELSEED_DATA_DIR

# Create missing directories
mkdir -p $MODELSEED_LOG_DIR $MODELSEED_CACHE_DIR

Configuration Conflicts

# Show effective configuration
modelseed-agent show-config

# Show configuration sources
modelseed-agent show-config --sources

Next Steps

Interactive Guide: Learn how to use ModelSEEDagent
API Documentation: Explore programmatic usage
Troubleshooting: Solve common issues
Development: Contribute to the project

Connection Pooling Configuration

ModelSEEDagent automatically manages HTTP connection pooling for optimal performance.

LLM Connection Pooling

Automatic Configuration: - HTTP clients are pooled per configuration key - Connections are reused across tool executions - Timeout and connection limits are automatically managed

Benefits: - Eliminates redundant connection setup overhead - Reduces memory usage for LLM communications - Improves overall session performance

Monitoring: Connection pool statistics are logged at DEBUG level:

LLM Connection Pool initialized
Created new HTTP client for config: dev_120.0
Reusing existing LLM instance: argo|gpto1|prod|jplfaria|30.0

Configuration is automatic and requires no user intervention.

Last updated: 3003b76c - Connection pooling implementation detected

COBRA Multiprocessing Configuration

COBRA tools support both single-process and multiprocess execution modes.

Default Behavior

Single Process Mode (Default): - All COBRA tools default to processes=1 - Prevents connection pool fragmentation - Recommended for most use cases

Multiprocessing Control

Global Environment Variables:

# Disable multiprocessing for all COBRA tools
export COBRA_DISABLE_MULTIPROCESSING=1

# Set process count for all COBRA tools
export COBRA_PROCESSES=4

Tool-Specific Environment Variables:

# Flux Variability Analysis
export COBRA_FVA_PROCESSES=8

# Flux Sampling
export COBRA_SAMPLING_PROCESSES=4

# Gene Deletion Analysis
export COBRA_GENE_DELETION_PROCESSES=2

# Essentiality Analysis
export COBRA_ESSENTIALITY_PROCESSES=2

Performance Considerations

Single Process (Default): - No connection pool fragmentation - Lower memory usage - Simpler debugging - Slower for large analyses

Multiprocess: - Faster for large-scale analyses - Higher memory usage - Connection pool overhead - Complex error handling

Last updated: 3003b76c - COBRA multiprocessing changes detected