LLMProvider Trait

Overview

The LLMProvider trait defines the standard interface for integrating Large Language Model APIs into the MoFA framework. All LLM providers (OpenAI, Anthropic, Ollama, etc.) implement this trait to provide a unified API surface.

Trait Definition

#[async_trait]
pub trait LLMProvider: Send + Sync {
    fn name(&self) -> &str;
    fn default_model(&self) -> &str;
    fn supported_models(&self) -> Vec<&str>;
    fn supports_streaming(&self) -> bool;
    fn supports_tools(&self) -> bool;
    fn supports_vision(&self) -> bool;
    fn supports_embedding(&self) -> bool;
    
    async fn chat(&self, request: ChatCompletionRequest) -> LLMResult<ChatCompletionResponse>;
    async fn chat_stream(&self, request: ChatCompletionRequest) -> LLMResult<ChatStream>;
    async fn embedding(&self, request: EmbeddingRequest) -> LLMResult<EmbeddingResponse>;
    async fn health_check(&self) -> LLMResult<bool>;
    async fn get_model_info(&self, model: &str) -> LLMResult<ModelInfo>;
}

Methods

Provider Metadata

name

fn() -> &str

required

Returns the provider name identifier (e.g., “openai”, “anthropic”, “ollama”)

default_model

fn() -> &str

required

Returns the default model identifier used when no model is specified

supported_models

fn() -> Vec<&str>

required

Returns a list of model identifiers supported by this provider

Capability Detection

supports_streaming

fn() -> bool

required

Returns true if the provider supports streaming responses

supports_tools

fn() -> bool

required

Returns true if the provider supports function/tool calling

supports_vision

fn() -> bool

required

Returns true if the provider supports vision/image inputs

supports_embedding

fn() -> bool

required

Returns true if the provider supports text embeddings

Core Operations

chat

async fn(request: ChatCompletionRequest) -> LLMResult<ChatCompletionResponse>

required

Sends a chat completion request and returns the complete responseParameters:

request: Chat completion request with messages, model, and parameters

Returns:

ChatCompletionResponse: Complete response with choices and usage data

chat_stream

async fn(request: ChatCompletionRequest) -> LLMResult<ChatStream>

required

Sends a chat completion request and returns a stream of response chunksParameters:

request: Chat completion request with stream enabled

Returns:

ChatStream: Stream of ChatCompletionChunk items

embedding

async fn(request: EmbeddingRequest) -> LLMResult<EmbeddingResponse>

required

Generates embeddings for input text(s)Parameters:

request: Embedding request with model and input text(s)

Returns:

EmbeddingResponse: Vector embeddings and usage data

health_check

async fn() -> LLMResult<bool>

required

Checks if the provider API is accessible and respondingReturns:

bool: true if healthy, false otherwise

get_model_info

async fn(model: &str) -> LLMResult<ModelInfo>

required

Retrieves metadata about a specific modelParameters:

model: Model identifier

Returns:

ModelInfo: Model capabilities, context window, and metadata

Type Definitions

ModelInfo

pub struct ModelInfo {
    pub id: String,
    pub name: String,
    pub description: Option<String>,
    pub context_window: Option<u32>,
    pub max_output_tokens: Option<u32>,
    pub training_cutoff: Option<String>,
    pub capabilities: ModelCapabilities,
}

ModelCapabilities

pub struct ModelCapabilities {
    pub streaming: bool,
    pub tools: bool,
    pub vision: bool,
    pub json_mode: bool,
    pub json_schema: bool,
}

ChatStream

pub type ChatStream = Pin<Box<dyn Stream<Item = LLMResult<ChatCompletionChunk>> + Send>>;

Implementing a Custom Provider

Basic Implementation

use mofa_foundation::llm::*;
use async_trait::async_trait;
use std::sync::Arc;

struct MyLLMProvider {
    api_key: String,
    base_url: String,
}

impl MyLLMProvider {
    pub fn new(api_key: impl Into<String>) -> Self {
        Self {
            api_key: api_key.into(),
            base_url: "https://api.example.com".to_string(),
        }
    }
}

#[async_trait]
impl LLMProvider for MyLLMProvider {
    fn name(&self) -> &str {
        "my-llm"
    }

    fn default_model(&self) -> &str {
        "my-model-v1"
    }

    fn supported_models(&self) -> Vec<&str> {
        vec!["my-model-v1", "my-model-v2"]
    }

    fn supports_streaming(&self) -> bool {
        true
    }

    fn supports_tools(&self) -> bool {
        true
    }

    fn supports_vision(&self) -> bool {
        false
    }

    fn supports_embedding(&self) -> bool {
        false
    }

    async fn chat(&self, request: ChatCompletionRequest) -> LLMResult<ChatCompletionResponse> {
        // Convert request to provider-specific format
        // Send HTTP request to your API
        // Parse and convert response
        todo!("Implement API call")
    }

    async fn chat_stream(&self, request: ChatCompletionRequest) -> LLMResult<ChatStream> {
        // Implement streaming response
        todo!("Implement streaming")
    }

    async fn embedding(&self, request: EmbeddingRequest) -> LLMResult<EmbeddingResponse> {
        Err(LLMError::Other("Embeddings not supported".to_string()))
    }

    async fn health_check(&self) -> LLMResult<bool> {
        // Send simple request to check connectivity
        Ok(true)
    }

    async fn get_model_info(&self, model: &str) -> LLMResult<ModelInfo> {
        Ok(ModelInfo {
            id: model.to_string(),
            name: model.to_string(),
            description: Some("Custom model".to_string()),
            context_window: Some(8192),
            max_output_tokens: Some(2048),
            training_cutoff: None,
            capabilities: ModelCapabilities {
                streaming: true,
                tools: true,
                vision: false,
                json_mode: true,
                json_schema: false,
            },
        })
    }
}

Using the Custom Provider

use mofa_foundation::llm::LLMClient;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let provider = Arc::new(MyLLMProvider::new("api-key"));
    let client = LLMClient::new(provider);

    let response = client
        .chat()
        .system("You are a helpful assistant.")
        .user("What is Rust?")
        .send()
        .await?;

    println!("Response: {}", response.content().unwrap());
    Ok(())
}

Built-in Providers

MoFA includes several built-in provider implementations:

OpenAI: GPT-4, GPT-3.5, with vision and tools support
Anthropic: Claude 3 models with streaming
Ollama: Local models via OpenAI-compatible API
Google Gemini: Google’s models (when enabled)

See individual provider documentation for configuration details.

Error Handling

pub enum LLMError {
    ApiError { code: Option<String>, message: String },
    NetworkError(String),
    Timeout(String),
    RateLimited(String),
    QuotaExceeded(String),
    ModelNotFound(String),
    ContextLengthExceeded(String),
    ContentFiltered(String),
    ConfigError(String),
    SerializationError(String),
    Other(String),
}

pub type LLMResult<T> = Result<T, LLMError>;

Best Practices

Thread Safety: All providers must be Send + Sync for concurrent usage
Error Categorization: Map provider-specific errors to appropriate LLMError variants
Timeout Handling: Implement reasonable timeouts for API calls
Retry Logic: Consider implementing retry logic for transient failures
Capability Detection: Accurately report supported capabilities
Resource Cleanup: Properly handle connection pooling and cleanup

LLMClient - High-level client for using providers
ChatCompletionRequest - Request format
ChatCompletionResponse - Response format
Tool - Tool/function calling definitions

Documentation Index

​Overview

​Trait Definition

​Methods

​Provider Metadata

​Capability Detection

​Core Operations

​Type Definitions

​ModelInfo

​ModelCapabilities

​ChatStream

​Implementing a Custom Provider

​Basic Implementation

​Using the Custom Provider

​Built-in Providers

​Error Handling

​Best Practices

​Related Types

Overview

Trait Definition

Methods

Provider Metadata

Capability Detection

Core Operations

Type Definitions

ModelInfo

ModelCapabilities

ChatStream

Implementing a Custom Provider

Basic Implementation

Using the Custom Provider

Built-in Providers

Error Handling

Best Practices

Related Types