Skip to main content

Rate Limiter

The Rate Limiter utility provides intelligent rate limiting for API calls, helping manage costs and prevent quota exhaustion when working with LLM providers.

Overview

The rate limiter is typically used through the @RateLimiter decorator but can also be accessed directly for advanced use cases.

Features

  • Token Bucket Algorithm: Efficient rate limiting using token bucket implementation
  • Per-Tool Limits: Different rate limits for different tools
  • Automatic Backoff: Intelligent waiting when limits are reached
  • Caching Integration: Built-in result caching to reduce API calls
  • Verbose Logging: Detailed logging for debugging and monitoring

Configuration

interface RateLimiterConfig {
rateLimitPerSecond?: number;
rateLimitPerMinute?: number;
verbose?: boolean;
cacheTTL?: number;
toolSpecificLimits?: Record<string, ToolSpecificLimit>;
}

Usage with Decorator

@RateLimiter({
rateLimitPerSecond: 2,
rateLimitPerMinute: 100,
verbose: true,
toolSpecificLimits: {
"WebSearchTool": {
rateLimitPerSecond: 0.5,
rateLimitPerMinute: 20
}
}
})
@llmProvider("openai", { apiKey: process.env.OPENAI_API_KEY })
@forge()
class RateLimitedApp {
static forge: AgentForge;
}

Best Practices

  • Start with conservative limits and increase as needed
  • Use tool-specific limits for expensive operations
  • Enable verbose logging during development
  • Monitor rate limit usage in production
  • Implement fallback strategies for rate limit scenarios

See Also