Rate Limiter
The Rate Limiter utility provides intelligent rate limiting for API calls, helping manage costs and prevent quota exhaustion when working with LLM providers.
Overview
The rate limiter is typically used through the @RateLimiter
decorator but can also be accessed directly for advanced use cases.
Features
- Token Bucket Algorithm: Efficient rate limiting using token bucket implementation
- Per-Tool Limits: Different rate limits for different tools
- Automatic Backoff: Intelligent waiting when limits are reached
- Caching Integration: Built-in result caching to reduce API calls
- Verbose Logging: Detailed logging for debugging and monitoring
Configuration
interface RateLimiterConfig {
rateLimitPerSecond?: number;
rateLimitPerMinute?: number;
verbose?: boolean;
cacheTTL?: number;
toolSpecificLimits?: Record<string, ToolSpecificLimit>;
}
Usage with Decorator
@RateLimiter({
rateLimitPerSecond: 2,
rateLimitPerMinute: 100,
verbose: true,
toolSpecificLimits: {
"WebSearchTool": {
rateLimitPerSecond: 0.5,
rateLimitPerMinute: 20
}
}
})
@llmProvider("openai", { apiKey: process.env.OPENAI_API_KEY })
@forge()
class RateLimitedApp {
static forge: AgentForge;
}
Best Practices
- Start with conservative limits and increase as needed
- Use tool-specific limits for expensive operations
- Enable verbose logging during development
- Monitor rate limit usage in production
- Implement fallback strategies for rate limit scenarios