Rate Limits
Understanding and managing API rate limits
Rate Limits
To ensure the stability and availability of our API for all users, Meibel AI implements rate limiting. This page explains our rate limiting system and how to handle rate limit errors.
Understanding Rate Limits
Rate limits are applied on a per-API-key basis and vary depending on your subscription plan. Rate limits are calculated based on rolling time windows, typically per minute and per day.
Rate Limit Headers
Every API response includes headers that provide information about your current rate limit status:
Header | Description |
---|---|
X-RateLimit-Limit | The maximum number of requests allowed in the current time window |
X-RateLimit-Remaining | The number of requests remaining in the current time window |
X-RateLimit-Reset | The time at which the current rate limit window resets (UTC epoch seconds) |
Example headers:
Rate Limits by Subscription Plan
Free Tier
- 60 requests per minute
- 1,000 requests per day
- Limited streaming duration
Pro Plan
- 300 requests per minute
- 10,000 requests per day
- Extended streaming duration
Enterprise
- Custom limits
- Priority during high load
- Dedicated support
For the most up-to-date information on rate limits for your subscription, check your account dashboard.
Rate Limit Errors
When you exceed your rate limit, the API returns a 429 Too Many Requests
error with the following response body:
Best Practices for Handling Rate Limits
Implement backoff and retry logic
When you receive a 429 response, use the X-RateLimit-Reset
header to determine when to retry your request. Implement an exponential backoff strategy for retries.
Monitor your usage
Keep track of your rate limit usage through the headers in the API responses to ensure you’re staying within your limits.
Batch requests when possible
Instead of making multiple small requests, batch operations together when the API supports it.
Cache responses
Cache API responses that don’t change frequently to reduce the number of requests you need to make.
Example Retry Implementation
Here’s an example of how to implement retry logic in Python:
Rate Limit Considerations
Streaming Endpoints
Streaming endpoints (like chat and completion streams) have different rate limits based on:
- Number of requests
- Duration of streaming connections
- Amount of data transmitted
Burst Behavior
Our rate limiting system allows for occasional bursts of traffic that exceed your normal limits, but consistent overages will trigger rate limiting.
IP-Based Rate Limiting
In addition to API key-based rate limiting, we also employ IP-based rate limiting as a security measure. This helps protect against unauthorized access attempts.
Increasing Your Rate Limits
If you need higher rate limits:
- Upgrade your plan: Consider upgrading to a higher tier subscription
- Contact sales: Enterprise customers can request custom rate limits by contacting sales@meibel.ai
- Optimize your usage: Review our best practices to ensure efficient API usage
Repeatedly exceeding your rate limits may result in temporary or permanent restrictions on your API key. Always implement proper rate limit handling in production applications.