充分利用 AI Router Platform 的强大功能,优化您的 AI 应用性能和成本
根据您的业务场景选择最合适的路由策略:
适用场景:
适用场景:
适用场景:
适用场景:
POST /v1/routing/config
{
"strategy": "latency_priority",
"fallback_strategy": "failover",
"providers": [
{
"name": "openai",
"priority": 1,
"weight": 60
},
{
"name": "anthropic",
"priority": 2,
"weight": 40
}
]
}
将多个 API Key 添加到 Key Pool 中,系统会自动进行负载均衡和故障转移
POST /v1/providers/openai/keys
建议每 90 天轮换一次 API Key,降低密钥泄露风险
定期检查各个 Key 的调用量、成功率和延迟,及时发现异常
开发、测试和生产环境使用不同的 API Key,避免互相影响
messages = [
{"role": "system", "content": "You are a helpful assistant that provides detailed explanations..."},
{"role": "user", "content": "What is the capital of France? Please provide a comprehensive answer."}
]
messages = [
{"role": "system", "content": "Concise assistant"},
{"role": "user", "content": "Capital of France?"}
]
POST /v1/billing/alerts
{
"type": "budget_threshold",
"threshold": 100.0,
"currency": "USD",
"period": "monthly",
"notify_email": "admin@example.com"
}
import time
from openai import OpenAIError, RateLimitError
def exponential_backoff_retry(func, max_retries=3):
for attempt in range(max_retries):
try:
return func()
except RateLimitError:
if attempt == max_retries - 1:
raise
wait_time = (2 ** attempt) * 1
time.sleep(wait_time)
except OpenAIError as e:
print(f"Error: {e}")
if attempt == max_retries - 1:
raise
time.sleep(1)
直接返回错误,修正后重新请求
使用指数退避进行重试
配置故障转移策略后,当主提供商出现问题时,系统会自动切换到备用提供商,无需手动处理。
对于长文本生成,启用 streaming 可以更快地获得首个响应
response = openai.ChatCompletion.create(
model="gpt-4-turbo",
messages=[...],
stream=True # 启用流式响应
)
系统会实时监控提供商延迟,自动选择最快的提供商
对于重复的请求,使用应用层缓存可以显著降低延迟和成本
合理控制并发数,避免触发速率限制和超时
当 5 分钟内错误率超过 5% 时发送告警
当 P95 延迟超过 2 秒时发送告警
当每日成本超过预算 80% 时发送告警
GET /v1/monitoring/metrics?period=1h
Response:
{
"metrics": {
"requests_total": 12500,
"success_rate": 0.998,
"avg_latency_ms": 180,
"p95_latency_ms": 450,
"error_rate": 0.002
}
}
限制 API Key 只能从特定 IP 地址访问
对用户输入进行验证和清理,防止注入攻击
定期审查 API 调用日志,及时发现异常行为
import os
# 从环境变量读取
api_key = os.getenv('AI_ROUTER_API_KEY')
# 配置客户端
openai.api_key = api_key
openai.api_base = "https://api.poeti.ai/v1"