流式输出(SSE)
流式输出可以边生成边返回内容,体验更接近网页端“打字机”效果。
接口信息
- 方法:
POST - 路径:
/chat/completions - 在请求体中加入:
"stream": true
curl 示例
bash
curl -N https://ask.ling.rest/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{
"model": "YOUR_MODEL_ID_HERE",
"stream": true,
"messages": [
{ "role": "user", "content": "用 5 句话讲一个短故事。" }
]
}'服务端会以 SSE 的形式输出多段 data: 行,最后以 data: [DONE] 结束。
Python 示例
python
import json
import requests
BASE_URL = "https://ask.ling.rest/api/v1"
API_KEY = "YOUR_API_KEY_HERE"
MODEL = "YOUR_MODEL_ID_HERE"
with requests.post(
f"{BASE_URL}/chat/completions",
headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"},
json={
"model": MODEL,
"stream": True,
"messages": [{"role": "user", "content": "写一段 100 字以内的产品文案。"}],
},
stream=True,
timeout=120,
) as resp:
resp.raise_for_status()
for line in resp.iter_lines(decode_unicode=True):
if not line:
continue
if not line.startswith("data: "):
continue
payload = line[len("data: "):].strip()
if payload == "[DONE]":
break
chunk = json.loads(payload)
delta = chunk["choices"][0].get("delta", {})
text = delta.get("content")
if text:
print(text, end="", flush=True)
print()JavaScript(浏览器/Node.js)示例
javascript
const BASE_URL = "https://ask.ling.rest/api/v1";
const API_KEY = "YOUR_API_KEY_HERE";
const MODEL = "YOUR_MODEL_ID_HERE";
async function main() {
const res = await fetch(`${BASE_URL}/chat/completions`, {
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: MODEL,
stream: true,
messages: [{ role: "user", content: "给我 10 个博客标题。" }],
}),
});
if (!res.ok) throw new Error(`HTTP ${res.status}: ${await res.text()}`);
const reader = res.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { value, done } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const parts = buffer.split("\n\n");
buffer = parts.pop() ?? "";
for (const part of parts) {
const line = part.split("\n").find(l => l.startsWith("data: "));
if (!line) continue;
const payload = line.slice("data: ".length).trim();
if (payload === "[DONE]") return;
const chunk = JSON.parse(payload);
const delta = chunk.choices?.[0]?.delta ?? {};
if (delta.content) process.stdout?.write?.(delta.content) || console.log(delta.content);
}
}
}
main().catch(console.error);常见问题
1)为什么我收不到流式内容?
- 部分网络环境会缓冲响应;请确保客户端支持并关闭缓冲(curl 用
-N) - 代理/网关可能会把流式转为整包返回
- 少量模型或渠道可能不支持流式输出
2)出现空回复/速度慢怎么办?
逆向模型出现 速度慢、空回复 属正常现象,详见:模型详细介绍