大模型什么样的速度是靠谱的

This topic created in 313 days ago, the information mentioned may be changed or developed.

用了一台昇腾 910b 跑 qwen32b 的模型，

一个 dify 知识库的回答，跑 LLM 用了 30s ，这个正常吗？手头没有 H100 这样 nb 的机器。

2 replies • 2025-08-17 15:58:40 +08:00

guoguobaba

Aug 6, 2025

oldlamp

Aug 17, 2025

粗略来说，得看 tokens/s