guoguobaba
V2EX  ›  Local LLM

大模型什么样的速度是靠谱的

  •  
  •   guoguobaba · Aug 6, 2025 · 1556 views
    This topic created in 313 days ago, the information mentioned may be changed or developed.

    用了一台昇腾 910b 跑 qwen32b 的模型,

    一个 dify 知识库的回答,跑 LLM 用了 30s ,这个正常吗?手头没有 H100 这样 nb 的机器。

    https://i.imgur.com/N63dxld.jpg

    2 replies    2025-08-17 15:58:40 +08:00
    guoguobaba
        1
    guoguobaba  
    OP
       Aug 6, 2025
    oldlamp
        2
    oldlamp  
       Aug 17, 2025
    粗略来说,得看 tokens/s
    About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   947 Online   Highest 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 29ms · UTC 20:50 · PVG 04:50 · LAX 13:50 · JFK 16:50
    ♥ Do have faith in what you're doing.