Local LLM quick performance test ============================== write me a thousand word story write me one page story about... ============================== LM Studio 0.3.5 A: $400 hp envy x360 Ryzen 5 8640HS 16GB 6400MT/s iGPU 760m B: $700 asus tuf A16 Ryzen 7 7735HS 16GB iGPU 680m + AMD rx7700S 8GB C: $600 mac mini m4 16gb 256GB D: $800 hp victus 16.1" Ryzen 7 8845HS 16GB (48GB) 5600MT/s nVidia 4070 8GB 105-120W E: $1500 macbook pro 14 M3 Pro 11 core 18GB 512GB F: $1100 asus proart P16 amd ryzen ai 9 hx 370 12 core 5.1GHz 32GB 7500MT/s iGPU 890m 1TB nVidia 4060 8GB 100W G: $600 Lenovo yoga slim 7i aura intel core 2 ultra 256v 8core 4.8GHz 16GB 8553 MT/s ipgu arc 140v H: $900 hp victus 15.6" Ryzen ai 7 350 8 core 5GHz 48GB 5600MT/s nvidia rtx 5060 8GB 105+W I: $1000 asus proart PX13 amd ryzen ai 9 hx 370 12 core %.1GHz 32GB 7500MT/s nvidia rts 4050 6GB 95w ============ llama 3.1 8b ============ A B C D E E (mlx 4bit) F (cuda) F (cpu) F (vulcan batt) G (vulcan batt) 5.93 tok/sec 41.54 tok/sec 18.68 tok/sec 35.01 tok/sec 24.04 tok/sec 28.62 tok/sec 41.39 tok/s 13.38 tok/s 15 tok/s 14 tok/s • 1466 tokens 1238 tokens 1493 tokens 1472 tokens 1530 tokens 1321 tokens • 5.08s to first token 0.28s 0.39s 0.27s 0.40s 1.19s H I (cuda, win perf) I (cuda, asus perf) 41.93 tok/sec 33.94 tok/sec 34.65 tok/sec ============ llama 3.2 3b ============ A B C D 9.92 tok/sec 53.18 tok/sec 25.61 tok/sec 45.39 tok/sec • 1616 tokens 1508 tokens 1336 tokens 1482 tokens • 0.88s to first token 0.46s 0.13s 0.41s ===================== Mistral Nemo 2407 12b ===================== A B C D 11.45 tok/sec 10.35 tok/sec • 963 tokens 861 tokens • 0.46s to first token 0.43s •