Model Performance Data for Intel® Gaudi® 2 AI Accelerators
These performance numbers are measured using the latest SynapseAI* software release version 1.19.0, unless otherwise noted.
Note All models for both training and inference are using the PyTorch* 2.5.1 framework. Other applicable frameworks used for training or inference are noted for each model.
INFERENCE | TRAINING
Large Language Models (LLM) for Throughput with Intel Gaudi 2 Accelerator
Max Throughput [TpS - higher is better] | ||||||
---|---|---|---|---|---|---|
Model | # HPU | Precision | Input Length | Output Length | Batch Size | Throughput (tokens/sec) |
LLaMA 2 7b | 1 | fp8 | 128 | 128 | 1230 | 13583 |
LLaMA 2 7b | 1 | fp8 | 128 | 2048 | 163 | 4802 |
LLaMA 2 7b | 1 | fp8 | 2048 | 128 | 94 | 1447 |
LLaMA 2 7b | 1 | fp8 | 2048 | 2048 | 81 | 1956 |
LLaMA 2 70b | 2 | fp8 | 128 | 128 | 1750 | 2943 |
LLaMA 2 70b | 2 | fp8 | 128 | 2048 | 327 | 3312 |
LLaMA 2 70b | 2 | fp8 | 2048 | 128 | 95 | 316 |
LLaMA 2 70b | 2 | fp8 | 2048 | 2048 | 159 | 1755 |
LLaMA 3.1 8B | 1 | fp8 | 128 | 128 | 2816 | 19875 |
LLaMA 3.1 8B | 1 | fp8 | 128 | 2048 | 512 | 14784 |
LLaMA 3.1 8B | 1 | fp8 | 2048 | 128 | 179 | 2011 |
LLaMA 3.1 8B | 1 | fp8 | 2048 | 2048 | 256 | 6083 |
LLaMA 3.1 70B | 2 | fp8 | 128 | 128 | 1792 | 2895 |
LLaMA 3.1 70B | 2 | fp8 | 128 | 2048 | 256 | 3816 |
LLaMA 3.1 70B | 2 | fp8 | 2048 | 128 | 142 | 316 |
LLaMA 3.1 70B | 2 | fp8 | 2048 | 2048 | 139 | 1648 |
LLaMA 3.1 70B | 8 | fp8 | 128 | 128 | 4000 | 10012 |
LLaMA 3.1 70B | 8 | fp8 | 128 | 2048 | 600 | 12538 |
LLaMA 3.1 70B | 8 | fp8 | 2048 | 128 | 383 | 1083 |
LLaMA 3.1 70B | 8 | fp8 | 2048 | 2048 | 476 | 6623 |
System Configuration
Intel Gaudi 2 Platform
System: HLS-Gaudi2 with eight Intel Gaudi 2 platform HL-225H mezzanine cards, two Intel Xeon Platinum 8380 CPUs at 2.30 GHz, and 1 TB of system memory
Common Software
- Ubuntu* v22.04,
- Intel Gaudi software v1.19.0 (full software support details)
- PyTorch: Models run with PyTorch v2.4.0
Stay Informed
Register for the latest Intel Gaudi AI accelerator developer news, events, training, and updates.