Any Algorithm – Any Host Processor
Fully programmable Companion Chip


8 Cores – High-level programming throughout
Fully programmable
8 cores
High-level programming throughout
Very high performance
AI & GP processing automatically selected layer-by-layer
Close to theory implementation efficiency
Specifications
1,600 Tflops (fp8 Tensorcore)
400 Tflops (fp16 Tensorcore)
50 Tflops (fp8)
25 Tflops (fp16)
12 Tflops (fp32)
16 GB on-chip memory
60W (peak power consumption)

4 Cores – High-level programming throughout
Fully programmable
4 cores
High-level programming throughout
Very high performance
AI & GP processing automatically selected layer-by-layer
Close to theory implementation efficiency
Specifications
800 Tflops (fp8 Tensorcore)
200 Tflops (fp16 Tensorcore)
25 Tflops (fp8)
12 Tflops (fp16)
6 Tflops (fp32)
16 GB on-chip memory
30W (peak power consumption)

2 Cores – High-level programming throughout
Fully programmable
2 cores
High-level programming throughout
Very high performance
AI & GP processing automatically selected layer-by-layer
Close to theory implementation efficiency
Specifications
400 Tflops (fp8 Tensorcore)
100 Tflops (fp16 Tensorcore)
12 Tflops (fp8)
6 Tflops (fp16)
3 Tflops (fp32)
16 GB on-chip memory
10W (peak power consumption)