AI Hardwares
AI chips, Edge-AI, Benchrmarks, Frameworks
AI chips
Top10 AI Chip Makers
Tesla
AI5
特斯拉 AI5 / HW5 下代 FSD 晶片號稱運算性能達 2,000~2,500TOPS(每秒兆次操作),是現款 HW4(約 500TOPS)晶片五倍,可支援更複雜的無監督 FSD 演算法。輝達 RTX5080 和 RTX5090(分別約為 1,500 美元和 3,000 美元)運算性能為 1,800TOPS 和 3,400TOPS。
Tesla Hardware 4 (AI4) – Full Details and Latest News
Hardware 3 (AI3) – FSD Computer
Tesla’s Hardware & E/E Architecture
Nvidia
Nvidia Draws GPU System Roadmap Out To 2028
Analysis of NVIDIA’s Latest Hardware: B100/B200/GH200/NVL72/SuperPod
DGX-Spark
Nvidia GB10 Grace Blackwell Superchip單系統晶片。GB10包含具第5代Tensor Cores及FP4精度支援的Nvidia Blackwell GPU,使DGX具備128GB記憶體,並支援最高1000 TOPS
ASUS Ascent GX10 搭載 NVIDIA® GB10 Grace Blackwell 超級晶片,提供高達 1 petaFLOP 的 AI 運算效能,並配備 128GB 記憶體,足以支援 2000 億(200B)參數模型的微調。
Geforce RTX 50 series
- Giga 5090OC 32GB
- MSI 5080OC 16GB
- Giga 5070Ti 16GB
- Zotac 5060Ti 16GB
AMD Instinct™ MI355X GPUs
Built on the 4th Gen AMD CDNA™ architecture, AMD Instinct™ MI355X GPUs deliver leadership AI and HPC performance enabling high density infrastructures with 288GB HBM3E memory, 8TB/s bandwidth, and expanded FP6 and FP4 datatype support.
AMD 7 GHz Zen 6 Clock Speed Target “Confirmed” – Leaker Claims
Intel Gaudi 3
Intel® Gaudi® 3 AI Accelerators on IBM Cloud
Groq LPU
The Groq LPU™ Inference Engine
Paper: A Software-defined Tensor Streaming Multiprocessor for Large-scale Machine Learning
Google TPU v6e
Google Cloud推第6代TPU為基礎的新版Hypercomputer
Taiwan AI Cloud - 台智雲
Taiwan AI Cloud operates based on the strength of Taiwania 2, the supercomputer, which consists of 2,016 NVIDIA Tesla V100 32GB GPUs, delivering 9 PFLOPS of superior performance.
Edge AI
Nvidia Jetson
Jetson Thor
Introducing NVIDIA Jetson Thor, the Ultimate Platform for Physical AI
NVIDIA Jetson Orin Nano Developer Kit Gets a “Super” Boost
NVIDIA® Jetson Orin Nano™ Super 入門套件包
MediaTek
Kinara
Ara-2
the Kinara Ara-2 AI processor, the leader in Edge AI acceleration. This 40 TOPS powerhouse tackles the massive compute demands of Generative AI and transformer-based models with unmatched cost-effectiveness.
Kneron
KNEO300 EdgeGPT
KL730 AI Soc
- Quad ARM® Cortex™ A55 CPU。
- 內建DSP,可以加速AI模型後處理,語音處理。
- Linux和RTOS、TSMC 12 納米工藝。
- 高達4K@60fps解析度,與主流感測器的無縫 RGB Bayer 接口,多達4通道影像接口。
- 高達3.6eTOPS@int8 / 7.2eTops@int4。
- 支持Cafee、Tensorflow、Tensorflowlite、Pytorch、Keras、ONNX框架。
- 并兼容CNN、Transformer、RNN Hybrid等多種AI模型, 有更高的處理效率和精度。
Realtek
AMB82-MINI
- MCU
- Part Number: RTL8735B
- 32-bit Arm v8M, up to 500MHz
- MEMORY
- 768KB ROM
- 512KB RAM
- 16MB Flash
- Supports MCM embedded DDR2/DDR3L memory up to 128MB
- KEY FEATURES
- Integrated 802.11 a/b/g/n Wi-Fi, 2.4GHz/5GHz
- Bluetooth Low Energy (BLE) 5.1
- Integrated Intelligent Engine @ 0.4 TOPS
Nuvoton M55M1
RPi 5 + Hailo-8L
Hailo-8L = 13 TOPs
Broadcom BCM2712 2.4GHz quad-core 64-bit Arm Cortex-A76 CPU, with Cryptographic Extension, 512KB per-core L2 caches, and a 2MB shared L3 cache
Benchmarks
MLPerf
MLPerf Inference v5.0 measures inference performance on 11 different benchmarks, including several large language models (LLMs), text-to-image generative AI, recommendation, computer vision, biomedical image segmentation, and graph neural network (GNN).
MLPerf Training v5.0 measures the time to train on seven different benchmarks: LLM pretraining, LLM fine-tuning, text-to-image, GNN, object detection, recommendation, and natural language processing.
NVIDIA Blackwell Supercharges AI Training
NVIDIA Blackwell Delivers up to 2.6x Higher Performance in MLPerf Training v5.0
NVIDIA’s MLPerf Benchmark Results
Frameworks
PyTorch
Tensorflow
Keras 3.0
MLX
MLX is an array framework for machine learning on Apple silicon, brought to you by Apple machine learning research.
MLX documentation
TinyML
Tensorflow.js
MediaPipe
This site was last updated September 17, 2025.