The SAKURA-II is a cutting-edge AI accelerator that combines high performance with low power consumption, designed to efficiently handle multi-billion parameter models for generative AI applications. It is particularly suited for tasks that demand real-time AI inferencing with minimal batch processing, making it ideal for applications devoted to edge environments. With a typical power usage of 8 watts and a compact footprint, the SAKURA-II achieves more than twice the AI compute efficiency of comparable solutions.
This AI accelerator supports next-generation applications by providing up to 4x more DRAM bandwidth compared to alternatives, crucial for the processing of complex vision tasks and large language models (LLMs). The hardware offers advanced precision through software-enabled mixed-precision, which achieves near FP32 accuracy, while a unique sparse computing feature optimizes memory usage. Its robust memory architecture backs up to 32GB of DRAM, providing ample capacity for intensive AI workloads.
The SAKURA-II's modular design allows it to be used in multiple form factors, addressing the diverse needs of modern computing tasks such as those found in smart cities, autonomous robotics, and smart manufacturing. Its adaptability is further enhanced by runtime configurable data paths, allowing the device to optimize task scheduling and resource allocation dynamically. These features are powered by the Dynamic Neural Accelerator engine, ensuring efficient computation and energy management.