Small precision Floating Point support in eProcessor in a nutshell
The eProcessor vector accelerator design supports the IEEE754 double (64-bit) and single (32-bit) precision formats (FP64/32), the 16-bit “brain floating-point” format (BF16), and two different 8-bit format, 1-4-3 and 1-5-2, referred to as “Hybrid 8-bit Floating Point” (HFP8). While the wider formats are appropriate for the traditional scientific applications, small precision floating point formats are gaining interest for AI computing […]