Microsoft has unveiled a new hardware system called Brainwave that will allow developers to use high-speed artificial intelligence, by deploying machine learning systems onto programmable silicon.
This would mean that performance could go beyond levels of current central processing units, with a lack of batching operations intending to help hardware to handle requests as they receive them.
Billed as a “major leap forward in both performance and flexibility for cloud-based serving of deep learning models,” the real-time AI system uses ultra-low latency which would benefit cloud infrastructures that need to process live data streams.
Using the field programmable gate array that Microsoft has been deploying, there would be no software used in the loop, instead serving DNNs as a hardware microservice and mapping one to a pool of remote FPGAs.
The hardware will be used across the FPGAs that Microsoft has installed in its data centres, aiming to boost the ability in which they can deal with new artificial intelligence services and features.
Project Brainwave ran on early Stratix 10 silicon but still managed to run a large GRU model that was five times larger than Resnet-50, with no batching and the achieved a record-breaking performance.
Each request was handled in less than one millisecond, showing “unprecedented levels of demonstrated real-time AI performance on extremely challenging models.”