Artificial Intelligence is the backbone for technologies such as Siri and Alexa. Those are digital assistants that rely on deep machine learning to do their jobs. Nevertheless, the makers of these products – and others that rely on artificial intelligence – getting them trained is often time-consuming and an expensive process. Despite that, Rice University scientists have found a way for training deep neural nets more affordable, more quickly, with the help of CPUs.
Usually, in implementing deep learning in technology, companies use GPUs as acceleration hardware. Nevertheless, this is pricey – top of the platforms of line GPU cost around $100,000. Rice researchers have now created an algorithm called SLIDE (sub-linear deep learning engine), which can do the same job of implementing deep learning but without the specialized acceleration hardware.
Then, the team took a complex workload and using Google’s TensorFlow software to feed it to both a top-line GPU and a 44-core Xeon-class CPU using a sub-linear deep learning engine. They found out that compared to three and a half hours for the GPU, the CPU could complete the training in just one hour. (as we know, no such exists as a 44-core Xeon-class CPU. Thus, it is more likely that the team was referring to 22-core, 44-thread CPU.)
SLIDE works by taking a fundamentally different approach for deep learning. By studying vast amounts of data, GPUs leverage such networks. They are often using millions or billions of neurons and employing different neurons for recognizing different types of information. Nevertheless, there is no need to train every neuron in every case. SLIDE is only picking neurons that are relevant to the learning at hand.
Anshumali Shrivastava and Artificial Intelligence
Anshumali Shrivastava is the assistant professor at Rice’s Brown School of engineering. According to him, SLIDE has the advantage of being data-parallel. He said that by data-parallel means that if he had two instances of data that he wanted to train on, for example, an image of a cat and other of a bus, they would most likely activate different neurons, and SLIDE can update, or train on that two independently. For CPUs, that is much a better utilization of parallelism.
Nevertheless, that had its own challenged. The flip side is that compared to GPU, they require significant memory. In the main memory, there is a cache hierarchy. Shrivastava adds that if you aren’t careful with that, you might run into a problem called cache thrashing, where you get a lot of misses of the cache. When the team published their initial finding, Intel got in touch for collaborating on the problem. Professor said that Intel told them they could work with them for making it train even faster. It appeared that they were right. With their help, the results improved by around 50 percent.
For those involved in Artificial Intelligence, SLIDE is a promising development. It’s doubtful to replace GPU-based training any time soon. It is because it is far easier to add multiple GPUs to one system than to add various CPUs. SLIDE has more potential to make artificial intelligence training more efficient and more accessible.