Use case in production

Neural network optimization for inference deployments

Sarosh Quraishi

Machine Learning Specialist, Intel

Sarosh Quraishi is a Machine learning specialist at Intel and he is currently working with customers on solving model deployment challenges for deep learning models. In past, Sarosh has completed a Ph.D. from the Indian Institute of Technology and a Postdoc in Applied Mathematics from TU Berlin. He worked on solving parametric eigenvalue problems during his Postdoc.

Sarosh Quraishi
Sarosh Quraishi
Session description

Day by day, deep learning models are becoming larger in size and harder to deploy. We will cover various issues that developers face while deploying deep learning models and how to solve them by using popular network compression technologies, such as quantization, pruning, knowledge distillation. These techniques reduce the size of the model and improve performance metrics like latency and throughput. We introduce the Intel® Neural Compressor and show how it works seamlessly with various frameworks.