Project Info

Accelerating Inference of Recurrent Neural Networks on GPUs

Recurrent neural networks is widely used in domains, such as document understanding, machine translation, and sequence prediction. Although GPUs have shown their tremendous power in model training, inference of recurrent neural networks cannot efficiently utilize GPUs due to the limited parallelism. This project aims at designing a scheduling framework to accelerate inference on GPUs by more than 10 times.

More Information

http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/

https://www.nvidia.com/content/tegra/embedded-systems/pdf/jetson_tx1_whitepaper.pdf

Grand Engineering Challenge: Not applicable

Student Preparation

Qualifications

The student should be familiar with CUDA programming.

Time Commitment

60 hours/month

Skills/Techniques Gained

GPU programming
Deep learning
Scheduling
Locality optimization

Mentoring Plan

The student can attend our weekly group meetings. I’ll also have individual meetings with the student.