Paper Note of DeepMon

Title: DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications

Topic: Reduce deep learning applications’ latency on mobile devices.

Idea: Utilize GPU of mobile device, convert deep learning model, store metadata on host mem and parameters on device mem, cache first N conv layers’ output and refresh based on a histogram similarity policy, decompose conv parameters from one tensor to 3 small tensors, optimize conv operations via try unfolding & half float, propose multi kernels to fit different mobile GPU.

Contribution: A toolkit DeepMon that runs deep learning applications on commodity mobile with low latency, implemented with OpenCL and Vulkan

Intellectual merit: Proposed optimization methods to reduce latency without significantly loss performance, without offload to other powerful servers via server-client mode and save energy.

Strengths: Detailed observation & experiments on every step. For example when choosing number of bins at caching step, they set a lot of experiments to draw graph to see how the performance change.

Weakness: No enough guidance for dl developers who wants to make mobile apps: training/testing VGG and YOLO on different datasets