Abstract: Edge computing has been widely used to deploy and service deep learning applications. Equipped with GPUs, edge nodes can process concurrent incoming inference requests of the deep learning ...