MLink: Linking Black-Box Models for Collaborative Multi-Model Inference
Mu Yuan, Lan Zhang, Xiang-Yang Li
[AAAI-22] Main Track
Abstract:
The cost-efficiency of model inference is critical to real-world machine learning (ML) applications, especially for delay-sensitive tasks and resource-limited devices. A typical dilemma is: in order to provide complex intelligent services (e.g. smart city), we need inference results of multiple ML models, but the cost budget (e.g. GPU memory) is not enough to run all of them. In this work, we dug into underlying relationships among black-box ML models and proposed a novel learning task: model linking. Model linking aims to bridges the knowledge of different black-box models by learning mappings between their output spaces. Based on model links, we developed a scheduling algorithm, named MLink. Through collaborative multi-model inference enabled by model links, MLink can significantly improve the accuracy of obtained inference results under the cost budget. We conducted comprehensive evaluations on a multi-modal dataset with seven different ML models and two real-world video analytics systems with 6 ML models and 3,264 hours of video. The experimental results show that our proposed model links can be effectively built among various black-box models. Under the budget of GPU memory, MLink can save 66.7% inference computations while preserving 94% inference accuracy, which outperforms multi-task learning, deep reinforcement learning-based scheduler and frame filtering baselines.
Introduction Video
Sessions where this paper appears
-
Poster Session 4
Fri, February 25 5:00 PM - 6:45 PM (+00:00)
Red 3
-
Poster Session 8
Sun, February 27 12:45 AM - 2:30 AM (+00:00)
Red 3
-
Oral Session 4
Fri, February 25 6:45 PM - 8:00 PM (+00:00)
Red 3