We consider a multi-user multi-server mobile edge computing (MEC) network with time-varying fading channels and formulate an offloading decision and resource allocation problem. To solve this mixed-integer non-convex problem, we propose two hybrid approaches that learn offloading strategy with DQN (opt-DQN) or Q-table (opt-QL) at each user equipment (UE). The communication resources are allocated with an optimization algorithm at each computational access point (CAP). We also propose a pure DQN method that learns both the offloading strategy and resource allocation via Q-learning (QL). We analyze the convergence behavior of the QL-based algorithms from a game-theoretical perspective and demonstrate the performance of the proposed hybrid approaches for different network sizes. The simulation results show that the hybrid approaches reach lower costs than other baseline algorithms and the pure-DQN approach. Moreover, the performance of the pure-DQN approach degrades severely as the network size increases, while opt-DQN still performs the best, followed by opt-QL. These observations demonstrate that the hybrid approach that combines the advantages of both QL and convex optimization is a promising design for a multi-user MEC network, wherein complicated offloading and resource allocation strategies need to be determined in a timely and accurate fashion.
All Science Journal Classification (ASJC) codes