Accurate forecasting of short-term load forecasting is of great help to demand-side response and power dispatching. In order to improve the accuracy of short-term power load prediction, the original power load data signals are decomposed by using the Variational Mode Decomposition (VMD) method. The decomposed sub-signals and the original signals form a new data set, which is then trained by the neural network. The decomposed sub-signals reflect the detailed features inside the power load that are difficult to be learned by the neural network. Through VMD analysis, the neural network can learn richer information, which is more effective than the superposition prediction method. The neural network prediction model selects an architecture based on Attention-long short term memory (Attention-LSTM). The addition of attention mechanism enables important decomposed information to be fully learned. The effectiveness of this method is proved by experiment.