In the past, indoor positioning technology was mainly based on pedestrian dead reckoning and wireless signal positioning methods, but it was easy to cause some problems such as error accumulation and signal interference. Positioning accuracy still needs to be improved. With the development of neural networks in recent years, many researchers have successfully applied the neural network to the indoor positioning problem based on the Convolutional Neural Network (CNN). This technique mainly determines the position of the image by matching the image features. CNN faces the same challenges as other supervised learning. If the “clean” data cannot be collected, the trained model will not achieve good positioning accuracy. For CNN used for indoor positioning, if someone passes through in the training data, causing the person to appear in different positions of the images, the model may think that the images are the same location. To solve this problem, we propose a data pre-processing method to improve the accuracy of indoor positioning based on CNN. In this method, the moving objects recognized in training and testing data are modified in different ways. We perform data pre-processing method based on Mask R-CNN and YOLO, and then integrate the pre-processing method to PoseNet the famous CNN indoor positioning architecture. Through real experimental analysis, removing moving objects can effectively improve indoor positioning accuracy about 46%.