Yifan Zuo, Yongxiang Xia
Feature dimensionality reduction technology has always played an important role in data mining. This paper makes a comparative study of feature dimensionality reduction techniques, and proposes a new feature selection method based on improved partial priority clustering algorithm (IPPCA). Firstly, the selection method of the cluster center of the partial priority clustering algorithm (PPCA) is improved, so that the operation efficiency of the algorithm is improved, and the range of input data is expanded. Then, the clustering results are applied to feature selection, so that the key feature set selected can retain the characteristics of the original dataset to a large extent. Finally, the above methods are simulated on four different data sets. The experiment shows that IPPCA not only has a high efficiency, but also the clustering effect is improved. Compared with principal component analysis (PCA) algorithm and independent component analysis (ICA) algorithm, the accuracy and precision of the key feature set obtained by the proposed feature selection algorithm can reach more than 90% in data classification prediction.
Cluster; Feature selection; Partial priority clustering algorithm; Improved partial priority clustering algorithm; Big data