LIAO Bin, ZHANG Tao, YU Jiong, et al. Optimization of collaborative filtering algorithm based on DAG Spark scheduling[J]. Acta Scientiarum Naturalium Universitatis SunYatseni, 2017,56(3):46-56.
LIAO Bin, ZHANG Tao, YU Jiong, et al. Optimization of collaborative filtering algorithm based on DAG Spark scheduling[J]. Acta Scientiarum Naturalium Universitatis SunYatseni, 2017,56(3):46-56.DOI:
The scale effect of big data has brought great challenges to data storage
management and analysis. And the high efficiency and low cost big data processing technology has become a hotspot research in academia and industry. In order to improve the efficiency of collaborative filtering algorithms
the implementation of the algorithm under the MapReduce architecture is decomposed in order to analysis the defects of the algorithm. For the Spark suitable for the iterative and interactive tasks
this paper presents the methods to improve the execution efficiency from the MapReduce platform to the Spark platform. The implementation flow of the algorithm in Spark is designed
and efficiency is improved by parameter adjustment and memory optimization. Experimental results show that: based on spark DAG scheduling
the algorithm can reduce more than 65% HDFS I/O operations and enforce the efficiency and energy efficiency were increased by nearly 200% and 50%.