Handwritten digit recognition is an active research field. These recognition systems are faced with many challenges, including accuracy, speed and automatic extraction of complex handwriting features. In this paper, a Stacking ensemble learning model based on fusion optimized CNN is proposed, which can be effectively used for handwritten digit recognition. To better extract the features of complex handwritten digital images and maximize the reliability of the model, the Bagging strategy combined with six CNNs is used for feature extraction for the first time, and SVM is used for classification. This not only improves the accuracy and stability of the model, but also effectively avoids over-fitting. In addition, a fusion optimization algorithm based on Adam and SGD is proposed to solve the problem that CNN falls into local optimum due to a large number of iterations. During the process of training, ASCNN can not only speed up the convergence rate in the early stage, but also reduce the oscillation phenomenon in the late stage. Extensive experimental results on the well-known MNIST and USPS handwriting image datasets demonstrate the effectiveness of the proposed model.