| 英文摘要 |
The existing method of using large pre-trained models with prompts for zero-shot text classification possesses powerful representation ability and scalability. However, its commercial availability is relatively limited. The approach of employing class labels and existing datasets to fine-tune smaller models for zero-shot classification is comparatively straightforward, yet it might lead to weaker model generalization ability. This paper introduces three methods to enhance the accuracy and generalization capability of pre-trained models in zero-shot text classification tasks: 1) utilizing pretrained language models and structuring inputs into a standardized multiple-choice format; 2) creating a text classification training dataset using Wikipedia text data and refining the pre-trained model through fine-tuning; and 3) suggesting a zero-shot category mapping technique based on GloVe text similarity, wherein Wikipedia categories replace textual categories. Remarkably, without employing labeled samples for fine-tuning, the proposed method achieves results comparable to the best models fine-tuned with labeled samples. |