数字人文研究 ›› 2025, Vol. 5 ›› Issue (2): 45-58.

• 攻玉以石 • 上一篇    下一篇

基于深度集成学习的战国楚系简帛文字识别

陈超,中国人民大学信息资源管理学院2021级本科生;李赫孜,中国人民大学文学院2022级本科生;杨泽坤(通讯作者),中国人民大学信息资源管理学院讲师,中国人民大学数字人文研究院研究员。
  

  • 出版日期:2025-06-28 发布日期:2025-07-23
  • 基金资助:
    本文为中国人民大学科学研究基金(中央高校基本科研业务费专项资金资助)项目(项目批准号:23XNF040) “用于中国书法风格识别的可解释人工智能模型研究”成果,并得到国家自然科学基金青年项目“数字图书馆情境下基于可解释深度学习的图像检索研究”(项目号:72204255)的资助。

Recognition of Chu Dynasty characters in Warring States based on Deep Ensemble Learning

  • Online:2025-06-28 Published:2025-07-23

摘要:

楚系简帛文字的释读一直是古文字学的重点研究方向,然而目前多依赖人工手段对单字形体开展分析,缺少用计算机视觉技术对海量文字图版进行字形识别的尝试。研究针对大量楚系简帛文字图像识别困难的问题,结合楚系简帛文字的内在特征,不局限于单一深度神经网络模型和单一文字图片分析的微观视角,提出了一种基于集成学习策略的楚系简帛文字图像分类方法,即使用四种深度学习网络提取楚系简帛文字图像的共同形态学特征,并以投票形式得到最终的分类结果,从而构建了计算机自动高效识别海量楚系简帛文字图像的技术框架。应用该框架对目前出土的部分简帛材料中的文字图像进行识别,准确率高达96.72%,充分证明了该框架的可行性和有效性,为古文字研究提供了新的路径。

关键词:

Abstract:

As an important writing material, the interpretation of Chu Jian and silk has always been the key research direction of ancient philology. However, at present, the Chu system of bamboo and silk text interpretation mainly relies on artificial means to analyze the single character form, and there is a lack of computer vision technology for font recognition of massive text plates. Aiming at the difficulty of image recognition of a large number of Chu script and silk text, this paper proposes an integrated learning strategy based on image classification method for Chu script and silk text, which is not limited to the microscopic perspective of single deep neural network model and single text image analysis, combined with the inherent characteristics of Chu script and silk text. Different deep learning networks were used to extract the common morphological features of Chu Jian and silk text images, and the final classification results were obtained in the form of voting, and a technical framework for automatic and efficient recognition of massive Chu Jian and silk text images was constructed. The framework is applied to recognize the text images in some unearthed silk materials with an accuracy of 96.72% ,which fully proves the feasibility and effectiveness of the framework and provides a new way for the study of ancient Chinese characters.

Key words:

中图分类号: