Geometrics Assisted Rubbing Generation and Semantics Enhanced Detection for Small and Dense OBI Character

Xiuan Wan; Yuchun Fang; Jiahua Wu; Shouyong Pan

doi:10.9781/ijimai.2025.10.001

Authors

Xiuan Wan Shanghai University
Yuchun Fang Shanghai University
Jiahua Wu Shanghai University
Shouyong Pan Shanghai University

DOI:

https://doi.org/10.9781/ijimai.2025.10.001

Keywords:

Data Augmentation, GAN, NMS, Object Detection, Oracle Bone Inscription

Supporting Agencies

The research was supported by the National Natural Science Foundation of China under Grant No.: 61976132 and 61991410. This work is supported by Shanghai Technical Service Center of Science and Engineering Computing, Shanghai University.

Abstract

Character detection is essential for subsequent Oracle Bone Inscription (OBI) research. However, the lack of labeled data and the complexity of small and dense OBI characters are the main difficulties in OBI detection research. In this paper, we propose a framework for rubbing generation that can automatically build up largescale rubbing samples with verisimilar scenarios to noisy wild OBI through geometric and morphological construction combined with style transferring. Moreover, we propose a semantic-enhanced detection model aiming at small and dense OBI through the fusion of multi-resolution feature maps with the enriched feature in the YOLOv5s backbone. We introduce the higher resolution and the Soft-NMS into the proposed OBI detection model to solve the overlapping of small and dense OBI characters. The augmented dataset improves the performance of benchmark object detection models in the real OBI detection task when sufficient data is lacking. Furthermore, the proposed OBI detection model can provide easy and preferable access to OBI detection even with a small number of labeled data and obtain preferable results. Experiments ascertain the effectiveness of the proposed OBI generation framework and the proposed OBI detection model.

Downloads

Download data is not yet available.

Author Biographies

Xiuan Wan, Shanghai University

Xiuan Wan received the B.S. degree from the school of computer engineering and science, Shanghai University in 2022. He is currently pursuing the M.S. degree in the school of computer engineering and science, Shanghai University. His research interest is object detection.

Yuchun Fang, Shanghai University

Yuchun Fang received the B.S. degree from the Central University of Nationalities in 1996, the M.S. degree from the Beijing Polytechnique University in 1999, and the Ph.D. degree from the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China, in 2003. She is currently a Full Professor with the School of Computer Engineering and Science, Shanghai University. From 2003 to 2004, she was a post-doctoral researcher at the France National Research Institute on Information and Automation (INRIA). Her current research interests include pattern recognition and image processing.

Jiahua Wu, Shanghai University

Jiahua Wu received the B.S. degree from the school of computer engineering and science, Shanghai University in 2022. He is currently pursuing the M.S. degree in the school of computer engineering and science, Shanghai University. His research interest is domain adaptation.

Shouyong Pan, Shanghai University

Shouyong Pan received the B.A. degree from Jilin University(archaeology, 1989), the M.A. degree from Nankai University (museology and history, 1993), and the Ph.D. degree from the Minzu University of China (ethnology, 1999). He is currently a distinguished Professor of anthropology and museology (Weichang Scholar), and director of university library, Shanghai University. From 2002 to 2004, he was a Harvard-yenching scholar at Harvard University, and from 2013 to 2014, he was a Fulbright scholar at George Washington University. His current research interests include Chinese culture, anthropology and cultural heritage studies.

References

R. Pramanik, S. Bag, “Segmentation‑based recognition system for handwritten Bangla and Devanagari words using conventional classification and transfer learning,” IET Image Processing, vol. 14, no. 5, pp. 959–972, 2020, doi: https://doi.org/10.1049/iet‑ipr.2019.0208

R. Parashivamurthy, C. Naveena, Y. H. Sharath Kumar, “Sift and hog features for the retrieval of ancient kannada epigraphs,” IET Image Processing, vol. 14, no. 17, pp. 4657–4662, 2020, doi: https://doi.org/10.1049/iet‑ipr.2020.0715

G. Li, J. Zhang, D. Chen, “F2pnet: font-to-painting translation by adversarial learning,” IET Image Processing, vol. 14, no. 13, pp. 3243–3253, 2020, https://doi.org/10.1049/iet-ipr.2019.0476

K. Takashima, “Towards a more rigorous methodology of deciphering oracle-bone inscriptions,” T’oung Pao, vol. 86, no. 5, pp. 363–399, 2000, https://doi.org/10.1163/15685320051072753

L. Meng, “Two-stage recognition for oracle bone inscriptions,” in International Conference on Image Analysis and Processing, Catania, Italy, 2017, pp. 672– 682, Springer, doi: https://doi.org/10.1007/978-3-319-68548-9_61

W. Han, X. Ren, H. Lin, Y. Fu, X. Xue, “Self-supervised learning of orc-bert augmentator for recognizing few- shot oracle characters,” in Proceedings of the Asian Conference on Computer Vision, Kyoto, Japan, 2020, pp. 652–668, https://doi.org/10.1007/978-3-030-69544-6_39

Q. Jiao, Y. Jin, Y. Liu, S. Han, G. Liu, N. Wang, B. Li, F. Gao, “Module structure detection of oracle characters with similar semantics,” Alexandria Engineering Journal, vol. 60, no. 5, pp. 4819–4828, 2021, https://doi.org/1016/j.aej.2021.03.072

R. Girshick, J. Donahue, T. Darrell, J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Columbus, OH, USA, 2014, pp. 580–587, doi: https://doi.org/10.1109/CVPR.2014.81

A. Laishram, K. Thongam, “Automatic classification of oral pathologies using orthopantomogram radiography images based on convolutional neural network,” International Journal of Interactive Multimedia and Artificial Intelligence, 2022, doi: https://doi.org/10.9781/ijimai.2021.10.009

M. Adimoolam, S. Mohan, G. Srivastava, et al., “A novel technique to detect and track multiple objects in dynamic video surveillance systems,” International Journal of Interactive Multimedia and Artificial Intelligence, 2022, doi: https://doi.org/10.9781/ijimai.2022.01.002

S. Ren, K. He, R. Girshick, J. Sun, “Faster r- cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, vol. 28, 2015, doi: https://doi.org/10.1109/TPAMI.2016.2577031

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, A. C. Berg, “Ssd: Single shot multibox detector,” in European conference on computer vision, Amsterdam, The Netherlands, 2016, pp. 21–37, Springer, doi: https://doi.org/10.1007/978-3-319-46448-0_2

J. Redmon, S. Divvala, R. Girshick, A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, 2016, pp. 779– 788, doi: https://doi.org/10.1109/CVPR.2016.91

J. Redmon, A. Farhadi, “Yolo9000: Better, faster, stronger,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 2017, pp. 7263–7271, doi: https://doi.org/10.1109/CVPR.2017.690

J. Redmon, A. Farhadi, “Yolov3: An incremental improvement,” 2018. [Online]. Available: https://arxiv.org/abs/1804.02767

A. Bochkovskiy, C.-Y. Wang, H.-Y. M. Liao, “Yolov4: Optimal speed and accuracy of object detection,” 2020. [Online]. Available: https://arxiv.org/abs/2004.10934.

J. Xing, G. Liu, J. Xiong, “Oracle bone inscription detection: A survey of oracle bone inscription detection based on deep learning algorithm,” in Proceedings of the International Conference on Artificial Intelligence, Information Processing and Cloud Computing, Sanya, China, 2019, pp. 1–8, doi: https://doi.org/10.1145/3371425.3371434

H. Zhang, M. Cisse, Y. N. Dauphin, D. Lopez- Paz, “mixup: Beyond empirical risk minimization,” in International Conference on Learning Representations, Vancouver, BC, Canada, 2018, doi: https://doi.org/10.48550/arXiv.1710.09412

C. Shorten, T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” Journal of big data, vol. 6, no. 1, pp. 1–48, 2019, doi: https://doi.org/10.1186/s40537-019-0197-0

T. DeVries, G. W. Taylor, “Improved regularization of convolutional neural networks with cutout,” 2017. [Online]. Available: https://arxiv.org/abs/1708.04552.

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014, doi: https://doi.org/10.1145/3422622

X. Yue, H. Li, Y. Fujikawa, L. Meng, “Dynamic dataset augmentation for deep learning-based oracle bone inscriptions recognition,” ACM Journal on Computing and Cultural Heritage, vol. 15, no. 4, pp. 1–20, 2022, doi: https://doi.org/10.1145/3532868

G. Li, L. Wen, Z. Huang, R. Xia, Y. Pang, “Data augmentation and shadow image classification for shadow detection,” IET Image Processing, vol. 16, no. 3, pp. 717–728, 2022, doi: https://doi.org/10.1049/ipr2.12377

K. He, X. Zhang, S. Ren, J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 37, no. 9, pp. 1904–1916, 2015, doi: https://doi.org/10.1109/TPAMI.2015.2389824

R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, Santiago, Chile, 2015, pp. 1440–1448, doi: https://doi.org/10.1109/ICCV.2015.169

Z. Cai, N. Vasconcelos, “Cascade r-cnn: Delving into high quality object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 2018, pp. 6154–6162, doi: https://doi.org/10.1109/CVPR.2018.00644

T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Belongie, “Feature pyramid networks for object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 2017, pp. 2117–2125, doi: https://doi.org/10.1109/CVPR.2017.106

C.-Y. Wang, H.-Y. M. Liao, Y.-H. Wu, P.-Y. Chen, J.-W. Hsieh, I.-H. Yeh, “Cspnet: A new backbone that can enhance learning capability of cnn,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, Seattle, WA, USA, 2020, pp. 390– 391, doi: https://doi.org/10.1109/CVPRW50498.2020.00203

S. Liu, L. Qi, H. Qin, J. Shi, J. Jia, “Path aggregation network for instance segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 2018, pp. 8759– 8768, doi: https://doi.org/10.1109/CVPR.2018.00913

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.

Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 2021, pp. 10012–10022, doi: https://doi.org/10.1109/ICCV48922.2021.00986

N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, “End-to-end object detection with transformers,” in European conference on computer vision, Glasgow, UK, 2020, pp. 213–229, Springer, doi: https://doi.org/10.1007/978-3-030-58452-8_13

L. Meng, B. Lyu, Z. Zhang, C. Aravinda, N. Kamitoku, K. Yamazaki, “Oracle bone inscription detector based on ssd,” in International Conference on Image Analysis and Processing, 2019, pp. 126–136, Springer, doi: https://doi.org/10.1007/978-3-030-30754-7_13

Y. Fujikawa, H. Li, X. Yue, C. Aravinda, G. A. Prabhu, L. Meng, “Recognition of oracle bone inscriptions by using two deep learning models,” International Journal of Digital Humanities, pp. 1–15, 2022.

G. Xu, “Research on Oracle Bone Radical Detection Based in Deep Learning of Semantic Analysis,” M.S. thesis, Jiangxi Science and Technology Normal University, Nanchang, China, 2020.

F. Liu, H. Li, J. Ma, S. Yan, P. Jin, “Research of automatic detection and recognition of oracle rubbings based on mask-rcnn,” Data Analysis and Knowledge Discovery, vol. 5, no. 12, pp. 88–97, 2022, doi: https://doi.org/10.54097/0k5qen34

J. Xing, “Research of Oracle Bone Inscription Detection Based on Deep Convolution Neural Network,” M.S. thesis, School of Computer Science and Engineering, Zhengzhou University, Zhengzhou, China, 2020.

J.-Y. Zhu, T. Park, P. Isola, A. A. Efros, “Unpaired image- to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, Venice, Italy, 2017, pp. 2223–2232, doi: https://doi.org/10.1109/ICCV.2017.244

P. Isola, J.-Y. Zhu, T. Zhou, A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 2017, pp. 1125– 1134, doi: https://doi.org/10.1109/CVPR.2017.632

O. Ronneberger, P. Fischer, T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention, 2015, pp. 234–24, Springer, doi: https://doi.org/10.1007/978-3-319-24574-4_28

G. J. et al., “ultralytics/yolov5: v6.0,” 2021. [Online]. Available: https://github.com/ultralytics/yolov5

X. Zhu, B. Liang, D. Fu, G. Huang, F. Yang, W. Li, “Airport small object detection based on feature enhancement,” IET Image Processing, vol. 16, no. 11, pp. 2863–2874, 2022, doi: https://doi.org/10.1049/ipr2.12387

J. Yu, Y. Jiang, Z. Wang, Z. Cao, T. Huang, “Unitbox: An advanced object detection network‘,” in Proceedings of the 24th ACM international conference on Multimedia, Amsterdam, The Netherlands, 2016, pp. 516–520, doi: https://doi.org/10.1145/2964284.296727

H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, S. Savarese, “Generalized intersection over union: A metric and a loss for bounding box regression,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Long Beach, CA, USA, 2019, pp. 658–666, doi: https://doi.org/10.1109/CVPR.2019.00075

Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, D. Ren, “Distance-iou loss: Faster and better learning for bounding box regression,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, New York City, NY, USA, 2020, pp. 12993–13000, doi: https://doi.org/10.1609/aaai.v34i07.6999

N. Bodla, B. Singh, R. Chellappa, L. S. Davis, “Soft- nms–improving object detection with one line of code,” in Proceedings of the IEEE international conference on computer vision, Venice, Italy, 2017, pp. 5561–5569, doi: https://doi.org/10.1109/ICCV.2017.593

X. Zhu, W. Su, L. Lu, B. Li, X. Wang, J. Dai, “Deformable detr: Deformable transformers for end- to-end object detection,” in International Conference on Learning Representations, Virtual Event, Austria, 2021, doi: https://doi.org/10.48550/arXiv.2010.04159

Z. Peng, W. Huang, S. Gu, L. Xie, Y. Wang, J. Jiao, Q. Ye, “Conformer: Local features coupling global representations for visual recognition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 2021, pp. 367–376, doi: https://doi.org/10.1109/TPAMI.2023.3243048

K. Sohn, Z. Zhang, C.-L. Li, H. Zhang, C.-Y. Lee, T. Pfister, “A simple semisupervised learning framework for object detection,” 2020. [Online]. Available: https://arxiv.org/abs/2005.04757

Y.-C. Liu, C.-Y. Ma, Z. He, C.-W. Kuo, K. Chen, P. Zhang, B. Wu, Z. Kira, P. Vajda, “Unbiased teacher for semi-supervised object detection,” in International Conference on Learning Representations, Virtual Event, Austria, 2021.