Geometrics Assisted Rubbing Generation and Semantics Enhanced Detection for Small and Dense OBI Character
DOI:
https://doi.org/10.9781/ijimai.2025.10.001Keywords:
Data Augmentation, GAN, NMS, Object Detection, Oracle Bone InscriptionAbstract
Character detection is essential for subsequent Oracle Bone Inscription (OBI) research. However, the lack of labeled data and the complexity of small and dense OBI characters are the main difficulties in OBI detection research. In this paper, we propose a framework for rubbing generation that can automatically build up largescale rubbing samples with verisimilar scenarios to noisy wild OBI through geometric and morphological construction combined with style transferring. Moreover, we propose a semantic-enhanced detection model aiming at small and dense OBI through the fusion of multi-resolution feature maps with the enriched feature in the YOLOv5s backbone. We introduce the higher resolution and the Soft-NMS into the proposed OBI detection model to solve the overlapping of small and dense OBI characters. The augmented dataset improves the performance of benchmark object detection models in the real OBI detection task when sufficient data is lacking. Furthermore, the proposed OBI detection model can provide easy and preferable access to OBI detection even with a small number of labeled data and obtain preferable results. Experiments ascertain the effectiveness of the proposed OBI generation framework and the proposed OBI detection model.
Downloads
References
R. Pramanik, S. Bag, “Segmentation‑based recognition system for handwritten Bangla and Devanagari words using conventional classification and transfer learning,” IET Image Processing, vol. 14, no. 5, pp. 959–972, 2020, doi: https://doi.org/10.1049/iet‑ipr.2019.0208
R. Parashivamurthy, C. Naveena, Y. H. Sharath Kumar, “Sift and hog features for the retrieval of ancient kannada epigraphs,” IET Image Processing, vol. 14, no. 17, pp. 4657–4662, 2020, doi: https://doi.org/10.1049/iet‑ipr.2020.0715
G. Li, J. Zhang, D. Chen, “F2pnet: font-to-painting translation by adversarial learning,” IET Image Processing, vol. 14, no. 13, pp. 3243–3253, 2020, https://doi.org/10.1049/iet-ipr.2019.0476
K. Takashima, “Towards a more rigorous methodology of deciphering oracle-bone inscriptions,” T’oung Pao, vol. 86, no. 5, pp. 363–399, 2000, https://doi.org/10.1163/15685320051072753
L. Meng, “Two-stage recognition for oracle bone inscriptions,” in International Conference on Image Analysis and Processing, Catania, Italy, 2017, pp. 672– 682, Springer, doi: https://doi.org/10.1007/978-3-319-68548-9_61
W. Han, X. Ren, H. Lin, Y. Fu, X. Xue, “Self-supervised learning of orc-bert augmentator for recognizing few- shot oracle characters,” in Proceedings of the Asian Conference on Computer Vision, Kyoto, Japan, 2020, pp. 652–668, https://doi.org/10.1007/978-3-030-69544-6_39
Q. Jiao, Y. Jin, Y. Liu, S. Han, G. Liu, N. Wang, B. Li, F. Gao, “Module structure detection of oracle characters with similar semantics,” Alexandria Engineering Journal, vol. 60, no. 5, pp. 4819–4828, 2021, https://doi.org/1016/j.aej.2021.03.072
R. Girshick, J. Donahue, T. Darrell, J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Columbus, OH, USA, 2014, pp. 580–587, doi: https://doi.org/10.1109/CVPR.2014.81
A. Laishram, K. Thongam, “Automatic classification of oral pathologies using orthopantomogram radiography images based on convolutional neural network,” International Journal of Interactive Multimedia and Artificial Intelligence, 2022, doi: https://doi.org/10.9781/ijimai.2021.10.009
M. Adimoolam, S. Mohan, G. Srivastava, et al., “A novel technique to detect and track multiple objects in dynamic video surveillance systems,” International Journal of Interactive Multimedia and Artificial Intelligence, 2022, doi: https://doi.org/10.9781/ijimai.2022.01.002
S. Ren, K. He, R. Girshick, J. Sun, “Faster r- cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, vol. 28, 2015, doi: https://doi.org/10.1109/TPAMI.2016.2577031
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, A. C. Berg, “Ssd: Single shot multibox detector,” in European conference on computer vision, Amsterdam, The Netherlands, 2016, pp. 21–37, Springer, doi: https://doi.org/10.1007/978-3-319-46448-0_2
J. Redmon, S. Divvala, R. Girshick, A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, 2016, pp. 779– 788, doi: https://doi.org/10.1109/CVPR.2016.91
J. Redmon, A. Farhadi, “Yolo9000: Better, faster, stronger,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 2017, pp. 7263–7271, doi: https://doi.org/10.1109/CVPR.2017.690
J. Redmon, A. Farhadi, “Yolov3: An incremental improvement,” 2018. [Online]. Available: https://arxiv.org/abs/1804.02767
A. Bochkovskiy, C.-Y. Wang, H.-Y. M. Liao, “Yolov4: Optimal speed and accuracy of object detection,” 2020. [Online]. Available: https://arxiv.org/abs/2004.10934.
J. Xing, G. Liu, J. Xiong, “Oracle bone inscription detection: A survey of oracle bone inscription detection based on deep learning algorithm,” in Proceedings of the International Conference on Artificial Intelligence, Information Processing and Cloud Computing, Sanya, China, 2019, pp. 1–8, doi: https://doi.org/10.1145/3371425.3371434
H. Zhang, M. Cisse, Y. N. Dauphin, D. Lopez- Paz, “mixup: Beyond empirical risk minimization,” in International Conference on Learning Representations, Vancouver, BC, Canada, 2018, doi: https://doi.org/10.48550/arXiv.1710.09412
C. Shorten, T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” Journal of big data, vol. 6, no. 1, pp. 1–48, 2019, doi: https://doi.org/10.1186/s40537-019-0197-0
T. DeVries, G. W. Taylor, “Improved regularization of convolutional neural networks with cutout,” 2017. [Online]. Available: https://arxiv.org/abs/1708.04552.
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014, doi: https://doi.org/10.1145/3422622
X. Yue, H. Li, Y. Fujikawa, L. Meng, “Dynamic dataset augmentation for deep learning-based oracle bone inscriptions recognition,” ACM Journal on Computing and Cultural Heritage, vol. 15, no. 4, pp. 1–20, 2022, doi: https://doi.org/10.1145/3532868
G. Li, L. Wen, Z. Huang, R. Xia, Y. Pang, “Data augmentation and shadow image classification for shadow detection,” IET Image Processing, vol. 16, no. 3, pp. 717–728, 2022, doi: https://doi.org/10.1049/ipr2.12377
K. He, X. Zhang, S. Ren, J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 37, no. 9, pp. 1904–1916, 2015, doi: https://doi.org/10.1109/TPAMI.2015.2389824
R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, Santiago, Chile, 2015, pp. 1440–1448, doi: https://doi.org/10.1109/ICCV.2015.169
Z. Cai, N. Vasconcelos, “Cascade r-cnn: Delving into high quality object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 2018, pp. 6154–6162, doi: https://doi.org/10.1109/CVPR.2018.00644
T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Belongie, “Feature pyramid networks for object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 2017, pp. 2117–2125, doi: https://doi.org/10.1109/CVPR.2017.106
C.-Y. Wang, H.-Y. M. Liao, Y.-H. Wu, P.-Y. Chen, J.-W. Hsieh, I.-H. Yeh, “Cspnet: A new backbone that can enhance learning capability of cnn,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, Seattle, WA, USA, 2020, pp. 390– 391, doi: https://doi.org/10.1109/CVPRW50498.2020.00203
S. Liu, L. Qi, H. Qin, J. Shi, J. Jia, “Path aggregation network for instance segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 2018, pp. 8759– 8768, doi: https://doi.org/10.1109/CVPR.2018.00913
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 2021, pp. 10012–10022, doi: https://doi.org/10.1109/ICCV48922.2021.00986
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, “End-to-end object detection with transformers,” in European conference on computer vision, Glasgow, UK, 2020, pp. 213–229, Springer, doi: https://doi.org/10.1007/978-3-030-58452-8_13
L. Meng, B. Lyu, Z. Zhang, C. Aravinda, N. Kamitoku, K. Yamazaki, “Oracle bone inscription detector based on ssd,” in International Conference on Image Analysis and Processing, 2019, pp. 126–136, Springer, doi: https://doi.org/10.1007/978-3-030-30754-7_13
Y. Fujikawa, H. Li, X. Yue, C. Aravinda, G. A. Prabhu, L. Meng, “Recognition of oracle bone inscriptions by using two deep learning models,” International Journal of Digital Humanities, pp. 1–15, 2022.
G. Xu, “Research on Oracle Bone Radical Detection Based in Deep Learning of Semantic Analysis,” M.S. thesis, Jiangxi Science and Technology Normal University, Nanchang, China, 2020.
F. Liu, H. Li, J. Ma, S. Yan, P. Jin, “Research of automatic detection and recognition of oracle rubbings based on mask-rcnn,” Data Analysis and Knowledge Discovery, vol. 5, no. 12, pp. 88–97, 2022, doi: https://doi.org/10.54097/0k5qen34
J. Xing, “Research of Oracle Bone Inscription Detection Based on Deep Convolution Neural Network,” M.S. thesis, School of Computer Science and Engineering, Zhengzhou University, Zhengzhou, China, 2020.
J.-Y. Zhu, T. Park, P. Isola, A. A. Efros, “Unpaired image- to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, Venice, Italy, 2017, pp. 2223–2232, doi: https://doi.org/10.1109/ICCV.2017.244
P. Isola, J.-Y. Zhu, T. Zhou, A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 2017, pp. 1125– 1134, doi: https://doi.org/10.1109/CVPR.2017.632
O. Ronneberger, P. Fischer, T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention, 2015, pp. 234–24, Springer, doi: https://doi.org/10.1007/978-3-319-24574-4_28
G. J. et al., “ultralytics/yolov5: v6.0,” 2021. [Online]. Available: https://github.com/ultralytics/yolov5
X. Zhu, B. Liang, D. Fu, G. Huang, F. Yang, W. Li, “Airport small object detection based on feature enhancement,” IET Image Processing, vol. 16, no. 11, pp. 2863–2874, 2022, doi: https://doi.org/10.1049/ipr2.12387
J. Yu, Y. Jiang, Z. Wang, Z. Cao, T. Huang, “Unitbox: An advanced object detection network‘,” in Proceedings of the 24th ACM international conference on Multimedia, Amsterdam, The Netherlands, 2016, pp. 516–520, doi: https://doi.org/10.1145/2964284.296727
H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, S. Savarese, “Generalized intersection over union: A metric and a loss for bounding box regression,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Long Beach, CA, USA, 2019, pp. 658–666, doi: https://doi.org/10.1109/CVPR.2019.00075
Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, D. Ren, “Distance-iou loss: Faster and better learning for bounding box regression,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, New York City, NY, USA, 2020, pp. 12993–13000, doi: https://doi.org/10.1609/aaai.v34i07.6999
N. Bodla, B. Singh, R. Chellappa, L. S. Davis, “Soft- nms–improving object detection with one line of code,” in Proceedings of the IEEE international conference on computer vision, Venice, Italy, 2017, pp. 5561–5569, doi: https://doi.org/10.1109/ICCV.2017.593
X. Zhu, W. Su, L. Lu, B. Li, X. Wang, J. Dai, “Deformable detr: Deformable transformers for end- to-end object detection,” in International Conference on Learning Representations, Virtual Event, Austria, 2021, doi: https://doi.org/10.48550/arXiv.2010.04159
Z. Peng, W. Huang, S. Gu, L. Xie, Y. Wang, J. Jiao, Q. Ye, “Conformer: Local features coupling global representations for visual recognition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 2021, pp. 367–376, doi: https://doi.org/10.1109/TPAMI.2023.3243048
K. Sohn, Z. Zhang, C.-L. Li, H. Zhang, C.-Y. Lee, T. Pfister, “A simple semisupervised learning framework for object detection,” 2020. [Online]. Available: https://arxiv.org/abs/2005.04757
Y.-C. Liu, C.-Y. Ma, Z. He, C.-W. Kuo, K. Chen, P. Zhang, B. Wu, Z. Kira, P. Vajda, “Unbiased teacher for semi-supervised object detection,” in International Conference on Learning Representations, Virtual Event, Austria, 2021.
Downloads
Published
- 
			Abstract66
 - 
                                        							PDF19
 
						





