VERIFIKASI DOKUMEN TRANSKRIP NILAI SEMESTER MENGGUNAKAN METODA OPTICAL CHARACTER RECOGNITION

Authors

  • zulkarnaen hatala politeknik negeri ambon

DOI:

https://doi.org/10.23960/jitet.v11i3.3277

Abstract Views: 159 File Views: 282

Abstract

At the Ambon State Polytechnic, students' semester grade reports are still manually typed. This causes frequent typo errors which can result in the invalidity of the document, let alone incorrect grades, student identification numbers and many other label values. Here a java application has been implemented to detect these errors. This application is primarily intended for officials of the Head of Study Program, Head of the Department before signing and validating the report. Officials who legalize it will be greatly assisted because tedious validation work can be replaced by computers. The validation process is carried out by utilizing the optical character recognition technique from the open source library Tesseract-OCR. From the experimental results the verification process can be improved by using OCR  specific on specific regions of interest (ROI) after using template matching method from OpenCV. The consideration of the Levehnstein distance in the comparison of label values against the reference database also improves the success rate of the algorithm. The database used has been tested for about 800 grade report documents, with successful verification result above 90%.

Downloads

Download data is not yet available.

References

M. A. Sipe-Haesemeyer, “Bringing the World Wide Web into Third World Countries: Integrating Technology Across the Globe,” Glob. Media J., vol. 4, no. 7, 2005.

J. A. Yeow, P. K. Ng, K. S. Tan, T. S. Chin, and W. Y. Lim, “Effects of stress, repetition, fatigue and work environment on human error in manufacturing industries,” J. Appl. Sci., vol. 14, no. 24, pp. 3464–3471, 2014.

Y. Fataicha, M. Cheriet, J. Y. Nie, and C. Y. Suen, “Information Retrieval Based on OCR Errors in Scanned Documents,” in 2003 Conference on Computer Vision and Pattern Recognition Workshop, Madison, Wisconsin, USA: IEEE, Jun. 2003, pp. 25–25. doi: 10.1109/CVPRW.2003.10020.

Y. Lee, J. Song, and Y. Won, “Improving personal information detection using OCR feature recognition rate,” J. Supercomput., vol. 75, no. 4, pp. 1941–1952, Apr. 2019, doi: 10.1007/s11227-018-2444-0.

D. Yamakawa and N. Yoshiura, “Applying Tesseract-OCR to detection of image spam mails,” in 2012 14th Asia-Pacific Network Operations and Management Symposium (APNOMS), IEEE, 2012, pp. 1–4.

S. Easterbrook, J. Singer, M.-A. Storey, and D. Damian, “Selecting empirical methods for software engineering research,” Guide Adv. Empir. Softw. Eng., pp. 285–311, 2008.

J. Farrell, Java programming. Cengage Learning, 2022.

R. Smith, “An Overview of the Tesseract OCR Engine,” in Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Sep. 2007, pp. 629–633. doi: 10.1109/ICDAR.2007.4376991.

S. Gollapudi, Learn computer vision using OpenCV. Springer, 2019.

S. Srigiri and S. K. Saha, “Spelling Correction of OCR-Generated Hindi Text Using Word Embedding and Levenshtein Distance,” in Nanoelectronics, Circuits and Communication Systems: Proceeding of NCCS 2018, Springer, 2020, pp. 415–424.

Z. Hatala, “Verifikator Transkrip Nilai Semester Otomatis.” Jun. 27, 2023. [Online]. Available: https://github.com/dzhatala/scanned-document-verificator

Downloads

Published

2023-08-01

How to Cite

hatala, zulkarnaen. (2023). VERIFIKASI DOKUMEN TRANSKRIP NILAI SEMESTER MENGGUNAKAN METODA OPTICAL CHARACTER RECOGNITION. Jurnal Informatika Dan Teknik Elektro Terapan, 11(3). https://doi.org/10.23960/jitet.v11i3.3277

Issue

Section

Articles