Text And Image Plagiarism Detection
Keywords:
Plagiarism Detection, Image Plagiarism, Text Similarity, Histogram Features, Clustering Algorithm, Artificial Intelligence, Document Originality, Turnitin Alternative.Abstract
Plagiarism detection has become an essential requirement in academic institutions, journals, and research organizations to ensure originality in submitted documents and projects. Most existing plagiarism detection systems focus primarily on textual similarity and are effective in identifying copied or paraphrased text. However, these systems often fail to address plagiarism involving images, design files, and graphical content. To overcome this limitation, this paper proposes an adaptive, scalable, and extensible plagiarism detection framework capable of analyzing both textual and image-based content.The proposed model employs a text corpus containing 100 reference documents to identify suspicious textual similarities. For image plagiarism detection, a histogram-based feature extraction technique is used to create feature representations of stored database images. Suspicious images are then compared against the stored histogram models to measure similarity. In addition, clustering algorithms are incorporated to group extracted image features and improve matching robustness without discarding important visual characteristics. A configurable similarity threshold of 40% is used for detection, which can be adjusted according to institutional requirements.Experimental evaluation was conducted using a dataset consisting of 10 training design samples stored in the system database and multiple test samples containing both original and forged designs. The proposed framework achieved a 100% matching rate and an overall detection accuracy of 81%. Results demonstrate that integrating text comparison with intelligent image analysis significantly improves plagiarism identification beyond traditional text-only systems. The developed approach is suitable for educational institutions, publishing agencies, and digital repositories seeking reliable multi-format plagiarism detection solutions
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.











