Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 9d278ac

Browse files
Merge pull request avinashkranjan#707 from Anupreetadas/branch_name2
Plagiarism-Checker
2 parents c902100 + 18f26f7 commit 9d278ac

File tree

7 files changed

+58
-0
lines changed

7 files changed

+58
-0
lines changed

‎Plagiarism-Checker/README.md‎

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
<h3 align="center">Plagiarism Checker</h3>
2+
3+
## How it works <br><br>
4+
- In order to compute the simlilarity between two text documents, the textual raw data is transformed into vectors.
5+
- Then it is transformed into arrays of numbers and then from that by using a basic knowledge vector to compute the the similarity between them.
6+
7+
## Dependencies<br>
8+
- Install scikit-learn by:
9+
10+
$ pip install scikit-learn
11+
12+
## Running the app <br>
13+
- There are four text documents in the repository.
14+
- Basically the code will compare all the .txt files and check for any similarity.
15+
16+
$ python plagiarism.py
17+
18+
## Screenshots
19+
<img src="https://raw.githubusercontent.com/Anupreetadas/plagiarism_checker/main/assets/Capture1.PNG" width="100%" height="100%" align="left" >
20+
16.1 KB
Loading[フレーム]

‎Plagiarism-Checker/file1.txt‎

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
The present moment is filled with happiness and joy.
2+
If you are attentive enough you will see it.

‎Plagiarism-Checker/file2.txt‎

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Life is in the joy of achievement and thrill of creative effort.
2+
If you are attentive enough you will see it.

‎Plagiarism-Checker/file3.txt‎

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Life is in joy of achievement.Do hark work and live up to your goals.

‎Plagiarism-Checker/file4.txt‎

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Welcome to new brave world!!

‎Plagiarism-Checker/plagiarism.py‎

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
2+
#OS Module for loading paths of textfiles. TfidfVectorizer to perform word embedding on the textual data and cosine similarity to compute the plagiarism.
3+
import os
4+
from sklearn.feature_extraction.text import TfidfVectorizer
5+
from sklearn.metrics.pairwise import cosine_similarity
6+
student_files = [doc for doc in os.listdir() if doc.endswith('.txt')]
7+
student_notes =[open(File).read() for File in student_files]
8+
#Two lambda functions, one to convert the text to arrays of numbers and the other one to compute the similarity between them.
9+
10+
vectorize = lambda Text: TfidfVectorizer().fit_transform(Text).toarray()
11+
similarity=lambda doc1,doc2:cosine_similarity([doc1,doc2])
12+
#Vectorize the Textual Data
13+
vectors = vectorize(student_notes)
14+
s_vectors=list(zip(student_files,vectors))
15+
16+
#computing the similarity among students
17+
def check_plagiarism():
18+
plagiarism_results = set()
19+
global s_vectors
20+
for student_a, text_vector_a in s_vectors:
21+
new_vectors =s_vectors.copy()
22+
current_index = new_vectors.index((student_a, text_vector_a))
23+
del new_vectors[current_index]
24+
for student_b , text_vector_b in new_vectors:
25+
sim_score = similarity(text_vector_a, text_vector_b)[0][1]
26+
student_pair = sorted((student_a, student_b))
27+
score = (student_pair[0], student_pair[1],sim_score)
28+
plagiarism_results.add(score)
29+
return plagiarism_results
30+
31+
for data in check_plagiarism():
32+
print(data)

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /