Commit c3ad8bf

authored

Merge pull request avinashkranjan#750 from zaverisanya/master

POS removal from hindi text

2 parents 9eb20a3 + f99719f commit c3ad8bfCopy full SHA for c3ad8bf

File tree

6 files changed

+937

-0

lines changed

Remove_POS_hindi_text

6 files changed

+937

-0

lines changed

`‎Remove_POS_hindi_text/Input.png`

108 KB

Loading[フレーム]

`‎Remove_POS_hindi_text/Only_Hindi.txt`

Lines changed: 1 addition & 0 deletions

Large diffs are not rendered by default.

`‎Remove_POS_hindi_text/Output.png`

181 KB

Loading[フレーム]

`‎Remove_POS_hindi_text/README.md`

Lines changed: 47 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,47 @@`
	`1`	`+# Package/Script Name`
	`2`	`+`
	`3`	`+Short description of package/script`
	`4`	`+`
	`5`	`+-->Package installed- NLKT`
	`6`	`+- NLTK stands for 'Natural Language Tool Kit'. It consists of the most common algorithms such as tokenizing, part-of-speech tagging, stemming, sentiment analysis, topic segmentation, and named entity recognition. NLTK helps the computer to analysis, preprocess, and understand the written text.`
	`7`	`+`
	`8`	`+`
	`9`	`+## Setup instructions`
	`10`	`+`
	`11`	`+--> Explanation on how to setup and run your package/script locally`
	`12`	`+- simply import the NLKT package by writing 'import nlkt' in first line of your script.`
	`13`	`+- To run the script locally save the 'Tagged_Hindi_Corpus.txt' file at your favourable location.`
	`14`	`+- In code, in fp=open(r"..."), give the location of your saved file as mentioned in previous step.`
	`15`	`+- In code, in fd=open(r"..."), give the location where you want the file with only Hindi text after removal of POS.`
	`16`	`+- Note that for this script, I have run the script therefore only_hindi.txt file already exists. Before executing your script make sure you delete 'only_hindi.txt' file and see it after running the script.`
	`17`	`+- Run the script with "python hindi_POS_tag_removal.py OR python <name of your py file.py>"`
	`18`	`+- You will be able to see the file with only Hindi text.`
	`19`	`+`
	`20`	`+`
	`21`	`+## Detailed explanation of script, if needed`
	`22`	`+`
	`23`	`+Script is written as follows:`
	`24`	`+`
	`25`	`+- Open the hindi_tagged_corpus file.`
	`26`	`+- Data tokenization.`
	`27`	`+- Create 2 empty lists.`
	`28`	`+- To get all categories from POS.`
	`29`	`+- To get all the hindi words.`
	`30`	`+- To concatenate the words.`
	`31`	`+- To write the words in only_hindi file.`
	`32`	`+`
	`33`	`+## Input`
	`34`	`+`
	`35`	`+![Image](C:\Users\ZAVERI SANYA\Desktop\Amazing-Python-Scripts\Remove_POS_hindi_text\Input.png)`
	`36`	`+`
	`37`	`+## Output`
	`38`	`+![Image](C:\Users\ZAVERI SANYA\Desktop\Amazing-Python-Scripts\Remove_POS_hindi_text\Output.png)`
	`39`	`+`
	`40`	`+`
	`41`	`+## Author(s)`
	`42`	`+`
	`43`	`+- This code is written by Sanya Devansh Zaveri. [https://github.com/zaverisanya]`
	`44`	`+`
	`45`	`+## Disclaimers, if any`
	`46`	`+`
	`47`	`+There are no disclaimers for this script.`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit c3ad8bf

File tree

6 files changed

6 files changed

`‎Remove_POS_hindi_text/Input.png`

`‎Remove_POS_hindi_text/Only_Hindi.txt`

`‎Remove_POS_hindi_text/Output.png`

`‎Remove_POS_hindi_text/README.md`

0 commit comments