|
| 1 | +# Package/Script Name |
| 2 | + |
| 3 | +Short description of package/script |
| 4 | + |
| 5 | +-->Package installed- NLKT |
| 6 | +- NLTK stands for 'Natural Language Tool Kit'. It consists of the most common algorithms such as tokenizing, part-of-speech tagging, stemming, sentiment analysis, topic segmentation, and named entity recognition. NLTK helps the computer to analysis, preprocess, and understand the written text. |
| 7 | + |
| 8 | + |
| 9 | +## Setup instructions |
| 10 | + |
| 11 | +--> Explanation on how to setup and run your package/script locally |
| 12 | +- simply import the NLKT package by writing 'import nlkt' in first line of your script. |
| 13 | +- To run the script locally save the 'Tagged_Hindi_Corpus.txt' file at your favourable location. |
| 14 | +- In code, in fp=open(r"..."), give the location of your saved file as mentioned in previous step. |
| 15 | +- In code, in fd=open(r"..."), give the location where you want the file with only Hindi text after removal of POS. |
| 16 | +- Note that for this script, I have run the script therefore only_hindi.txt file already exists. Before executing your script make sure you delete 'only_hindi.txt' file and see it after running the script. |
| 17 | +- Run the script with "python hindi_POS_tag_removal.py OR python <name of your py file.py>" |
| 18 | +- You will be able to see the file with only Hindi text. |
| 19 | + |
| 20 | + |
| 21 | +## Detailed explanation of script, if needed |
| 22 | + |
| 23 | +Script is written as follows: |
| 24 | + |
| 25 | +- Open the hindi_tagged_corpus file. |
| 26 | +- Data tokenization. |
| 27 | +- Create 2 empty lists. |
| 28 | +- To get all categories from POS. |
| 29 | +- To get all the hindi words. |
| 30 | +- To concatenate the words. |
| 31 | +- To write the words in only_hindi file. |
| 32 | + |
| 33 | +## Input |
| 34 | + |
| 35 | + |
| 36 | + |
| 37 | +## Output |
| 38 | + |
| 39 | + |
| 40 | + |
| 41 | +## Author(s) |
| 42 | + |
| 43 | +- This code is written by Sanya Devansh Zaveri. [https://github.com/zaverisanya] |
| 44 | + |
| 45 | +## Disclaimers, if any |
| 46 | + |
| 47 | +There are no disclaimers for this script. |
0 commit comments