Negation detection in sentiment analysis

Question 1

This basic function to detect negation while performing sentiment analysis. So that not good kind of words can be considered a negative.

This code is working fine. Can someone help to improve it or find if any bug present?

def negate_sequence(self,text):
 """
 Detects negations and transforms negated words into "not_" form.
 """
 negation = False
 delims = "?.,!:;"
 result = []
#Here rather then applying split, we can directly feed our extracted symptoms list
 words = text.split()
 prev = None
 pprev = None
 for word in words:
 # stripped = word.strip(delchars)
 stripped = word.strip(delims).lower()
 negated = "not_" + stripped if negation else stripped
 result.append(negated)
 if prev:
 bigram = prev + " " + negated
 result.append(bigram)
 if pprev:
 trigram = pprev + " " + bigram
 result.append(trigram)
 pprev = prev
 prev = negated
 if any(neg in word for neg in ["not", "n't", "no"]):
 negation = not negation
 if any(c in word for c in delims):
 negation = False
 return result

Question 2

Here are a few suggestions on how to improve the code:

State the intent

What is the purpose of the function? What is the input, what is the output? In human words, what does the algorithm do? A docstring answering those basic questions would be helpful. A series of concise unit tests would be great.

I ran the function to have a look at the output:

>>> text = "The weather is not good."
>>> result = negate_sequence(self=None, text=text)
>>> print(result)
['the', 'weather', 'the weather', 'is', 'weather is', 'the weather is', 'not', 'is not',
'weather is not', 'not_good', 'not not_good', 'is not not_good']

This doesn't ring a bell with me, so I stopped trying to understand the purpose.

Avoid stateful loops

Iteration i is coupled to iteration i-1 by the negation variable, this makes the logic hard to understand and error prone. If you work on bigrams/trigrams, I'd create a list of bigrams/trigrams and iterate over the tuples. This decouples the iterations.

Breakup long functions

This has almost endless benefits, as a starting point see this article. Some possibilities:

Have the text broken up into all lowercase and without punctuation by extract_words(text)
Have the list of trigrams created by make_trigrams(words)
Inspect the trigrams by process(trigrams)
If needed, have some kind of aggregate(results)

Once this is done, I guess we are much better to prepared to identify bugs and to further improve functionality.

Question 3

I can give a suggestion.

Can you just make it more useful by replacing it with real autonyms? That would be way more useful than making it negative without parsing up front.

Question 4

Welcome to Code Review. Please note that this question is more than two and a half years old. Also, currently your answer is not a good one since you do not provide a reasoning for your recommendation.

paulw paulw 913 bronze badges · Answer 1 · 2019-05-27 15:02:50Z

Here are a few suggestions on how to improve the code:

State the intent

What is the purpose of the function? What is the input, what is the output? In human words, what does the algorithm do? A docstring answering those basic questions would be helpful. A series of concise unit tests would be great.

I ran the function to have a look at the output:

>>> text = "The weather is not good."
>>> result = negate_sequence(self=None, text=text)
>>> print(result)
['the', 'weather', 'the weather', 'is', 'weather is', 'the weather is', 'not', 'is not',
'weather is not', 'not_good', 'not not_good', 'is not not_good']

This doesn't ring a bell with me, so I stopped trying to understand the purpose.

Avoid stateful loops

Iteration i is coupled to iteration i-1 by the negation variable, this makes the logic hard to understand and error prone. If you work on bigrams/trigrams, I'd create a list of bigrams/trigrams and iterate over the tuples. This decouples the iterations.

Breakup long functions

This has almost endless benefits, as a starting point see this article. Some possibilities:

Have the text broken up into all lowercase and without punctuation by extract_words(text)
Have the list of trigrams created by make_trigrams(words)
Inspect the trigrams by process(trigrams)
If needed, have some kind of aggregate(results)

Once this is done, I guess we are much better to prepared to identify bugs and to further improve functionality.

Sreedath P S Sreedath P S 111 bronze badge · Answer 2 · 2019-05-27 13:28:45Z

1

\$\begingroup\$

I can give a suggestion.

Can you just make it more useful by replacing it with real autonyms? That would be way more useful than making it negative without parsing up front.

Share

edited May 27, 2019 at 13:57

AlexV's user avatar

AlexV

7,3532 gold badges24 silver badges47 bronze badges

answered May 27, 2019 at 13:28

Sreedath P S's user avatar

Sreedath P S Sreedath P S

111 bronze badge

\$\endgroup\$

1

1

\$\begingroup\$ Welcome to Code Review. Please note that this question is more than two and a half years old. Also, currently your answer is not a good one since you do not provide a reasoning for your recommendation. \$\endgroup\$

AlexV
– AlexV

2019年05月27日 14:00:11 +00:00
Commented May 27, 2019 at 14:00

Add a comment |

Stack Exchange Network

Negation detection in sentiment analysis

2 Answers 2

State the intent

Avoid stateful loops

Breakup long functions

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Negation detection in sentiment analysis

2 Answers 2

State the intent

Avoid stateful loops

Breakup long functions

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions