2
\$\begingroup\$

This basic function to detect negation while performing sentiment analysis. So that not good kind of words can be considered a negative.

This code is working fine. Can someone help to improve it or find if any bug present?

def negate_sequence(self,text):
 """
 Detects negations and transforms negated words into "not_" form.
 """
 negation = False
 delims = "?.,!:;"
 result = []
#Here rather then applying split, we can directly feed our extracted symptoms list
 words = text.split()
 prev = None
 pprev = None
 for word in words:
 # stripped = word.strip(delchars)
 stripped = word.strip(delims).lower()
 negated = "not_" + stripped if negation else stripped
 result.append(negated)
 if prev:
 bigram = prev + " " + negated
 result.append(bigram)
 if pprev:
 trigram = pprev + " " + bigram
 result.append(trigram)
 pprev = prev
 prev = negated
 if any(neg in word for neg in ["not", "n't", "no"]):
 negation = not negation
 if any(c in word for c in delims):
 negation = False
 return result
200_success
146k22 gold badges190 silver badges479 bronze badges
asked Oct 20, 2016 at 12:55
\$\endgroup\$

2 Answers 2

3
\$\begingroup\$

Here are a few suggestions on how to improve the code:

State the intent

What is the purpose of the function? What is the input, what is the output? In human words, what does the algorithm do? A docstring answering those basic questions would be helpful. A series of concise unit tests would be great.

I ran the function to have a look at the output:

>>> text = "The weather is not good."
>>> result = negate_sequence(self=None, text=text)
>>> print(result)
['the', 'weather', 'the weather', 'is', 'weather is', 'the weather is', 'not', 'is not',
'weather is not', 'not_good', 'not not_good', 'is not not_good']

This doesn't ring a bell with me, so I stopped trying to understand the purpose.

Avoid stateful loops

Iteration i is coupled to iteration i-1 by the negation variable, this makes the logic hard to understand and error prone. If you work on bigrams/trigrams, I'd create a list of bigrams/trigrams and iterate over the tuples. This decouples the iterations.

Breakup long functions

This has almost endless benefits, as a starting point see this article. Some possibilities:

  • Have the text broken up into all lowercase and without punctuation by extract_words(text)
  • Have the list of trigrams created by make_trigrams(words)
  • Inspect the trigrams by process(trigrams)
  • If needed, have some kind of aggregate(results)

Once this is done, I guess we are much better to prepared to identify bugs and to further improve functionality.

answered May 27, 2019 at 15:02
\$\endgroup\$
1
\$\begingroup\$

I can give a suggestion.

Can you just make it more useful by replacing it with real autonyms? That would be way more useful than making it negative without parsing up front.

AlexV
7,3532 gold badges24 silver badges47 bronze badges
answered May 27, 2019 at 13:28
\$\endgroup\$
1
  • 1
    \$\begingroup\$ Welcome to Code Review. Please note that this question is more than two and a half years old. Also, currently your answer is not a good one since you do not provide a reasoning for your recommendation. \$\endgroup\$ Commented May 27, 2019 at 14:00

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.