0

I'm trying to build a sentiment analysis model on a csv file on using the text analytics api on azure

This is the code I used:

for j in range(0,num_of_batches): # this loop will add num_of_batches strings to input_texts
 input_texts.set_value(j,"") # initialize input_texts string j
 for i in range(j*l//num_of_batches,(j+1)*l//num_of_batches): #loop through a window of rows from the dataset
 comment = str(mydata["tweet"][i]) #grab the comment from the current row
 comment = comment.replace("\"", "'") #remove backslashes (why? I don’t remember. #honestblogger)
 #add the current comment to the end of the string we’re building in input_texts string j 
 input_texts.set_value(j, input_texts[j] + '{"language":"' + "pt"',"id":"' + str(i) + '","text":"'+ comment + '"},')
 #after we’ve looped through this window of the input dataset to build this series, add the request head and tail
 input_texts.set_value(j, '{"documents":[' + input_texts[j] + ']}')
headers = {'Content-Type':'application/json', 'Ocp-Apim-Subscription-Key':account_key}
Sentiment = pd.Series()
batch_sentiment_url = "https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment"

until now every thing is fine, but when I try to get the data from the api I get an error in the final part

for j in range(0,num_of_batches):
 # Detect sentiment for the each batch.
 req = urllib2.Request(batch_sentiment_url, input_texts[j], headers) 
 response = urllib2.urlopen(req)
 result = response.read()
 obj = json.loads(result.decode('utf-8'))
 #loop through each result string, extracting the sentiment associated with each id
 for sentiment_analysis in obj['documents']:
 Sentiment.set_value(sentiment_analysis['id'], sentiment_analysis['score']) 
#tack our new sentiment series onto our original dataframe
mydata.insert(len(mydata.columns),'Sentiment',Sentiment.values)

This error

HTTPError: HTTP Error 400: Bad Request
asked Jun 21, 2017 at 7:43

2 Answers 2

1

You're getting a 400 error because your JSON is malformed (mismatched quotes around 'pt'). I don't think you're doing yourself any favors by using the pandas module for the outgoing request, or attempting to hand-craft the JSON. In particular you are vulnerable to errant quote marks or escape characters screwing things up.

Here's how you might do it instead:

input_texts = []
for j in range(0,num_of_batches): # this loop will add num_of_batches strings to input_texts
 documents = []
 for i in range(j*l//num_of_batches,(j+1)*l//num_of_batches): #loop through a window of rows from the dataset
 documents.append({
 'language':'pt',
 'id': str(i),
 'text': str(mydata["tweet"][i])})
 input_texts.append({'documents':documents})
...
req = urllib2.Request(batch_sentiment_url, json.dumps(input_texts[j]), headers)
answered Jun 21, 2017 at 15:37
Sign up to request clarification or add additional context in comments.

3 Comments

It worked very well and returned the results but the length of my data is 1544 and the length of the sentiments returned are 1543. How can I find the missing record or drop it ! Thanks a lot
By 'length of data', do you mean count of documents? You can correlate the input and output using the id field.
for future reference I added tweetid field instead of id then I created a new dataframe for the sentiment results and the tweetid, concatenated it with the original data frame to drop the missing record and by length of data I mean number of records Thanks for the help you saved my day :)
0

Always validate API calls using curl first. Afterwards insert into code. This curl line works for me:

curl -k -X POST -H "Ocp-Apim-Subscription-Key: <your ocp-apim-subscription-key>" -H "Content-Type: application/json" --data "{ 'documents': [ { 'id': '12345', 'text': 'now is the time for all good men to come to the aid of their party.' } ] }" "https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment"
answered Jun 21, 2017 at 9:54

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.