Skip to main content
Code Review

Return to Question

replaced http://stackoverflow.com/ with https://stackoverflow.com/
Source Link
deleted 35 characters in body; edited tags
Source Link
Jamal
  • 35.2k
  • 13
  • 134
  • 238

At the advice of someone on SO I'm posting this here.

I am doing sentiment analysis on tweets. I have code that I developed from following an online tutorial (found here: http://www.laurentluce.com/posts/twitter-sentiment-analysis-using-python-and-nltk/here) and adding in some parts myself, which looks like this:

When I ran this on my sample dataset, it all worked perfectly, although a little inaccurately (training set only had 50 tweets). My REAL training set however has 1.5 million tweets. I'm finding that using the default trainer provided by Python is just far too slow.

Is this too large a dataset to be used with the default Python classifier? Does anybody have any suggestions or alternatives that could be used to do this operation? In all responses please bear in mind I could only accomplish this with a tutorial and am totally new to Python (am usually a Java coder).

Original SO post: http://stackoverflow.com/questions/18154278/is-there-a-maximum-size-for-the-nltk-naive-bayes-classifer#18154932Original SO post

At the advice of someone on SO I'm posting this here.

I am doing sentiment analysis on tweets. I have code that I developed from following an online tutorial (found here: http://www.laurentluce.com/posts/twitter-sentiment-analysis-using-python-and-nltk/) and adding in some parts myself, which looks like this:

When I ran this on my sample dataset, it all worked perfectly, although a little inaccurately (training set only had 50 tweets). My REAL training set however has 1.5 million tweets. I'm finding that using the default trainer provided by Python is just far too slow. Is this too large a dataset to be used with the default Python classifier? Does anybody have any suggestions or alternatives that could be used to do this operation? In all responses please bear in mind I could only accomplish this with a tutorial and am totally new to Python (am usually a Java coder)

Original SO post: http://stackoverflow.com/questions/18154278/is-there-a-maximum-size-for-the-nltk-naive-bayes-classifer#18154932

I am doing sentiment analysis on tweets. I have code that I developed from following an online tutorial (found here) and adding in some parts myself, which looks like this:

When I ran this on my sample dataset, it all worked perfectly, although a little inaccurately (training set only had 50 tweets). My REAL training set however has 1.5 million tweets. I'm finding that using the default trainer provided by Python is just far too slow.

Is this too large a dataset to be used with the default Python classifier? Does anybody have any suggestions or alternatives that could be used to do this operation? In all responses please bear in mind I could only accomplish this with a tutorial and am totally new to Python (am usually a Java coder).

Original SO post

edited tags
Link
200_success
  • 145.5k
  • 22
  • 190
  • 478
Tweeted twitter.com/#!/StackCodeReview/status/366030452076724225
added 131 characters in body
Source Link
Andrew Martin
  • 439
  • 1
  • 5
  • 17
Loading
Source Link
Andrew Martin
  • 439
  • 1
  • 5
  • 17
Loading
lang-py

AltStyle によって変換されたページ (->オリジナル) /