6
  • Which are the fundamental criterias for using supervised or unsupervised learning?
  • When is one better than the other?
  • Is there specific cases when you can only use one of them?

Thanks

asked Jul 4, 2017 at 13:49

3 Answers 3

6
  1. If you a have labeled dataset you can use both. If you have no labels you only can use unsupervised learning.

  2. It ́s not a question of "better". It ́s a question of what you want to achieve. E.g. clustering data is usually unsupervised – you want the algorithm to tell you how your data is structured. Categorizing is supervised since you need to teach your algorithm what is what in order to make predictions on unseen data.

  3. See 1.

On a side note: These are very broad questions. I suggest you familiarize yourself with some ML foundations.

Good podcast for example here: http://ocdevel.com/podcasts/machine-learning

Very good book / notebooks by Jake VanderPlas: http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/Index.ipynb

answered Jul 4, 2017 at 14:59
Sign up to request clarification or add additional context in comments.

Comments

1

Depends on your needs. If you have a set of existing data including the target values that you wish to predict (labels) then you probably need supervised learning (e.g. is something true or false; or does this data represent a fish or cat or a dog? Simply put - you already have examples of right answers and you are just telling the algorithm what to predict). You also need to distinguish whether you need a classification or regression. Classification is when you need to categorize the predicted values into given classes (e.g. is it likely that this person develops a diabetes - yes or no? In other words - discrete values) and regression is when you need to predict continuous values (1,2, 4.56, 12.99, 23 etc.). There are many supervised learning algorithms to choose from (k-nearest neighbors, naive bayes, SVN, ridge..)

On contrary - use the unsupervised learning if you don't have the labels (or target values). You're simply trying to identify the clusters of data as they come. E.g. k-Means, DBScan, spectral clustering..)

So it depends and there's no exact answer but generally speaking you need to:

  1. Collect and see you data. You need to know your data and only then decide which way you choose or what algorithm will best suite your needs.

  2. Train your algorithm. Be sure to have a clean and good data and bear in mind that in case of unsupervised learning you can skip this step as you don't have the target values. You test your algorithm right away

  3. Test your algorithm. Run and see how well your algorithm behaves. In case of supervised learning you can use some training data to evaluate how well is your algorithm doing.

There are many books online about machine learning and many online lectures on the topic as well.

answered Jul 4, 2017 at 15:20

Comments

0

Depends on the data set that you have. If you have target feature in your hand then you should go for supervised learning. If you don't have then it is a unsupervised based problem. Supervised is like teaching the model with examples. Unsupervised learning is mainly used to group similar data, it plays a major role in feature engineering. Thank you..

Dharman
34k27 gold badges105 silver badges156 bronze badges
answered Mar 27, 2022 at 16:33

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.