Introduction

Imagine yourself as a first standard student sitting in a classroom wherein your teacher is supervising you, for instance, “this is a dog and this is a cat”. Similarly, in Supervised Machine Learning input is provided to a machine as a labeled dataset. Later, a model can learn from it to provide the result of the problem.

What is Supervised Machine Learning?


Supervised learning is used to find underlying patterns in data that can be applied to an analytics process. This data contains labels and features. For instance, there could be a dataset of millions of images of animals and contains a description of each animal and we can create a machine learning application that identifies each animal. Continous labels mean Regression whereas discrete labels mean Classification.


The algorithms are trained using training data(both labels and features are feed to machine), and then the performance of the algorithms is measured with test data. Supervised learning models have been applied to various business problems, for instance in fraud detection, recommendation solutions, speech recognition, or risk analysis. Let us further explore some Supervised Machine Learning Algorithms that every Data Engineer or Data Geeks must know!

Linear Regression

It represents the relationship between two variables. And shows how the change in one variable affects the other. It shows the impact of changing the independent variable on dependent variables. Further, Independent variables are referred to features and dependent variables are referred to labels. Linear Regression is grouped into two types:

  1. Simple linear regression: One feature can predict the label.
  2. Multiple linear regression: Multiple features can predict the label.

Applications of Linear Regression

  • Business: For sales forecasting.
  • Risk Assessment: In insurance or financial domain.
  • Predictive Analytics
  • Analyzing sales drivers

Advantages of Linear Regression Machine

  • Linear Regression is easy to implement and understand.
  • It requires minimal tuning.
  • This algorithm runs fast compared to others.

Support Vector Machines

SVM solves both regression and classification problems. And is most used for classification problems. It is based on finding a hyperplane that divides the dataset into classes.

Support Vectors: These are the points that are nearest to the hyperplane. They are the critical elements since removing them would alter the position of the dividing hyperplane. They are further classified into two categories:

  1. Linear SVM: Here the classifiers are separated by a hyperplane(a line that linearly separates and classifies a set of data).
  2. Non-Linear SVM: It is impossible for hyperplane to separate the training data since it is too complex. For complex data, it is not possible to find a representation for every feature vector.

Applications of SVM

  • In handwritten digit recognition.
  • Segmenting audience
  • For image recognition.
  • Used in category assignment, spam detection and sentiment analysis.
  • For stock market forecasting.

Advantages of using SVM

  • Offers the best accuracy.
  • Does do not do overfitting
  • Works well on a smaller dataset.
  • Does not make any assumption on data.

Naïve Bayes Classifier

It is a classification technique based on Bayes’ Theorem of probability. It believes that the presence of one feature is not related to the presence of another feature. For moderate or large training datasets it can be used. It is practically not possible to classify lengthy text, then it can be used. For instance, it’s difficult to classify a web page, a document, or an email. Bayes Theorem equation is as given below:

P(A|B)=(P(B|A)P(A))/P(B)

Applications of Naïve Bayes Classifier

  • Sentiment Analysis: For analyzing negative or positive sentiments.
  • Document Categorization: Google uses this for PageRank. It looks for pages marked as important in the databases and therefore, classifies them using a document classification technique.
  • Email Spam Filtering: Google uses it for grouping emails as Spam or Not Spam.
  • Recommendation System: Naive Bayes along with collaborative filtering builds Recommendation System.

Advantages of the Naïve Bayes Classifier

  • When the input variables are categorical it performs well as compared to numerical values.
  • When we require less training data then, Naive Bayes Classifiers work better as compared to other algorithms like Logistic Regression.
  • It is easier to predict the class of test data with it.

Logistic Regression

Unlike the actual regression, it does not predict the numeric values. Instead, it predicts the following input belongs to the certain class. The linear model is fit into the feature space, therefore, “Regression” is used as a name. The algorithm uses a Logistic Function to linear data to predict the variables belongs to which class. Logistic regression can be classified into 3 types on the basis of the categorical response:

  1. Binary Logistic Regression: when the categorical response has 2 possible outcomes i.e. either yes or not.
  2. Multi-nominal Logistic Regression: when the categorical response has 3 possible outcomes. For instance, whether the input image is of dog, cat or rabbit.
  3. Ordinal Logistic Regression: when the categorical response has 3 or more possible outcomes. For example restaurant rating(0- 10).

Applications of Logistic Regression

  • To identify risk factors for diseases.
  • To predict whether the candidate will win a political election or not.
  • For grouping words as nouns, pronouns, verbs, etc.
  • In weather Forecasting.

Advantages of using Logistic Regression

  • This is a robust algorithm.
  • It can also handle non-linear effects since it does not assume a linear relation between features and labels.
  • It is easy to inspect and less complex.

Decision Tree

It is a graphical representation of the dataset. It uses branching methodology to find all possible outcomes of a decision. Here, the internal node represents a test, each branch represents the outcome and the leaf node represents a particular class label. They can be used to handle missing values since they look for the data in other columns.

Types of Decision Trees

  1. Classification Trees: Default type of trees, used to separate a dataset into different classes.
  2. Regression Trees: Used when the response or target variable is continuous or numerical.

Applications of Decision Tree

  • Finance Sector: For option pricing.
  • For pattern recognition.
  • Customer’s Sentiment Analysis
  • Banks: To classify loan applicants by the possibility of defaulting payments.
  • Gerber Products: They used to decide whether they should continue using the plastic PVC in their products.

Advantages of Using Decision Tree

  • Handle both categorical and numerical data.
  • Can be explained to anyone with ease, since they are self-explanatory.
  • Missing data won’t stop us from splitting the data for building a decision tree. 
  • Outliers won’t affect the decision tree.

Supervised Machine Learning is the Machine Learning technique that is used by most of the systems across the world. As we have seen in this blog, we can get benefits by implementing machine learning in different areas- sales and marketing, business, management and the list goes on.