AI / Machine Learning · 09/04/2025

What is Machine Learning? (Part 1)

MACHINE LEARNING

Machine Learning (ML) is a subfield of artificial intelligence (AI) that focuses on developing algorithms and statistical models that enable computer systems to learn from data and make predictions or judgments based on that data. ML is a subset of AI. In other words, it is a branch of study and practice in which computers are taught to carry out tasks without being explicitly programmed for each activity.

SUPERVISED AND UNSUPERVISED LEARNING
Supervised learning and unsupervised learning are two fundamental categories of machine learning, each with its own distinct characteristics and use cases:

A. Supervised Learning

Definition: Supervised learning is a type of machine learning where the algorithm learns from labeled training data. It involves a clear relationship between input and output, where the model is trained to map input data to corresponding target labels.

Key Characteristics:

  • The training data consists of pairs of input features (attributes or variables) and their corresponding target labels.
  • The goal is to learn a mapping function to accurately predict the target label for new, unseen data based on the input features.
  • Supervised learning tasks include classification (predicting discrete labels) and regression (predicting continuous values).

Examples:

  • Classification: Spam email detection, image recognition, sentiment analysis.
  • Regression: House price prediction, stock price forecasting, temperature prediction.

Evaluation Metrics:

  • For classification: Accuracy, precision, recall, F1-score, ROC curve, etc.
  • For regression: Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R-squared, etc.

Workflow:
Data pre-processing, feature engineering, model selection, training, evaluation, and testing are common steps in a supervised learning pipeline.

B. Unsupervised Learning

Definition: Unsupervised learning is a type of machine learning where the algorithm learns from unlabeled data. It aims to discover patterns, structures, or relationships in the data without explicit guidance or target labels.

Key Characteristics:

  • The training data does not contain labeled target information.
  • The primary objective is to explore the inherent structure within the data, such as clustering similar data points or reducing dimensionality.

Examples:

  • Clustering: Grouping similar customer profiles for targeted marketing.
  • Dimensionality Reduction: Reducing the number of features while preserving essential information (e.g., Principal Component Analysis).

Evaluation Metrics: Evaluation in unsupervised learning can be more challenging since no ground truth labels exist. Metrics depend on the specific task, such as silhouette score for clustering or explained variance ratio for dimensionality reduction.

Workflow:

  • Data preprocessing, model selection, training, and evaluation are steps in an unsupervised learning pipeline.
  • Visualization and domain knowledge are often essential for interpreting results.

Semi-Supervised Learning:
In addition to supervised and unsupervised learning, there is also a category known as semi-supervised learning, which combines elements of both. In semi-supervised learning, you have a small amount of labeled data and a more significant amount of unlabeled data. The goal is to leverage both sources of information to build better models.

In summary, supervised learning is about learning from labeled data to make predictions, unsupervised learning is about discovering patterns in unlabeled data, and semi-supervised learning combines elements of both to leverage labeled and unlabeled data for learning and prediction.

CLASSIFICATION

In machine learning, classification is a fundamental concept that entails categorizing data into preset classes or labels depending on the characteristics or characteristics of the data. Many different industries make extensive use of classification models. These industries include finance, healthcare, natural language processing, image identification, etc. Here’s a comprehensive write-up on classification models in machine learning:

Introduction to Classification:
Classification is a supervised learning task that assigns a class label to an input data point. The classes can be binary (two classes, often referred to as positive and negative) or multi-class (more than two classes). Classification aims to build a model that can generalize from the training data to make accurate predictions on new, unseen data.

MULTINOMIAL LOGISTIC REGRESSION

Overview
Multinomial logistic regression is employed when the dependent variable comprises many categories exceeding two. It is an augmentation of the fundamental logistic regression model. For instance, if the dependent variable comprises three categories, A, B, and C, we would estimate regression models for A against B and A versus C. It is essential to acknowledge that these models are estimated concurrently. This model predicts the probabilities of a categorical dependent variable with many outcome classes. The dependent variables are categorical in type. This indicates that the target classes lack any form of hierarchy, meaning they cannot be effectively sequenced (e.g., 1, 2, 3, 4, etc.). Topics pertaining to multinomial logistic regression encompass:

Favorite ice cream flavor * Dependent Variable: Ice cream flavor – Chocolate, Vanilla, Coffee, Strawberry * Independent Variables: Gender, Age, Ethnicity

Current employment status * Dependent Variable: Employment status – Full-time employment, Part-time employment, Training, Unemployed * Independent Variables: Gender, Age, Qualifications, Area a person lives, number of children

Sentencing outcome status * Independent Variable: Sentencing status – Prison, Community Penalty, Fine, Discharge * Dependent Variables: Gender, Age, Previous Convictions

Assumptions:

Multinomial logistic regression assumes that:

  1. Each independent variable has a single value for each case.
  2. The dependent variable cannot be perfectly predicted from the independent variables for each case.
  3. No multicollinearity between independent variables occurs when two or more independent variables are highly correlated. This makes it difficult to understand how much every independent variable contributes to the category of dependent variable.
  4. There should be no outliers in the data points.