Brief Introduction to Machine Learning: Definition & Steps Involved

Jaishri Rai
3 min readMay 18, 2023

The idea of this article is to keep a repository of learnings gained from books, articles, and discussions.

Source: Internet, Research Gate

What is machine learning? Machine Learning is about creating an ability with machines to learn from data and implement required mechanisms. Like if I have an retail shop, then I can use data from my past sales or sales data from different organisations. Further I can corelate with different types of customers, their products, wishes and tastes. Based on this one can plan inventory, procurement items and their quantities, suitable pricing for a product and hence a suitable marketing strategy and tentative budget plan. Basically we are feeding data to solve a particular problem statement. So, through some algorithm, simple or complex, we are making our machine capable enough to identify solutions from underlying data.

Identifying problem statement, gathering suitable data, preparing or cleaning data, deciding important features, defining relations among important features and then finally identifying a suitable logic for required algorithm are some of the steps involved in process. While finalising algorithm/model, evaluation of model for different test and train data set is required.

Key aspects: With these three aspects in hand, one can do a good start for developing ML models.

  1. Understading problem statement,
  2. Understading data,
  3. Knowledge of logics and ideas behind classical ML algorithms

Now, if we summarise theoretically, some of the basic steps involved in ML Peojetcs are:-

Step 1: Problem or Opportunity Identification

  • A good ML project starts with the ability of the organization to define the problem clearly. Domain knowledge is very important at this stage of the project.
  • Problem definition or opportunity identification will be a major challenge for many companies who do not have capabilities to ask right questions.

Step 2: Feature Extraction − Collection of Relevant Data

  • Once the problem is defined clearly, the project team should identify and collect the relevant data. This is an iterative process since ‘relevant data’ may not be known in advance in many analytics projects. The existence of ERP systems will be very useful at this stage. In addition to the data available within the organization, they have to collect data from external sources. The data needs to be integrated to create a data lake. Quality of data is a major impediment for successful ML model development.

Step 3: Data Pre-processing

  • Anecdotal evidence suggests that data preparation and data processing form a significant proportion of any analytics project. This would include data cleaning and data imputation and the creation of additional variables (feature engineering) such as interaction variables and dummy variables.

Step 4: Model Building

  • ML model building is an iterative process that aims to find the best model. Several analytical tools and solution procedures will be used to find the best ML model.
  • To avoid overfitting, it is important to create several training and validation datasets.

Step 5: Communication and Deployment of the Data Analysis

  • The primary objective of machine learning is to come up with actionable items that can be deployed.
  • The communication of the ML algorithm output to the top management and clients plays a crucial role. Innovative data visualization techniques may be used in this stage.
  • Deployment of the model may involve developing software solutions and products, such as recommender engine

Sources:

Machine Learning Using Python, Manaranjan Pradhan; Open Internet

--

--

Jaishri Rai

Someone who wants to dig deep in hope that one day my thoughts, my resentments will become part of my armory to make someone’s life better.