Machine learning (ML) is revolutionizing industries by enabling systems to learn from data, identify patterns, and make decisions with minimal human intervention. In this blog post, we’ll delve into the fascinating world of machine learning, its fundamentals, and the step-by-step process of building algorithms from scratch. Whether you’re a beginner or looking to enhance your skills, this guide provides a practical understanding of the basics of ML and hands-on implementation.
At its core, machine learning is a subset of artificial intelligence (AI) that focuses on creating systems that can improve from experience. Unlike traditional programming, where explicit instructions are given, ML models learn patterns from data to make predictions or decisions.
Let’s build two basic ML algorithms—Linear Regression and K-Nearest Neighbors (KNN)—from scratch using Python.
Linear regression is a supervised learning algorithm used to predict a continuous target variable.
The equation of a line is:
y=mx+cy = mx + cy=mx+c
In ML terms:
y^=w⋅x+b\hat{y} = w \cdot x + by^=w⋅x+b
We minimize the Mean Squared Error (MSE):
MSE=1n∑i=1n(y^i−yi)2MSE = \frac{1}{n} \sum_{i=1}^n (\hat{y}_i – y_i)^2MSE=n1i=1∑n(y^i−yi)2
Python code
import numpy
as np
# Sample data
X = np.array([
1,
2,
3,
4,
5])
y = np.array([
2.2,
2.8,
4.5,
3.7,
5.5])
# Initialize parameters
w =
0
# Weight
b =
0
# Bias
learning_rate =
0.01
epochs =
1000
# Training the modelfor epoch
in
range(epochs):
# Predictions
y_pred = w * X + b
# Compute gradients
dw = -
2 * np.
sum(X * (y - y_pred)) /
len(X)
db = -
2 * np.
sum(y - y_pred) /
len(X)
# Update parameters
w -= learning_rate * dw
b -= learning_rate * db
print(
f”Trained Weight: {w}, Trained Bias: {b}”)
KNN is a simple algorithm that classifies data points based on the majority class among their k-nearest neighbors.
python code
import numpy
as np
from collections
import Counter
# Sample data
X_train = np.array([[
1,
2], [
2,
3], [
3,
1], [
6,
5], [
7,
8]])
y_train = np.array([
0,
0,
0,
1,
1])
# 0: Class A, 1: Class B
def
euclidean_distance(
p1, p2):
return np.sqrt(np.
sum((p1 - p2)**
2))
def
knn_predict(
X_train, y_train, X_query, k=3):
distances = []
for i
in
range(
len(X_train)):
dist = euclidean_distance(X_train[i], X_query)
distances.append((dist, y_train[i]))
distances.sort(key=
lambda x: x[
0])
k_nearest = distances[:k]
k_labels = [label
for _, label
in k_nearest]
return Counter(k_labels).most_common(
1)[
0][
0]
# Test the model
X_query = np.array([
5,
5])
k =
3
predicted_class = knn_predict(X_train, y_train, X_query, k)
print(
f”Predicted Class: {predicted_class}”)
While building algorithms from scratch is an excellent learning exercise, real-world applications often involve libraries like scikit-learn, TensorFlow, or PyTorch. These libraries provide optimized and tested implementations of various ML algorithms.
Building machine learning algorithms from scratch demystifies the underlying concepts and strengthens your foundation in ML. By following the step-by-step process, you can better understand how models work, their limitations, and their potential. As you progress, integrate pre-built libraries and frameworks to tackle more complex problems efficiently.
Are you ready to start your ML journey? Share your thoughts, questions, or feedback below! 🚀
Comments are closed