Fundamentals of AI Engineering

Foundational Models

Transformers

Fine Tuning

Vector Databases

RAG

LangChain

Neural networks are the building blocks of modern AI and power applications like self-driving cars, chatbots, and image recognition. Unlike traditional programs, which follow predefined rules, neural networks learn from data by recognizing patterns.

1. Understanding Neural Networks

A. What is a Neural Network?

A neural network is a system of artificial neurons that work together to process information. It is inspired by how the human brain works.

Imagine a self-driving car trying to recognize a stop sign :

The car’s camera captures an image.
The neural network analyzes the image and detects patterns.
It determines whether the image contains a stop sign.

B. Structure of a Neural Network

A neural network is made up of three layers:

Example: Recognizing handwritten digits (0-9) using a neural network.

Input: The pixels of the image (grayscale values).
Hidden Layers: Extract important features (like curves or lines).
Output: The number the image represents.

Think of a neural network like a chef preparing a meal:

Ingredients (input layer) → Raw data.
Cooking process (hidden layers) → Transforming data into something useful.
Final dish (output layer) → The result (e.g., prediction).

2. Types of Neural Networks

Neural networks come in different types, depending on what they are used for.

A. Multi-Layer Perceptron (MLP) – The Basic Neural Network

The simplest type of neural network.
Used for tasks like spam detection (email is spam or not).

B. Convolutional Neural Networks (CNNs) – For Images

Specially designed for image recognition.
Used in self-driving cars, facial recognition, and medical imaging.

C. Recurrent Neural Networks (RNNs) – For Sequences

Processes time-based data like speech, music, and stock prices.
Used in chatbots and real-time translation (Google Translate).

Think of different neural networks like different types of athletes:

MLP → A runner (simple tasks).
CNN → A gymnast (identifies shapes and patterns).
RNN → A musician (remembers past data and sequences).

3. How Neural Networks Learn (Training Process)

A. The Learning Process

Neural networks learn by adjusting their internal connections (weights and biases).

Example: Teaching a child to recognize apples

Show the child pictures of apples and non-apples.
If they make a mistake, correct them.
Repeat the process until they can recognize apples correctly.

Neural networks learn in a similar way by:

Receiving input data (e.g., an image of an apple).
Making a prediction (e.g., “this is an apple”).
Checking the error (Was the prediction correct?).
Adjusting itself to make better future predictions.

B. The Role of Activation Functions

Activation functions decide whether a neuron should “fire” (activate) based on its input.

Think of activation functions like a decision filter:

Sigmoid: Decides if a light switch should turn on/off.
ReLU: Ignores weak signals but reacts to strong ones.

C. How Training Works: The Role of Errors

After each prediction, the network checks how wrong or right it was.
It adjusts itself using an optimization process called backpropagation.
Over time, the network improves, just like a student who learns from mistakes.

Think of backpropagation like a coach helping an athlete:

The coach points out mistakes (error feedback).
The athlete adjusts their technique (learning).
Over time, performance improves.

4. Case Study: AI-Powered Handwriting Recognition

Problem: Traditional OCR (Optical Character Recognition) struggles with handwriting variations. Solution: A Convolutional Neural Network (CNN) is trained to recognize handwritten text.

Workflow:

Data Collection: Handwritten samples are collected.
Model Training: The CNN is trained on thousands of handwritten examples.
Deployment: Used in mobile banking apps for check deposits.

Result: AI-powered OCR improves accuracy in reading handwritten text, reducing manual data entry.

Simple Artificial Neuron Simulation

import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Define input, weights, and bias
inputs = np.array([1, 0])  # Binary input
weights = np.array([0.5, -0.5])  # Weight coefficients
bias = 0.1  # Bias term

# Compute output
output = sigmoid(np.dot(inputs, weights) + bias)
print("Output:", output)

Output:

Output: 0.6456563062257954

Code Breakdown

This code simulates a simple artificial neuron:

Imports NumPy for numerical operations.
Defines the sigmoid function, which squashes values between 0 and 1.
Initializes inputs, weights, and bias:
- Inputs: [1, 0] (binary values).
- Weights: [0.5, -0.5] (determines feature importance).
- Bias: 0.1 (shifts activation threshold).
Computes the weighted sum using the dot product of inputs and weights, then adds the bias.
Applies the sigmoid function to transform the result into a probability-like value.
Prints the final output, which determines neuron activation.

Simple Neural Network Using PyTorch

import torch
import torch.nn as nn

# Define an MLP model
class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(2, 4)  # Input layer to hidden layer
        self.relu = nn.ReLU()  # Activation function
        self.fc2 = nn.Linear(4, 1)  # Hidden layer to output layer
        self.sigmoid = nn.Sigmoid()  # Output activation

    def forward(self, x):
        x = self.relu(self.fc1(x))
        x = self.sigmoid(self.fc2(x))
        return x

# Instantiate model
model = MLP()
print(model)

Output:

MLP(
  (fc1): Linear(in_features=2, out_features=4, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=4, out_features=1, bias=True)
  (sigmoid): Sigmoid()
)

Code Breakdown

Importing Necessary Libraries:

torch: The main library for working with deep learning in Python.
torch.nn: Contains tools to build neural networks.

Defining the Neural Network (MLP Model):

The MLP class is created to define the neural network.
It inherits from nn.Module, which is required for PyTorch models.
The __init__ method sets up two layers:
- fc1: Connects 2 input numbers to 4 neurons in a hidden layer.
- fc2: Connects 4 hidden neurons to 1 output neuron.
The model uses:
- ReLU activation function to allow the model to learn better.
- Sigmoid activation function to make sure the output is between 0 and 1 (useful for yes/no decisions).

How the Model Processes Data (forward Method):

The input is passed through fc1, then the ReLU activation is applied.
It is then passed through fc2, followed by sigmoid activation.
This defines how the model transforms the input into an output.

Creating and Printing the Model:

model = MLP() Creates the neural network 
print(model)  Displays the structure of the model

XOR Neural Network

# Define loss function and optimizer
criterion = nn.BCELoss()  # Binary Cross-Entropy Loss
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Sample training data
X_train = torch.tensor([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]])
y_train = torch.tensor([[0.0], [1.0], [1.0], [0.0]])  # XOR dataset labels

# Training loop
epochs = 1000
for epoch in range(epochs):
    optimizer.zero_grad()
    outputs = model(X_train)  
    loss = criterion(outputs, y_train)
    loss.backward()
    optimizer.step()

    if epoch % 200 == 0:
        print(f"Epoch {epoch}, Loss: {loss.item()}")

# Final output after training
print("Final model output:", model(X_train))

Output:

Epoch 0, Loss: 0.6961403489112854
Epoch 200, Loss: 0.22329354286193848
Epoch 400, Loss: 0.04864644259214401
Epoch 600, Loss: 0.019203947857022285
Epoch 800, Loss: 0.010208996012806892
Final model output: tensor([[0.0055],
        [0.9953],
        [0.9897],
        [0.0047]], grad_fn=<SigmoidBackward0>)

Code Breakdown

This code trains a neural network to learn the XOR function, a basic logic operation where:

0 XOR 0 = 0
0 XOR 1 = 1
1 XOR 0 = 1
1 XOR 1 = 0

Step-by-Step Breakdown:

1. Define Loss Function & Optimizer

The Binary Cross-Entropy Loss (BCELoss) measures how well the model’s predictions match actual outputs.
Adam optimizer updates the model’s weights to minimize the loss and improve predictions.

2. Define the Training Data (XOR Dataset)

The input X_train consists of 4 possible binary combinations (0s and 1s).
The expected output y_train follows the XOR truth table.

3. Training the Model (1000 Epochs)

The model predicts outputs, calculates loss, and adjusts weights using gradient descent.
Loss is printed every 200 epochs to track improvement.

4. Model Evaluation

After training, the model makes predictions for X_train, and the results are displayed.

Understanding the Output

The model prints loss values and final predictions. Let’s analyze them:

1. Loss at Different Epochs

2. Final Model Predictions

tensor([[0.0055], [0.9953], [0.9897], [0.0047]])

These values represent the model’s predicted probabilities (between 0 and 1) after applying the Sigmoid activation function.

The model has successfully learned the XOR function!

Predictions are very close to the expected values.
It correctly classifies XOR outputs after training.

Login

What is a Neural Network? Basics of Deep Learning

Fundamentals of AI Engineering

Foundational Models

Transformers

Fine Tuning

Vector Databases

RAG

LangChain

1. Understanding Neural Networks

A. What is a Neural Network?

B. Structure of a Neural Network

2. Types of Neural Networks

A. Multi-Layer Perceptron (MLP) – The Basic Neural Network

B. Convolutional Neural Networks (CNNs) – For Images

C. Recurrent Neural Networks (RNNs) – For Sequences

3. How Neural Networks Learn (Training Process)

A. The Learning Process

B. The Role of Activation Functions

C. How Training Works: The Role of Errors

4. Case Study: AI-Powered Handwriting Recognition

Workflow:

Simple Artificial Neuron Simulation

Output:

Code Breakdown

Simple Neural Network Using PyTorch

Output:

Code Breakdown

XOR Neural Network

Output:

Code Breakdown

Understanding the Output