Login Sign Up

What is a Neural Network? Basics of Deep Learning

Neural networks are the building blocks of modern AI and power applications like self-driving cars, chatbots, and image recognition. Unlike traditional programs, which follow predefined rules, neural networks learn from data by recognizing patterns.

1. Understanding Neural Networks

A. What is a Neural Network?

A neural network is a system of artificial neurons that work together to process information. It is inspired by how the human brain works.

Imagine a self-driving car trying to recognize a stop sign :

  • The car’s camera captures an image.
  • The neural network analyzes the image and detects patterns.
  • It determines whether the image contains a stop sign.

B. Structure of a Neural Network

Frame 3(1)
Neural Network

A neural network is made up of three layers:

 

Example: Recognizing handwritten digits (0-9) using a neural network.

  • Input: The pixels of the image (grayscale values).
  • Hidden Layers: Extract important features (like curves or lines).
  • Output: The number the image represents.

Think of a neural network like a chef preparing a meal:

  • Ingredients (input layer) → Raw data.
  • Cooking process (hidden layers) → Transforming data into something useful.
  • Final dish (output layer) → The result (e.g., prediction).

2. Types of Neural Networks

Neural networks come in different types, depending on what they are used for.

A. Multi-Layer Perceptron (MLP) – The Basic Neural Network

  • The simplest type of neural network.
  • Used for tasks like spam detection (email is spam or not).

B. Convolutional Neural Networks (CNNs) – For Images

  • Specially designed for image recognition.
  • Used in self-driving cars, facial recognition, and medical imaging.

C. Recurrent Neural Networks (RNNs) – For Sequences

  • Processes time-based data like speech, music, and stock prices.
  • Used in chatbots and real-time translation (Google Translate).

Think of different neural networks like different types of athletes:

  • MLP → A runner (simple tasks).
  • CNN → A gymnast (identifies shapes and patterns).
  • RNN → A musician (remembers past data and sequences).

3. How Neural Networks Learn (Training Process)

A. The Learning Process

Neural networks learn by adjusting their internal connections (weights and biases).

Example: Teaching a child to recognize apples 

  1. Show the child pictures of apples and non-apples.
  2. If they make a mistake, correct them.
  3. Repeat the process until they can recognize apples correctly.

Neural networks learn in a similar way by:

  1. Receiving input data (e.g., an image of an apple).
  2. Making a prediction (e.g., “this is an apple”).
  3. Checking the error (Was the prediction correct?).
  4. Adjusting itself to make better future predictions.

B. The Role of Activation Functions

Activation functions decide whether a neuron should “fire” (activate) based on its input.

Simple Artificial Neuron Simulation

import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Define input, weights, and bias
inputs = np.array([1, 0])  # Binary input
weights = np.array([0.5, -0.5])  # Weight coefficients
bias = 0.1  # Bias term

# Compute output
output = sigmoid(np.dot(inputs, weights) + bias)
print("Output:", output)

Output:

Output: 0.6456563062257954

Code Breakdown

This code simulates a simple artificial neuron:

  • Imports NumPy for numerical operations.
  • Defines the sigmoid function, which squashes values between 0 and 1.
  • Initializes inputs, weights, and bias:
    • Inputs: [1, 0] (binary values).
    • Weights: [0.5, -0.5] (determines feature importance).
    • Bias: 0.1 (shifts activation threshold).
  • Computes the weighted sum using the dot product of inputs and weights, then adds the bias.
  • Applies the sigmoid function to transform the result into a probability-like value.
  • Prints the final output, which determines neuron activation.

Simple Neural Network Using PyTorch

import torch
import torch.nn as nn

# Define an MLP model
class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(2, 4)  # Input layer to hidden layer
        self.relu = nn.ReLU()  # Activation function
        self.fc2 = nn.Linear(4, 1)  # Hidden layer to output layer
        self.sigmoid = nn.Sigmoid()  # Output activation

    def forward(self, x):
        x = self.relu(self.fc1(x))
        x = self.sigmoid(self.fc2(x))
        return x

# Instantiate model
model = MLP()
print(model)

Output:

MLP(
  (fc1): Linear(in_features=2, out_features=4, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=4, out_features=1, bias=True)
  (sigmoid): Sigmoid()
)

Code Breakdown

XOR Neural Network

# Define loss function and optimizer
criterion = nn.BCELoss()  # Binary Cross-Entropy Loss
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Sample training data
X_train = torch.tensor([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]])
y_train = torch.tensor([[0.0], [1.0], [1.0], [0.0]])  # XOR dataset labels

# Training loop
epochs = 1000
for epoch in range(epochs):
    optimizer.zero_grad()
    outputs = model(X_train)  
    loss = criterion(outputs, y_train)
    loss.backward()
    optimizer.step()

    if epoch % 200 == 0:
        print(f"Epoch {epoch}, Loss: {loss.item()}")

# Final output after training
print("Final model output:", model(X_train))

Output:

Epoch 0, Loss: 0.6961403489112854
Epoch 200, Loss: 0.22329354286193848
Epoch 400, Loss: 0.04864644259214401
Epoch 600, Loss: 0.019203947857022285
Epoch 800, Loss: 0.010208996012806892
Final model output: tensor([[0.0055],
        [0.9953],
        [0.9897],
        [0.0047]], grad_fn=<SigmoidBackward0>)

Code Breakdown

This code trains a neural network to learn the XOR function, a basic logic operation where:

  • 0 XOR 0 = 0
  • 0 XOR 1 = 1
  • 1 XOR 0 = 1
  • 1 XOR 1 = 0

Step-by-Step Breakdown:

1. Define Loss Function & Optimizer

  • The Binary Cross-Entropy Loss (BCELoss) measures how well the model’s predictions match actual outputs.
  • Adam optimizer updates the model’s weights to minimize the loss and improve predictions.

2. Define the Training Data (XOR Dataset)

  • The input X_train consists of 4 possible binary combinations (0s and 1s).
  • The expected output y_train follows the XOR truth table.

3. Training the Model (1000 Epochs)

  • The model predicts outputs, calculates loss, and adjusts weights using gradient descent.
  • Loss is printed every 200 epochs to track improvement.

4. Model Evaluation

  • After training, the model makes predictions for X_train, and the results are displayed.

Understanding the Output

The model prints loss values and final predictions. Let’s analyze them:

1. Loss at Different Epochs

2. Final Model Predictions

tensor([[0.0055],
[0.9953],
[0.9897],
[0.0047
]])

These values represent the model’s predicted probabilities (between 0 and 1) after applying the Sigmoid activation function.

 

The model has successfully learned the XOR function!

  • Predictions are very close to the expected values.
  • It correctly classifies XOR outputs after training.