What is an Algorithm?
An algorithm is simply a set of step-by-step instructions to solve a problem or accomplish a task. Think of it as a recipe for computers. Just as you follow a recipe to bake a cake, computers follow algorithms to perform tasks.
Examples in Everyday Life:
- Making a sandwich: The steps you follow (get bread, add ingredients, etc.) form an algorithm.
- Driving to work: Your route and decision-making process is an algorithm.
- Sorting books on a shelf: Whether by author, title, or genre, you’re using an algorithm.
To learn them all, look here.
1. ELO Rating System
ELO ranks players in competitive games, used in chess, online gaming, and dating apps. Players gain or lose points based on wins/losses against opponents, with higher-rated opponents worth more points. It’s commonly used in matchmaking systems for video games, “hot or not” (Zuckerberg’s “Facemash”), and even in some dating apps to match users based on perceived attractiveness.
New Rating = Old Rating + K * (Actual Score - Expected Score)
where:
K = factor determining maximum rating change
Expected Score = 1 / (1 + 10^((Opponent Rating - Player Rating) / 400))
Simple explanation: Imagine a seesaw. If you beat someone “heavier” (higher-rated) than you, you get a big boost. If you lose to someone “lighter” (lower-rated), you drop a lot. It’s like a balancing act that adjusts your “weight” based on who you compete against.
def elo(player_rating, opponent_rating, score, k_factor=32):
expected_score = 1 / (1 + 10 ** ((opponent_rating - player_rating) / 400))
new_rating = player_rating + k_factor * (score - expected_score)
return round(new_rating)
# Example usage
new_rating = elo(1500, 1600, 1) # 1 for win, 0 for loss, 0.5 for draw
print(f"New rating: {new_rating}")
2. PageRank
PageRank ranks web pages for search results, originally used by Google. Pages are ranked based on the number and quality of links pointing to them, with links from important pages worth more. This algorithm revolutionized web search by providing more relevant results based on the web’s link structure.
PR(A) = (1-d) + d * (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
where:
PR(A) = PageRank of page A
d = damping factor
PR(Ti) = PageRank of pages linking to A
C(Ti) = number of outbound links from Ti
Simple explanation: Think of websites as politicians and links as votes. Popular politicians (websites) get more votes (links). But a vote from a well-known politician is worth more than one from an unknown person. PageRank is like counting these weighted votes to decide which politicians (websites) are most important.
function pageRank(graph, dampingFactor = 0.85, epsilon = 1e-8, maxIterations = 100) {
const numPages = Object.keys(graph).length;
let ranks = Object.fromEntries(Object.keys(graph).map(page => [page, 1 / numPages]));
for (let i = 0; i < maxIterations; i++) {
let newRanks = {};
let diff = 0;
for (const [page, links] of Object.entries(graph)) {
let rankSum = 0;
for (const incomingPage of Object.keys(graph)) {
if (graph[incomingPage].includes(page)) {
rankSum += ranks[incomingPage] / graph[incomingPage].length;
}
}
newRanks[page] = (1 - dampingFactor) / numPages + dampingFactor * rankSum;
diff += Math.abs(newRanks[page] - ranks[page]);
}
ranks = newRanks;
if (diff < epsilon) break;
}
return ranks;
}
// Example usage
const graph = {
'A': ['B', 'C'],
'B': ['C'],
'C': ['A']
};
console.log(pageRank(graph));
3. AES
AES secures data by encrypting it, used in secure communication, online banking, and government classified information. It transforms data into an unreadable format using a secret key. This algorithm is widely used to protect sensitive information in various digital systems, from messaging apps to secure file storage.
AES uses a substitution-permutation network with multiple rounds of:
1. SubBytes: Substitute each byte with another according to a lookup table
2. ShiftRows: Shift the rows of the state array
3. MixColumns: Mix data within each column of the state array
4. AddRoundKey: XOR the state with a round key derived from the main key
Simple explanation: Imagine you have a special box that scrambles anything you put inside in a very specific way. Only someone with the right key can unscramble it. AES is like this box, but for digital information. It jumbles up your data so well that even super-fast computers can’t figure out what’s inside without the key.
from Crypto.Cipher import AES
from Crypto.Random import get_random_bytes
def encrypt(plaintext, key):
cipher = AES.new(key, AES.MODE_EAX)
ciphertext, tag = cipher.encrypt_and_digest(plaintext.encode())
return (cipher.nonce, tag, ciphertext)
def decrypt(nonce, tag, ciphertext, key):
cipher = AES.new(key, AES.MODE_EAX, nonce)
plaintext = cipher.decrypt_and_verify(ciphertext, tag)
return plaintext.decode()
# Example usage
key = get_random_bytes(16) # 128-bit key
message = "Hello, AES!"
nonce, tag, ciphertext = encrypt(message, key)
decrypted = decrypt(nonce, tag, ciphertext, key)
print(f"Original: {message}")
print(f"Encrypted: {ciphertext}")
print(f"Decrypted: {decrypted}")
4. Dijkstra’s Algorithm
Dijkstra’s algorithm finds the shortest path between points in a network, used in GPS navigation and network routing. It explores all possible paths, always choosing the shortest available path until reaching the destination. This algorithm is crucial in optimizing routes for navigation apps and efficiently routing data in computer networks.
1. Mark all nodes as unvisited. Create a set of unvisited nodes.
2. Assign every node a tentative distance value:
- 0 for initial node
- infinity for all other nodes
3. For the current node, consider all of its unvisited neighbors and calculate their tentative distances.
4. When done considering all neighbors, mark the current node as visited. A visited node will never be checked again.
5. If the destination node has been marked visited, stop. The algorithm has finished.
6. Otherwise, select the unvisited node with the smallest tentative distance, set it as the new current node, and go back to step 3.
Simple explanation: Imagine you’re in a maze trying to find the quickest way out. Dijkstra’s algorithm is like having a magic marker that colors the shortest path as you explore. At each intersection, it checks all possible routes and always chooses the shortest one, leaving a trail of the quickest path to any point in the maze.
import heapq
def dijkstra(graph, start, end):
distances = {node: float('infinity') for node in graph}
distances[start] = 0
pq = [(0, start)]
previous = {node: None for node in graph}
while pq:
current_distance, current_node = heapq.heappop(pq)
if current_node == end:
path = []
while current_node:
path.append(current_node)
current_node = previous[current_node]
return path[::-1], distances[end]
if current_distance > distances[current_node]:
continue
for neighbor, weight in graph[current_node].items():
distance = current_distance + weight
if distance < distances[neighbor]:
distances[neighbor] = distance
previous[neighbor] = current_node
heapq.heappush(pq, (distance, neighbor))
return None, float('infinity')
# Example usage
graph = {
'A': {'B': 4, 'C': 2},
'B': {'D': 3, 'C': 1},
'C': {'D': 5},
'D': {}
}
start, end = 'A', 'D'
path, distance = dijkstra(graph, start, end)
print(f"Shortest path from {start} to {end}: {path}")
print(f"Total distance: {distance}")
5. Rating Percentage Index (RPI)
RPI ranks sports teams based on win percentage and strength of schedule, used in NCAA basketball rankings and tournament seedings. It combines a team’s winning percentage with their opponents’ and opponents’ opponents’ winning percentages. This metric is widely used in college sports to evaluate team performance and determine tournament qualifications.
RPI = (WP * 0.25) + (OWP * 0.5) + (OOWP * 0.25)
where:
WP = team's winning percentage
OWP = opponents' winning percentage
OOWP = opponents' opponents' winning percentage
Simple explanation: Imagine you’re judging a pie-baking contest. You don’t just look at how many contests each baker has won (their win percentage). You also consider how tough their competition was (their opponents’ win percentage) and even how good their opponents’ opponents were (opponents’ opponents’ win percentage). RPI is like this for sports teams, giving a fuller picture of how good a team really is.
def calculate_rpi(team_wins, team_losses, opp_wins, opp_losses, opp_opp_wins, opp_opp_losses):
wp = team_wins / (team_wins + team_losses)
owp = opp_wins / (opp_wins + opp_losses)
oowp = opp_opp_wins / (opp_opp_wins + opp_opp_losses)
rpi = (wp * 0.25) + (owp * 0.5) + (oowp * 0.25)
return round(rpi, 4)
# Example usage
team_wins, team_losses = 20, 10
opp_wins, opp_losses = 300, 200
opp_opp_wins, opp_opp_losses = 4500, 3500
rpi = calculate_rpi(team_wins, team_losses, opp_wins, opp_losses, opp_opp_wins, opp_opp_losses)
print(f"Team's RPI: {rpi}")
6. TrueSkill™ Ranking
TrueSkill™ is a ranking system developed by Microsoft for Xbox Live, used to rank players in multiplayer games, especially those with team-based gameplay. It’s designed to work well with games that have more than two players or teams, and can handle draws. TrueSkill™ is used in various competitive gaming platforms and e-sports rankings.
TrueSkill uses a Bayesian inference algorithm. The core idea is:
Skill ~ N(μ, σ²)
where:
μ = the average skill of a player
σ = the uncertainty in the player's skill
After each match, μ and σ are updated based on the game outcome and the ratings of other players involved.
Simple explanation: Imagine each player has a “skill level” hidden inside a box. TrueSkill™ tries to guess what’s in the box by watching game results. It’s not just interested in who wins, but also how surprising the win was. If a lower-rated player beats a higher-rated one, TrueSkill™ might think, “Wow, maybe that lower-rated player is actually better than we thought!” and adjust accordingly.
from trueskill import Rating, quality_1vs1, rate_1vs1
def update_skills(player1, player2, player1_wins):
if player1_wins:
new_player1, new_player2 = rate_1vs1(player1, player2)
else:
new_player2, new_player1 = rate_1vs1(player2, player1)
return new_player1, new_player2
# Example usage
player1 = Rating() # Default is mu=25, sigma=8.333
player2 = Rating()
print(f"Initial ratings: Player1={player1}, Player2={player2}")
# Simulate a game where Player1 wins
new_player1, new_player2 = update_skills(player1, player2, player1_wins=True)
print(f"After Player1 wins: Player1={new_player1}, Player2={new_player2}")
# Calculate match quality (probability of a draw)
match_quality = quality_1vs1(new_player1, new_player2)
print(f"Match quality for a rematch: {match_quality:.2%}")
7. Merge Sort
Merge Sort is a divide-and-conquer algorithm used for sorting. It divides the input array into two halves, recursively sorts them, and then merges the two sorted halves. Merge Sort is widely used in various applications where stable, efficient sorting is required, such as in database systems and external sorting of large data sets.
MergeSort(arr):
If length of arr <= 1, return arr
mid = length of arr / 2
left = MergeSort(arr[0...mid])
right = MergeSort(arr[mid+1...end])
return Merge(left, right)
Merge(left, right):
result = empty array
while left and right have elements:
if left[0] <= right[0]:
append left[0] to result
remove left[0] from left
else:
append right[0] to result
remove right[0] from right
append remaining elements of left to result
append remaining elements of right to result
return result
Simple explanation: Imagine you have a deck of cards to sort. You split the deck in half, then split each half again, and so on until you have single cards. Then you start pairing up cards, putting the lower card first each time. You keep pairing and merging these sorted mini-decks until you have one fully sorted deck. That’s Merge Sort!
def merge_sort(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid])
right = merge_sort(arr[mid:])
return merge(left, right)
def merge(left, right):
result = []
i, j = 0, 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result.extend(left[i:])
result.extend(right[j:])
return result
# Example usage
arr = [38, 27, 43, 3, 9, 82, 10]
sorted_arr = merge_sort(arr)
print(f"Sorted array: {sorted_arr}")
8. Quick Sort
Quick Sort is another efficient, divide-and-conquer sorting algorithm. It works by selecting a ‘pivot’ element from the array and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The sub-arrays are then recursively sorted. Quick Sort is widely used due to its efficiency and is the standard sorting algorithm in many libraries.
QuickSort(arr, low, high):
if low < high:
pivot_index = Partition(arr, low, high)
QuickSort(arr, low, pivot_index - 1)
QuickSort(arr, pivot_index + 1, high)
Partition(arr, low, high):
pivot = arr[high]
i = low - 1
for j = low to high - 1:
if arr[j] <= pivot:
i = i + 1
swap arr[i] with arr[j]
swap arr[i + 1] with arr[high]
return i + 1
Simple explanation: Imagine you’re organizing books on a shelf by their height. You pick a book (the pivot) and put all shorter books to its left and all taller books to its right. Then you repeat this process for the left and right groups until all books are in order. That’s Quick Sort in action!
def quick_sort(arr):
if len(arr) <= 1:
return arr
else:
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)
# Example usage
arr = [38, 27, 43, 3, 9, 82, 10]
sorted_arr = quick_sort(arr)
print(f"Sorted array: {sorted_arr}")
9. Binary Search
Binary Search is an efficient algorithm for searching a sorted array by repeatedly dividing the search interval in half. It compares the target value to the middle element of the array; if they are unequal, the half in which the target cannot lie is eliminated and the search continues on the remaining half until the target is found or it’s clear the target is not in the array.
BinarySearch(arr, target):
left = 0
right = length of arr - 1
while left <= right:
mid = (left + right) / 2
if arr[mid] == target:
return mid
else if arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1 # Target not found
Simple explanation: Imagine you’re looking for a word in a dictionary. You open it in the middle – if your word comes before that page, you only look in the first half; if it comes after, you only look in the second half. You keep dividing the remaining pages in half until you find your word. That’s how Binary Search works, but with numbers or any sorted data.
def binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1 # Target not found
# Example usage
arr = [2, 3, 4, 10, 40]
target = 10
result = binary_search(arr, target)
print(f"Element {target} is {'present at index ' + str(result) if result != -1 else 'not present'}")
10. SHA-256
SHA-256 (Secure Hash Algorithm 256-bit) is a cryptographic hash function that generates a fixed-size 256-bit (32-byte) hash. It’s part of the SHA-2 family, designed by the U.S. National Security Agency. SHA-256 is widely used in security applications and protocols, including TLS, SSL, PGP, SSH, IPsec, and cryptocurrencies like Bitcoin for verifying transactions.
SHA-256 processes the input in 512-bit blocks, using a series of logical operations:
1. Break the data into 512-bit chunks
2. Expand the first 16 32-bit words into 64 32-bit words
3. Initialize hash values (h0 to h7)
4. Compress function:
- 64 rounds of logical operations (Ch, Maj, Σ0, Σ1, σ0, σ1)
5. Add compressed chunk to current hash value
6. Repeat 4-5 for all chunks
7. Produce final 256-bit hash value
Simple explanation: Imagine you have a machine that turns any book into a unique 64-character code. No matter how long or short the book is, you always get a 64-character code. If you change even one letter in the book, the code completely changes. That’s basically what SHA-256 does with digital data – it creates a unique “fingerprint” for any piece of information.
import hashlib
def sha256_hash(data):
return hashlib.sha256(data.encode()).hexdigest()
# Example usage
message = "Hello, SHA-256!"
hashed = sha256_hash(message)
print(f"SHA-256 hash of '{message}': {hashed}")
11. MD5
MD5 (Message Digest algorithm 5) is a widely used cryptographic hash function that produces a 128-bit (16-byte) hash value. It was designed to be used as a cryptographic hash function, but is now primarily used for verifying data integrity. Due to its vulnerabilities, it’s no longer recommended for security-critical applications.
MD5 processes input in 512-bit blocks:
1. Pad message to be a multiple of 512 bits
2. Initialize four 32-bit registers (A, B, C, D)
3. Process message in 16-word blocks:
- 4 rounds of 16 operations each (64 total)
- Each operation performs a nonlinear function on three of A, B, C, D
- Adds the result to the fourth variable, a word from the message, and a constant
- The result is then rotated and added to one of A, B, C, or D
4. The final output is the concatenation of A, B, C, and D
Simple explanation: Think of MD5 as a very complex paper shredder. You feed in a document of any size, and it shreds it into a very specific pattern of confetti that’s always the same size. If you feed in the exact same document, you get the exact same confetti pattern. But change even one letter, and the confetti pattern is completely different.
import hashlib
def md5_hash(data):
return hashlib.md5(data.encode()).hexdigest()
# Example usage
message = "Hello, MD5!"
hashed = md5_hash(message)
print(f"MD5 hash of '{message}': {hashed}")
12. A* Search Algorithm
A* (pronounced “A-star”) is a graph traversal and path search algorithm. It’s widely used in pathfinding and graph traversal, the process of finding a path between multiple points, especially in video games, robotics, and other applications requiring efficient route planning. A* uses a best-first search and finds a least-cost path from a given initial node to one goal node.
A*(start, goal):
openSet = {start}
cameFrom = an empty map
gScore = map with default value of Infinity
gScore[start] = 0
fScore = map with default value of Infinity
fScore[start] = heuristic(start, goal)
while openSet is not empty:
current = node in openSet with lowest fScore value
if current = goal:
return reconstruct_path(cameFrom, current)
openSet.Remove(current)
for each neighbor of current:
tentative_gScore = gScore[current] + d(current, neighbor)
if tentative_gScore < gScore[neighbor]:
cameFrom[neighbor] = current
gScore[neighbor] = tentative_gScore
fScore[neighbor] = gScore[neighbor] + heuristic(neighbor, goal)
if neighbor not in openSet:
openSet.add(neighbor)
return failure
Simple explanation: Imagine you’re in a maze trying to find the quickest way out. A* is like a smart explorer that not only knows where it’s been (like Dijkstra’s algorithm) but also has a rough idea of how far it is from the exit. It uses this combined knowledge to always explore the most promising path first, making it very efficient at finding the best route.
import heapq
def heuristic(a, b):
return abs(b[0] - a[0]) + abs(b[1] - a[1])
def a_star(graph, start, goal):
open_set = []
heapq.heappush(open_set, (0, start))
came_from = {}
g_score = {start: 0}
f_score = {start: heuristic(start, goal)}
while open_set:
current = heapq.heappop(open_set)[1]
if current == goal:
path = []
while current in came_from:
path.append(current)
current = came_from[current]
path.append(start)
return path[::-1]
for neighbor in graph[current]:
tentative_g_score = g_score[current] + graph[current][neighbor]
if tentative_g_score < g_score.get(neighbor, float('inf')):
came_from[neighbor] = current
g_score[neighbor] = tentative_g_score
f_score[neighbor] = g_score[neighbor] + heuristic(neighbor, goal)
heapq.heappush(open_set, (f_score[neighbor], neighbor))
return None
# Example usage
graph = {
(0, 0): {(0, 1): 1, (1, 0): 1},
(0, 1): {(0, 0): 1, (0, 2): 1, (1, 1): 1},
(0, 2): {(0, 1): 1, (1, 2): 1},
(1, 0): {(0, 0): 1, (1, 1): 1, (2, 0): 1},
(1, 1): {(0, 1): 1, (1, 0): 1, (1, 2): 1, (2, 1): 1},
(1, 2): {(0, 2): 1, (1, 1): 1, (2, 2): 1},
(2, 0): {(1, 0): 1, (2, 1): 1},
(2, 1): {(1, 1): 1, (2, 0): 1, (2, 2): 1},
(2, 2): {(1, 2): 1, (2, 1): 1}
}
start = (0, 0)
goal = (2, 2)
path = a_star(graph, start, goal)
print(f"Path from {start} to {goal}: {path}")
13. K-Means Clustering
K-Means Clustering is an unsupervised machine learning algorithm used to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centroid). It’s commonly used for market segmentation, document clustering, image segmentation, and pattern recognition.
K-Means Algorithm:
1. Choose the number of clusters k
2. Initialize k centroids randomly
3. Repeat until convergence:
a. Assign each point to the nearest centroid
b. Recompute centroids of clusters by taking the average of all points assigned to that centroid's cluster
4. The algorithm has converged when the assignments no longer change
Simple explanation: Imagine you have a bag of mixed candies and you want to sort them into k piles based on their color. You start by randomly placing k sheets of paper on a table, each representing a pile. Then you take each candy and put it on the sheet with the most similar color. After placing all candies, you adjust the position of each sheet to be in the center of its candy pile. You repeat this process of reassigning candies and moving sheets until no candies need to be moved. That’s essentially how K-Means works with data points instead of candies!
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
# Generate some random data
np.random.seed(42)
X = np.random.randn(300, 2)
# Perform K-Means clustering
kmeans = KMeans(n_clusters=3)
kmeans.fit(X)
# Get cluster centers and labels
centers = kmeans.cluster_centers_
labels = kmeans.labels_
# Plot the results
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.scatter(centers[:, 0], centers[:, 1], c='red', marker='x', s=200, linewidths=3)
plt.title('K-Means Clustering')
plt.show()
14. Support Vector Machine (SVM)
Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression tasks. In classification, SVM finds the hyperplane that best divides a dataset into classes. SVM is widely used in various applications, including image classification, handwriting recognition, and bioinformatics.
SVM Optimization Problem:
Minimize: (1/2) ||w||^2
Subject to: yi(w · xi + b) ≥ 1 for all i
where:
w = weight vector
x = input vector
y = class label (+1 or -1)
b = bias
Simple explanation: Imagine you have a room full of red and blue balls, and you want to divide them with a sheet. SVM is like finding the best position for that sheet so that it keeps red and blue balls on different sides with the widest possible corridor between them. This ‘widest corridor’ approach helps SVM make good predictions even for new balls it hasn’t seen before.
from sklearn import svm
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Generate a random dataset
X, y = make_classification(n_samples=100, n_features=2, n_informative=2, n_redundant=0, random_state=42)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create and train the SVM model
clf = svm.SVC(kernel='linear')
clf.fit(X_train, y_train)
# Make predictions on the test set
y_pred = clf.predict(X_test)
# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
15. Naive Bayes Classifier
Naive Bayes is a probabilistic machine learning algorithm based on Bayes’ theorem, with an assumption of independence between features. It’s widely used for text classification, spam filtering, sentiment analysis, and recommendation systems due to its simplicity and effectiveness, especially with high-dimensional datasets.
Bayes' Theorem: P(A|B) = (P(B|A) * P(A)) / P(B)
For classification:
P(class|features) = (P(features|class) * P(class)) / P(features)
Naive Bayes assumes feature independence:
P(x1,...,xn|class) = P(x1|class) * ... * P(xn|class)
Simple explanation: Imagine you’re a weather forecaster with a very simple method. You look at today’s weather (sunny, rainy, or cloudy) and what you had for breakfast (cereal, toast, or fruit). Based on past data of how often it rained when you ate cereal and it was sunny, you predict tomorrow’s weather. This oversimplified approach, where you assume breakfast and today’s weather independently affect tomorrow’s weather, is similar to how Naive Bayes works with data.
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create and train the Naive Bayes model
clf = GaussianNB()
clf.fit(X_train, y_train)
# Make predictions on the test set
y_pred = clf.predict(X_test)
# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
16. Decision Trees
Decision Trees are a non-parametric supervised learning method used for classification and regression tasks. The model learns decision rules inferred from the data features to predict the target variable. Decision Trees are widely used due to their interpretability and their ability to handle both numerical and categorical data.
Decision Tree Algorithm:
1. Start at the root node
2. For each feature X:
a. Calculate Information Gain for splitting on X
3. Select feature with highest Information Gain
4. Create decision node using selected feature
5. Recursively repeat steps 2-4 for each child node until stopping criteria is met
Information Gain = Entropy(parent) - Weighted Sum of Entropy(children)
Entropy = -Σ p(x) * log2(p(x))
Simple explanation: Imagine you’re playing a game of “20 Questions” to guess an animal. Each question you ask is like a node in a Decision Tree. “Does it fly?” If yes, you go down one path; if no, you go down another. Each answer narrows down the possibilities until you reach a final guess. That’s how a Decision Tree works: it asks a series of yes/no questions about the data to make a prediction or classification.
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create and train the Decision Tree model
clf = DecisionTreeClassifier(random_state=42)
clf.fit(X_train, y_train)
# Make predictions on the test set
y_pred = clf.predict(X_test)
# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
17. Random Forest
Random Forest is an ensemble learning method that constructs multiple decision trees and merges them to get a more accurate and stable prediction. It’s used for both classification and regression tasks and is known for its high accuracy, ability to handle large datasets with higher dimensionality, and resistance to overfitting.
Random Forest Algorithm:
1. Create n bootstrap samples from the original dataset
2. For each sample:
a. Grow a decision tree
b. At each node, randomly select k features
c. Split the node using the best feature among k
d. Repeat until the tree is fully grown (no pruning)
3. For classification, use majority voting of all trees
For regression, take the average prediction of all trees
Simple explanation: Imagine you have a big decision to make, so you ask 100 of your friends for advice. Each friend has access to slightly different information and uses a slightly different method to make their decision. You then go with the majority opinion. This is similar to how Random Forest works: it creates many decision trees, each with a slightly different perspective on the data, and combines their predictions for a final result.
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create and train the Random Forest model
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)
# Make predictions on the test set
y_pred = clf.predict(X_test)
# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
18. Gradient Descent
Gradient Descent is an optimization algorithm used to minimize a function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient. It’s widely used in machine learning and deep learning for finding the optimal parameters of a model.
Gradient Descent Algorithm:
1. Initialize parameters θ
2. Repeat until convergence:
θ = θ - α * ∇J(θ)
where:
α = learning rate
∇J(θ) = gradient of the cost function with respect to θ
Simple explanation: Imagine you’re in a hilly area trying to find the lowest point. You look around and take a step in the direction where the ground slopes down the most. You keep doing this, always stepping in the direction of the steepest descent, until you reach a point where any step would take you uphill. This process of repeatedly stepping downhill until you can’t go any lower is essentially how Gradient Descent works to find the minimum of a function.
import numpy as np
import matplotlib.pyplot as plt
def gradient_descent(start, gradient, learn_rate, n_iter=50, tolerance=1e-06):
vector = start
for _ in range(n_iter):
diff = -learn_rate * gradient(vector)
if np.all(np.abs(diff) <= tolerance):
break
vector += diff
return vector
# Example: Finding the minimum of f(x) = x^2 + 5
# The gradient of f(x) is 2x
gradient = lambda x: 2 * x
start = 10.0
learn_rate = 0.1
result = gradient_descent(start, gradient, learn_rate)
print(f"Minimum found at: {result}")
# Plotting
x = np.linspace(-15, 15, 100)
y = x**2 + 5
plt.plot(x, y)
plt.plot(result, result**2 + 5, 'ro')
plt.title('Gradient Descent Example')
plt.show()
19. Backpropagation
Backpropagation is a widely used algorithm for training feedforward neural networks. It efficiently computes the gradient of the loss function with respect to the weights of the network for a single input-output example, which is then used in stochastic gradient descent.
Backpropagation Algorithm:
1. Forward pass: Compute the output of the network for an input
2. Compute the error/loss at the output
3. Backward pass:
a. Compute the gradient of the error with respect to output layer weights
b. Propagate the gradients back through the network layers
c. Update all weights using the computed gradients
Simple explanation: Imagine you’re baking a cake that doesn’t taste quite right. You adjust your recipe slightly and bake again. If it tastes better, you keep those changes; if it’s worse, you try a different adjustment. Backpropagation is similar: the neural network makes a prediction, checks how far off it is, then adjusts its internal ‘recipe’ (weights) to do better next time. It does this over and over, gradually improving its predictions.
import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def sigmoid_derivative(x):
return x * (1 - x)
class NeuralNetwork:
def __init__(self, x, y):
self.input = x
self.weights1 = np.random.rand(self.input.shape[1], 4)
self.weights2 = np.random.rand(4, 1)
self.y = y
self.output = np.zeros(self.y.shape)
def feedforward(self):
self.layer1 = sigmoid(np.dot(self.input, self.weights1))
self.output = sigmoid(np.dot(self.layer1, self.weights2))
def backprop(self):
d_weights2 = np.dot(self.layer1.T, (2*(self.y - self.output) * sigmoid_derivative(self.output)))
d_weights1 = np.dot(self.input.T, (np.dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1)))
self.weights1 += d_weights1
self.weights2 += d_weights2
# Example usage
X = np.array([[0,0,1], [0,1,1], [1,0,1], [1,1,1]])
y = np.array([[0],[1],[1],[0]])
nn = NeuralNetwork(X, y)
for i in range(1500):
nn.feedforward()
nn.backprop()
print(nn.output)
20. Convolutional Neural Networks (CNNs)
Convolutional Neural Networks are a class of deep neural networks most commonly applied to analyze visual imagery. They use a variation of multilayer perceptrons designed to require minimal preprocessing. CNNs are widely used in image and video recognition, image classification, medical image analysis, and natural language processing.
CNN Architecture:
1. Convolutional Layer: Apply filters to detect features
2. ReLU Layer: Apply activation function
3. Pooling Layer: Reduce spatial dimensions
4. Fully Connected Layer: Compute class scores
Simple explanation: Imagine you’re trying to recognize objects in a picture, but instead of looking at the whole image at once, you slide a small viewfinder across the image, noticing important features at each position (like edges or colors). Then you combine these features to understand what’s in the image. This is similar to how a CNN processes images: it uses layers of small filters to detect features, combines them, and eventually recognizes the content of the image.
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
# Load and prepare the CIFAR10 dataset
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
train_images, test_images = train_images / 255.0, test_images / 255.0
# Create the CNN model
model = models.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.Flatten(),
layers.Dense(64, activation='relu'),
layers.Dense(10)
])
# Compile and train the model
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_images, train_labels, epochs=10,
validation_data=(test_images, test_labels))
# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print(f"\nTest accuracy: {test_acc:.2f}")
21. Recurrent Neural Networks (RNNs)
Recurrent Neural Networks are a class of neural networks where connections between nodes form a directed graph along a temporal sequence. This allows RNNs to exhibit temporal dynamic behavior, making them suitable for tasks such as speech recognition, language modeling, translation, and image captioning.
RNN Architecture:
For each time step t:
1. Input: x(t)
2. Hidden state: h(t) = f(U * x(t) + W * h(t-1))
3. Output: y(t) = g(V * h(t))
where:
f, g are activation functions
U, V, W are weight matrices
Simple explanation: Imagine you’re reading a book, and with each word, you’re not just understanding its meaning in isolation, but in the context of all the words you’ve read before it. RNNs work similarly with sequential data: they process each piece of data (like a word in a sentence) while keeping in mind what they’ve seen
22. Principal Component Analysis (PCA)
Principal Component Analysis is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. It’s commonly used for dimensionality reduction in data preprocessing for machine learning and for exploratory data analysis.
PCA Algorithm:
1. Standardize the dataset
2. Calculate the covariance matrix
3. Calculate eigenvectors and eigenvalues of the covariance matrix
4. Sort eigenvalues in descending order
5. Choose top k eigenvectors
6. Transform the original dataset
Simple explanation: Imagine you have a big, messy closet and you want to organize it efficiently. PCA is like finding the most important directions to arrange your items. If most of your clothes vary in size, the first ‘direction’ might be size. The second might be color, and so on. By focusing on these key ‘directions’, you can organize your closet (or data) in a way that captures the most important variations with the least amount of effort.
from sklearn.decomposition import PCA
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
# Load the iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Create a PCA object and fit the data
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)
# Plot the results
plt.figure(figsize=(8, 6))
plt.scatter(X_pca[:, 0], X_pca[:, 1], c=y, cmap='viridis')
plt.xlabel('First Principal Component')
plt.ylabel('Second Principal Component')
plt.title('PCA of Iris Dataset')
plt.colorbar(label='Species')
plt.show()
print(f"Explained variance ratio: {pca.explained_variance_ratio_}")
23. Logistic Regression
Logistic Regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. The outcome is measured with a dichotomous variable (in which there are only two possible outcomes). It’s used extensively in various fields, including machine learning, most medical fields, and social sciences.
Logistic Function: σ(z) = 1 / (1 + e^(-z))
Predicted Probability: P(Y=1) = σ(β0 + β1X1 + β2X2 + ... + βnXn)
Log-odds: ln(P(Y=1) / (1 - P(Y=1))) = β0 + β1X1 + β2X2 + ... + βnXn
Simple explanation: Imagine you’re trying to predict whether it will rain tomorrow based on today’s temperature and humidity. Logistic Regression is like drawing a line (or curve) that best separates the “rainy days” from the “non-rainy days” in your data. Once you have this line, you can use it to predict the likelihood of rain for any new combination of temperature and humidity.
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load the breast cancer dataset
cancer = load_breast_cancer()
X = cancer.data
y = cancer.target
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create and train the Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)
# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
24. Linear Regression
Linear Regression is a linear approach to modelling the relationship between a scalar response and one or more explanatory variables. It’s used for predicting a continuous outcome variable (Y) based on one or more input predictor variables (X). The goal is to find the best-fitting straight line through the points.
Simple Linear Regression Equation:
Y = β0 + β1X + ε
where:
Y is the dependent variable
X is the independent variable
β0 is the y-intercept
β1 is the slope
ε is the error term
Simple explanation: Imagine you’re trying to guess how much an ice cream will cost based on its size. You plot all the ice creams you’ve bought on a graph, with size on one axis and price on the other. Linear Regression is like drawing the best straight line through all these points. This line then becomes your “prediction machine” – you can use it to estimate the price of any size ice cream.
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import numpy as np
# Load the Boston Housing dataset
boston = load_boston()
X = boston.data
y = boston.target
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create and train the Linear Regression model
model = LinearRegression()
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)
# Calculate the mean squared error and R-squared score
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print(f"Mean squared error: {mse:.2f}")
print(f"R-squared score: {r2:.2f}")
25. K-Nearest Neighbors (KNN)
K-Nearest Neighbors is a simple, instance-based learning algorithm used for classification and regression. In KNN classification, an object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors. In KNN regression, the output is the average of the values of its k nearest neighbors.
KNN Algorithm:
1. Choose the number of neighbors, K
2. Calculate the distance between the query instance and all training samples
3. Sort the distances and determine the K nearest neighbors
4. For classification: perform majority voting
For regression: calculate the average of the K neighbors
5. Return the predicted class or value
Simple explanation: Imagine you’re new in a city and trying to guess if you’ll like a restaurant. You ask the 5 people nearest to you if they like it. If 3 or more say yes, you predict you’ll like it too. This is how KNN works – it looks at the ‘K’ nearest data points (where K is a number you choose) and makes a decision based on what category the majority of those neighbors belong to.
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load the iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create and train the KNN model
model = KNeighborsClassifier(n_neighbors=3)
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)
# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
26. Apriori Algorithm
The Apriori algorithm is used for mining frequent itemsets and generating association rules from transactional databases. It’s commonly used in market basket analysis to discover relationships between products that people buy together frequently. The algorithm uses the principle that if an itemset is frequent, then all of its subsets must also be frequent.
Apriori Algorithm:
1. Set a minimum support threshold
2. Generate frequent itemsets:
a. Find all frequent 1-itemsets
b. Generate candidate k-itemsets from frequent (k-1)-itemsets
c. Prune candidates that have infrequent subsets
d. Count the support of each candidate in the database
e. Eliminate candidates below the minimum support threshold
f. Repeat b-e until no more frequent itemsets are found
3. Generate association rules from frequent itemsets
Simple explanation: Imagine you’re a supermarket manager trying to understand what products customers often buy together. The Apriori algorithm is like going through all the receipts and noticing patterns. If you see that bread and butter are often bought together, you might then look for a third item that’s often bought with both bread and butter. The algorithm keeps building up these combinations, always focusing on the most common ones to keep the task manageable.
from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import association_rules
import pandas as pd
# Create a sample dataset
data = {
'TID': [1, 2, 3, 4, 5],
'Items': [
'Bread, Milk',
'Bread, Diaper, Beer, Eggs',
'Milk, Diaper, Beer, Coke',
'Bread, Milk, Diaper, Beer',
'Bread, Milk, Diaper, Coke'
]
}
df = pd.DataFrame(data)
# Convert the dataframe to a one-hot encoded format
one_hot = pd.get_dummies(df['Items'].str.get_dummies(', '))
# Apply the Apriori algorithm
frequent_itemsets = apriori(one_hot, min_support=0.6, use_colnames=True)
# Generate association rules
rules = association_rules(frequent_itemsets, metric="confidence", min_threshold=0.7)
print("Frequent Itemsets:")
print(frequent_itemsets)
print("\nAssociation Rules:")
print(rules)
27. Floyd-Warshall Algorithm
The Floyd-Warshall algorithm is used to find the shortest paths between all pairs of vertices in a weighted graph. It works with both positive and negative edge weights, but not with negative cycles. The algorithm is an example of dynamic programming.
Floyd-Warshall Algorithm:
1. Initialize the distance matrix with direct edge weights
2. For each intermediate vertex k:
For each pair of vertices (i, j):
If dist[i][j] > dist[i][k] + dist[k][j]:
dist[i][j] = dist[i][k] + dist[k][j]
3. The resulting matrix contains the shortest distances between all pairs of vertices
Simple explanation: Imagine you’re planning a road trip and want to know the shortest route between every pair of cities on your map. The Floyd-Warshall algorithm is like considering each city as a possible stopover and checking if going through that city makes any route shorter. You do this for every city, and by the end, you know the shortest path between all pairs of cities.
import numpy as np
def floyd_warshall(graph):
V = len(graph)
dist = np.array(graph)
for k in range(V):
for i in range(V):
for j in range(V):
dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
return dist
# Example usage
INF = float('inf')
graph = [
[0, 5, INF, 10],
[INF, 0, 3, INF],
[INF, INF, 0, 1],
[INF, INF, INF, 0]
]
result = floyd_warshall(graph)
print("Shortest distances between every pair of vertices:")
print(result)
28. Bellman-Ford Algorithm
The Bellman-Ford algorithm computes shortest paths from a single source vertex to all other vertices in a weighted digraph. It’s slower than Dijkstra’s algorithm for the same problem, but more versatile, as it can handle graphs with negative edge weights.
Bellman-Ford Algorithm:
1. Initialize distances from source to all vertices as infinite and distance to source as 0
2. Repeat V-1 times (where V is the number of vertices):
For each edge (u, v) with weight w:
If dist[u] + w < dist[v]:
dist[v] = dist[u] + w
3. Check for negative-weight cycles:
For each edge (u, v) with weight w:
If dist[u] + w < dist[v]:
Graph contains a negative-weight cycle
Simple explanation: Imagine you’re trying to find the cheapest way to fly from one city to any other city, but some airlines are offering promotional negative fares (they pay you to fly). Bellman-Ford is like checking every possible route up to the maximum number of stops, always updating if you find a cheaper way. It can handle those tricky negative fares, but it also checks if there’s any way to keep flying in circles and make infinite money (which would be a problem in real life!).
def bellman_ford(graph, source):
dist = {node: float('inf') for node in graph}
dist[source] = 0
for _ in range(len(graph) - 1):
for node in graph:
for neighbour, weight in graph[node].items():
if dist[node] + weight < dist[neighbour]:
dist[neighbour] = dist[node] + weight
# Check for negative-weight cycles
for node in graph:
for neighbour, weight in graph[node].items():
if dist[node] + weight < dist[neighbour]:
print("Graph contains a negative-weight cycle")
return None
return dist
# Example usage
graph = {
'A': {'B': -1, 'C': 4},
'B': {'C': 3, 'D': 2, 'E': 2},
'C': {},
'D': {'B': 1, 'C': 5},
'E': {'D': -3}
}
result = bellman_ford(graph, 'A')
if result:
print("Shortest distances from source 'A':")
for node, distance in result.items():
print(f"{node}: {distance}")
29. Huffman Coding
Huffman Coding is a data compression algorithm that assigns variable-length codes to characters based on their frequencies. Characters that appear more frequently get shorter codes, while less frequent characters get longer codes. This technique is used in various compression methods, including JPEG and MP3.
Huffman Coding Algorithm:
1. Count the frequency of each character in the input
2. Create a leaf node for each character and add it to a priority queue
3. While there is more than one node in the queue:
a. Remove the two nodes with the lowest frequency
b. Create a new internal node with these two nodes as children
c. Set the frequency of this new node as the sum of the two children's frequencies
d. Add the new node back into the queue
4. The remaining node is the root of the Huffman tree
5. Traverse the tree to assign codes (left child: 0, right child: 1)
Simple explanation: Imagine you’re creating a new alphabet for a telegram service where you pay per letter. You’d want to use shorter codes for common letters (like ‘e’ or ‘t’) and longer codes for rare letters (like ‘z’ or ‘q’). Huffman Coding does exactly this for data compression: it creates a special “alphabet” where frequently occurring data gets shorter codes, allowing the overall message to be shorter.
import heapq
from collections import defaultdict
class HuffmanNode:
def __init__(self, char, freq):
self.char = char
self.freq = freq
self.left = None
self.right =
30. LZW Compression
LZW (Lempel-Ziv-Welch) is a universal lossless data compression algorithm. It’s widely used in GIF image compression and in Unix file compression utility. LZW compression works by building a dictionary of strings encountered in the data and replacing longer strings with shorter codes.
LZW Compression Algorithm:
1. Initialize dictionary with single-character strings
2. Read input string w
3. While there are more input symbols:
a. Read next input symbol k
b. If wk exists in the dictionary:
w = wk
c. Else:
Output the code for w
Add wk to the dictionary
w = k
4. Output the code for w
Simple explanation: Imagine you’re writing a story and you notice you repeat certain phrases often. Instead of writing out “once upon a time” every time, you decide to use a shorthand like “OuaT”. As you write, you keep adding new shorthands for phrases you use often. This is similar to how LZW works – it builds a ‘dictionary’ of repeated patterns in the data and replaces them with shorter codes.
def lzw_compress(data):
dictionary = {chr(i): i for i in range(256)}
result = []
w = ""
code = 256
for c in data:
wc = w + c
if wc in dictionary:
w = wc
else:
result.append(dictionary[w])
dictionary[wc] = code
code += 1
w = c
if w:
result.append(dictionary[w])
return result
# Example usage
data = "TOBEORNOTTOBEORTOBEORNOT"
compressed = lzw_compress(data)
print(f"Original data: {data}")
print(f"Compressed data: {compressed}")
31. Dynamic Programming
Dynamic Programming is a method for solving complex problems by breaking them down into simpler subproblems. It is applicable to problems exhibiting the properties of overlapping subproblems and optimal substructure. Dynamic Programming is used in optimization problems, graph algorithms, and many other areas of computer science.
General Dynamic Programming Approach:
1. Define the subproblem
2. Write down the recurrence relation that relates subproblems
3. Solve the base cases
4. Decide if you want to implement it top-down (memoization) or bottom-up (tabulation)
5. Implement the solution
Simple explanation: Imagine you’re climbing stairs and you can take either 1 or 2 steps at a time. To figure out how many ways you can climb n stairs, you realize it’s the sum of ways to climb (n-1) and (n-2) stairs. Instead of recalculating for each step, you store the results for each subproblem (number of stairs) and build up to your solution. This is Dynamic Programming – solving complex problems by breaking them down and reusing solutions to smaller parts.
def fibonacci_dp(n):
if n <= 1:
return n
fib = [0] * (n + 1)
fib[1] = 1
for i in range(2, n + 1):
fib[i] = fib[i - 1] + fib[i - 2]
return fib[n]
# Example usage
n = 10
result = fibonacci_dp(n)
print(f"The {n}th Fibonacci number is: {result}")
32. Monte Carlo Method
The Monte Carlo method is a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle. It’s used in physical sciences, engineering, finance, and many other fields.
General Monte Carlo Algorithm:
1. Define a domain of possible inputs
2. Generate inputs randomly from the domain
3. Perform a deterministic computation using the inputs
4. Aggregate the results of the individual computations
Simple explanation: Imagine you want to calculate the area of an irregularly shaped lake. You could overlay a square grid on it and count the squares, but that’s tedious. Instead, you decide to randomly throw pebbles over the entire square and count what fraction lands in the lake. If you throw enough pebbles, the fraction in the lake will approximate the lake’s area relative to the square. This is how Monte Carlo works – using random sampling to estimate results for problems that are hard to solve directly.
import random
import math
def estimate_pi(num_points):
inside_circle = 0
for _ in range(num_points):
x = random.uniform(-1, 1)
y = random.uniform(-1, 1)
if x*x + y*y <= 1:
inside_circle += 1
pi_estimate = 4 * inside_circle / num_points
return pi_estimate
# Example usage
num_points = 1000000
estimated_pi = estimate_pi(num_points)
print(f"Estimated Pi: {estimated_pi}")
print(f"Actual Pi: {math.pi}")
print(f"Error: {abs(estimated_pi - math.pi)}")
33. Expectation-Maximization Algorithm
The Expectation-Maximization (EM) algorithm is an iterative method to find maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables. It’s widely used in machine learning for clustering, density estimation, and finding missing data.
EM Algorithm:
1. Initialize the parameters
2. Expectation step (E-step):
Calculate the expected value of the log likelihood function, with respect to the conditional distribution of Z given X under the current estimate of the parameters θ(t)
3. Maximization step (M-step):
Find the parameter that maximizes this expected log likelihood
4. Repeat steps 2 and 3 until convergence
Simple explanation: Imagine you’re a teacher trying to grade a multiple-choice test, but the students’ names have been erased from the papers. You start by guessing which paper belongs to which student based on handwriting. Then, you use these guesses to estimate each student’s average score. You then use these averages to make better guesses about which paper belongs to which student. You keep alternating between guessing paper ownership and estimating scores until your guesses stop changing. This back-and-forth process is similar to how the EM algorithm works.
import numpy as np
from sklearn.mixture import GaussianMixture
import matplotlib.pyplot as plt
# Generate some sample data
np.random.seed(42)
n_samples = 300
# Create two clusters
X = np.concatenate([np.random.normal(0, 1, int(0.3 * n_samples)),
np.random.normal(5, 1, int(0.7 * n_samples))]).reshape(-1, 1)
# Fit a Gaussian Mixture Model
gmm = GaussianMixture(n_components=2, random_state=42)
gmm.fit(X)
# Generate the x-axis values
x = np.linspace(-5, 10, 1000).reshape(-1, 1)
# Get the log-probabilities
logprob = gmm.score_samples(x)
responsibilities = gmm.predict_proba(x)
# Plot the results
plt.hist(X, bins=50, density=True, alpha=0.5)
plt.plot(x, np.exp(logprob), '-k')
plt.plot(x, responsibilities[:, 0] * np.exp(logprob), '--r')
plt.plot(x, responsibilities[:, 1] * np.exp(logprob), '--g')
plt.title('Gaussian Mixture Model Fit')
plt.xlabel('Data')
plt.ylabel('Density')
plt.show()
print("Means:", gmm.means_)
print("Covariances:", gmm.covariances_)
34. Markov Chains
A Markov Chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Markov Chains are used in statistical modeling of random processes, speech recognition, bioinformatics, and many other fields.
Markov Chain Properties:
1. State space: Set of all possible states
2. Transition probabilities: Probability of moving from one state to another
3. Initial state distribution
4. Memoryless property: P(X_{n+1} = x | X_n, X_{n-1}, ..., X_1) = P(X_{n+1} = x | X_n)
Simple explanation: Imagine you’re predicting the weather. In a very simple model, you might say that if it’s sunny today, there’s a 70% chance it’ll be sunny tomorrow and a 30% chance it’ll rain. If it rains today, there’s a 60% chance it’ll rain tomorrow and a 40% chance it’ll be sunny. This is a basic Markov Chain – the weather tomorrow only depends on the weather today, not on what happened in the past.
import numpy as np
import matplotlib.pyplot as plt
# Define the transition matrix
# Rows: current state, Columns: next state
# States: 0 - Sunny, 1 - Rainy
P = np.array([[0.7, 0.3],
[0.4, 0.6]])
# Initial state (Sunny)
state = 0
# Number of days to simulate
n_days = 100
# Simulate the weather
weather = []
for _ in range(n_days):
weather.append(state)
state = np.random.choice([0, 1], p=P[state])
# Plot the results
plt.figure(figsize=(12, 6))
plt.plot(weather, 'b-', label='Weather')
plt.title('Markov Chain Weather Simulation')
plt.xlabel('Day')
plt.ylabel('Weather (0: Sunny, 1: Rainy)')
plt.yticks([0, 1], ['Sunny', 'Rainy'])
plt.legend()
plt.show()
print(f"Proportion of Sunny days: {weather.count(0)/n_days:.2f}")
print(f"Proportion of Rainy days: {weather.count(1)/n_days:.2f}")
35. Boosting Algorithms
Boosting is an ensemble learning method that combines multiple weak learners to create a strong learner. The idea is to train weak learners sequentially, with each new learner focusing on the mistakes of the previous ones. Common boosting algorithms include AdaBoost, Gradient Boosting, and XGBoost.
General Boosting Algorithm:
1. Initialize the model with a simple prediction
2. For t = 1 to T (number of iterations):
a. Fit a weak learner to the residuals of the current model
b. Add this weak learner to the ensemble with some weight
c. Update the residuals
3. Final model is the sum of all weak learners
Simple explanation: Imagine you’re trying to solve a complex puzzle. You start by making a rough guess, then you focus on the parts you got wrong and try to improve them. You keep doing this, each time concentrating on the areas where you’re still making mistakes. Boosting algorithms work similarly – they start with a simple model, then keep adding new models that focus on correcting the errors of the previous ones.
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Create a random dataset
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and train the model
model = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, random_state=42)
model.fit(X_train, y_train)
# Make predictions
y_pred = model.predict(X_test)
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
36. Q-Learning
Q-Learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment and can handle problems with stochastic transitions and rewards without requiring adaptations.
Q-Learning Update Rule:
Q(s,a) ← Q(s,a) + α[r + γ max(Q(s',a')) - Q(s,a)]
Where:
s, a are the current state and action
s', a' are the next state and action
r is the reward
α is the learning rate
γ is the discount factor
Simple explanation: Imagine you’re playing a video game where you need to find the best path through a maze. At first, you try random moves. Each time you make a move, you learn how good it was based on whether you got closer to the exit or found treasure. Over time, you build up a “map” of which moves are best in each part of the maze. This is similar to how Q-Learning works – it learns the best actions to take in each situation through trial and error.
import numpy as np
import random
# Define the environment
env = np.array([
[0, 1, 0, 1, 0],
[0, 0, 0, 0, 0],
[0, 1, 1, 1, 0],
[0, 0, 0, 1, 0],
[1, 1, 0, 0, 0]
])
# Q-table
Q = np.zeros((5, 5, 4)) # 5x5 grid, 4 actions (up, right, down, left)
# Hyperparameters
alpha = 0.1
gamma = 0.99
epsilon = 0.1
episodes = 1000
# Q-learning algorithm
for _ in range(episodes):
state = (0, 0)
while state != (4, 4):
if random.uniform(0, 1) < epsilon:
action = random.randint(0, 3)
else:
action = np.argmax(Q[state[0], state[1]])
if action == 0: # up
next_state = (max(0, state[0] - 1), state[1])
elif action == 1: # right
next_state = (state[0], min(4, state[1] + 1))
elif action == 2: # down
next_state = (min(4, state[0] + 1), state[1])
else: # left
next_state = (state[0], max(0, state[1] - 1))
reward = -1
if next_state == (4, 4):
reward = 100
elif env[next_state] == 1:
reward = -100
next_state = state
Q[state[0], state[1], action] = Q[state[0], state[1], action] + \
alpha * (reward + gamma * np.max(Q[next_state]) - Q[state[0], state[1], action])
state = next_state
print("Q-table:")
print(Q)
37. Alpha-Beta Pruning
Alpha-Beta pruning is an optimization technique for the minimax algorithm, used in two-player games. It reduces the number of nodes evaluated in the search tree by eliminating branches that don’t need to be explored.
Alpha-Beta Pruning Algorithm:
function alphabeta(node, depth, α, β, maximizingPlayer)
if depth = 0 or node is a terminal node
return the heuristic value of node
if maximizingPlayer
value := −∞
for each child of node
value := max(value, alphabeta(child, depth − 1, α, β, FALSE))
α := max(α, value)
if α ≥ β
break
return value
else
value := +∞
for each child of node
value := min(value, alphabeta(child, depth − 1, α, β, TRUE))
β := min(β, value)
if β ≤ α
break
return value
Simple explanation: Imagine you’re playing chess and thinking several moves ahead. You realize that if a particular move leads to a situation where your opponent can force a win, you don’t need to consider any other consequences of that move – you can “prune” that branch of possibilities. Alpha-Beta pruning works similarly in game-playing algorithms, cutting off branches of the search tree that don’t need to be explored, making the search much faster.
import math
def alpha_beta_pruning(node, depth, alpha, beta, maximizing_player):
if depth == 0 or node.is_terminal():
return node.evaluate()
if maximizing_player:
value = -math.inf
for child in node.get_children():
value = max(value, alpha_beta_pruning(child, depth - 1, alpha, beta, False))
alpha = max(alpha, value)
if alpha >= beta:
break # Beta cut-off
return value
else:
value = math.inf
for child in node.get_children():
value = min(value, alpha_beta_pruning(child, depth - 1, alpha, beta, True))
beta = min(beta, value)
if beta <= alpha:
break # Alpha cut-off
return value
# Example usage (assuming a Node class is defined)
class Node:
def __init__(self, value):
self.value = value
self.children = []
def is_terminal(self):
return len(self.children) == 0
def evaluate(self):
return self.value
def get_children(self):
return self.children
# Create a simple game tree
root = Node(0)
root.children = [Node(3), Node(5), Node(6)]
root.children[0].children = [Node(2), Node(4)]
root.children[1].children = [Node(1), Node(7)]
root.children[2].children = [Node(8), Node(9)]
result = alpha_beta_pruning(root, 2, -math.inf, math.inf, True)
print(f"Best possible score: {result}")
38. Minimax Algorithm
The Minimax algorithm is a decision rule used in artificial intelligence, decision theory, game theory, and other fields. It’s primarily used in two-player turn-based games to find the optimal move for a player, assuming that the opponent also plays optimally.
Minimax Algorithm:
function minimax(node, depth, maximizingPlayer)
if depth = 0 or node is a terminal node
return the heuristic value of node
if maximizingPlayer
value := −∞
for each child of node
value := max(value, minimax(child, depth − 1, FALSE))
return value
else
value := +∞
for each child of node
value := min(value, minimax(child, depth − 1, TRUE))
return value
Simple explanation: Imagine you’re playing a game of tic-tac-toe and thinking several moves ahead. You consider your best move, then your opponent’s best response, then your best response to that, and so on. You assume your opponent will always make the move that’s worst for you. By thinking through all possibilities this way, you can determine your best move. This is how the Minimax algorithm works – it alternates between finding the best move for you and the best move for your opponent to determine the optimal play.
import math
def minimax(node, depth, maximizing_player):
if depth == 0 or node.is_terminal():
return node.evaluate()
if maximizing_player:
value = -math.inf
for child in node.get_children():
value = max(value, minimax(child, depth - 1, False))
return value
else:
value = math.inf
for child in node.get_children():
value = min(value, minimax(child, depth - 1, True))
return value
# Example usage (assuming a Node class is defined)
class Node:
def __init__(self, value):
self.value = value
self.children = []
def is_terminal(self):
return len(self.children) == 0
def evaluate(self):
return self.value
def get_children(self):
return self.children
# Create a simple game tree
root = Node(0)
root.children = [Node(3), Node(5), Node(6)]
root.children[0].children = [Node(2), Node(4)]
root.children[1].children = [Node(1), Node(7)]
root.children[2].children = [Node(8), Node(9)]
result = minimax(root, 2, True)
print(f"Best possible score: {result}")
39. t-Distributed Stochastic Neighbor Embedding (t-SNE)
t-SNE is a machine learning algorithm for visualization developed by Laurens van der Maaten and Geoffrey Hinton. It is a nonlinear dimensionality reduction technique well-suited for embedding high-dimensional data for visualization in a low-dimensional space of two or three dimensions.
t-SNE Algorithm:
1. Compute pairwise similarities in high-dimensional space
2. Initialize low-dimensional embeddings
3. For a number of iterations:
a. Compute pairwise similarities in low-dimensional space
b. Compute gradient of the cost function
c. Update low-dimensional embeddings
Simple explanation: Imagine you’re trying to create a map of a city based only on the distances between buildings. t-SNE is like arranging little models of the buildings on a table, constantly adjusting their positions so that the distances between models on the table match the real distances as closely as possible. It’s particularly good at preserving local structures, so buildings that are close in reality end up close on your model table, even if the overall layout might be different.
from sklearn.manifold import TSNE
from sklearn.datasets import load_digits
import matplotlib.pyplot as plt
# Load the digits dataset
digits = load_digits()
X, y = digits.data, digits.target
# Apply t-SNE
tsne = TSNE(n_components=2, random_state=42)
X_tsne = tsne.fit_transform(X)
# Plot the results
plt.figure(figsize=(10, 8))
scatter = plt.scatter(X_tsne[:, 0], X_tsne[:, 1], c=y, cmap='viridis')
plt.colorbar(scatter)
plt.title('t-SNE visualization of the digits dataset')
plt.show()
40. Latent Dirichlet Allocation (LDA)
Latent Dirichlet Allocation is a generative statistical model that allows sets of observations to be explained by unobserved groups. It’s commonly used for topic modeling in natural language processing, allowing us to discover abstract “topics” that occur in a collection of documents.
LDA Model:
1. For each topic k:
- Draw a distribution over words βk ~ Dirichlet(η)
2. For each document d:
- Draw a vector of topic proportions θd ~ Dirichlet(α)
- For each word w in document d:
- Draw a topic assignment zd,n ~ Multinomial(θd)
- Draw a word wd,n ~ Multinomial(βzd,n)
Simple explanation: Imagine you’re trying to organize a large library of books without reading them all. LDA is like having a team of robots that skim through each book, noticing which words appear often together. They might notice one book uses “galaxy”, “star”, and “planet” a lot, so they label it as “space-related”. Another book might use “recipe”, “ingredient”, and “bake” frequently, so it’s labeled “cooking-related”. LDA does this for text data, finding common themes (topics) based on word co-occurrences.
from sklearn.decomposition import LatentDirichletAllocation
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
# Load the 20 newsgroups dataset
newsgroups = fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))
# Create a document-term matrix
vectorizer = CountVectorizer(max_df=0.95, min_df=2, stop_words='english')
doc_term_matrix = vectorizer.fit_transform(newsgroups.data)
# Create and fit the LDA model
lda = LatentDirichletAllocation(n_components=10, random_state=42)
lda.fit(doc_term_matrix)
# Function to print top words for each topic
def print_top_words(model, feature_names, n_top_words):
for topic_idx, topic in enumerate(model.components_):
top_words = [feature_names[i] for i in topic.argsort()[:-n_top_words - 1:-1]]
print(f"Topic #{topic_idx}: {', '.join(top_words)}")
# Print the top 10 words for each topic
print_top_words(lda, vectorizer.get_feature_names(), 10)
41. Word2Vec
Word2Vec is a group of related models used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words. Word2Vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space.
Word2Vec Models:
1. Continuous Bag-of-Words (CBOW):
- Predicts target word from context words
2. Skip-gram:
- Predicts context words from target word
Training objective:
Maximize the average log probability:
1/T Σ Σ log p(wt+j | wt)
Simple explanation: Imagine you’re teaching a computer to understand language by playing a game. In one version of the game (CBOW), you show the computer a sentence with a word missing, and it has to guess the word. In another version (Skip-gram), you show it a word, and it has to guess what words might appear around it. As the computer plays this game with millions of sentences, it starts to understand how words relate to each other. Word2Vec does this, creating a “map” where similar words are close together.
from gensim.models import Word2Vec
from nltk.corpus import brown
import nltk
# Download the Brown corpus
nltk.download('brown')
# Prepare the data
sentences = brown.sents()
# Train the Word2Vec model
model = Word2Vec(sentences, vector_size=100, window=5, min_count=1, workers=4)
# Find similar words
similar_words = model.wv.most_similar('man', topn=5)
print("Words most similar to 'man':")
for word, score in similar_words:
print(f"{word}: {score:.2f}")
# Perform word analogy
result = model.wv.most_similar(positive=['woman', 'king'], negative=['man'], topn=1)
print("\nWord analogy (woman - man + king):")
print(f"{result[0][0]}: {result[0][1]:.2f}")
42. TF-IDF (Term Frequency-Inverse Document Frequency)
TF-IDF is a numerical statistic intended to reflect how important a word is to a document in a collection or corpus. It’s commonly used as a weighting factor in information retrieval, text mining, and user modeling. The TF-IDF value increases proportionally to the number of times a word appears in the document and is offset by the number of documents in the corpus that contain the word.
TF-IDF calculation:
TF(t,d) = (Number of times term t appears in document d) / (Total number of terms in document d)
IDF(t) = log_e(Total number of documents / Number of documents with term t in it)
TF-IDF(t,d) = TF(t,d) * IDF(t)
Simple explanation: Imagine you’re trying to summarize a book by finding its most important words. You might think words that appear often are important (that’s the TF part). But then you realize common words like “the” or “and” aren’t actually that meaningful. So you also consider how rare the word is across all books (that’s the IDF part). Words that are frequent in this book but rare in others are probably the most important. TF-IDF does this mathematically, helping identify key words in documents.
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
# Sample documents
documents = [
"The quick brown fox jumps over the lazy dog",
"Never jump over the lazy dog quickly",
"The lazy dog is always lazy",
"A quick brown dog outsmarts a quick fox"
]
# Create TF-IDF Vectorizer
vectorizer = TfidfVectorizer()
# Fit and transform the documents
tfidf_matrix = vectorizer.fit_transform(documents)
# Get feature names (words)
feature_names = vectorizer.get_feature_names()
# Print TF-IDF values for each document
for i, doc in enumerate(documents):
print(f"Document {i+1}:")
for j, word in enumerate(feature_names):
tfidf_value = tfidf_matrix[i, j]
if tfidf_value > 0:
print(f" {word}: {tfidf_value:.4f}")
print()
# Calculate cosine similarity between documents
cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)
print("Cosine Similarity Matrix:")
print(cosine_sim)
43. Autoencoders
Autoencoders are a type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised learning). The aim of an autoencoder is to learn a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore signal “noise”.
Autoencoder Architecture:
1. Input layer (size of input data)
2. One or more encoder layers (typically decreasing in size)
3. Bottleneck layer (compressed representation)
4. One or more decoder layers (typically increasing in size)
5. Output layer (same size as input layer)
Training objective:
Minimize reconstruction error: ||X - X'||^2
where X is the input and X' is the reconstructed output
Simple explanation: Imagine you’re trying to compress a large photo to send it via email, but in a way that you can later reconstruct it as closely as possible to the original. An autoencoder is like a machine that learns to do this compression and decompression. It first squeezes (encodes) the photo into a smaller size, then tries to expand (decode) it back to the original size. By doing this many times with many photos, it learns to compress photos in a way that preserves the most important features.
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
# Load and preprocess the MNIST dataset
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
# Define the autoencoder model
encoding_dim = 32
input_img = tf.keras.Input(shape=(784,))
encoded = layers.Dense(encoding_dim, activation='relu')(input_img)
decoded = layers.Dense(784, activation='sigmoid')(encoded)
autoencoder = models.Model(input_img, decoded)
# Compile and train the model
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
autoencoder.fit(x_train, x_train,
epochs=50,
batch_size=256,
shuffle=True,
validation_data=(x_test, x_test))
# Use the autoencoder to reconstruct some test images
encoded_imgs = autoencoder.encoder(x_test).numpy()
decoded_imgs = autoencoder.decoder(encoded_imgs).numpy()
# Plot original and reconstructed images
import matplotlib.pyplot as plt
n = 10
plt.figure(figsize=(20, 4))
for i in range(n):
# Original
ax = plt.subplot(2, n, i + 1)
plt.imshow(x_test[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# Reconstruction
ax = plt.subplot(2, n, i + 1 + n)
plt.imshow(decoded_imgs[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
44. GANs (Generative Adversarial Networks)
GANs are a class of machine learning frameworks designed by Ian Goodfellow and his colleagues in 2014. In a GAN, two neural networks contest with each other in a game (in the sense of game theory, often but not always in the form of a zero-sum game). Given a training set, this technique learns to generate new data with the same statistics as the training set.
GAN Components:
1. Generator G: Generates fake samples
2. Discriminator D: Distinguishes real samples from fake ones
Training objective:
min_G max_D V(D,G) = E[log(D(x))] + E[log(1-D(G(z)))]
where:
x is data sampled from real data distribution
z is random noise
G(z) is generated fake data
Simple explanation: Imagine a game between an art forger (the generator) and an art detective (the discriminator). The forger tries to create fake paintings that look real, while the detective tries to distinguish between real and fake paintings. As they play this game, the forger gets better at creating convincing fakes, and the detective gets better at spotting even tiny flaws. GANs work similarly, with one part trying to generate convincing fake data, and another part trying to distinguish real from fake, both improving over time.
import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
import matplotlib.pyplot as plt
# Define the generator
def make_generator_model():
model = tf.keras.Sequential([
layers.Dense(7*7*256, use_bias=False, input_shape=(100,)),
layers.BatchNormalization(),
layers.LeakyReLU(),
layers.Reshape((7, 7, 256)),
layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False),
layers.BatchNormalization(),
layers.LeakyReLU(),
layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False),
layers.BatchNormalization(),
layers.LeakyReLU(),
layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh')
])
return model
# Define the discriminator
def make_discriminator_model():
model = tf.keras.Sequential([
layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same', input_shape=[28, 28, 1]),
layers.LeakyReLU(),
layers.Dropout(0.3),
layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same'),
layers.LeakyReLU(),
layers.Dropout(0.3),
layers.Flatten(),
layers.Dense(1)
])
return model
# Define loss functions
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)
def discriminator_loss(real_output, fake_output):
real_loss = cross_entropy(tf.ones_like(real_output), real_output)
fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
total_loss = real_loss + fake_loss
return total_loss
def generator_loss(fake_output):
return cross_entropy(tf.ones_like(fake_output), fake_output)
# Define optimizers
generator_optimizer = tf.keras.optimizers.Adam(1e-4)
discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)
# Create the models
generator = make_generator_model()
discriminator = make_discriminator_model()
# Training step
@tf.function
def train_step(images):
noise = tf.random.normal([BATCH_SIZE, 100])
with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
generated_images = generator(noise, training=True)
real_output = discriminator(images, training=True)
fake_output = discriminator(generated_images, training=True)
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)
gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)
generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))
# Load and preprocess the MNIST dataset
(train_images, _), (_, _) = tf.keras.datasets.mnist.load_data()
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32')
train_images = (train_images - 127.5) / 127.5
BUFFER_SIZE =
45. Levenshtein Distance
The Levenshtein distance is a string metric for measuring the difference between two sequences. It is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into another.
Levenshtein Distance Recursion:
lev(i,j) =
max(i,j) if min(i,j) = 0
min(
lev(i-1, j) + 1,
lev(i, j-1) + 1,
lev(i-1, j-1) + (s[i] ≠ t[j])
) otherwise
Simple explanation: Imagine you’re a proofreader comparing two versions of a text. The Levenshtein distance is like counting the minimum number of letter changes (adding, removing, or replacing) you’d need to make to turn one version into the other. For example, changing “kitten” to “sitting” requires 3 changes: substitute ‘k’ for ‘s’, substitute ‘e’ for ‘i’, and insert ‘g’ at the end.
def levenshtein_distance(s1, s2):
if len(s1) < len(s2):
return levenshtein_distance(s2, s1)
if len(s2) == 0:
return len(s1)
previous_row = range(len(s2) + 1)
for i, c1 in enumerate(s1):
current_row = [i + 1]
for j, c2 in enumerate(s2):
insertions = previous_row[j + 1] + 1
deletions = current_row[j] + 1
substitutions = previous_row[j] + (c1 != c2)
current_row.append(min(insertions, deletions, substitutions))
previous_row = current_row
return previous_row[-1]
# Example usage
word1 = "kitten"
word2 = "sitting"
distance = levenshtein_distance(word1, word2)
print(f"The Levenshtein distance between '{word1}' and '{word2}' is {distance}")
46. Bloom Filter
A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. It can have false positives, but false negatives are impossible. Elements can be added to the set, but not removed.
Bloom Filter Operations:
1. Initialize m bits to 0
2. Choose k independent hash functions
To add an element x:
for each hash function h_i(x):
set the bit at position h_i(x) to 1
To query for an element x:
for each hash function h_i(x):
if the bit at position h_i(x) is 0:
return "definitely not in set"
return "possibly in set"
Simple explanation: Imagine you have a huge garden and you want to keep track of which plants you’ve watered, but you can’t afford a tag for each plant. Instead, you have a grid of soil patches. When you water a plant, you mark a few patches based on the plant’s name (using hash functions). Later, to check if you’ve watered a plant, you look at those same patches. If any are unmarked, you definitely haven’t watered it. If all are marked, you probably have (but there’s a small chance you’re seeing marks from other plants).
import mmh3
from bitarray import bitarray
class BloomFilter:
def __init__(self, size, hash_count):
self.size = size
self.hash_count = hash_count
self.bit_array = bitarray(size)
self.bit_array.setall(0)
def add(self, item):
for i in range(self.hash_count):
index = mmh3.hash(item, i) % self.size
self.bit_array[index] = 1
def check(self, item):
for i in range(self.hash_count):
index = mmh3.hash(item, i) % self.size
if self.bit_array[index] == 0:
return False
return True
# Example usage
bloom = BloomFilter(20, 3)
bloom.add("apple")
bloom.add("banana")
bloom.add("orange")
print(bloom.check("apple")) # Should return True
print(bloom.check("pear")) # Should return False (probably)
print(bloom.check("banana")) # Should return True
47. Consistent Hashing
Consistent hashing is a distributed hashing scheme that operates independently of the number of servers or objects in a distributed hash table. It allows servers and objects to scale without affecting the overall system.
Consistent Hashing Algorithm:
1. Map servers and keys to points on a ring (0 to 2^n - 1)
2. To find which server a key belongs to:
- Start at the key's position
- Move clockwise until finding the first server
3. When adding/removing servers, only keys between the new server and the next server are affected
Simple explanation: Imagine you’re organizing books on a circular bookshelf. Each section of the shelf represents a server, and books are placed based on their titles. When you need to find a book, you start at its title’s position and move clockwise until you hit a server section. This way, if you add or remove a section (server), you only need to move books in that immediate area, not reorganize the entire shelf. This is how consistent hashing efficiently distributes data across changing numbers of servers.
import hashlib
class ConsistentHash:
def __init__(self, nodes=None, virtual_nodes=100):
self.virtual_nodes = virtual_nodes
self.ring = {}
self.sorted_keys = []
if nodes:
for node in nodes:
self.add_node(node)
def add_node(self, node):
for i in range(self.virtual_nodes):
key = self.hash(f"{node}:{i}")
self.ring[key] = node
self.sorted_keys.append(key)
self.sorted_keys.sort()
def remove_node(self, node):
for i in range(self.virtual_nodes):
key = self.hash(f"{node}:{i}")
del self.ring[key]
self.sorted_keys.remove(key)
def get_node(self, key):
if not self.ring:
return None
hash_key = self.hash(key)
for node_key in self.sorted_keys:
if node_key >= hash_key:
return self.ring[node_key]
return self.ring[self.sorted_keys[0]]
def hash(self, key):
return hashlib.md5(key.encode('utf-8')).hexdigest()
# Example usage
ch = ConsistentHash(["server1", "server2", "server3"])
print(ch.get_node("object1"))
print(ch.get_node("object2"))
ch.add_node("server4")
print(ch.get_node("object1")) # Might or might not change
print(ch.get_node("object2")) # Might or might not change
ch.remove_node("server1")
print(ch.get_node("object1")) # Might or might not change
print(ch.get_node("object2")) # Might or might not change
48. MapReduce
MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster. It consists of a Map() procedure that performs filtering and sorting, and a Reduce() procedure that performs a summary operation.
MapReduce Algorithm:
1. Map phase:
- Input data is divided into splits
- Map function processes each split, emitting key-value pairs
2. Shuffle and Sort phase:
- Key-value pairs are grouped by key
3. Reduce phase:
- Reduce function processes each key group
- Outputs final result
Simple explanation: Imagine you’re counting votes in a large election. Instead of one person counting all votes, you divide the ballot boxes among many people (Map step). Each person counts votes for each candidate in their box. Then, you have people collect all counts for each candidate (Shuffle step) and sum them up (Reduce step). This divide-and-conquer approach is how MapReduce works, allowing it to process vast amounts of data across many computers efficiently.
from collections import defaultdict
def map_function(document):
words = document.split()
return [(word, 1) for word in words]
def reduce_function(word, counts):
return (word, sum(counts))
def mapreduce(documents):
# Map phase
mapped_values = []
for doc in documents:
mapped_values.extend(map_function(doc))
# Shuffle phase
shuffled_values = defaultdict(list)
for word, count in mapped_values:
shuffled_values[word].append(count)
# Reduce phase
reduced_values = []
for word, counts in shuffled_values.items():
reduced_values.append(reduce_function(word, counts))
return reduced_values
# Example usage
documents = [
"hello world",
"hello mapreduce",
"mapreduce is awesome",
"hello awesome world"
]
result = mapreduce(documents)
print("Word counts:")
for word, count in sorted(result, key=lambda x: x[1], reverse=True):
print(f"{word}: {count}")
49. Simpson’s Paradox
Simpson’s Paradox is a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined. This result is often encountered in social-science and medical-science statistics and is particularly confounding when frequency data is unduly given causal interpretations.
Simpson's Paradox occurs when:
P(A|B) > P(A|not B) for each subgroup
but
P(A|B) < P(A|not B) for the entire population
Where A and B are events, and P(A|B) is the probability of A given B
Simple explanation: Imagine two hospitals: a small one that mostly treats easy cases and a large one that mostly treats difficult cases. The small hospital might have a higher success rate for both easy and difficult cases individually. However, when you combine the data, the large hospital appears to have a better overall success rate because it handles more cases total. This is Simpson’s Paradox – where combining data can reverse the trend seen in individual groups.
import pandas as pd
import matplotlib.pyplot as plt
# Create example data
data = {
'Treatment': ['A', 'A', 'B', 'B'],
'Group': ['Small', 'Large', 'Small', 'Large'],
'Success': [20, 60, 15, 80],
'Total': [25, 100, 20, 100]
}
df = pd.DataFrame(data)
df['Rate'] = df['Success'] / df['Total']
# Calculate overall rates
overall_a = df[df['Treatment'] == 'A']['Success'].sum() / df[df['Treatment'] == 'A']['Total'].sum()
overall_b = df[df['Treatment'] == 'B']['Success'].sum() / df[df['Treatment'] == 'B']['Total'].sum()
# Plot
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Group-wise plot
for treatment in ['A', 'B']:
group_data = df[df['Treatment'] == treatment]
ax1.bar(group_data['Group'], group_data['Rate'], label=f'Treatment {treatment}')
ax1.set_ylim(0, 1)
ax1.set_title('Group-wise Success Rates')
ax1.set_ylabel('Success Rate')
ax1.legend()
# Overall plot
ax2.bar(['A', 'B'], [overall_a, overall_b])
ax2.set_ylim(0, 1)
ax2.set_title('Overall Success Rates')
ax2.set_ylabel('Success Rate')
plt.tight_layout()
plt.show()
print("Group-wise rates:")
print(df)
print(f"\nOverall rates:\nTreatment A: {overall_a:.2f}\nTreatment B: {overall_b:.2f}")
50. Z-Algorithm
The Z-Algorithm is a linear time string matching algorithm that finds all occurrences of a pattern in a text. It uses a Z-array, where Z[i] is the length of the longest substring starting from str[i] which is also a prefix of the string.
Z-Algorithm steps:
1. Concatenate pattern + '$' + text
2. Construct Z-array:
- If i > r, compare from str[i] onwards
- If i ≤ r:
- If Z[i-l] < r-i+1, Z[i] = Z[i-l]
- Else, compare from str[r] onwards
3. All i where Z[i] = pattern length are matches
Simple explanation: Imagine you’re a detective trying to find a specific word in a long document. Instead of checking each letter one by one, you use a clever trick. You first look at how many letters at the start of the document match your word. Then, you use this information to skip unnecessary comparisons as you move through the document. This is essentially what the Z-Algorithm does, making it very efficient for finding patterns in text.
def z_algorithm(s):
n = len(s)
z = [0] * n
l, r = 0, 0
for i in range(1, n):
if i > r:
l = r = i
while r < n and s[r-l] == s[r]:
r += 1
z[i] = r - l
r -= 1
else:
k = i - l
if z[k] < r - i + 1:
z[i] = z[k]
else:
l = i
while r < n and s[r-l] == s[r]:
r += 1
z[i] = r - l
r -= 1
return z
def pattern_matching(text, pattern):
concat = pattern + "$" + text
z = z_algorithm(concat)
result = []
for i in range(len(concat)):
if z[i] == len(pattern):
result.append(i - len(pattern) - 1)
return result
# Example usage
text = "ABABDABACDABABCABAB"
pattern = "ABABCABAB"
matches = pattern_matching(text, pattern)
print(f"Pattern found at indices: {matches}")
51. Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form (BCNF) is a normal form used in database normalization. A table is in BCNF if and only if, for every one of its dependencies X → Y, at least one of the following conditions holds:
- X → Y is a trivial functional dependency (Y ⊆ X)
- X is a superkey for the table
BCNF Conditions:
1. The table must be in 3NF
2. For every non-trivial functional dependency X → Y:
X must be a superkey
Simple explanation: Imagine you’re organizing a library. BCNF is like ensuring that each book’s location is determined solely by its unique identifier (like ISBN), not by any other attribute. If you find that you’re using the author’s name to determine the book’s location, and two books by the same author could be in different places, your system isn’t in BCNF. It helps prevent redundancy and anomalies in your data organization.
def is_superkey(table, attributes):
return set(attributes) == set(table)
def is_bcnf(table, dependencies):
for dep in dependencies:
left, right = dep
if not (set(right).issubset(set(left)) or is_superkey(table, left)):
return False
return True
# Example usage
table = ['A', 'B', 'C', 'D']
dependencies = [
(['A'], ['B']),
(['B'], ['C']),
(['A', 'D'], ['B'])
]
if is_bcnf(table, dependencies):
print("The table is in BCNF")
else:
print("The table is not in BCNF")
52. Fisher-Yates Shuffle
The Fisher-Yates shuffle, also known as the Knuth shuffle, is an algorithm for generating a random permutation of a finite sequence. It’s an unbiased way to shuffle an array, meaning every permutation is equally likely.
Fisher-Yates Algorithm:
for i from n-1 down to 1 do
j = random integer such that 0 ≤ j ≤ i
swap a[i] with a[j]
Simple explanation: Imagine you have a deck of numbered cards. To shuffle, you start from the last card, pick a random card from the deck (including the last card), and swap it with the last card. Then you move to the second-last card and repeat, each time picking from the remaining unshuffled cards. This method ensures a truly random shuffle, just like the Fisher-Yates algorithm does for data.
import random
def fisher_yates_shuffle(arr):
for i in range(len(arr)-1, 0, -1):
j = random.randint(0, i)
arr[i], arr[j] = arr[j], arr[i]
return arr
# Example usage
original = list(range(1, 11)) # [1, 2, 3, ..., 10]
shuffled = fisher_yates_shuffle(original.copy())
print(f"Original: {original}")
print(f"Shuffled: {shuffled}")
53. Trie
A Trie, also called a prefix tree, is a tree-like data structure used to store and retrieve strings. Each node in the trie represents a common prefix of some strings. This structure is particularly useful for tasks like autocomplete, spell checking, and IP routing.
Trie Operations:
Insert(word):
For each character in word:
If character doesn't exist, create new node
Move to next node
Mark last node as end of word
Search(word):
For each character in word:
If character doesn't exist, return false
Move to next node
Return true if last node is marked as end of word
Simple explanation: Imagine a dictionary where each page split represents a letter. To find a word, you’d flip to the first letter, then within that section to the second letter, and so on. A trie works similarly for storing words efficiently. Each branch represents a letter, and you follow the branches to spell out words. This makes it very fast to check if a word exists or to find all words with a certain prefix.
class TrieNode:
def __init__(self):
self.children = {}
self.is_end_of_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end_of_word = True
def search(self, word):
node = self.root
for char in word:
if char not in node.children:
return False
node = node.children[char]
return node.is_end_of_word
def starts_with(self, prefix):
node = self.root
for char in prefix:
if char not in node.children:
return False
node = node.children[char]
return True
# Example usage
trie = Trie()
words = ["apple", "app", "apricot", "banana"]
for word in words:
trie.insert(word)
print(trie.search("apple")) # True
print(trie.search("app")) # True
print(trie.search("appl")) # False
print(trie.starts_with("app")) # True
print(trie.starts_with("ban")) # True
print(trie.starts_with("cat")) # False
54. Red-Black Tree
A self-balancing binary search tree that ensures the tree remains balanced after insertions and deletions. It maintains balance by enforcing properties related to the coloring of nodes, leading to efficient operations such as insertions, deletions, and lookups, all in O(log n) time.
Properties:
1.Every node is either red or black.
2.The root is black.
3.All leaves (NIL nodes) are black.
4.If a node is red, both its children are black.
5.Every path from a node to its NIL descendants has the same number of black nodes.
Simple explanation: Imagine the tree is a team where each player (node) wears either a red or black jersey. The coach ensures no two red jerseys are together, and the path from the starting player (root) to any bench player (leaf) has the same number of black jerseys.
class RedBlackNode:
def __init__(self, data, color='red', left=None, right=None, parent=None):
self.data = data
self.color = color
self.left = left
self.right = right
self.parent = parent
class RedBlackTree:
def __init__(self):
self.TNULL = RedBlackNode(0, color='black')
self.root = self.TNULL
def insert(self, key):
# Normal BST insert
node = RedBlackNode(key)
node.left = self.TNULL
node.right = self.TNULL
node.color = 'red'
parent = None
current = self.root
while current != self.TNULL:
parent = current
if node.data < current.data:
current = current.left
else:
current = current.right
node.parent = parent
if parent is None:
self.root = node
elif node.data < parent.data:
parent.left = node
else:
parent.right = node
if node.parent is None:
node.color = 'black'
return
if node.parent.parent is None:
return
self.fix_insert(node)
def fix_insert(self, k):
while k.parent.color == 'red':
if k.parent == k.parent.parent.right:
u = k.parent.parent.left
if u.color == 'red':
u.color = 'black'
k.parent.color = 'black'
k.parent.parent.color = 'red'
k = k.parent.parent
else:
if k == k.parent.left:
k = k.parent
self.right_rotate(k)
k.parent.color = 'black'
k.parent.parent.color = 'red'
self.left_rotate(k.parent.parent)
else:
u = k.parent.parent.right
if u.color == 'red':
u.color = 'black'
k.parent.color = 'black'
k.parent.parent.color = 'red'
k = k.parent.parent
else:
if k == k.parent.right:
k = k.parent
self.left_rotate(k)
k.parent.color = 'black'
k.parent.parent.color = 'red'
self.right_rotate(k.parent.parent)
if k == self.root:
break
self.root.color = 'black'
def left_rotate(self, x):
y = x.right
x.right = y.left
if y.left != self.TNULL:
y.left.parent = x
y.parent = x.parent
if x.parent is None:
self.root = y
elif x == x.parent.left:
x.parent.left = y
else:
x.parent.right = y
y.left = x
x.parent = y
def right_rotate(self, x):
y = x.left
x.left = y.right
if y.right != self.TNULL:
y.right.parent = x
y.parent = x.parent
if x.parent is None:
self.root = y
elif x == x.parent.right:
x.parent.right = y
else:
x.parent.left = y
y.right = x
x.parent = y
54. AVL Tree
Another type of self-balancing binary search tree where the heights of the two child subtrees of any node differ by no more than one. If at any time during an insertion or deletion, the tree becomes unbalanced, a rotation is performed to restore balance. If the balance factor is not in {-1, 0, 1}, the tree is unbalanced, and rotations are needed to balance it.
Balance factor=height(left subtree)−height(right subtree)
Simple explanation: Picture a tree where every branch (subtree) is carefully pruned so that it never leans too much to one side. Whenever it starts to lean, a gardener (the algorithm) trims it back to keep it balanced.
class TreeNode:
def __init__(self, key):
self.left = None
self.right = None
self.val = key
self.height = 1
class AVLTree:
def insert(self, root, key):
if not root:
return TreeNode(key)
elif key < root.val:
root.left = self.insert(root.left, key)
else:
root.right = self.insert(root.right, key)
root.height = 1 + max(self.get_height(root.left), self.get_height(root.right))
balance = self.get_balance(root)
if balance > 1 and key < root.left.val:
return self.right_rotate(root)
if balance < -1 and key > root.right.val:
return self.left_rotate(root)
if balance > 1 and key > root.left.val:
root.left = self.left_rotate(root.left)
return self.right_rotate(root)
if balance < -1 and key < root.right.val:
root.right = self.right_rotate(root.right)
return self.left_rotate(root)
return root
def left_rotate(self, z):
y = z.right
T2 = y.left
y.left = z
z.right = T2
z.height = 1 + max(self.get_height(z.left), self.get_height(z.right))
y.height = 1 + max(self.get_height(y.left), self.get_height(y.right))
return y
def right_rotate(self, z):
y = z.left
T3 = y.right
y.right = z
z.left = T3
z.height = 1 + max(self.get_height(z.left), self.get_height(z.right))
y.height = 1 + max(self.get_height(y.left), self.get_height(y.right))
return y
def get_height(self, root):
if not root:
return 0
return root.height
def get_balance(self, root):
if not root:
return 0
return self.get_height(root.left) - self.get_height(root.right)
55. Breadth-First Search (BFS)
An algorithm for traversing or searching tree or graph data structures. It starts at the tree root (or some arbitrary node in a graph) and explores all nodes at the present depth level before moving on to nodes at the next depth level. BFS uses a queue to keep track of the next location to visit.
Simple explanation: Imagine you’re at a party where you want to meet everyone. You start by talking to everyone in the room you’re in (the current level) before moving to another room (next level) to meet more people. You use a list (queue) to remember which rooms to visit next.
from collections import deque
def bfs(graph, start):
visited = set()
queue = deque([start])
while queue:
vertex = queue.popleft()
if vertex not in visited:
print(vertex, end=" ")
visited.add(vertex)
queue.extend(graph[vertex] - visited)
56. Depth-First Search (DFS)
Another algorithm for traversing or searching tree or graph data structures. It starts at the root (or an arbitrary node in a graph) and explores as far as possible along each branch before backtracking. DFS uses a stack (either implicitly via recursion or explicitly) to keep track of the next location to visit.
def dfs(graph, start, visited=None):
if visited is None:
visited = set()
visited.add(start)
print(start, end=" ")
for next in graph[start] - visited:
dfs(graph, next, visited)
return visited
57. Hash Tables
A data structure that implements an associative array, a structure that can map keys to values. It uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.
Index=hash(key) mod number of buckets
Simple explanation: Imagine a library where books are organized based on the first few letters of the title. The hash function is like the system that determines which shelf a book goes on, making it easy to find later.
class HashTable:
def __init__(self):
self.table = [None] * 10
def hash_function(self, key):
return key % len(self.table)
def insert(self, key, value):
index = self.hash_function(key)
self.table[index] = value
def search(self, key):
index = self.hash_function(key)
return self.table[index]
58. Linear Probing
A collision resolution technique in hash tables. When a collision occurs (i.e., two keys hash to the same index), linear probing checks the next slot in the array until an empty slot is found.
Index=(hash(key)+i) mod number of buckets
Simple explanation: Imagine a parking lot where each car is assigned a specific space based on its license plate number. If that space is taken, the car moves to the next available space.
class HashTableLinearProbing:
def __init__(self):
self.table = [None] * 10
def hash_function(self, key):
return key % len(self.table)
def insert(self, key, value):
index = self.hash_function(key)
while self.table[index] is not None:
index = (index + 1) % len(self.table)
self.table[index] = value
def search(self, key):
index = self.hash_function(key)
while self.table[index] is not None:
if self.table[index] == key:
return True
index = (index + 1) % len(self.table)
return False
59. Euclidean Algorithm
An ancient algorithm for finding the greatest common divisor (GCD) of two integers. It is based on the principle that the GCD of two numbers also divides their difference.
GCD(a,b) = GCD(b,a mod b)
Simple explanation: Imagine you’re cutting two pieces of rope of different lengths. You keep cutting the longer piece by the length of the shorter piece until they’re both the same length—the GCD is that final length.
def gcd(a, b):
while b:
a, b = b, a % b
return a
60. Extended Euclidean Algorithm
Extends the Euclidean algorithm for finding the GCD of two numbers to also find the coefficients (usually denoted as x and y) that satisfy Bézout’s identity.
GCD(a,b) = ax + by
Simple explanation: Not only do you find the GCD of two ropes, but you also figure out how many times you cut each rope to get to that length.
def extended_gcd(a, b):
if a == 0:
return b, 0, 1
gcd, x1, y1 = extended_gcd(b % a, a)
x = y1 - (b // a) * x1
y = x1
return gcd, x, y
61. Haversine Formula
Used to calculate the distance between two points on the Earth given their latitude and longitude. It accounts for the spherical shape of the Earth.
a = sin²(φB - φA/2) + cos φA * cos φB * sin²(λB - λA/2)
c = 2 * atan2( √a, √(1−a) )
d = R ⋅ c
Simple explanation: Imagine you’re measuring the shortest path between two cities on a globe, not just as the crow flies but along the curve of the Earth’s surface.
import math
def haversine(lat1, lon1, lat2, lon2):
R = 6371.0 # Radius of Earth in kilometers
dlat = math.radians(lat2 - lat1)
dlon = math.radians(lon2 - lon1)
a = math.sin(dlat / 2) ** 2 + math.cos(math.radians(lat1)) * math.cos(math.radians(lat2)) * math.sin(dlon / 2) ** 2
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
distance = R * c
return distance
62. Boyer-Moore String Search Algorithm
An efficient string searching algorithm that skips sections of the text rather than checking each character one by one. It preprocesses the pattern and uses this information to skip alignments that cannot match the pattern.
Bad character rule: Skip based on the last occurrence of a mismatched character in the pattern.
Good suffix rule: Skip based on matching suffixes within the pattern.
Good suffix rule: Skip based on matching suffixes within the pattern.
Simple explanation: Imagine you’re looking for a word in a book. Instead of checking every single letter, you quickly skip over sections where you know the word can’t possibly be, based on the letters you’ve already seen.
def bad_char_heuristic(string, size):
bad_char = [-1] * 256
for i in range(size):
bad_char[ord(string[i])] = i
return bad_char
def boyer_moore_search(text, pattern):
m = len(pattern)
n = len(text)
bad_char = bad_char_heuristic(pattern, m)
s = 0
while s <= n - m:
j = m - 1
while j >= 0 and pattern[j] == text[s + j]:
j -= 1
if j < 0:
print(f"Pattern occurs at shift {s}")
s += (m - bad_char[ord(text[s + m])] if s + m < n else 1)
else:
s += max(1, j - bad_char[ord(text[s + j])])
63. Knuth-Morris-Pratt (KMP) Algorithm
Another efficient string searching algorithm that avoids unnecessary comparisons by preprocessing the pattern to determine the longest prefix which is also a suffix (LPS array). This allows the algorithm to skip sections of the text that have already been matched with the pattern.
LPS[i]=length of the longest prefix of the pattern which is also a suffix for the substring pattern[0..i]
If mismatch occurs after j matches, skip to LPS[j-1] and continue matching.
If mismatch occurs after j matches, skip to LPS[j-1] and continue matching.
Simple explanation: Imagine you’re assembling a puzzle and you’ve already placed some pieces together. If the next piece doesn’t fit, instead of starting over, you skip to the next possible fitting piece based on what you’ve already placed correctly.
def kmp_search(pattern, text):
lps = [0] * len(pattern)
j = 0
compute_lps(pattern, lps)
i = 0
while i < len(text):
if pattern[j] == text[i]:
i += 1
j += 1
if j == len(pattern):
print(f"Found pattern at index {i - j}")
j = lps[j - 1]
elif i < len(text) and pattern[j] != text[i]:
if j != 0:
j = lps[j - 1]
else:
i += 1
def compute_lps(pattern, lps):
length = 0
i = 1
while i < len(pattern):
if pattern[i] == pattern[length]:
length += 1
lps[i] = length
i += 1
else:
if length != 0:
length = lps[length - 1]
else:
lps[i] = 0
i += 1
64. Suffix Array
A data structure used in on-the-fly string searches. It consists of an array of all the suffixes of a string, sorted in lexicographical order. This allows for efficient pattern matching, as binary search can be used on the sorted array.
Suffix Array = An array of integers giving the starting positions of suffixes of a string in lexicographical order
Simple explanation: Think of a word and list all possible endings of that word. By sorting these endings alphabetically, you can quickly find where any smaller word fits among them, making searching faster.
def suffix_array_construction(text):
suffixes = [(text[i:], i) for i in range(len(text))]
suffixes.sort()
suffix_array = [s[1] for s in suffixes]
return suffix_array
text = "banana"
suffix_array = suffix_array_construction(text)
print("Suffix Array:", suffix_array)
65. Karatsuba Algorithm
A fast multiplication algorithm that reduces the multiplication of two n-digit numbers to at most three × nlog2(3) ≈ n1.585 single-digit multiplications. It breaks down large numbers into smaller components and applies recursive multiplication.
Let x=10 m a+b and y=10 m c+d
xy = 10(2m)ac + 10m(ad+bc) + bd
Compute ac,bd, and (a+b)(c+d)−ac−bd recursively.
Simple explanation: Imagine multiplying two big numbers by breaking them down into smaller pieces. Instead of doing it the long way, you combine results of smaller multiplications to get the final product faster.
def karatsuba(x, y):
if x < 10 or y < 10:
return x * y
n = max(len(str(x)), len(str(y)))
m = n // 2
x_high, x_low = divmod(x, 10**m)
y_high, y_low = divmod(y, 10**m)
z0 = karatsuba(x_low, y_low)
z1 = karatsuba((x_low + x_high), (y_low + y_high))
z2 = karatsuba(x_high, y_high)
return (z2 * 10**(2*m)) + ((z1 - z2 - z0) * 10**m) + z0
66. Strassen’s Algorithm
Used for matrix multiplication, Strassen’s algorithm reduces the complexity of matrix multiplication from O(n 3 ) to 𝑂 ( 𝑛 2.81 ) O(n 2.81 ) by breaking down larger matrices into smaller submatrices and recursively applying the multiplication.
Break each n×n matrix into four (n/2)×(n/2) matrices and combine them.
Simple explanation: Imagine multiplying two large matrices. Instead of doing it the traditional way, you break them into smaller pieces, multiply these pieces, and then combine the results in a way that saves work and time.
import numpy as np
def strassen(A, B):
if len(A) == 1:
return A * B
n = len(A)
mid = n // 2
A11 = A[:mid, :mid]
A12 = A[:mid, mid:]
A21 = A[mid:, :mid]
A22 = A[mid:, mid:]
B11 = B[:mid, :mid]
B12 = B[:mid, mid:]
B21 = B[mid:, :mid]
B22 = B[mid:, mid:]
M1 = strassen(A11 + A22, B11 + B22)
M2 = strassen(A21 + A22, B11)
M3 = strassen(A11, B12 - B22)
M4 = strassen(A22, B21 - B11)
M5 = strassen(A11 + A12, B22)
M6 = strassen(A21 - A11, B11 + B12)
M7 = strassen(A12 - A22, B21 + B22)
C11 = M1 + M4 - M5 + M7
C12 = M3 + M5
C21 = M2 + M4
C22 = M1 + M3 - M2 + M6
C = np.vstack((np.hstack((C11, C12)), np.hstack((C21, C22))))
return C
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result = strassen(A, B)
print(result)
67. Gaussian Elimination
A method for solving systems of linear equations. It transforms the system’s augmented matrix into row echelon form (upper triangular matrix) by performing row operations, making it easy to solve for the variables.
Apply row operations to achieve an upper triangular matrix, then back substitute to find the solution.
Simple explanation: Think of it like organizing a stack of papers. You rearrange the equations to simplify them one by one, making it easy to pull out the answer from the last equation up to the first.
import numpy as np
def gaussian_elimination(A, b):
n = len(b)
M = A
for i in range(n):
maxEl = abs(M[i][i])
maxRow = i
for k in range(i + 1, n):
if abs(M[k][i]) > maxEl:
maxEl = abs(M[k][i])
maxRow = k
for k in range(i, n):
M[maxRow][k], M[i][k] = M[i][k], M[maxRow][k]
b[maxRow], b[i] = b[i], b[maxRow]
for k in range(i + 1, n):
c = -M[k][i] / M[i][i]
for j in range(i, n):
if i == j:
M[k][j] = 0
else:
M[k][j] += c * M[i][j]
b[k] += c * b[i]
x = [0 for i in range(n)]
for i in range(n - 1, -1, -1):
x[i] = b[i] / M[i][i]
for k in range(i - 1, -1, -1):
b[k] -= M[k][i] * x[i]
return x
A = np.array([[2, 1, -1], [-3, -1, 2], [-2, 1, 2]], dtype=float)
b = np.array([8, -11, -3], dtype=float)
x = gaussian_elimination(A, b)
print("Solution:", x)
68. Fast Fourier Transform (FFT)
An algorithm to compute the discrete Fourier transform (DFT) and its inverse. FFT is widely used in signal processing, image analysis, and solving partial differential equations. It reduces the complexity of DFT from O(n^2) to O(n log n), making it efficient for processing large datasets. FFT is commonly used in audio signal processing, such as converting a sound wave into its frequency components, which is essential for music applications, voice recognition, and even medical imaging like MRI scans.
DFT: X(k) = sum_{n=0}^{N-1} x(n) * exp(-2piikn/N) for k = 0, 1, ..., N-1
Where N is the number of data points, x(n) is the input signal, X(k) is the DFT output, and i is the imaginary unit.
Simple explanation: Imagine you’re trying to figure out the different notes (frequencies) playing in a piece of music. The FFT helps you break down the sound wave quickly to identify all the individual notes.
import numpy as np
def fft(x):
N = len(x)
if N <= 1: return x
even = fft(x[0::2])
odd = fft(x[1::2])
T = [np.exp(-2j * np.pi * k / N) * odd[k] for k in range(N // 2)]
return [even[k] + T[k] for k in range(N // 2)] + [even[k] - T[k] for k in range(N // 2)]
x = np.array([0, 1, 2, 3])
fft_result = fft(x)
print("FFT result:", fft_result)
69. Discrete Cosine Transform (DCT)
Used in image compression (e.g., JPEG). The DCT converts a signal or image from the spatial domain to the frequency domain. It is similar to the Fourier Transform but only uses cosine functions, which makes it more efficient for certain types of data, especially images. DCT is widely used in image and video compression, making it a crucial component in JPEG images and MP3 audio compression.
DCT(k) = sum_{n=0}^{N-1} x(n) * cos((pi * k * (2 * n + 1)) / (2 * N)) for k = 0, 1, ..., N-1
Where N is the length of the input data, x(n) is the input signal, and DCT(k) is the DCT output for the k-th coefficient.
Simple explanation: Imagine taking a picture and breaking it down into patterns of light and dark. The DCT helps you figure out which patterns are most important for recreating the image with fewer details, which is how JPEG compression works.
import numpy as np
import math
def dct(signal):
N = len(signal)
result = np.zeros(N)
for k in range(N):
sum = 0.0
for n in range(N):
sum += signal[n] * math.cos(math.pi * k * (2 * n + 1) / (2 * N))
result[k] = sum * math.sqrt(2 / N)
return result
signal = [255, 128, 64, 32]
dct_result = dct(signal)
print("DCT result:", dct_result)
70. Euler’s Totient Function
Used in number theory and cryptography, Euler’s Totient Function, denoted as phi(n), counts the number of integers up to a given integer n that are relatively prime to n. This function is particularly important in RSA encryption, where it helps in key generation.
phi(n) = n * product(1 - 1/p) for each distinct prime p dividing n
Simple explanation: Imagine you have n people standing in a circle. The Euler’s Totient Function tells you how many people cannot be paired with any other person by sharing a common factor other than 1.
def euler_totient(n):
result = n
p = 2
while p * p <= n:
if n % p == 0:
while n % p == 0:
n //= p
result -= result // p
p += 1
if n > 1:
result -= result // n
return result
n = 36
print("Euler's Totient Function result:", euler_totient(n))
71. Fermat’s Little Theorem
A key theorem in number theory and cryptography. It states that if p is a prime number and a is an integer not divisible by p, then a^(p-1) is congruent to 1 modulo p. This theorem is the basis for many cryptographic algorithms, including RSA encryption.
a^(p-1) ≡ 1 (mod p)
Where p is a prime number, and a is an integer such that gcd(a, p) = 1.
Simple explanation: Imagine you’re performing a magic trick where you raise any number to a specific power and always get the same result when divided by a prime number. Fermat’s Little Theorem is the math behind that trick, ensuring it works every time.
def fermat_little_theorem(a, p):
if pow(a, p-1, p) == 1:
return True
else:
return False
a = 3
p = 7
print("Fermat's Little Theorem holds:", fermat_little_theorem(a, p))
72. Chinese Remainder Theorem
Used in solving systems of simultaneous congruences. The theorem states that if one knows the remainders of the division of an integer n by several pairwise coprime integers, then one can uniquely determine the remainder of the division of n by the product of these integers. It is used in cryptography, particularly in RSA decryption.
n ≡ a_1 (mod m_1)
n ≡ a_2 (mod m_2)
...
n ≡ a_k (mod m_k)
Where the m_i are pairwise coprime.
Simple explanation: Imagine you’re trying to figure out a number that gives you a certain remainder when divided by different numbers. The Chinese Remainder Theorem helps you find that special number that fits all the conditions at once.
def chinese_remainder(n, a):
sum = 0
prod = 1
for i in range(len(n)):
prod *= n[i]
for n_i, a_i in zip(n, a):
p = prod // n_i
sum += a_i * mul_inv(p, n_i) * p
return sum % prod
def mul_inv(a, b):
b0 = b
x0, x1 = 0, 1
if b == 1: return 1
while a > 1:
q = a // b
a, b = b, a % b
x0, x1 = x1 - q * x0, x0
if x1 < 0: x1 += b0
return x1
n = [3, 5, 7]
a = [2, 3, 2]
print("Chinese Remainder Theorem result:", chinese_remainder(n, a))
73. Kalman Filter
Used in signal processing and control systems, the Kalman Filter is an algorithm that estimates the state of a dynamic system from a series of incomplete and noisy measurements. It is widely used in applications such as GPS tracking, robotics, and econometrics for predicting the future state of a system based on past measurements.
Predict:
x_k = A * x_{k-1} + B * u_k
P_k = A * P_{k-1} * A^T + Q
Update:
K_k = P_k * H^T * (H * P_k * H^T + R)^(-1)
x_k = x_k + K_k * (z_k - H * x_k)
P_k = (I - K_k * H) * P_k
Where:
x_k is the estimated state at time k,
P_k is the estimated covariance,
A is the state transition model,
B is the control input model,
u_k is the control vector,
Q is the process noise covariance,
H is the observation model,
z_k is the measurement,
R is the measurement noise covariance,
K_k is the Kalman gain.
Simple explanation: Imagine you’re trying to track a moving object (like a car) based on noisy GPS data. The Kalman Filter helps you make accurate predictions about where the car is now and where it’s heading, even if the GPS readings aren’t perfect.
import numpy as np
def kalman_filter(z, A, H, Q, R, x, P):
I = np.eye(A.shape[1])
# Predict
x = A @ x
P = A @ P @ A.T + Q
# Update
K = P @ H.T @ np.linalg.inv(H @ P @ H.T + R)
x = x + K @ (z - H @ x)
P = (I - K @ H) @ P
return x, P
A = np.array([[1, 1], [0, 1]])
H = np.array([[1, 0]])
Q = np.array([[0.1, 0.1], [0.1, 0.1]])
R = np.array([[1]])
x = np.array([[0], [1]])
P = np.eye(2)
z = np.array([[2]])
x, P = kalman_filter(z, A, H, Q, R, x, P)
print("Kalman Filter estimate:", x)
74. Proportional-Integral-Derivative Controller
A control loop mechanism widely used in industrial control systems. PID stands for Proportional-Integral-Derivative, which are the three basic operations used to control the process. It continuously calculates an error value as the difference between a desired setpoint and a measured process variable, and applies a correction based on proportional, integral, and derivative terms. PID controllers are used in various applications, from simple household appliances like thermostats to complex industrial processes like chemical reactors.
output = Kp * e(t) + Ki * integral(e(t)) dt + Kd * derivative(e(t)) / dt
Where:
e(t) is the error (difference between desired and actual),
Kp, Ki, and Kd are the proportional, integral, and derivative gains, respectively.
Simple explanation: Imagine you’re driving a car and trying to maintain a constant speed. If you’re too slow, you press the gas pedal (proportional), if you’ve been going slow for a while, you press even harder (integral), and if you’re approaching the right speed, you ease off the gas to avoid overshooting (derivative).
class PIDController:
def __init__(self, Kp, Ki, Kd, setpoint):
self.Kp = Kp
self.Ki = Ki
self.Kd = Kd
self.setpoint = setpoint
self.integral = 0
self.previous_error = 0
def compute(self, measurement):
error = self.setpoint - measurement
self.integral += error
derivative = error - self.previous_error
output = self.Kp * error + self.Ki * self.integral + self.Kd * derivative
self.previous_error = error
return output
pid = PIDController(Kp=1.0, Ki=0.1, Kd=0.05, setpoint=100)
measurement = 90
control_signal = pid.compute(measurement)
print("Control signal:", control_signal)
75. Smith-Waterman Algorithm
Used in bioinformatics for sequence alignment, the Smith-Waterman algorithm performs local sequence alignment to identify regions of similarity between two nucleotide or protein sequences. Unlike global alignment algorithms, it finds the optimal alignment in subsequences, which is particularly useful when the sequences are of different lengths or only partially related. This algorithm is commonly used in DNA sequencing and protein structure prediction.
H(i, j) = max(0, H(i-1, j-1) + match_score, H(i-1, j) - gap_penalty, H(i, j-1) - gap_penalty)
Where:
H(i, j) is the score at position (i, j) in the alignment matrix,
match_score is the score for aligning two matching symbols,
gap_penalty is the penalty for introducing a gap in the alignment.
Simple explanation: Imagine you’re comparing two long strings of text (like DNA sequences) and trying to find the best match between smaller sections of the two. The Smith-Waterman algorithm helps you focus on the most similar parts, even if the overall sequences aren’t identical.
def smith_waterman(seq1, seq2, match_score=3, gap_penalty=2):
m, n = len(seq1), len(seq2)
H = [[0] * (n + 1) for _ in range(m + 1)]
max_score = 0
for i in range(1, m + 1):
for j in range(1, n + 1):
match = H[i - 1][j - 1] + (match_score if seq1[i - 1] == seq2[j - 1] else -match_score)
delete = H[i - 1][j] - gap_penalty
insert = H[i][j - 1] - gap_penalty
H[i][j] = max(0, match, delete, insert)
max_score = max(max_score, H[i][j])
return max_score
seq1 = "GATTACA"
seq2 = "GCATGCU"
score = smith_waterman(seq1, seq2)
print("Smith-Waterman score:", score)
76. Needleman-Wunsch Algorithm
Another algorithm for sequence alignment in bioinformatics, the Needleman-Wunsch algorithm performs global sequence alignment, aligning two sequences along their entire lengths. It is used when comparing sequences that are expected to be similar over their entire length. This algorithm is also widely used in DNA sequencing and protein structure analysis.
F(i, j) = max(F(i-1, j-1) + match_score, F(i-1, j) - gap_penalty, F(i, j-1) - gap_penalty)
Where:
F(i, j) is the score at position (i, j) in the alignment matrix,
match_score is the score for aligning two matching symbols,
gap_penalty is the penalty for introducing a gap in the alignment.
Simple explanation: Imagine you’re trying to align two similar texts to see how they match up from beginning to end. It helps you align the entire length of both texts, showing where they match and where they differ.
def needleman_wunsch(seq1, seq2, match_score=1, gap_penalty=1):
m, n = len(seq1), len(seq2)
F = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(1, m + 1):
F[i][0] = -i * gap_penalty
for j in range(1, n + 1):
F[0][j] = -j * gap_penalty
for i in range(1, m + 1):
for j in range(1, n + 1):
match = F[i - 1][j - 1] + (match_score if seq1[i - 1] == seq2[j - 1] else -match_score)
delete = F[i - 1][j] - gap_penalty
insert = F[i][j - 1] - gap_penalty
F[i][j] = max(match, delete, insert)
return F[m][n]
seq1 = "GATTACA"
seq2 = "GCATGCU"
score = needleman_wunsch(seq1, seq2)
print("Needleman-Wunsch score:", score)
77. Viterbi Algorithm
Used in decoding convolutional codes and hidden Markov models, the Viterbi algorithm finds the most likely sequence of hidden states (called the Viterbi path) that results in a sequence of observed events. It is widely used in speech recognition, bioinformatics, and digital communication systems.
V_t(k) = max_{j} (V_{t-1}(j) * a_{jk}) * b_k(O_t)
Where:
V_t(k) is the probability of the most likely path ending in state k at time t,
a_{jk} is the transition probability from state j to state k,
b_k(O_t) is the probability of observing O_t given state k,
O_t is the observed event at time t.
Simple explanation: Imagine you’re trying to decode a series of signals that have been scrambled. It helps you figure out the most likely original sequence by considering all possible sequences and choosing the one with the highest probability.
def viterbi(obs, states, start_p, trans_p, emit_p):
V = [{}]
for st in states:
V[0][st] = {"prob": start_p[st] * emit_p[st][obs[0]], "prev": None}
for t in range(1, len(obs)):
V.append({})
for st in states:
max_tr_prob = max(V[t-1][prev_st]["prob"] * trans_p[prev_st][st] for prev_st in states)
for prev_st in states:
if V[t-1][prev_st]["prob"] * trans_p[prev_st][st] == max_tr_prob:
max_prob = max_tr_prob * emit_p[st][obs[t]]
V[t][st] = {"prob": max_prob, "prev": prev_st}
break
opt = []
max_prob = max(value["prob"] for value in V[-1].values())
previous = None
for st, data in V[-1].items():
if data["prob"] == max_prob:
opt.append(st)
previous = st
break
for t in range(len(V) - 2, -1, -1):
opt.insert(0, V[t + 1][previous]["prev"])
previous = V[t + 1][previous]["prev"]
return opt
obs = ('normal', 'cold', 'dizzy')
states = ('Healthy', 'Fever')
start_p = {'Healthy': 0.6, 'Fever': 0.4}
trans_p = {
'Healthy': {'Healthy': 0.7, 'Fever': 0.3},
'Fever': {'Healthy': 0.4, 'Fever': 0.6}
}
emit_p = {
'Healthy': {'normal': 0.5, 'cold': 0.4, 'dizzy': 0.1},
'Fever': {'normal': 0.1, 'cold': 0.3, 'dizzy': 0.6}
}
result = viterbi(obs, states, start_p, trans_p, emit_p)
print("Viterbi path:", result)
78. Ford-Fulkerson Algorithm
Used to compute the maximum flow in a flow network, the Ford-Fulkerson algorithm finds the maximum flow possible from a source to a sink in a flow network by repeatedly searching for augmenting paths in the network. It is widely used in network routing, resource allocation, and scheduling problems.
Initialize the flow in all edges to 0.
While there is an augmenting path from the source to the sink, add this path to the flow.
Update the residual capacities of the edges and reverse edges along the path.
The algorithm terminates when no more augmenting paths are found.
Simple explanation: Imagine you’re trying to send as much water as possible through a network of pipes from a reservoir (source) to a storage tank (sink). It helps you figure out the maximum amount of water that can flow through the pipes without exceeding the capacity of any pipe.
class Graph:
def __init__(self, graph):
self.graph = graph
self.ROW = len(graph)
def bfs(self, s, t, parent):
visited = [False] * self.ROW
queue = []
queue.append(s)
visited[s] = True
while queue:
u = queue.pop(0)
for ind, val in enumerate(self.graph[u]):
if visited[ind] == False and val > 0:
queue.append(ind)
visited[ind] = True
parent[ind] = u
return True if visited[t] else False
def ford_fulkerson(self, source, sink):
parent = [-1] * self.ROW
max_flow = 0
while self.bfs(source, sink, parent):
path_flow = float("Inf")
s = sink
while s != source:
path_flow = min(path_flow, self.graph[parent[s]][s])
s = parent[s]
max_flow += path_flow
v = sink
while v != source:
u = parent[v]
self.graph[u][v] -= path_flow
self.graph[v][u] += path_flow
v = parent[v]
return max_flow
graph = [[0, 16, 13, 0, 0, 0],
[0, 0, 10, 12, 0, 0],
[0, 4, 0, 0, 14, 0],
[0, 0, 9, 0, 0, 20],
[0, 0, 0, 7, 0, 4],
[0, 0, 0, 0, 0, 0]]
g = Graph(graph)
source = 0
sink = 5
print("The maximum possible flow is", g.ford_fulkerson(source, sink))
79. Edmonds-Karp Algorithm
An implementation of the Ford-Fulkerson method that uses Breadth-First Search (BFS) to find the shortest augmenting path in a flow network. The Edmonds-Karp algorithm ensures that the search for augmenting paths is done in the shortest possible time, making the overall time complexity of the algorithm O(V * E^2), where V is the number of vertices and E is the number of edges. It is widely used in network flow problems, such as matching and resource allocation.
Simple explanation: Just like in the Ford-Fulkerson algorithm, imagine you’re sending water through pipes. But in Edmonds-Karp, you always choose the shortest path (in terms of the number of pipes) to send more water, making sure that the process is efficient.
from collections import deque
class Graph:
def __init__(self, graph):
self.graph = graph
self.ROW = len(graph)
def bfs(self, s, t, parent):
visited = [False] * self.ROW
queue = deque([s])
visited[s] = True
while queue:
u = queue.popleft()
for ind, val in enumerate(self.graph[u]):
if not visited[ind] and val > 0:
queue.append(ind)
visited[ind] = True
parent[ind] = u
return visited[t]
def edmonds_karp(self, source, sink):
parent = [-1] * self.ROW
max_flow = 0
while self.bfs(source, sink, parent):
path_flow = float('Inf')
s = sink
while s != source:
path_flow = min(path_flow, self.graph[parent[s]][s])
s = parent[s]
max_flow += path_flow
v = sink
while v != source:
u = parent[v]
self.graph[u][v] -= path_flow
self.graph[v][u] += path_flow
v = parent[v]
return max_flow
graph = [[0, 16, 13, 0, 0, 0],
[0, 0, 10, 12, 0, 0],
[0, 4, 0, 0, 14, 0],
[0, 0, 9, 0, 0, 20],
[0, 0, 0, 7, 0, 4],
[0, 0, 0, 0, 0, 0]]
g = Graph(graph)
source = 0
sink = 5
print("The maximum possible flow is", g.edmonds_karp(source, sink))
80. Kruskal’s Algorithm
Used to find the minimum spanning tree (MST) for a connected, weighted graph, Kruskal’s algorithm works by sorting all edges in the graph by weight and adding the smallest edge to the MST if it doesn’t form a cycle. It is commonly used in network design, such as laying down cables or roads that connect multiple locations with the least cost.
Sort all the edges in non-decreasing order of their weight.
Pick the smallest edge. Check if it forms a cycle with the spanning tree formed so far. If a cycle is not formed, include this edge. If a cycle is formed, discard it.
Repeat step 2 until there are V-1 edges in the spanning tree, where V is the number of vertices.
Simple explanation: Imagine you’re building a road network between cities and want to use the least amount of material. It helps you connect all cities with the shortest and cheapest roads, avoiding loops.
class Graph:
def __init__(self, vertices):
self.V = vertices
self.graph = []
def add_edge(self, u, v, w):
self.graph.append([u, v, w])
def find(self, parent, i):
if parent[i] == i:
return i
return self.find(parent, parent[i])
def union(self, parent, rank, x, y):
root_x = self.find(parent, x)
root_y = self.find(parent, y)
if rank[root_x] < rank[root_y]:
parent[root_x] = root_y
elif rank[root_x] > rank[root_y]:
parent[root_y] = root_x
else:
parent[root_y] = root_x
rank[root_x] += 1
def kruskal(self):
result = []
i, e = 0, 0
self.graph = sorted(self.graph, key=lambda item: item[2])
parent = []
rank = []
for node in range(self.V):
parent.append(node)
rank.append(0)
while e < self.V - 1:
u, v, w = self.graph[i]
i = i + 1
x = self.find(parent, u)
y = self.find(parent, v)
if x != y:
e = e + 1
result.append([u, v, w])
self.union(parent, rank, x, y)
return result
g = Graph(4)
g.add_edge(0, 1, 10)
g.add_edge(0, 2, 6)
g.add_edge(0, 3, 5)
g.add_edge(1, 3, 15)
g.add_edge(2, 3, 4)
mst = g.kruskal()
print("Minimum Spanning Tree:", mst)
81. Prim’s Algorithm
Another algorithm for finding the minimum spanning tree (MST) of a connected, weighted graph, Prim’s algorithm starts with a single vertex and grows the MST by adding the cheapest possible connection from the tree to another vertex. This algorithm is also used in network design, especially when dealing with dense graphs.
Initialize a tree with a single vertex, chosen arbitrarily from the graph.
Grow the tree by adding the smallest edge that connects a vertex in the tree to a vertex outside the tree.
Repeat step 2 until all vertices are included in the tree.
Simple explanation: Imagine you’re setting up electrical wiring in a building, starting from the power source and connecting one room at a time using the shortest possible wires. It helps you find the most efficient way to wire the entire building.
import heapq
def prim(graph, start):
mst = []
visited = set()
min_heap = [(0, start, None)]
while min_heap:
weight, u, prev = heapq.heappop(min_heap)
if u in visited:
continue
visited.add(u)
if prev is not None:
mst.append((prev, u, weight))
for v, w in graph[u]:
if v not in visited:
heapq.heappush(min_heap, (w, v, u))
return mst
graph = {
0: [(1, 10), (2, 6), (3, 5)],
1: [(0, 10), (3, 15)],
2: [(0, 6), (3, 4)],
3: [(0, 5), (1, 15), (2, 4)]
}
mst = prim(graph, 0)
print("Minimum Spanning Tree:", mst)
82. Borůvka’s Algorithm
Yet another minimum spanning tree algorithm, Borůvka’s algorithm is designed to find the MST by repeatedly adding the smallest edge from each component to another component. It is particularly effective for parallel processing and can be implemented to run efficiently on distributed systems.
Start with a forest where each vertex is a separate component.
For each component, find the smallest edge connecting it to another component, and add this edge to the MST.
Merge the components connected by these edges.
Repeat steps 2 and 3 until only one component remains, which is the MST.
Simple explanation: Imagine a group of islands, each with its own village, and you want to build bridges to connect all the villages using the least amount of material. It helps you build the shortest and cheapest bridges, connecting one village at a time.
class Graph:
def __init__(self, vertices):
self.V = vertices
self.graph = []
def add_edge(self, u, v, w):
self.graph.append([u, v, w])
def find(self, parent, i):
if parent[i] == i:
return i
return self.find(parent, parent[i])
def union(self, parent, rank, x, y):
xroot = self.find(parent, x)
yroot = self.find(parent, y)
if rank[xroot] < rank[yroot]:
parent[xroot] = yroot
elif rank[xroot] > rank[yroot]:
parent[yroot] = xroot
else:
parent[yroot] = xroot
rank[xroot] += 1
def boruvka(self):
parent = []
rank = []
cheapest = [-1] * self.V
numTrees = self.V
MSTweight = 0
for node in range(self.V):
parent.append(node)
rank.append(0)
while numTrees > 1:
for i in range(len(self.graph)):
u, v, w = self.graph[i]
set1 = self.find(parent, u)
set2 = self.find(parent, v)
if set1 != set2:
if cheapest[set1] == -1 or cheapest[set1][2] > w:
cheapest[set1] = [u, v, w]
if cheapest[set2] == -1 or cheapest[set2][2] > w:
cheapest[set2] = [u, v, w]
for node in range(self.V):
if cheapest[node] != -1:
u, v, w = cheapest[node]
set1 = self.find(parent, u)
set2 = self.find(parent, v)
if set1 != set2:
MSTweight += w
self.union(parent, rank, set1, set2)
numTrees -= 1
cheapest = [-1] * self.V
return MSTweight
g = Graph(4)
g.add_edge(0, 1, 10)
g.add_edge(0, 2, 6)
g.add_edge(0, 3, 5)
g.add_edge(1, 3, 15)
g.add_edge(2, 3, 4)
mst_weight = g.boruvka()
print("Weight of MST:", mst_weight)
83. Heap Sort
A comparison-based sorting algorithm that uses a binary heap data structure. Heap Sort works by building a max heap from the input data, and then repeatedly extracting the maximum element from the heap and reconstructing the heap with the remaining elements. It is an in-place sorting algorithm and has a time complexity of O(n log n). Heap Sort is used in scenarios where a stable, in-place, and efficient sorting algorithm is required, such as in embedded systems or memory-constrained environments.
Build a max heap from the input data.
Extract the maximum element from the heap, and place it at the end of the array.
Reduce the heap size and heapify the root of the heap.
Repeat steps 2 and 3 until the heap is empty.
Simple explanation: Imagine a pile of rocks of different sizes, and you want to sort them from largest to smallest. Heap Sort helps you pick the largest rock first, then the next largest, and so on, until all the rocks are neatly sorted.
def heapify(arr, n, i):
largest = i
l = 2 * i + 1
r = 2 * i + 2
if l < n and arr[i] < arr[l]:
largest = l
if r < n and arr[largest] < arr[r]:
largest = r
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i]
heapify(arr, n, largest)
def heap_sort(arr):
n = len(arr)
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
for i in range(n-1, 0, -1):
arr[i], arr[0] = arr[0], arr[i]
heapify(arr, i, 0)
arr = [12, 11, 13, 5, 6, 7]
heap_sort(arr)
print("Sorted array:", arr)
84. Shell Sort
An in-place comparison-based sorting algorithm that generalizes insertion sort by allowing the exchange of items that are far apart. The idea is to arrange the list of elements so that, starting anywhere, considering every h-th element gives a sorted list. Such a list is said to be h-sorted. The algorithm repeats this process with smaller and smaller values of h, eventually using h = 1. Shell Sort is used in scenarios where a simple, efficient, and in-place sorting algorithm is required, particularly for small to medium-sized datasets.
Start with a large gap and reduce the gap in each iteration.
Perform a gapped insertion sort for the current gap.
Repeat step 2 until the gap is 1.
Simple explanation: Imagine you have a row of books, and you want to sort them by height. Instead of sorting adjacent books, you first sort books that are a few places apart, then sort books that are closer together, and finally sort adjacent books. This method helps to organize the books faster.
def shell_sort(arr):
n = len(arr)
gap = n // 2
while gap > 0:
for i in range(gap, n):
temp = arr[i]
j = i
while j >= gap and arr[j - gap] > temp:
arr[j] = arr[j - gap]
j -= gap
arr[j] = temp
gap //= 2
arr = [12, 34, 54, 2, 3]
shell_sort(arr)
print("Sorted array:", arr)
85. Counting Sort
A non-comparison-based sorting algorithm that works by counting the number of occurrences of each distinct element in the input array. The algorithm then uses this count to determine the position of each element in the output array. Counting Sort is particularly efficient when the range of input values (k) is not significantly greater than the number of elements to be sorted (n). It is commonly used in scenarios where all elements are integers within a known range, such as sorting exam scores or age data.
Find the maximum value in the input array.
Create a count array of size equal to the maximum value plus one.
Count the occurrences of each element in the input array and store them in the count array.
Modify the count array such that each element at each index stores the sum of previous counts.
Use the count array to place the elements in the correct position in the output array.
Simple explanation: Imagine you’re sorting a group of students based on their scores. Instead of comparing each student, you count how many students got each score, then use that count to arrange them in order.
def counting_sort(arr):
max_val = max(arr)
count = [0] * (max_val + 1)
output = [0] * len(arr)
for num in arr:
count[num] += 1
for i in range(1, len(count)):
count[i] += count[i - 1]
for num in reversed(arr):
output[count[num] - 1] = num
count[num] -= 1
for i in range(len(arr)):
arr[i] = output[i]
arr = [4, 2, 2, 8, 3, 3, 1]
counting_sort(arr)
print("Sorted array:", arr)
86. Radix Sort
Another non-comparison-based sorting algorithm, Radix Sort sorts numbers by processing individual digits. It works by sorting the numbers digit by digit, starting from the least significant digit to the most significant digit, using a stable sorting algorithm like Counting Sort at each step. Radix Sort is particularly efficient for sorting large numbers or strings where the number of digits or characters is fixed. It is commonly used in applications like sorting large sets of integers or strings, such as telephone numbers or social security numbers.
Find the maximum number to determine the number of digits.
Perform Counting Sort for each digit, starting from the least significant digit to the most significant digit.
Repeat until all digits have been processed.
Simple explanation: Imagine you have a stack of cards with numbers on them, and you want to sort them. Instead of comparing the entire number, you sort the cards based on the last digit first, then the second-to-last digit, and so on, until the entire number is sorted.
def counting_sort_for_radix(arr, exp):
n = len(arr)
output = [0] * n
count = [0] * 10
for i in range(n):
index = arr[i] // exp
count[index % 10] += 1
for i in range(1, 10):
count[i] += count[i - 1]
for i in range(n - 1, -1, -1):
index = arr[i] // exp
output[count[index % 10] - 1] = arr[i]
count[index % 10] -= 1
for i in range(n):
arr[i] = output[i]
def radix_sort(arr):
max_val = max(arr)
exp = 1
while max_val // exp > 0:
counting_sort_for_radix(arr, exp)
exp *= 10
arr = [170, 45, 75, 90, 802, 24, 2, 66]
radix_sort(arr)
print("Sorted array:", arr)
87. Bucket Sort
A distribution sort algorithm that divides the input elements into several buckets, sorts each bucket individually (using a different sorting algorithm, such as Insertion Sort), and then concatenates all the sorted buckets to produce the final sorted array. Bucket Sort is particularly effective when the input is uniformly distributed across a range. It is commonly used in scenarios like sorting floating-point numbers in a limited range, such as GPA scores or distribution of grades.
Create an empty array of buckets.
Scatter the elements from the input array into the appropriate buckets.
Sort each bucket individually.
Concatenate the sorted buckets to get the sorted array.
Simple explanation: Imagine you’re sorting a collection of marbles of different colors. First, you put the marbles into buckets based on their color. Then, you sort the marbles in each bucket, and finally, you combine all the sorted buckets into one sorted collection.
def insertion_sort(arr):
for i in range(1, len(arr)):
key = arr[i]
j = i - 1
while j >= 0 and key < arr[j]:
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
def bucket_sort(arr):
bucket_count = len(arr)
buckets = [[] for _ in range(bucket_count)]
for num in arr:
index = int(num * bucket_count)
buckets[index].append(num)
for bucket in buckets:
insertion_sort(bucket)
sorted_arr = []
for bucket in buckets:
sorted_arr.extend(bucket)
return sorted_arr
arr = [0.897, 0.565, 0.656, 0.123, 0.665, 0.343]
sorted_arr = bucket_sort(arr)
print("Sorted array:", sorted_arr)
88. Topological Sort
Used in scheduling tasks or ordering data, Topological Sort is an algorithm that orders the vertices of a directed acyclic graph (DAG) such that for every directed edge u -> v, vertex u comes before vertex v in the ordering. It is commonly used in scenarios where there are dependencies between tasks, such as determining the order in which to compile files in a programming project or scheduling jobs in a workflow.
Identify a vertex with no incoming edges and remove it from the graph, adding it to the sorted list.
Remove the outgoing edges of the identified vertex.
Repeat steps 1 and 2 until the graph is empty.
The list of removed vertices forms the topological order.
Simple explanation: Imagine you’re organizing a set of tasks where some tasks must be completed before others. It helps you find the order in which to complete the tasks without violating any dependencies.
from collections import defaultdict, deque
def topological_sort(V, adj):
in_degree = [0] * V
for i in range(V):
for j in adj[i]:
in_degree[j] += 1
queue = deque([i for i in range(V) if in_degree[i] == 0])
topo_order = []
while queue:
u = queue.popleft()
topo_order.append(u)
for v in adj[u]:
in_degree[v] -= 1
if in_degree[v] == 0:
queue.append(v)
return topo_order
V = 6
adj = defaultdict(list)
adj[5].append(2)
adj[5].append(0)
adj[4].append(0)
adj[4].append(1)
adj[2].append(3)
adj[3].append(1)
topo_order = topological_sort(V, adj)
print("Topological Order:", topo_order)
89. Gale-Shapley Algorithm
Used in matching problems (e.g., stable marriage problem), the Gale-Shapley algorithm finds a stable matching between two equally sized sets of elements based on their preferences. A matching is stable if there is no pair of elements that would rather be matched with each other than with their current partners. The algorithm is widely used in college admissions processes, matching residents to hospitals, and other applications where stable pairings are desired.
Each element in one set proposes to its most preferred partner in the other set that hasn't rejected it yet.
Each element in the other set either accepts the proposal (if it is unmatched or prefers the proposer over its current match) or rejects it.
Repeat step 1 and 2 until all elements are matched.
Simple explanation: Imagine a group of students and a group of colleges, each with preferences for one another. It helps match students to colleges in such a way that no student and college would prefer each other over their current match.
def stable_marriage(n, men_preferences, women_preferences):
free_men = list(range(n))
women_partner = [-1] * n
men_next_proposal = [0] * n
women_preference_order = [[-1] * n for _ in range(n)]
for w in range(n):
for rank, m in enumerate(women_preferences[w]):
women_preference_order[w][m] = rank
while free_men:
m = free_men.pop(0)
w = men_preferences[m][men_next_proposal[m]]
men_next_proposal[m] += 1
if women_partner[w] == -1:
women_partner[w] = m
else:
m_prime = women_partner[w]
if women_preference_order[w][m] < women_preference_order[w][m_prime]:
women_partner[w] = m
free_men.append(m_prime)
else:
free_men.append(m)
return women_partner
n = 4
men_preferences = [
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3]
]
women_preferences = [
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3]
]
matches = stable_marriage(n, men_preferences, women_preferences)
print("Stable matches:", matches)
90. Hungarian Algorithm
Used for assignment problems in combinatorial optimization, the Hungarian algorithm finds the optimal assignment of resources to tasks that minimizes the total cost. It is particularly effective for solving the assignment problem, where there is a one-to-one correspondence between tasks and agents, each with an associated cost. This algorithm is commonly used in tasks such as job assignments, task scheduling, and resource allocation.
Subtract the smallest value in each row from all the values in that row.
Subtract the smallest value in each column from all the values in that column.
Cover all the zeros in the resulting matrix using a minimum number of horizontal and vertical lines.
If the minimum number of covering lines is equal to the number of rows (or columns), an optimal assignment is possible. If not, modify the matrix and repeat until an optimal assignment is found.
Simple explanation: Imagine you’re trying to assign workers to tasks such that the total time to complete all tasks is minimized. It helps you figure out the best way to assign each worker to a task, minimizing the overall effort.
import numpy as np
def hungarian_algorithm(cost_matrix):
cost_matrix = np.array(cost_matrix)
n = cost_matrix.shape[0]
# Step 1: Subtract the smallest value in each row from all elements of that row
for i in range(n):
cost_matrix[i] -= np.min(cost_matrix[i])
# Step 2: Subtract the smallest value in each column from all elements of that column
for j in range(n):
cost_matrix[:, j] -= np.min(cost_matrix[:, j])
# Step 3: Cover all zeros with minimum number of horizontal and vertical lines
while True:
# Find the minimum number of covering lines
row_covered = [False] * n
col_covered = [False] * n
covered_zeros = [(i, j) for i in range(n) for j in range(n) if cost_matrix[i, j] == 0]
for i, j in covered_zeros:
row_covered[i] = True
col_covered[j] = True
if sum(row_covered) + sum(col_covered) >= n:
break
# Step 4: Modify the matrix if the number of covering lines is not sufficient
min_uncovered = np.min(cost_matrix[np.ix_(~np.array(row_covered), ~np.array(col_covered))])
for i in range(n):
if not row_covered[i]:
cost_matrix[i] -= min_uncovered
for j in range(n):
if col_covered[j]:
cost_matrix[:, j] += min_uncovered
return cost_matrix
cost_matrix = [
[4, 2, 8, 7],
[6, 4, 6, 5],
[8, 3, 7, 4],
[7, 5, 6, 6]
]
result = hungarian_algorithm(cost_matrix)
print("Modified cost matrix after applying Hungarian Algorithm:")
print(result)
91. Simulated Annealing
Used in optimization problems to find a good approximation to the global optimum of a given function, Simulated Annealing is a probabilistic technique inspired by the annealing process in metallurgy. The algorithm explores the solution space by occasionally accepting worse solutions to escape local optima, gradually reducing the acceptance probability over time. It is widely used in combinatorial optimization problems, such as the traveling salesman problem and job scheduling.
Start with an initial solution and an initial "temperature."
Randomly perturb the current solution to generate a new solution.
If the new solution is better, accept it. If it is worse, accept it with a probability that decreases with temperature.
Gradually reduce the temperature and repeat steps 2-3 until the system "freezes" at a low temperature.
Simple explanation: Imagine you’re searching for the lowest point in a bumpy landscape. Instead of just going downhill, you sometimes take small steps uphill to avoid getting stuck in a valley, gradually reducing your willingness to go uphill as you get closer to the lowest point.
import math
import random
def simulated_annealing(cost_function, initial_solution, temp, cooling_rate, stop_temp):
current_solution = initial_solution
current_cost = cost_function(current_solution)
best_solution = current_solution
best_cost = current_cost
while temp > stop_temp:
new_solution = current_solution + random.uniform(-1, 1)
new_cost = cost_function(new_solution)
if new_cost < best_cost or math.exp((current_cost - new_cost) / temp) > random.random():
current_solution = new_solution
current_cost = new_cost
if new_cost < best_cost:
best_solution = new_solution
best_cost = new_cost
temp *= cooling_rate
return best_solution, best_cost
def cost_function(x):
return x**2 + 4*math.sin(5*x) + math.cos(10*x)
initial_solution = random.uniform(-10, 10)
temp = 1000
cooling_rate = 0.95
stop_temp = 1e-3
best_solution, best_cost = simulated_annealing(cost_function, initial_solution, temp, cooling_rate, stop_temp)
print("Best solution:", best_solution)
print("Best cost:", best_cost)
92. Ant Colony Optimization
A probabilistic technique for finding optimal paths, Ant Colony Optimization (ACO) is inspired by the foraging behavior of ants. The algorithm simulates the movement of ants in a graph, where they leave pheromones on the paths they traverse. Over time, shorter paths accumulate more pheromones, attracting more ants and converging on the optimal solution. ACO is widely used in solving problems like the traveling salesman problem, network routing, and scheduling.
Initialize pheromone levels on all paths.
Each ant constructs a solution by moving through the graph based on pheromone levels and heuristic information.
Update the pheromone levels based on the quality of the solutions found.
Evaporate some of the pheromone to avoid convergence on suboptimal solutions.
Repeat steps 2-4 until convergence or a stopping criterion is met.
Simple explanation: Imagine you’re an ant looking for food. You start wandering randomly, but if you find food, you leave a trail (pheromone) on your way back. Other ants will follow the strongest trails, leading the colony to the shortest path to the food.
import random
class AntColonyOptimization:
def __init__(self, distance_matrix, n_ants, n_best, n_iterations, decay, alpha=1, beta=1):
self.distances = distance_matrix
self.pheromone = [[1 / len(distance_matrix) for _ in range(len(distance_matrix))] for _ in range(len(distance_matrix))]
self.all_inds = range(len(distance_matrix))
self.n_ants = n_ants
self.n_best = n_best
self.n_iterations = n_iterations
self.decay = decay
self.alpha = alpha
self.beta = beta
def run(self):
shortest_path = None
all_time_shortest_path = ("placeholder", float("inf"))
for i in range(self.n_iterations):
all_paths = self.gen_all_paths()
self.spread_pheromone(all_paths, self.n_best)
shortest_path = min(all_paths, key=lambda x: x[1])
if shortest_path[1] < all_time_shortest_path[1]:
all_time_shortest_path = shortest_path
self.pheromone = [[self.decay_pheromone(self.pheromone[i][j]) for j in range(len(self.pheromone[i]))] for i in range(len(self.pheromone))]
return all_time_shortest_path
def gen_path_dist(self, path):
total_dist = 0
for i in range(len(path)):
total_dist += self.distances[path[i-1]][path[i]]
return total_dist
def spread_pheromone(self, all_paths, n_best):
sorted_paths = sorted(all_paths, key=lambda x: x[1])
for path, dist in sorted_paths[:n_best]:
for move in path:
self.pheromone[move[0]][move[1]] += 1.0 / self.distances[move[0]][move[1]]
def gen_all_paths(self):
all_paths = []
for i in range(self.n_ants):
path = self.gen_path(0)
all_paths.append((path, self.gen_path_dist(path)))
return all_paths
def gen_path(self, start):
path = []
visited = set()
visited.add(start)
prev = start
for i in range(len(self.distances) - 1):
move = self.pick_move(self.pheromone[prev], self.distances[prev], visited)
path.append((prev, move))
prev = move
visited.add(move)
path.append((prev, start))
return path
def pick_move(self, pheromone, dist, visited):
pheromone = [pow(pheromone[i], self.alpha) * pow((1.0 / dist[i]), self.beta) if i not in visited else 0 for i in range(len(pheromone))]
norm_pheromone = [float(i) / sum(pheromone) for i in pheromone]
move = 0
rand = random.uniform(0, 1)
for i, prob in enumerate(norm_pheromone):
rand -= prob
if rand <= 0:
move = i
break
return move
def decay_pheromone(self, pheromone):
return pheromone * self.decay
distance_matrix = [
[0, 2, 2, 5, 7],
[2, 0, 4, 8, 2],
[2, 4, 0, 1, 3],
[5, 8, 1, 0, 2],
[7, 2, 3, 2, 0]
]
aco = AntColonyOptimization(distance_matrix, n_ants=10, n_best=2, n_iterations=100, decay=0.95, alpha=1, beta=2)
shortest_path = aco.run()
print("Shortest path:", shortest_path)
93. Particle Swarm Optimization
Used in computational intelligence and optimization, Particle Swarm Optimization (PSO) is inspired by the social behavior of birds flocking or fish schooling. Each particle in the swarm represents a potential solution and moves through the solution space by adjusting its position based on its own experience and the experience of neighboring particles. PSO is commonly used in optimization problems where the solution space is large and complex, such as neural network training, function optimization, and industrial design.
Initialize a swarm of particles with random positions and velocities.
Evaluate the fitness of each particle based on a predefined objective function.
Update the velocity and position of each particle based on its personal best position and the global best position found by the swarm.
Repeat steps 2-3 until convergence or a stopping criterion is met.
Simple explanation: Imagine a group of birds flying around looking for food. Each bird adjusts its flight path based on where it found food in the past and where it sees other birds finding food. Together, they converge on the best feeding spot.
import random
def particle_swarm_optimization(cost_function, bounds, num_particles, num_iterations):
particles = [{'position': [random.uniform(bounds[i][0], bounds[i][1]) for i in range(len(bounds))],
'velocity': [random.uniform(-1, 1) for _ in range(len(bounds))],
'best_position': None,
'best_cost': float('inf')}
for _ in range(num_particles)]
global_best_position = None
global_best_cost = float('inf')
for particle in particles:
particle['best_position'] = particle['position']
particle['best_cost'] = cost_function(particle['position'])
if particle['best_cost'] < global_best_cost:
global_best_position = particle['position']
global_best_cost = particle['best_cost']
for _ in range(num_iterations):
for particle in particles:
for i in range(len(bounds)):
r1 = random.random()
r2 = random.random()
cognitive_velocity = 2 * r1 * (particle['best_position'][i] - particle['position'][i])
social_velocity = 2 * r2 * (global_best_position[i] - particle['position'][i])
particle['velocity'][i] = 0.5 * particle['velocity'][i] + cognitive_velocity + social_velocity
particle['position'][i] += particle['velocity'][i]
current_cost = cost_function(particle['position'])
if current_cost < particle['best_cost']:
particle['best_position'] = particle['position']
particle['best_cost'] = current_cost
if current_cost < global_best_cost:
global_best_position = particle['position']
global_best_cost = current_cost
return global_best_position, global_best_cost
def cost_function(x):
return sum(x_i**2 for x_i in x)
bounds = [(-10, 10), (-10, 10)]
num_particles = 30
num_iterations = 100
best_position, best_cost = particle_swarm_optimization(cost_function, bounds, num_particles, num_iterations)
print("Best position:", best_position)
print("Best cost:", best_cost)
94. Genetic Algorithm
Used in search and optimization problems, Genetic Algorithms (GAs) are inspired by the process of natural selection. GAs evolve a population of candidate solutions towards better solutions by applying genetic operators such as selection, crossover, and mutation. GAs are widely used in fields such as artificial intelligence, robotics, bioinformatics, and engineering design.
Initialize a population of individuals with random genes.
Evaluate the fitness of each individual based on a predefined fitness function.
Select individuals to reproduce based on their fitness.
Apply crossover and mutation to generate new offspring.
Replace the old population with the new offspring.
Repeat steps 2-5 until convergence or a stopping criterion is met.
Simple explanation: Imagine you have a group of creatures, and you’re trying to breed them to get the strongest offspring. Over many generations, by combining the best traits of the parents and occasionally introducing new traits (mutations), you end up with creatures that are better adapted to their environment.
import random
def genetic_algorithm(fitness_function, bounds, population_size, mutation_rate, generations):
population = [[random.uniform(bounds[i][0], bounds[i][1]) for i in range(len(bounds))] for _ in range(population_size)]
best_individual = None
best_fitness = float('inf')
for _ in range(generations):
population = sorted(population, key=lambda x: fitness_function(x))
if fitness_function(population[0]) < best_fitness:
best_individual = population[0]
best_fitness = fitness_function(population[0])
next_population = population[:population_size//2]
while len(next_population) < population_size:
parent1 = random.choice(next_population)
parent2 = random.choice(next_population)
child = crossover(parent1, parent2)
child = mutate(child, mutation_rate, bounds)
next_population.append(child)
population = next_population
return best_individual, best_fitness
def crossover(parent1, parent2):
return [(p1 + p2) / 2 for p1, p2 in zip(parent1, parent2)]
def mutate(individual, mutation_rate, bounds):
return [i + random.uniform(-mutation_rate, mutation_rate) if random.random() < mutation_rate else i for i in individual]
def fitness_function(x):
return sum(x_i**2 for x_i in x)
bounds = [(-10, 10), (-10, 10)]
population_size = 100
mutation_rate = 0.1
generations = 100
best_individual, best_fitness = genetic_algorithm(fitness_function, bounds, population_size, mutation_rate, generations)
print("Best individual:", best_individual)
print("Best fitness:", best_fitness)
95. Newton’s Method
An iterative method for finding successively better approximations to the roots (or zeros) of a real-valued function. Newton’s Method is widely used in numerical analysis, physics, engineering, and economics to solve nonlinear equations. The method is particularly useful when an analytic solution is difficult or impossible to obtain.
Start with an initial guess x_0.
Compute the next approximation using the formula: x_{n+1} = x_n - f(x_n) / f'(x_n).
Repeat step 2 until the difference between successive approximations is smaller than a predetermined tolerance level, or a maximum number of iterations is reached.
Simple explanation: Imagine you’re trying to find the exact point where a curve crosses the x-axis (a root). Newton’s Method helps you start with a rough guess and gradually refine it, getting closer and closer to the root with each step.
def newtons_method(f, f_prime, x0, tolerance=1e-7, max_iterations=100):
x_n = x0
for n in range(max_iterations):
fx_n = f(x_n)
if abs(fx_n) < tolerance:
return x_n
fpx_n = f_prime(x_n)
if fpx_n == 0:
return None
x_n = x_n - fx_n / fpx_n
return None
f = lambda x: x**2 - 2
f_prime = lambda x: 2*x
initial_guess = 1.0
root = newtons_method(f, f_prime, initial_guess)
print("Root:", root)
96. Lagrange Interpolation
A polynomial interpolation method used to estimate the value of a function given a set of points. Lagrange Interpolation finds the polynomial of the lowest degree that passes through all the given points. It is commonly used in numerical analysis, data fitting, and computer graphics to reconstruct curves and surfaces from discrete data points.
Given a set of n data points (x_0, y_0), (x_1, y_1), ..., (x_{n-1}, y_{n-1}), construct the Lagrange basis polynomials: L_i(x) = product((x - x_j) / (x_i - x_j)) for j != i.
The interpolating polynomial is given by: P(x) = sum(y_i * L_i(x)) for i from 0 to n-1.
Simple explanation: Imagine you have a few points on a graph, and you want to draw a smooth curve that passes through all of them. Lagrange Interpolation helps you find the exact curve that connects the dots perfectly.
def lagrange_interpolation(x_values, y_values, x):
def basis_polynomial(i, x):
result = 1.0
for j in range(len(x_values)):
if i != j:
result *= (x - x_values[j]) / (x_values[i] - x_values[j])
return result
result = 0.0
for i in range(len(x_values)):
result += y_values[i] * basis_polynomial(i, x)
return result
x_values = [0, 1, 2]
y_values = [1, 3, 2]
x = 1.5
y = lagrange_interpolation(x_values, y_values, x)
print("Interpolated value at x = 1.5:", y)
97. Bell Curve (Gaussian Distribution)
Used in statistics to represent the normal distribution, the Bell Curve is a symmetric curve centered around the mean of the data. The shape of the curve is determined by the mean and standard deviation of the data, with most of the data points falling within one standard deviation of the mean. The Bell Curve is widely used in fields such as psychology, finance, and natural sciences to model the distribution of various phenomena.
f(x) = (1 / (sigma * sqrt(2 * pi))) * exp(-((x - mu)^2) / (2 * sigma^2))
Where:
mu is the mean,
sigma is the standard deviation,
pi is a constant approximately equal to 3.14159,
exp represents the exponential function.
Simple explanation: Imagine you’re measuring the heights of a large group of people. Most people will have heights close to the average, with fewer people being much shorter or taller. The Bell Curve helps you visualize this distribution, showing that the majority of data points cluster around the mean.
import numpy as np
import matplotlib.pyplot as plt
mu = 0 # mean
sigma = 1 # standard deviation
x = np.linspace(mu - 3*sigma, mu + 3*sigma, 100)
y = (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-((x - mu) ** 2) / (2 * sigma ** 2))
plt.plot(x, y)
plt.title('Bell Curve (Gaussian Distribution)')
plt.xlabel('x')
plt.ylabel('Probability Density')
plt.show()
98. Markowitz Portfolio Optimization
Used in finance to allocate assets for maximum return for a given risk level, Markowitz Portfolio Optimization is based on Modern Portfolio Theory (MPT). The goal is to find the optimal portfolio that maximizes expected return for a given level of risk or minimizes risk for a given level of expected return by diversifying investments.
minimize: (1/2) * w^T * Cov * w
subject to: w^T * mu = R_target
sum(w) = 1
Where:
w is the vector of portfolio weights,
Cov is the covariance matrix of asset returns,
mu is the vector of expected returns,
R_target is the target return.
Simple explanation: Imagine you have money to invest in different stocks, and you want to make the most money with the least amount of risk. Markowitz Portfolio Optimization helps you figure out how much to invest in each stock to get the best balance between risk and reward.
import numpy as np
import scipy.optimize as sco
# Expected returns and covariance matrix
mu = np.array([0.12, 0.18, 0.15])
cov = np.array([[0.1, 0.02, 0.03],
[0.02, 0.08, 0.01],
[0.03, 0.01, 0.09]])
def portfolio_variance(weights, cov_matrix):
return weights.T @ cov_matrix @ weights
def portfolio_return(weights, mu):
return weights.T @ mu
def objective(weights, mu, cov_matrix, R_target):
return portfolio_variance(weights, cov_matrix)
def constraint(weights, mu, R_target):
return portfolio_return(weights, mu) - R_target
R_target = 0.15
n_assets = len(mu)
bounds = tuple((0, 1) for _ in range(n_assets))
constraints = ({'type': 'eq', 'fun': lambda weights: np.sum(weights) - 1},
{'type': 'eq', 'fun': constraint, 'args': (mu, R_target)})
result = sco.minimize(objective, n_assets * [1. / n_assets,], args=(mu, cov),
method='SLSQP', bounds=bounds, constraints=constraints)
print("Optimal portfolio weights:", result.x)