B Tree
Overview
With all the knowledge in Trees, we can very quickly grasp the basics of B-Tree and how to remember it - like a Binary Search Tree but with more than 1 key inside a node, and more than 2 children for a node (which makes it a non-binary tree). But lets get to the definition and theory on B-Tree now.
There are myths on what the B in B-Tree means but we can't tell for sure so we need to accept the name as it is.
For the curious - the B could potentially mean Balanced or it could be after the name of one of its authors Rudolf Bayer or... after the company that the author worked for - Boeing Research Labs. Still, these are only speculations.
Enough preludes, let's now learn what is a B-Tree exactly?
It's a self-balancing tree that is used to store huge sets of data and that lets you access, search, insert and delete data - all at logaritmic time. It's generally a preferred for file system and databases. The self-balancing property it has means that all leaf nodes should be at the same level. If a new node is inserted or an old one is deleted and it breaks this rule (leaf nodes are no longer on the same level), the tree needs to rebalance itself. This allows the operations on the tree, as already mentioned, to take logaritmic time.
Other important property (already mentioned in the beginning) is that each node can have multiple keys and multiple children.
How we denote their numbers? By 'Degree' - the maximum number of children a node can have.
If we have a B-Tree of Degree 3 (like in the animation), it means that we have a node with maximum of 3 children and maximum of Degree minus 1 keys (which means maximum 2 keys per node). If you take a closer look at the animation on top, you will see how it makes sense. If we have a node with 2 keys - say 20 and 30 at the end, this node has 3 children. The left child is less than the left key - less than 20. The mid child's value is in-between left and right keys - between 20 and 30. And the right child is more than the right key - more than 30.
This arrangement of data allows for sequential/predictable sorted data which makes it a breeze for binary search.
As for the main operations on B Tree:
Deletion in B-tree is done in the following manner.
If the key is in a leaf node, we can directly remove it. If the node from which we removed a key remains with enough keys "(children / 2) - 1" or in our case 1 key, we are done. If the node becomes empty, it gets deleted along with that key and the tree needs to be rebalanced.
If the key to be deleted is in an internal node and a child can spare the key, it's swapped with the next highest or lowest key from that child. Then, the deletion process continues in that child. This one is simple.
But if the key is in an internal node and its child holds the minimum keys, rebalancing occurs. This could involve:
Key redistribution - If a sibling has more keys than required, a key can be borrowed from it to fill the underfilled node. This action balances the key distribution between nodes.
Node Merge - When there are no suitable siblings for redistribution, merging might be necessary. The underfilled node and one of its siblings are combined into a single node. The parent node's key that previously separated these nodes might need to be adjusted.
Insertion - Deletion is generally difficult but thankfully, insertion is much simpler. In short- if the new key is moving towards a non-full node (less than Degree minus 1 keys), simply insert it.
If the target node is already full, it needs to be split in two nodes and the middle node moves up the tree. You can clearly see this process in the animation (the animation builds the B-Tree via the insertion process)
Code
class BTreeNode:
def __init__(self, leaf=True):
self.leaf = leaf
self.keys = []
self.children = []
class BTree:
def __init__(self, t):
self.root = BTreeNode(True)
self.t = t
def insert(self, key):
root = self.root
if len(root.keys) == (2 * self.t) - 1:
new_root = BTreeNode(False)
new_root.children.append(root)
self.split_child(new_root, 0)
self.root = new_root
self.insert_non_full(root, key)
def insert_non_full(self, node, key):
i = len(node.keys) - 1
if node.leaf:
node.keys.append(0)
while i >= 0 and key < node.keys[i]:
node.keys[i + 1] = node.keys[i]
i -= 1
node.keys[i + 1] = key
else:
while i >= 0 and key < node.keys[i]:
i -= 1
i += 1
if len(node.children[i].keys) == (2 * self.t) - 1:
self.split_child(node, i)
if key > node.keys[i]:
i += 1
self.insert_non_full(node.children[i], key)
def split_child(self, parent, index):
t = self.t
y = parent.children[index]
z = BTreeNode(y.leaf)
parent.children.insert(index + 1, z)
parent.keys.insert(index, y.keys[t - 1])
z.keys = y.keys[t:]
y.keys = y.keys[:t - 1]
if not y.leaf:
z.children = y.children[t:]
y.children = y.children[:t]
def delete(self, key):
self.delete_recursive(self.root, key)
def delete_recursive(self, node, key):
i = 0
while i < len(node.keys) and key > node.keys[i]:
i += 1
if i < len(node.keys) and key == node.keys[i]:
if node.leaf:
node.keys.pop(i)
else:
self.delete_internal_node(node, i)
elif not node.leaf:
self.delete_recursive(node.children[i], key)
def delete_internal_node(self, node, index):
key = node.keys[index]
pred = node.children[index]
succ = node.children[index + 1]
if len(pred.keys) >= self.t:
pred_max = self.get_max_key(pred)
node.keys[index] = pred_max
self.delete_recursive(pred, pred_max)
elif len(succ.keys) >= self.t:
succ_min = self.get_min_key(succ)
node.keys[index] = succ_min
self.delete_recursive(succ, succ_min)
else:
merged_node = self.merge_nodes(pred, succ)
node.keys.pop(index)
node.children[index] = merged_node
if node == self.root and len(node.keys) == 0:
self.root = merged_node
def merge_nodes(self, pred, succ):
merged_node = BTreeNode(pred.leaf)
merged_node.keys = pred.keys + succ.keys
merged_node.children = pred.children + succ.children
return merged_node
def get_max_key(self, node):
while not node.leaf:
node = node.children[-1]
return node.keys[-1]
def get_min_key(self, node):
while not node.leaf:
node = node.children[0]
return node.keys[0]
def search(self, key):
return self.search_recursive(self.root, key)
def search_recursive(self, node, key):
i = 0
while i < len(node.keys) and key > node.keys[i]:
i += 1
if i < len(node.keys) and key == node.keys[i]:
return True
elif node.leaf:
return False
else:
return self.search_recursive(node.children[i], key)
# Example usage:
btree = BTree(t=3)
numbers = [50, 40, 70, 30, 20, 90, 80, 10, 25, 60, 76]
for num in numbers:
btree.insert(num)
print(btree.search(25)) # Output: True
print(btree.search(100)) # Output: False
btree.delete(25)
print(btree.search(25)) # Output: False
class BTreeNode {
constructor(leaf = true) {
this.leaf = leaf;
this.keys = [];
this.children = [];
}
}
class BTree {
constructor(t) {
this.root = new BTreeNode(true);
this.t = t;
}
insert(key) {
let root = this.root;
if (root.keys.length === 2 * this.t - 1) {
const newRoot = new BTreeNode(false);
newRoot.children.push(root);
this.splitChild(newRoot, 0);
this.root = newRoot;
}
this.insertNonFull(root, key);
}
insertNonFull(node, key) {
let i = node.keys.length - 1;
if (node.leaf) {
node.keys.push(0);
while (i >= 0 && key < node.keys[i]) {
node.keys[i + 1] = node.keys[i];
i--;
}
node.keys[i + 1] = key;
} else {
while (i >= 0 && key < node.keys[i]) {
i--;
}
i++;
if (node.children[i].keys.length === 2 * this.t - 1) {
this.splitChild(node, i);
if (key > node.keys[i]) {
i++;
}
}
this.insertNonFull(node.children[i], key);
}
}
splitChild(parent, index) {
const t = this.t;
const y = parent.children[index];
const z = new BTreeNode(y.leaf);
parent.children.splice(index + 1, 0, z);
parent.keys.splice(index, 0, y.keys[t - 1]);
z.keys = y.keys.splice(t);
if (!y.leaf) {
z.children = y.children.splice(t);
}
}
delete(key) {
this.deleteRecursive(this.root, key);
}
deleteRecursive(node, key) {
let i = 0;
while (i < node.keys.length && key > node.keys[i]) {
i++;
}
if (i < node.keys.length && key === node.keys[i]) {
if (node.leaf) {
node.keys.splice(i, 1);
} else {
this.deleteInternalNode(node, i);
}
} else if (!node.leaf) {
this.deleteRecursive(node.children[i], key);
}
}
deleteInternalNode(node, index) {
const key = node.keys[index];
const pred = node.children[index];
const succ = node.children[index + 1];
if (pred.keys.length >= this.t) {
const predMax = this.getMaxKey(pred);
node.keys[index] = predMax;
this.deleteRecursive(pred, predMax);
} else if (succ.keys.length >= this.t) {
const succMin = this.getMinKey(succ);
node.keys[index] = succMin;
this.deleteRecursive(succ, succMin);
} else {
const mergedNode = this.mergeNodes(pred, succ);
node.keys.splice(index, 1);
node.children[index] = mergedNode;
if (node === this.root && node.keys.length === 0) {
this.root = mergedNode;
}
}
}
mergeNodes(pred, succ) {
const mergedNode = new BTreeNode(pred.leaf);
mergedNode.keys = pred.keys.concat(succ.keys);
mergedNode.children = pred.children.concat(succ.children);
return mergedNode;
}
getMaxKey(node) {
while (!node.leaf) {
node = node.children[node.children.length - 1];
}
return node.keys[node.keys.length - 1];
}
getMinKey(node) {
while (!node.leaf) {
node = node.children[0];
}
return node.keys[0];
}
search(key) {
return this.searchRecursive(this.root, key);
}
searchRecursive(node, key) {
let i = 0;
while (i < node.keys.length && key > node.keys[i]) {
i++;
}
if (i < node.keys.length && key === node.keys[i]) {
return true;
} else if (node.leaf) {
return false;
} else {
return this.searchRecursive(node.children[i], key);
}
}
}
// Example usage:
const btree = new BTree(3);
const numbers = [50, 40, 70, 30, 20, 90, 80, 10, 25, 60, 76];
numbers.forEach((num) => btree.insert(num));
console.log(btree.search(25)); // Output: true
console.log(btree.search(100)); // Output: false
btree.delete(25);
console.log(btree.search(25)); // Output: false
import java.util.ArrayList;
import java.util.List;
class BTreeNode {
boolean leaf;
List<Integer> keys;
List<BTreeNode> children;
BTreeNode(boolean leaf) {
this.leaf = leaf;
this.keys = new ArrayList<>();
this.children = new ArrayList<>();
}
}
public class BTree {
private BTreeNode root;
private int t;
public BTree(int t) {
this.root = new BTreeNode(true);
this.t = t;
}
public void insert(int key) {
BTreeNode rootNode = this.root;
if (rootNode.keys.size() == (2 * t) - 1) {
BTreeNode newRoot = new BTreeNode(false);
newRoot.children.add(rootNode);
splitChild(newRoot, 0);
root = newRoot;
}
insertNonFull(rootNode, key);
}
private void insertNonFull(BTreeNode node, int key) {
int i = node.keys.size() - 1;
if (node.leaf) {
node.keys.add(0);
while (i >= 0 && key < node.keys.get(i)) {
node.keys.set(i + 1, node.keys.get(i));
i--;
}
node.keys.set(i + 1, key);
} else {
while (i >= 0 && key < node.keys.get(i)) {
i--;
}
i++;
if (node.children.get(i).keys.size() == (2 * t) - 1) {
splitChild(node, i);
if (key > node.keys.get(i)) {
i++;
}
}
insertNonFull(node.children.get(i), key);
}
}
private void splitChild(BTreeNode parent, int index) {
int t = this.t;
BTreeNode y = parent.children.get(index);
BTreeNode z = new BTreeNode(y.leaf);
parent.children.add(index + 1, z);
parent.keys.add(index, y.keys.get(t - 1));
z.keys = new ArrayList<>(y.keys.subList(t, y.keys.size()));
y.keys.subList(t - 1, y.keys.size()).clear();
if (!y.leaf) {
z.children = new ArrayList<>(y.children.subList(t, y.children.size()));
y.children.subList(t, y.children.size()).clear();
}
}
public void delete(int key) {
deleteRecursive(root, key);
}
private void deleteRecursive(BTreeNode node, int key) {
int i = 0;
while (i < node.keys.size() && key > node.keys.get(i)) {
i++;
}
if (i < node.keys.size() && key == node.keys.get(i)) {
if (node.leaf) {
node.keys.remove(i);
} else {
deleteInternalNode(node, i);
}
} else if (!node.leaf) {
deleteRecursive(node.children.get(i), key);
}
}
private void deleteInternalNode(BTreeNode node, int index) {
int key = node.keys.get(index);
BTreeNode pred = node.children.get(index);
BTreeNode succ = node.children.get(index + 1);
if (pred.keys.size() >= t) {
int predMax = getMaxKey(pred);
node.keys.set(index, predMax);
deleteRecursive(pred, predMax);
} else if (succ.keys.size() >= t) {
int succMin = getMinKey(succ);
node.keys.set(index, succMin);
deleteRecursive(succ, succMin);
} else {
BTreeNode mergedNode = mergeNodes(pred, succ);
node.keys.remove(index);
node.children.set(index, mergedNode);
if (node == root && node.keys.size() == 0) {
root = mergedNode;
}
}
}
private BTreeNode mergeNodes(BTreeNode pred, BTreeNode succ) {
BTreeNode mergedNode = new BTreeNode(pred.leaf);
mergedNode.keys.addAll(pred.keys);
mergedNode.keys.addAll(succ.keys);
mergedNode.children.addAll(pred.children);
mergedNode.children.addAll(succ.children);
return mergedNode;
}
private int getMaxKey(BTreeNode node) {
while (!node.leaf) {
node = node.children.get(node.children.size() - 1);
}
return node.keys.get(node.keys.size() - 1);
}
private int getMinKey(BTreeNode node) {
while (!node.leaf) {
node = node.children.get(0);
}
return node.keys.get(0);
}
public boolean search(int key) {
return searchRecursive(root, key);
}
private boolean searchRecursive(BTreeNode node, int key) {
int i = 0;
while (i < node.keys.size() && key > node.keys.get(i)) {
i++;
}
if (i < node.keys.size() && key == node.keys.get(i)) {
return true;
} else if (node.leaf) {
return false;
} else {
return searchRecursive(node.children.get(i), key);
}
}
public static void main(String[] args) {
BTree btree = new BTree(3);
int[] numbers = {50, 40, 70, 30, 20, 90, 80, 10, 25, 60, 76};
for (int num : numbers) {
btree.insert(num);
}
System.out.println(btree.search(25)); // Output: true
System.out.println(btree.search(100)); // Output: false
btree.delete(25);
System.out.println(btree.search(25)); // Output: false
}
}
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#define MAX_KEYS (2 * t - 1)
#define MAX_CHILDREN (2 * t)
typedef struct BTreeNode {
bool leaf;
int num_keys;
int keys[MAX_KEYS];
struct BTreeNode* children[MAX_CHILDREN];
} BTreeNode;
BTreeNode* create_node(bool leaf) {
BTreeNode* new_node = (BTreeNode*)malloc(sizeof(BTreeNode));
new_node->leaf = leaf;
new_node->num_keys = 0;
return new_node;
}
BTreeNode* create_btree(int t) {
BTreeNode* root = create_node(true);
return root;
}
void insert(BTreeNode* root, int key, int t);
void split_child(BTreeNode* parent, int index, int t) {
BTreeNode* y = parent->children[index];
BTreeNode* z = create_node(y->leaf);
z->num_keys = t - 1;
for (int j = 0; j < t - 1; j++) {
z->keys[j] = y->keys[j + t];
}
if (!y->leaf) {
for (int j = 0; j < t; j++) {
z->children[j] = y->children[j + t];
}
}
y->num_keys = t - 1;
for (int j = parent->num_keys; j >= index + 1; j--) {
parent->children[j + 1] = parent->children[j];
}
parent->children[index + 1] = z;
for (int j = parent->num_keys - 1; j >= index; j--) {
parent->keys[j + 1] = parent->keys[j];
}
parent->keys[index] = y->keys[t - 1];
parent->num_keys++;
}
void insert_non_full(BTreeNode* node, int key, int t) {
int i = node->num_keys - 1;
if (node->leaf) {
while (i >= 0 && key < node->keys[i]) {
node->keys[i + 1] = node->keys[i];
i--;
}
node->keys[i + 1] = key;
node->num_keys++;
} else {
while (i >= 0 && key < node->keys[i]) {
i--;
}
i++;
if (node->children[i]->num_keys == MAX_KEYS) {
split_child(node, i, t);
if (key > node->keys[i]) {
i++;
}
}
insert_non_full(node->children[i], key, t);
}
}
void insert(BTreeNode* root, int key, int t) {
if (root->num_keys == MAX_KEYS) {
BTreeNode* new_root = create_node(false);
new_root->children[0] = root;
split_child(new_root, 0, t);
root = new_root;
}
insert_non_full(root, key, t);
}
void delete(BTreeNode* root, int key, int t);
void delete_recursive(BTreeNode* node, int key, int t) {
int i = 0;
while (i < node->num_keys && key > node->keys[i]) {
i++;
}
if (i < node->num_keys && key == node->keys[i]) {
if (node->leaf) {
for (int j = i; j < node->num_keys - 1; j++) {
node->keys[j] = node->keys[j + 1];
}
node->num_keys--;
} else {
delete(node->children[i], key, t);
}
} else if (!node->leaf) {
delete_recursive(node->children[i], key, t);
}
}
void delete_internal_node(BTreeNode* node, int index, int t) {
int key = node->keys[index];
BTreeNode* pred = node->children[index];
BTreeNode* succ = node->children[index + 1];
if (pred->num_keys >= t) {
int pred_max = get_max_key(pred);
node->keys[index] = pred_max;
delete_recursive(pred, pred_max, t);
} else if (succ->num_keys >= t) {
int succ_min = get_min_key(succ);
node->keys[index] = succ_min;
delete_recursive(succ, succ_min, t);
} else {
BTreeNode* merged_node = merge_nodes(pred, succ, t);
for (int j = index; j < node->num_keys - 1; j++) {
node->keys[j] = node->keys[j + 1];
node->children[j + 1] = node->children[j + 2];
}
node->num_keys--;
node->children[index] = merged_node;
free(pred);
free(succ);
}
}
BTreeNode* merge_nodes(BTreeNode* pred, BTreeNode* succ, int t) {
BTreeNode* merged_node = create_node(pred->leaf);
for (int i = 0; i < t - 1; i++) {
merged_node->keys[i] = pred->keys[i];
}
merged_node->keys[t - 1] = pred->keys[t - 1];
for (int i = 0; i < t; i++) {
merged_node->keys[t + i] = succ->keys[i];
}
if (!pred->leaf) {
for (int i = 0; i < t; i++) {
merged_node->children[i] = pred->children[i];
}
for (int i = 0; i < t; i++) {
merged_node->children[t + i] = succ->children[i];
}
}
merged_node->num_keys = 2 * t - 1;
return merged_node;
}
int get_max_key(BTreeNode* node) {
while (!node->leaf) {
node = node->children[node->num_keys];
}
return node->keys[node->num_keys - 1];
}
int get_min_key(BTreeNode* node) {
while (!node->leaf) {
node = node->children[0];
}
return node->keys[0];
}
void delete(BTreeNode* root, int key, int t) {
delete_recursive(root, key, t);
if (root->num_keys == 0) {
BTreeNode* new_root = root->children[0];
free(root);
root = new_root;
}
}
int search_recursive(BTreeNode* node, int key) {
int i = 0;
while (i < node->num_keys && key > node->keys[i]) {
i++;
}
if (i < node->num_keys && key == node->keys[i]) {
return 1;
} else if (node->leaf) {
return 0;
} else {
return search_recursive(node->children[i], key);
}
}
int search(BTreeNode* root, int key) {
return search_recursive(root, key);
}
int main() {
int t = 3;
BTreeNode* root = create_btree(t);
int numbers[] = {50, 40, 70, 30, 20, 90, 80, 10, 25, 60, 76};
int num_count = sizeof(numbers) / sizeof(numbers[0]);
for (int i = 0; i < num_count; i++) {
insert(root, numbers[i], t);
}
printf("%d\n", search(root, 25)); // Output: 1 (true)
printf("%d\n", search(root, 100)); // Output: 0 (false)
delete(root, 25, t);
printf("%d\n", search(root, 25)); // Output: 0 (false)
return 0;
}
class BTreeNode
{
public $leaf;
public $keys;
public $children;
public function __construct($leaf = true)
{
$this->leaf = $leaf;
$this->keys = array();
$this->children = array();
}
}
class BTree
{
private $root;
private $t;
public function __construct($t)
{
$this->root = new BTreeNode(true);
$this->t = $t;
}
public function insert($key)
{
$root = $this->root;
if (count($root->keys) == (2 * $this->t) - 1) {
$newRoot = new BTreeNode(false);
array_push($newRoot->children, $root);
$this->splitChild($newRoot, 0);
$this->root = $newRoot;
}
$this->insertNonFull($root, $key);
}
private function insertNonFull($node, $key)
{
$i = count($node->keys) - 1;
if ($node->leaf) {
array_push($node->keys, 0);
while ($i >= 0 && $key < $node->keys[$i]) {
$node->keys[$i + 1] = $node->keys[$i];
$i--;
}
$node->keys[$i + 1] = $key;
} else {
while ($i >= 0 && $key < $node->keys[$i]) {
$i--;
}
$i++;
if (count($node->children[$i]->keys) == (2 * $this->t) - 1) {
$this->splitChild($node, $i);
if ($key > $node->keys[$i]) {
$i++;
}
}
$this->insertNonFull($node->children[$i], $key);
}
}
private function splitChild(&$parent, $index)
{
$t = $this->t;
$y = $parent->children[$index];
$z = new BTreeNode($y->leaf);
array_splice($parent->children, $index + 1, 0, array($z));
array_splice($parent->keys, $index, 0, array($y->keys[$t - 1]));
$z->keys = array_splice($y->keys, $t, $t - 1);
if (!$y->leaf) {
$z->children = array_splice($y->children, $t, $t);
}
}
public function delete($key)
{
$this->deleteRecursive($this->root, $key);
}
private function deleteRecursive($node, $key)
{
$i = 0;
while ($i < count($node->keys) && $key > $node->keys[$i]) {
$i++;
}
if ($i < count($node->keys) && $key == $node->keys[$i]) {
if ($node->leaf) {
array_splice($node->keys, $i, 1);
} else {
$this->deleteInternalNode($node, $i);
}
} elseif (!$node->leaf) {
$this->deleteRecursive($node->children[$i], $key);
}
}
private function deleteInternalNode($node, $index)
{
$key = $node->keys[$index];
$pred = $node->children[$index];
$succ = $node->children[$index + 1];
if (count($pred->keys) >= $this->t) {
$predMax = $this->getMaxKey($pred);
$node->keys[$index] = $predMax;
$this->deleteRecursive($pred, $predMax);
} elseif (count($succ->keys) >= $this->t) {
$succMin = $this->getMinKey($succ);
$node->keys[$index] = $succMin;
$this->deleteRecursive($succ, $succMin);
} else {
$mergedNode = $this->mergeNodes($pred, $succ);
array_splice($node->keys, $index, 1);
$node->children[$index] = $mergedNode;
if ($node === $this->root && count($node->keys) == 0) {
$this->root = $mergedNode;
}
}
}
private function mergeNodes($pred, $succ)
{
$mergedNode = new BTreeNode($pred->leaf);
$mergedNode->keys = array_merge($pred->keys, $succ->keys);
$mergedNode->children = array_merge($pred->children, $succ->children);
return $mergedNode;
}
private function getMaxKey($node)
{
while (!$node->leaf) {
$node = $node->children[count($node->children) - 1];
}
return $node->keys[count($node->keys) - 1];
}
private function getMinKey($node)
{
while (!$node->leaf) {
$node = $node->children[0];
}
return $node->keys[0];
}
public function search($key)
{
return $this->searchRecursive($this->root, $key);
}
private function searchRecursive($node, $key)
{
$i = 0;
while ($i < count($node->keys) && $key > $node->keys[$i]) {
$i++;
}
if ($i < count($node->keys) && $key === $node->keys[$i]) {
return true;
} elseif ($node->leaf) {
return false;
} else {
return $this->searchRecursive($node->children[$i], $key);
}
}
}
// Example usage:
$btree = new BTree(3);
$numbers = [50, 40, 70, 30, 20, 90, 80, 10, 25, 60, 76];
foreach ($numbers as $num) {
$btree->insert($num);
}
echo $btree->search(25) ? "true\n" : "false\n"; // Output: true
echo $btree->search(100) ? "true\n" : "false\n"; // Output: false
$btree->delete(25);
echo $btree->search(25) ? "true\n" : "false\n"; // Output: false