B-Tree Index Files
A B-Tree is a balanced tree data structure commonly used in database indexing to improve data access speeds. It serves as a mechanism to locate records efficiently within a file. Like other index files, it consists of index entries containing a search-key and a pointer.
Info
There are 3 types of B Tree i.e. Standard B Tree, B* Tree and B+ Tree. We will be focusing on Standard B Tree here.
Properties
- No node has more than children.
- Every node except for the root and the terminal node has at least .
- The root, unless the tree only has one node, has at least two children.
- All terminal nodes appear on the same level, that is, are the same distance from the root.
- A non-terminal node with k children contain record and a terminal node contains at least and at most records.
Operations
Standard operations performed on B-Trees include insertion and deletion, which must maintain the structural properties listed above.
Example
Perform the following operations in the following B Tree with Order,
- Insert
- Delete
graph TD L((•L•)) E((•E•)) RV((•R•V•)) AB((•A•B•)) GH((•G•H•)) N((•N•)) SU((•S•U•)) Y((•Y•)) L --> E L --> RV E --> AB E --> GH RV --> N RV --> SU RV --> Y
Solution
For, Order
- Root Child
- Internal node Child
Insert M
goes between and (Lexicographical order). Insert into leaf
graph TD L((•L•)) E((•E•)) RV((•R•V•)) AB((•A•B•)) GH((•G•H•)) MN((•M•N•)) SU((•S•U•)) Y((•Y•)) L --> E L --> RV E --> AB E --> GH RV --> MN RV --> SU RV --> Y style MN fill:#D1FFC2
Insert J
goes between and . Insert into leaf , making it . Since this node now has 3 keys (overflow for ), we split it.
graph TD L((•L•)) EH((•E•H•)) RV((•R•V•)) AB((•A•B•)) G((•G•)) J((•J•)) MN((•M•N•)) SU((•S•U•)) Y((•Y•)) L --> EH L --> RV EH --> AB EH --> G EH --> J RV --> MN RV --> SU RV --> Y style J fill:#D1FFC2
Insert P
goes between and . Insert into leaf , making it . This causes overflow, so we split and push up and push to the root.
graph TD LR((•L•R•)) EH((•E•H•)) N((•N•)) V((•V•)) AB((•A•B•)) G((•G•)) J((•J•)) M((•M•)) P((•P•)) SU((•S•U•)) Y((•Y•)) LR --> EH LR --> N LR --> V EH --> AB EH --> G EH --> J N --> M N --> P V --> SU V --> Y style P fill:#D1FFC2
Insert D
goes between and . Insert into leaf , making it . So we push to the leaf node and replace it with .
graph TD LR((•L•R•)) DH((•D•H•)) N((•N•)) V((•V•)) AB((•A•B•)) EG((•E•G•)) J((•J•)) M((•M•)) P((•P•)) SU((•S•U•)) Y((•Y•)) LR --> DH LR --> N LR --> V DH --> AB DH --> EG DH --> J N --> M N --> P V --> SU V --> Y style DH fill:#D1FFC2
Delete B
is in the leaf node which has 3 children. So we can simply delete without violating the rules.
graph TD LR((•L•R•)) DH((•D•H•)) N((•N•)) V((•V•)) A((•A•)) EG((•E•G•)) J((•J•)) M((•M•)) P((•P•)) SU((•S•U•)) Y((•Y•)) LR --> DH LR --> N LR --> V DH --> A DH --> EG DH --> J N --> M N --> P V --> SU V --> Y style A fill:#FFCFCF
Delete D
is in internal node. So we delete from that node, replacing the space with from the leaf node.
graph TD LR((•L•R•)) EH((•E•H•)) N((•N•)) V((•V•)) A((•A•)) G((•G•)) J((•J•)) M((•M•)) P((•P•)) SU((•S•U•)) Y((•Y•)) LR --> EH LR --> N LR --> V EH --> A EH --> G EH --> J N --> M N --> P V --> SU V --> Y style EH fill:#FFCFCF
Delete E
is an internal node which has 3 child. We delete and merge and to to decrease the number of child.
graph TD LR((•L•R•)) H((•H•)) N((•N•)) V((•V•)) AG((•A•G•)) J((•J•)) M((•M•)) P((•P•)) SU((•S•U•)) Y((•Y•)) LR --> H LR --> N LR --> V H --> AG H --> J N --> M N --> P V --> SU V --> Y style H fill:#FFCFCF
Delete L
is in the root node. So after deleting , we merge and to to reduce the number of root child by 1 and redistribute the leaf nodes.
graph TD R((•R•)) HN((•H•N•)) V((•V•)) AG((•A•G•)) JM((•J•M•)) P((•P•)) SU((•S•U•)) Y((•Y•)) R --> HN R --> V HN --> AG HN --> JM HN --> P V --> SU V --> Y style R fill:#FFCFCF
And this is the final B Tree after all the operation mentioned.