Chapter 11
Tree Project


11.2 B-Trees

<1> Some properties

  1. B-tree may have more than two children

  2. each node of B-tree may contains more than a single entry

  3. any two entries in the B-tree follows a
    total order semantics

  4. the entries of each node are stores as a set of
    ( or bag ) entries by using set or Bag class.

<2> The B-tree Rule

  1. The root have at least one entry
    every other node has at least MINIMUM entries.

  2. the max number of entries in a node is
    twice the value of MINIMUM.

  3. The entries in each node is stored orderly
    from smallest to largest in a partially filled array.

  4. The number of subtrees for a non-leaf node (p.502)
    is th entry number of node + 1.

  5. for any non-leaf node
    1. an index : entry is > all entries in subtree index of i
    2. an index : entry is < all entries in subtree index of i+1

  6. every leaf has the same depth

eq.

  1. MINIMUM = 1

  2. MINIMUM = 1

<3> The set ADT using B-tres structure

  1. For each root node of B-tree

    size_t data_count;
    Item data[MAXIMUM +1];
    size_t child_count
    Set *subset[MAXIMUM +2];
    
    //here we use the property
    
    Note :

    MINIMUM, MAXIMUM
       are class constants
    eq.
       static const size_t MINIMUM = 20;
       static const size_t MAXIMUM = 2 * MAXIMUM  ;
    
    every child of the root is also
    the root of a smaller B-Tree

    --->recursion

  2. Invariant for the Set ADT implemented with B-tree

    1. items of the set are stored in B-tree

    2. data_cout number of entries in the root
      child_count number of subtree

    3. data[] stores the entries in the root
      from 0 to data_count - 1
    4. pointer array subset[] store the pointers
      pointing to subtree from 0 to child_count -1

  3. Interface member functions

    1. search for an item in a B-tree

      find 10 in the B-tree ?

      ^^^^target

      Approach : function bool contain(Item target)

      1. find min. index i such that data[i] >= target

      2. if ( target == data[i] )
                return true
        else if ( the root has no children )
                return false
        else
                return subset[i] -> contains(target); 
      3. Insert an item into a B-tree

      ex1. insert 18

      ex2.

      Note :

      1. at the first step of insertion, the item is always
        inserted into the end node (loose insertion function)

      2. if the total number of items at the inserted node
        is greater than MAXIMUM then
      • find the middle item of nodes,
      • separate the node into two nodes,
        one contains items > the middle item
        one contains items < the middle item
      • raise the middle item of the inserted node into its parent node.

      eq.

    2. the b. step is implemented reccusively such that
      no node in the B-tree with item number > MAXIMUM.

    Algorithm :

    since the above proceure is implemented recursively
    one can have the following algorithm.

    loose_insent (entry) ;    (*)
    if (data_count > MAXIMUM)
        fix_excess (data_count);    (**)
    (*) loose insection function (private)

    template void Set :: loose_insert(coust Item& entry){ find the min index i such that data[i] >= entry if (no such i ) { i = data_corst; } if (data[i] == entry) return; // nothing need to be done else if (root has no children) insert the entry as i-th position of data else{ subset[i] -> loose_insert (entry); // go to thesubset fix_excess (i); } (**) fix excess function (fix the excess data for the subset node) template void Set :: fix_excess (size_t i ) ;

    where i is the index of subset[i] for the corrent node.

    HW.how to implement fix_excess?

  4. remove an item from a B-tree too complicate to explain.

    Note :

    When B-tree become large, one has to store
    some of nodes of B-tree in the secondary memory.
    This situation is called external B-tree.