class PEROBS::BTree

This BTree class is very similar to a classic B+Tree implementation. It manages a tree that is always balanced. The BTree is stored in the specified directory and partially kept in memory to speed up operations. The order of the tree specifies how many keys each node will be able to hold. Leaf nodes will have a value associated with each key. Branch nodes have N + 1 references to child nodes instead.

Attributes

first_leaf[R]
last_leaf[R]
node_cache[R]
nodes[R]
order[R]
size[R]

Public Class Methods

new(dir, name, order, progressmeter) click to toggle source

Create a new BTree object. @param dir [String] Directory to store the tree file @param name [String] Base name of the BTree related files in ‘dir’ @param order [Integer] The maximum number of keys per node. This number

must be odd and larger than 2 and smaller than 2**16 - 1

@param progressmeter [ProgressMeter] reference to a ProgressMeter object

# File lib/perobs/BTree.rb, line 51
def initialize(dir, name, order, progressmeter)
  @dir = dir
  @name = name
  @progressmeter = progressmeter

  unless order > 4
    PEROBS.log.fatal "BTree order must be larger than 4, not #{order}"
  end
  unless order % 2 == 1
    PEROBS.log.fatal "BTree order must be an uneven number, not #{order}"
  end
  unless order < 2 ** 16 - 1
    PEROBS.log.fatal "BTree order must be smaller than #{2**16 - 1}"
  end
  @order = order

  # This EquiBlobsFile contains the nodes of the BTree.
  @nodes = EquiBlobsFile.new(@dir, @name, @progressmeter,
                             BTreeNode::node_bytes(@order))
  @nodes.register_custom_data('first_leaf')
  @nodes.register_custom_data('last_leaf')
  @nodes.register_custom_data('btree_size')
  @node_cache = PersistentObjectCache.new(2**13, 2**13, BTreeNode, self)
  @root = @first_leaf = @last_leaf = nil
  @size = 0

  # This BTree implementation uses a write cache to improve write
  # performance of multiple successive read/write operations. This also
  # means that data may not be written on the backing store until the
  # sync() or close() methods have been called. A bug in the program or a
  # premature program termination can lead to data loss. To detect such
  # situations, we use a lock file whenever there are pending writes.
  @is_dirty = false
  @dirty_flag = LockFile.new(File.join(@dir, name + '.dirty'),
                             { :timeout_secs => 0 })
end

Public Instance Methods

check() { |k, v| ... } click to toggle source

Check if the tree file contains any errors. @return [Boolean] true if no erros were found, false otherwise

# File lib/perobs/BTree.rb, line 161
def check(&block)
  sync
  return false unless @nodes.check

  entries = 0
  stats = nil
  @progressmeter.start('Checking index structure', @size) do |pm|
    stats = @root.check do |k, v|
      pm.update(entries += 1)
      block_given? ? yield(k, v) : true
    end
  end

  return false unless stats

  unless entries == @size
    PEROBS.log.error "The BTree size (#{@size}) and the number of " +
      "found entries (#{entries}) don't match"
    return false
  end
  unless stats.nodes_count == @nodes.total_entries
    PEROBS.log.error "The BTree nodes count (#{stats.nodes_count}) and " +
      "the number of entries in the nodes file (#{@nodes.total_entries}) " +
      "don't match"
      return false
  end
  PEROBS.log.info "Statistics for the BTree #{@name}: " +
    "Number of nodes: #{stats.nodes_count}; " +
    "Branch depth: #{stats.branch_depth}; " +
    "Number of leave nodes: #{stats.leave_nodes}; " +
    "Number of leaves: #{stats.leaves}"

  true
end
clear() click to toggle source

Clear all pools and forget any registered spaces.

# File lib/perobs/BTree.rb, line 134
def clear
  @node_cache.clear
  @nodes.clear
  @size = 0
  set_root(BTreeNode::create(self))
end
close() click to toggle source

Close the tree file.

# File lib/perobs/BTree.rb, line 119
def close
  sync
  PEROBS.log.info "BTree file #{@name} has currently " +
    "#{@nodes.total_entries} used entries and #{@nodes.total_spaces} " +
    "unused entries"
  @nodes.close
  @root = nil
end
delete_node(address) click to toggle source

Delete the node at the given address in the BTree file. @param address [Integer] address in file

# File lib/perobs/BTree.rb, line 271
def delete_node(address)
  @node_cache.delete(address)
  @nodes.delete_blob(address)
end
each(&block) click to toggle source

Iterate over all key/value pairs that are stored in the tree. @yield [key, value]

# File lib/perobs/BTree.rb, line 265
def each(&block)
  @root.each(&block)
end
entries_count() click to toggle source

@return [Integer] The number of entries stored in the tree.

# File lib/perobs/BTree.rb, line 277
def entries_count
  @size
end
erase() click to toggle source

Erase the backing store of the BTree. This method should only be called when not having the BTree open. And it obviously and permanently erases all stored data from the BTree.

# File lib/perobs/BTree.rb, line 144
def erase
  @nodes.erase
  @size = 0
  @root = nil
  @dirty_flag.forced_unlock
end
get(key) click to toggle source

Retrieve the value associated with the given key. If no entry was found, return nil. @param key [Integer] Unique key @return [Integer or nil] found value or nil

# File lib/perobs/BTree.rb, line 232
def get(key)
  @root.get(key)
end
get_best_match(key, min_miss_increment) click to toggle source

Either return the key/value pair that exactly matches the key or a key/value pair that has a key that is at least min_miss_increment larger than the key.

# File lib/perobs/BTree.rb, line 239
def get_best_match(key, min_miss_increment)
  @root.get_best_match(key, min_miss_increment)
end
insert(key, value) click to toggle source

Insert a new value into the tree using the key as a unique index. If the key already exists the old value will be overwritten. @param key [Integer] Unique key @param value [Integer] value

# File lib/perobs/BTree.rb, line 221
def insert(key, value)
  if @root.insert(key, value)
    @size += 1
    @node_cache.flush
  end
end
is_open?() click to toggle source

@return true if file is currently open

# File lib/perobs/BTree.rb, line 129
def is_open?
  !@root.nil?
end
open(file_must_exist = false) click to toggle source

Open the tree file.

# File lib/perobs/BTree.rb, line 89
def open(file_must_exist = false)
  if @dirty_flag.is_locked?
    PEROBS.log.fatal "Index file #{@nodes.file_name} is already " +
      "locked"
  end
  if file_must_exist && !@nodes.file_exist?
    PEROBS.log.fatal "Index file #{@nodes.file_name} does not exist"
  end

  @node_cache.clear
  @nodes.open

  if @nodes.total_entries == 0
    # We've created a new nodes file
    node = BTreeNode::create(self)
  else
    # We are loading an existing tree.
    node = BTreeNode::load_and_link(self, @nodes.first_entry)
    @first_leaf = BTreeNode::load_and_link(
      self, @nodes.get_custom_data('first_leaf'))
    @last_leaf = BTreeNode::load_and_link(
      self, @nodes.get_custom_data('last_leaf'))
  end
  set_root(node)

  # Get the total number of entries that are stored in the tree.
  @size = @nodes.get_custom_data('btree_size')
end
remove(key) click to toggle source

Find and remove the value associated with the given key. If no entry was found, return nil, otherwise the found value.

# File lib/perobs/BTree.rb, line 245
def remove(key)
  @size -= 1 unless (removed_value = @root.remove(key)).nil?

  # Check if the root node only contains one child link after the delete
  # operation. Then we can delete that node and pull the tree one level
  # up. This could happen for a sequence of nodes that all got merged to
  # single child nodes.
  while !@root.is_leaf && @root.children.size == 1
    old_root = @root
    set_root(@root.children.first)
    @root.parent = nil
    delete_node(old_root.node_address)
  end

  @node_cache.flush
  removed_value
end
set_first_leaf(node) click to toggle source

Set the address of the first leaf node. @param node [BTreeNode]

# File lib/perobs/BTree.rb, line 205
def set_first_leaf(node)
  @first_leaf = node
  @nodes.set_custom_data('first_leaf', node.node_address)
end
set_last_leaf(node) click to toggle source

Set the address of the last leaf node. @param node [BTreeNode]

# File lib/perobs/BTree.rb, line 212
def set_last_leaf(node)
  @last_leaf = node
  @nodes.set_custom_data('last_leaf', node.node_address)
end
set_root(node) click to toggle source

Register a new node as root node of the tree. @param node [BTreeNode]

# File lib/perobs/BTree.rb, line 198
def set_root(node)
  @root = node
  @nodes.first_entry = node.node_address
end
sync() click to toggle source

Flush all pending modifications into the tree file.

# File lib/perobs/BTree.rb, line 152
def sync
  @node_cache.flush(true)
  @nodes.set_custom_data('btree_size', @size)
  @nodes.sync
  @dirty_flag.unlock if @dirty_flag.is_locked?
end
to_s() click to toggle source

@return [String] Human reable form of the tree.

# File lib/perobs/BTree.rb, line 282
def to_s
  @root.to_s
end