class IndexBuilder

Class containing methods for building an search index.

Public Class Methods

build(samples, mismatches_max) click to toggle source

Internal: Class method that build a search index from a given Array of samples. The index consists of a Google Hash, which don’t have Ruby’s garbage collection and therefore is much more efficient. The Hash keys consists of index1 and index2 concatenated, and furthermore, if mismatches_max is given index1, and index2 are permutated accordingly. The Hash values are the sample number.

samples - Array of samples (Sample objects with id, index1 and index2).

Examples

IndexBuilder.build(samples)
  # => <Google Hash>

Returns a Google Hash where the key is the index and the value is sample number.

# File lib/index_builder.rb, line 45
def self.build(samples, mismatches_max)
  index_builder = new(samples, mismatches_max)
  index_hash    = index_builder.index_init
  index_builder.index_populate(index_hash)
end
new(samples, mismatches_max) click to toggle source

Internal: Constructor method for IndexBuilder object. The given Array of samples and mismatches_max are saved as an instance variable.

samples - Array of Sample objects. mismatches_max - Integer denoting the maximum number of misses allowed in

an index sequence.

Examples

IndexBuilder.new(samples, 2)
  # => <IndexBuilder>

Returns an IndexBuilder object.

# File lib/index_builder.rb, line 64
def initialize(samples, mismatches_max)
  @samples        = samples
  @mismatches_max = mismatches_max
end

Public Instance Methods

index_init() click to toggle source

Internal: Method to initialize the index. If @mismatches_max is <= then GoogleHashSparseLongToInt is used else GoogleHashDenseLongToInt due to memory and performance.

Returns a Google Hash.

# File lib/index_builder.rb, line 74
def index_init
  if @mismatches_max <= 1
    index_hash = GoogleHashSparseLongToInt.new
  else
    index_hash = GoogleHashDenseLongToInt.new
  end

  index_hash
end
index_populate(index_hash) click to toggle source

Internal: Method to populate the index.

index_hash - Google Hash with initialized index.

Returns a Google Hash.

# File lib/index_builder.rb, line 89
def index_populate(index_hash)
  @samples.each_with_index do |sample, i|
    index_list1 = permutate([sample.index1], @mismatches_max)
    index_list2 = permutate([sample.index2], @mismatches_max)

    index_list1.product(index_list2).each do |index1, index2|
      key = "#{index1}#{index2}".hash

      index_check_existing(index_hash, key, sample, index1, index2)

      index_hash[key] = i
    end
  end

  index_hash
end

Private Instance Methods

index_check_existing(index_hash, key, sample, index1, index2) click to toggle source

Internal: Method to check if a index key already exists in the index, and if so an exception is raised.

index_hash - Google Hash with index key - Integer from Google Hash’s hash method sample - Sample object whos index to check. index1 - String with index1 sequence. index2 - String with index2 sequence.

Returns nothing.

# File lib/index_builder.rb, line 118
def index_check_existing(index_hash, key, sample, index1, index2)
  return unless index_hash[key]

  fail IndexBuilderError, "Index combo of #{index1} and #{index2} already \
       exists for sample id: #{@samples[index_hash[key]].id} and #{sample.id}"
end
permutate(list, permutations = 2, alphabet = 'ATCG') click to toggle source

Internal: Method that for each word in a given Array of word permutates each word a given number (permuate) of times using a given alphabet, such that an Array of words with all possible combinations is returned.

list - Array of words (Strings) to permutate. permuate - Number of permutations (Integer). alphabet - String with alphabet used for permutation.

Examples

permutate(["AA"], 1, "ATCG")
# => ["AA", "TA", "CA", "GA", "AA", "AT", "AC, "AG"]

Returns an Array with permutated words (Strings).

# File lib/index_builder.rb, line 139
def permutate(list, permutations = 2, alphabet = 'ATCG')
  permutations.times do
    set = list.each_with_object(Set.new) { |e, a| a.add(e.to_sym) }

    list.each do |word|
      new_words = permutate_word(word, alphabet)
      new_words.map { |new_word| set.add(new_word.to_sym) }
    end

    list = set.map(&:to_s)
  end

  list
end
permutate_word(word, alphabet) click to toggle source

Internal: Method that permutates a given word using a given alphabet, such that an Array of words with all possible combinations is returned.

word - String with word to permutate. alphabet - String with alphabet used for permutation.

Examples

permutate("AA", "ATCG")
# => ["AA", "TA", "CA", "GA", "AA", "AT", "AC, "AG"]

Returns an Array with permutated words (Strings).

# File lib/index_builder.rb, line 166
def permutate_word(word, alphabet)
  new_words = []

  (0...word.size).each do |pos|
    alphabet.each_char do |char|
      new_words << "#{word[0...pos]}#{char}#{word[pos + 1..-1]}"
    end
  end

  new_words
end