class BBLib::FuzzyMatcher

Used to apply multiple string comparison algorithms to strings and normalize them to determine similarity for words or phrases.

Public Instance Methods

best_match(string_a, *string_b) click to toggle source

Returns the best match from array b to string a based on percent.

# File lib/bblib/classes/fuzzy_matcher.rb, line 30
def best_match(string_a, *string_b)
  similarities(string_a, *string_b).max_by { |_k, v| v }[0]
end
match?(string_a, string_b) click to toggle source

Checks to see if the match percentage between Strings a and b are equal to or greater than the threshold.

# File lib/bblib/classes/fuzzy_matcher.rb, line 25
def match?(string_a, string_b)
  similarity(string_a, string_b) >= threshold.to_f
end
set_weight(algorithm, weight) click to toggle source
# File lib/bblib/classes/fuzzy_matcher.rb, line 40
def set_weight(algorithm, weight)
  return nil unless algorithms.include? algorithm
  algorithms[algorithm] = BBLib.keep_between(weight, 0, nil)
end
similarities(string_a, *string_b) click to toggle source

Returns a hash of array 'b' with the percentage match to a. If sort is true, the hash is sorted desc by match percent.

# File lib/bblib/classes/fuzzy_matcher.rb, line 36
def similarities(string_a, *string_b)
  [*string_b].map { |word| [word, matches[word] = similarity(string_a, word)] }
end
similarity(string_a, string_b) click to toggle source

Calculates a percentage match between string a and string b.

# File lib/bblib/classes/fuzzy_matcher.rb, line 12
def similarity(string_a, string_b)
  string_a, string_b = prep_strings(string_a, string_b)
  return 100.0 if string_a == string_b
  score = 0
  total_weight = algorithms.values.inject { |sum, weight| sum + weight }
  algorithms.each do |algorithm, weight|
    next unless weight.positive?
    score+= string_a.send("#{algorithm}_similarity", string_b) * weight
  end
  score / total_weight
end

Private Instance Methods

prep_strings(string_a, string_b) click to toggle source
# File lib/bblib/classes/fuzzy_matcher.rb, line 56
def prep_strings(string_a, string_b)
  string_a = string_a.to_s.dup
  string_b = string_b.to_s.dup
  [
    case_sensitive? ? nil : :downcase,
    remove_symbols? ? :drop_symbols : nil,
    convert_roman? ? :from_roman : nil,
    move_articles? ? :move_articles : nil
  ].compact.each do |method|
    string_a = string_a.send(method)
    string_b = string_b.send(method)
  end
  [string_a, string_b]
end
simple_setup() click to toggle source
# File lib/bblib/classes/fuzzy_matcher.rb, line 47
def simple_setup
  self.algorithms = {
    levenshtein: 10,
    composition: 5,
    numeric:     0,
    phrase:      0
  }
end