module EnumerableWeightedSample

Constants

VERSION

Public Instance Methods

inverse_weighted_sample(count=nil) click to toggle source

Inverse weighted sample sugar for objects that respond to weight

# File lib/enumerable_weighted_sample.rb, line 46
def inverse_weighted_sample(count=nil)
  inverse_weighted_sample_by(count) { |obj| obj.weight }
end
inverse_weighted_sample_by(count=nil) { |obj| ... } click to toggle source

Variation to invert the weights. Inverting is a bit subjective. I've gone with a formula that calculates the difference from the original maximum. In addition, I'm also adding a constant so that the max retains a nonzero adjusted weight, and items with perfectly equal input weights converge on equal final probability.

# File lib/enumerable_weighted_sample.rb, line 32
def inverse_weighted_sample_by(count=nil)
  wraw = each_with_object({}) { |obj, h| h[obj] = yield(obj) }
  wmax = wraw.values.max
  weights = {}
  wraw.keys.each { |obj| weights[obj] = wmax - wraw[obj] + 1 }
  weighted_sample_by(count) { |obj| weights[obj] }
end
weighted_sample(count=nil) click to toggle source

Weighted sample sugar for objects that respond to weight

# File lib/enumerable_weighted_sample.rb, line 41
def weighted_sample(count=nil)
  weighted_sample_by(count) { |obj| obj.weight }
end
weighted_sample_by(count=nil) { |obj| ... } click to toggle source

Produce a weighted random sampling based on the weights calculated from a given block. The weight function must produce positive real numbers. For more, see Ruby Enumerable#max_by and Efraimidis & Spirakis (2005) ruby-doc.org/core-2.2.1/Enumerable.html#method-i-max_by utopia.duth.gr/~pefraimi/research/data/2007EncOfAlg.pdf

Parameters

- count: The number of samples to return. If not provided, will return a
         single sample. Else if specified returns an array of that length.
- block: Function for retrieving the absolute weight of each object.

Returns a single item if count is not provided, else an array of samples.

# File lib/enumerable_weighted_sample.rb, line 17
def weighted_sample_by(count=nil)
  wraw = each_with_object({}) { |obj, h| h[obj] = yield(obj) }
  wsum = wraw.values.inject(&:+)
  weights = wraw.keys.each_with_object({}) { |obj, h| h[obj] = Float(wraw[obj]) / wsum }
  results = (count || 1).times.map do
    weights.max_by { |_obj, weight| rand ** (1.0 / weight) }.first
  end
  count.nil? ? results.first : results
end