module Kalibera

Constants

CONSTANTS
ConfRange

Public Class Methods

bootstrap_geomean(l_data_a, l_data_b, iterations=10000, confidence='0.95') click to toggle source
# File lib/kalibera/data.rb, line 351
def self.bootstrap_geomean(l_data_a, l_data_b, iterations=10000, confidence='0.95')
  raise "lists need to match" unless l_data_a.size == l_data_b.size
  geomeans = []
  iterations.times do
    ratios = []
    l_data_a.zip(l_data_b).each do |a, b|
      ra = a.bootstrap_sample
      rb = b.bootstrap_sample
      mean_ra = mean(ra)
      mean_rb = mean(rb)
      ratios << mean_ra / mean_rb
    end
    geomeans << geomean(ratios)
  end
  geomeans.sort!
  confidence_slice(geomeans, confidence)
end
confidence_slice(means, confidence="0.95") click to toggle source

Returns a tuples (lower, median, upper), where: lower: lower bound of 95% confidence interval median: the median value of the data upper: upper bound of 95% confidence interval

Arguments: means – the list of means (need not be sorted).

# File lib/kalibera/data.rb, line 76
def self.confidence_slice(means, confidence="0.95")
  means = means.sort
  # There may be >1 median indicies, i.e. data is even-sized.
  lower, middle_indicies, upper = confidence_slice_indicies(means.size, confidence)
  median = mean(middle_indicies.map { |i| means[i] })
  ConfRange.new(means[lower], median, means[upper - 1]) # upper is *exclusive*
end
confidence_slice_indicies(length, confidence_level=BigDecimal('0.95')) click to toggle source

Returns a triple (lower, mean_indicies, upper) so that l gives confidence_level of all samples. Mean_indicies is a tuple of one or two indicies that correspond to the mean position

Keyword arguments: confidence_level – desired level of confidence as a Decimal instance.

# File lib/kalibera/data.rb, line 90
def self.confidence_slice_indicies(length, confidence_level=BigDecimal('0.95'))
  raise unless !confidence_level.instance_of?(Float)
  confidence_level = BigDecimal(confidence_level)
  raise unless confidence_level.instance_of?(BigDecimal)
  exclude = (1 - confidence_level) / 2

  if length % 2 == 0
    mean_indicies = [length / 2 - 1, length / 2]  # TRANSLITERATION: was //
  else
    mean_indicies = [length / 2]  # TRANSLITERATION: was //
  end

  lower_index = Integer(
      (exclude * length).round(0, BigDecimal::ROUND_DOWN) # TRANSLITERATION: was quantize 1.
  )

  upper_index = Integer(
      ((1 - exclude) * length).round(0, BigDecimal::ROUND_UP) # TRANSLITERATION: was quantize 1.
  )

  [lower_index, mean_indicies, upper_index]
end
geomean(l) click to toggle source
# File lib/kalibera/data.rb, line 117
def self.geomean(l)
  l.inject(1, :*) ** (1.0 / Float(l.size))
end
mean(l) click to toggle source
# File lib/kalibera/data.rb, line 113
def self.mean(l)
  l.inject(0, :+) / Float(l.size)
end
student_t_quantile95(ndeg) click to toggle source

Look up the 95% quantile from constant table.

# File lib/kalibera/data.rb, line 55
def self.student_t_quantile95(ndeg)
  index = ndeg - 1
  if index >= CONSTANTS.size
    index = -1 # the quantile converges, we just take the last value
  end
  CONSTANTS[index]
end