class Kalibera::Data

Public Class Methods

new(data, reps) click to toggle source

Instances of this class store measurements (corresponding to the Y_… in the papers).

Arguments: data – Dict mapping tuples of all but the last index to lists of values. reps – List of reps for each level, high to low.

# File lib/kalibera/data.rb, line 131
def initialize(data, reps)
  @data = data
  @reps = reps

  # check that all data is there

  array = reps.map { |i| (0...i).to_a }
  array[0].product(*array.drop(1)).each do |index|
    self[*index] # does not crash
  end
end

Public Instance Methods

Si2(i) click to toggle source

Biased estimator S_i^2.

Arguments: i – the mathematical index of the level from which to compute S_i^2

# File lib/kalibera/data.rb, line 197
def Si2(i)
  raise unless 1 <= i
  raise unless i <= n
  # @reps is indexed from the left to right
  index = n - i
  factor = 1.0

  # We compute this iteratively leveraging the fact that
  # 1 / (a * b) = (1 / a) / b
  for rep in @reps[0, index]
    factor /= rep
  end
  # Then at this point we have:
  # factor * (1 / (r_i - 1)) = factor / (r_i - 1)
  factor /=  @reps[index] - 1

  # Second line of the above definition, the lines are multiplied.
  indicies = index_iterator(0, index+1)
  sum = 0.0
  for index in indicies
    a = mean(index)
    b = mean(index[0,index.size-1])
    sum += (a - b) ** 2
  end
  factor * sum
end
Ti2(i) click to toggle source

Compute the unbiased T_i^2 variance estimator.

Arguments: i – the mathematical index from which to compute T_i^2.

# File lib/kalibera/data.rb, line 230
def Ti2(i)
  # This is the broken implementation of T_i^2 shown in the pubslished
  # version of "Rigorous benchmarking in reasonable time". Tomas has
  # since fixed this in local versions of the paper.
  #@memoize
  #def broken_Ti2(self, i)
  #  """ Compute the unbiased T_i^2 variance estimator.
  #
  #  Arguments:
  #  i -- the mathematical index from which to compute T_i^2.
  #  """
  #
  #  raise unless 1 <= i <= n
  #  if i == 1:
  #    return self.Si2(1)
  #  return self.Si2(i) - self.Ti2(i - 1) / self.r(i - 1)

  # This is the correct definition of T_i^2

  raise unless 1 <= i
  raise unless i <= n
  if i == 1
    return Si2(1)
  end
  Si2(i) - Si2(i - 1) / r(i - 1)
end
[](*indicies) click to toggle source
# File lib/kalibera/data.rb, line 143
def [](*indicies)
  raise unless indicies.size == @reps.size
  x = @data[indicies[0...indicies.size-1]]
  raise unless !x.nil?
  x[indicies[-1]]
end
bootstrap_confidence_interval(iterations=10000, confidence="0.95") click to toggle source

Compute a confidence interval via bootstrap method.

Keyword arguments: iterations – Number of resamplings to base result upon. Default is 10000. confidence – The required confidence. Default is “0.95” (95%).

# File lib/kalibera/data.rb, line 306
def bootstrap_confidence_interval(iterations=10000, confidence="0.95")
  means = bootstrap_means(iterations)
  Kalibera.confidence_slice(means, confidence)
end
bootstrap_means(iterations=1000) click to toggle source

Compute a list of simulated means from bootstrap resampling.

Note that, resampling occurs with replacement.

Keyword arguments: iterations – Number of resamples (and thus means) generated.

# File lib/kalibera/data.rb, line 291
def bootstrap_means(iterations=1000)
  means = []
  for i in 0...iterations
    values = bootstrap_sample()
    means.push(Kalibera.mean(values))
  end
  means.sort()
  means
end
bootstrap_quotient(other, iterations=10000, confidence='0.95') click to toggle source
# File lib/kalibera/data.rb, line 331
def bootstrap_quotient(other, iterations=10000, confidence='0.95')
  ratios = []
  for _ in 0...iterations
    ra = bootstrap_sample()
    rb = other.bootstrap_sample()
    mean_ra = Kalibera.mean(ra)
    mean_rb = Kalibera.mean(rb)

    if mean_rb == 0 # protect against divide by zero
      ratios.push(Float::INFINITY)
    else
      ratios.push(mean_ra / mean_rb)
    end
  end
  ratios.sort!
  Kalibera.confidence_slice(ratios, confidence).values
end
bootstrap_sample() click to toggle source
# File lib/kalibera/data.rb, line 327
def bootstrap_sample
  random_measurement_sample
end
confidence95() click to toggle source

Compute the 95% confidence interval.

# File lib/kalibera/data.rb, line 279
def confidence95
  degfreedom = @reps[0] - 1
  student_t_quantile95(degfreedom) *
    (Si2(n) / @reps[0]) ** 0.5
end
index_iterator(start=0, stop=nil) click to toggle source

Computes a list of all possible data indcies gievn that start <= index <= stop are fixed.

# File lib/kalibera/data.rb, line 152
def index_iterator(start=0, stop=nil)
  if stop.nil?
    stop = n
  end

  maximum_indicies = @reps[start...stop]
  remaining_indicies = maximum_indicies.map { |maximum| (0...maximum).to_a }
  return [[]] if remaining_indicies.empty?
  remaining_indicies[0].product(*remaining_indicies.drop(1))
end
mean(indicies=[]) click to toggle source

Compute the mean across a number of values.

Keyword arguments: indicies – tuple of fixed indicies over which to compute the mean, given from left to right. The remaining indicies are variable.

# File lib/kalibera/data.rb, line 184
def mean(indicies=[])
  remaining_indicies_cross_product =
      index_iterator(start=indicies.size)
  alldata = remaining_indicies_cross_product.map { |remaining| self[*(indicies + remaining)] }
  Kalibera.mean(alldata)
end
n() click to toggle source

The number of levels in the experiment.

# File lib/kalibera/data.rb, line 164
def n
  @reps.size
end
optimalreps(i, costs) click to toggle source

Computes the optimal number of repetitions for a given level.

Note that the resulting number of reps is not rounded.

Arguments: i – the mathematical level of which to compute optimal reps. costs – A list of costs for each level, high to low.

# File lib/kalibera/data.rb, line 266
def optimalreps(i, costs)
  # NOTE: Does not round
  costs = costs.map { |x| Float(x) }
  raise unless 1 <= i
  raise unless i < n
  index = n - i
  return (costs[index - 1] / costs[index] *
      Ti2(i) / Ti2(i + 1)) ** 0.5
end
r(i) click to toggle source

The number of repetitions for level i.

Arguments: i – mathematical index.

# File lib/kalibera/data.rb, line 172
def r(i)
  raise unless 1 <= i
  raise unless i <= n
  index = n - i
  @reps[index]
end
random_measurement_sample(index=[]) click to toggle source
# File lib/kalibera/data.rb, line 311
def random_measurement_sample(index=[])
  results = []
  if index.size == n
    results.push self[*index]
  else
    indicies = (0...@reps[index.size]).map { |i| rand(@reps[index.size]) }
    for single_index in indicies
      newindex = index + [single_index]
      for value in random_measurement_sample(newindex)
        results.push value
      end
    end
  end
  results
end