class Kalibera::Data
Public Class Methods
Instances of this class store measurements (corresponding to the Y_… in the papers).
Arguments: data – Dict mapping tuples of all but the last index to lists of values. reps – List of reps for each level, high to low.
# File lib/kalibera/data.rb, line 131 def initialize(data, reps) @data = data @reps = reps # check that all data is there array = reps.map { |i| (0...i).to_a } array[0].product(*array.drop(1)).each do |index| self[*index] # does not crash end end
Public Instance Methods
Biased estimator S_i^2.
Arguments: i – the mathematical index of the level from which to compute S_i^2
# File lib/kalibera/data.rb, line 197 def Si2(i) raise unless 1 <= i raise unless i <= n # @reps is indexed from the left to right index = n - i factor = 1.0 # We compute this iteratively leveraging the fact that # 1 / (a * b) = (1 / a) / b for rep in @reps[0, index] factor /= rep end # Then at this point we have: # factor * (1 / (r_i - 1)) = factor / (r_i - 1) factor /= @reps[index] - 1 # Second line of the above definition, the lines are multiplied. indicies = index_iterator(0, index+1) sum = 0.0 for index in indicies a = mean(index) b = mean(index[0,index.size-1]) sum += (a - b) ** 2 end factor * sum end
Compute the unbiased T_i^2 variance estimator.
Arguments: i – the mathematical index from which to compute T_i^2.
# File lib/kalibera/data.rb, line 230 def Ti2(i) # This is the broken implementation of T_i^2 shown in the pubslished # version of "Rigorous benchmarking in reasonable time". Tomas has # since fixed this in local versions of the paper. #@memoize #def broken_Ti2(self, i) # """ Compute the unbiased T_i^2 variance estimator. # # Arguments: # i -- the mathematical index from which to compute T_i^2. # """ # # raise unless 1 <= i <= n # if i == 1: # return self.Si2(1) # return self.Si2(i) - self.Ti2(i - 1) / self.r(i - 1) # This is the correct definition of T_i^2 raise unless 1 <= i raise unless i <= n if i == 1 return Si2(1) end Si2(i) - Si2(i - 1) / r(i - 1) end
# File lib/kalibera/data.rb, line 143 def [](*indicies) raise unless indicies.size == @reps.size x = @data[indicies[0...indicies.size-1]] raise unless !x.nil? x[indicies[-1]] end
Compute a confidence interval via bootstrap method.
Keyword arguments: iterations – Number of resamplings to base result upon. Default is 10000. confidence – The required confidence. Default is “0.95” (95%).
# File lib/kalibera/data.rb, line 306 def bootstrap_confidence_interval(iterations=10000, confidence="0.95") means = bootstrap_means(iterations) Kalibera.confidence_slice(means, confidence) end
Compute a list of simulated means from bootstrap resampling.
Note that, resampling occurs with replacement.
Keyword arguments: iterations – Number of resamples (and thus means) generated.
# File lib/kalibera/data.rb, line 291 def bootstrap_means(iterations=1000) means = [] for i in 0...iterations values = bootstrap_sample() means.push(Kalibera.mean(values)) end means.sort() means end
# File lib/kalibera/data.rb, line 331 def bootstrap_quotient(other, iterations=10000, confidence='0.95') ratios = [] for _ in 0...iterations ra = bootstrap_sample() rb = other.bootstrap_sample() mean_ra = Kalibera.mean(ra) mean_rb = Kalibera.mean(rb) if mean_rb == 0 # protect against divide by zero ratios.push(Float::INFINITY) else ratios.push(mean_ra / mean_rb) end end ratios.sort! Kalibera.confidence_slice(ratios, confidence).values end
# File lib/kalibera/data.rb, line 327 def bootstrap_sample random_measurement_sample end
Compute the 95% confidence interval.
# File lib/kalibera/data.rb, line 279 def confidence95 degfreedom = @reps[0] - 1 student_t_quantile95(degfreedom) * (Si2(n) / @reps[0]) ** 0.5 end
Computes a list of all possible data indcies gievn that start <= index <= stop are fixed.
# File lib/kalibera/data.rb, line 152 def index_iterator(start=0, stop=nil) if stop.nil? stop = n end maximum_indicies = @reps[start...stop] remaining_indicies = maximum_indicies.map { |maximum| (0...maximum).to_a } return [[]] if remaining_indicies.empty? remaining_indicies[0].product(*remaining_indicies.drop(1)) end
Compute the mean across a number of values.
Keyword arguments: indicies – tuple of fixed indicies over which to compute the mean, given from left to right. The remaining indicies are variable.
# File lib/kalibera/data.rb, line 184 def mean(indicies=[]) remaining_indicies_cross_product = index_iterator(start=indicies.size) alldata = remaining_indicies_cross_product.map { |remaining| self[*(indicies + remaining)] } Kalibera.mean(alldata) end
The number of levels in the experiment.
# File lib/kalibera/data.rb, line 164 def n @reps.size end
Computes the optimal number of repetitions for a given level.
Note that the resulting number of reps is not rounded.
Arguments: i – the mathematical level of which to compute optimal reps. costs – A list of costs for each level, high to low.
# File lib/kalibera/data.rb, line 266 def optimalreps(i, costs) # NOTE: Does not round costs = costs.map { |x| Float(x) } raise unless 1 <= i raise unless i < n index = n - i return (costs[index - 1] / costs[index] * Ti2(i) / Ti2(i + 1)) ** 0.5 end
The number of repetitions for level i.
Arguments: i – mathematical index.
# File lib/kalibera/data.rb, line 172 def r(i) raise unless 1 <= i raise unless i <= n index = n - i @reps[index] end
# File lib/kalibera/data.rb, line 311 def random_measurement_sample(index=[]) results = [] if index.size == n results.push self[*index] else indicies = (0...@reps[index.size]).map { |i| rand(@reps[index.size]) } for single_index in indicies newindex = index + [single_index] for value in random_measurement_sample(newindex) results.push value end end end results end