class Bmg::Summarizer
This class provides a basis for implementing aggregation operators.
Aggregation operators are made available through factory methods on the Summarizer
class itself:
Summarizer.count Summarizer.sum(:qty) Summarizer.sum{|t| t[:qty] * t[:price] }
Once built, summarizers can be used either in black-box or white-box modes.
relation = ... agg = Summarizer.sum(:qty) # Black box mode: result = agg.summarize(relation) # White box mode: memo = agg.least relation.each do |tuple| memo = agg.happens(memo, tuple) end result = agg.finalize(memo)
Attributes
@return the underlying functor, either a Symbol or a Proc
@return Aggregation options as a Hash
Public Class Methods
Factors an average summarizer
# File lib/bmg/summarizer/avg.rb, line 31 def self.avg(*args, &bl) Avg.new(*args, &bl) end
Factors a distinct summarizer
# File lib/bmg/summarizer/by_proc.rb, line 35 def self.by_proc(least = nil, proc = nil, &bl) least, proc = nil, least if least.is_a?(Proc) ByProc.new(least, proc || bl) end
Factors a collect summarizer
# File lib/bmg/summarizer/collect.rb, line 26 def self.collect(*args, &bl) Collect.new(*args, &bl) end
Factors a concatenation summarizer
# File lib/bmg/summarizer/concat.rb, line 37 def self.concat(*args, &bl) Concat.new(*args, &bl) end
Factors a count summarizer
# File lib/bmg/summarizer/count.rb, line 26 def self.count(*args, &bl) Count.new(*args, &bl) end
Factors a distinct summarizer
# File lib/bmg/summarizer/distinct.rb, line 31 def self.distinct(*args, &bl) Distinct.new(*args, &bl) end
Factors a distinct count summarizer
# File lib/bmg/summarizer/distinct_count.rb, line 31 def self.distinct_count(*args, &bl) DistinctCount.new(*args, &bl) end
Factors a max summarizer
# File lib/bmg/summarizer/max.rb, line 26 def self.max(*args, &bl) Max.new(*args, &bl) end
# File lib/bmg/summarizer/percentile.rb, line 66 def self.median(*args, &bl) Percentile.new(*(args + [50]), &bl) end
# File lib/bmg/summarizer/percentile.rb, line 70 def self.median_cont(*args, &bl) Percentile.new(*(args + [50, {:variant => :continuous}]), &bl) end
# File lib/bmg/summarizer/percentile.rb, line 74 def self.median_disc(*args, &bl) Percentile.new(*(args + [50, {:variant => :discrete}]), &bl) end
Factors a min summarizer
# File lib/bmg/summarizer/min.rb, line 26 def self.min(*args, &bl) Min.new(*args, &bl) end
Factors a distinct summarizer
# File lib/bmg/summarizer/multiple.rb, line 41 def self.multiple(defs) Multiple.new(defs) end
Creates an Summarizer
instance.
Private method, please use the factory methods
# File lib/bmg/summarizer.rb, line 40 def initialize(*args, &block) @options = default_options args.push(block) if block args.each do |arg| case arg when Symbol, Proc then @functor = arg when Hash then @options = @options.merge(arg) else raise ArgumentError, "Unexpected `#{arg}`" end end end
# File lib/bmg/summarizer/percentile.rb, line 54 def self.percentile(*args, &bl) Percentile.new(*args, &bl) end
# File lib/bmg/summarizer/percentile.rb, line 58 def self.percentile_cont(*args, &bl) Percentile.new(*(args + [{:variant => :continuous}]), &bl) end
# File lib/bmg/summarizer/percentile.rb, line 62 def self.percentile_disc(*args, &bl) Percentile.new(*(args + [{:variant => :discrete}]), &bl) end
Factors a standard deviation summarizer
# File lib/bmg/summarizer/stddev.rb, line 21 def self.stddev(*args, &bl) Stddev.new(*args, &bl) end
Factors a sum summarizer
# File lib/bmg/summarizer/sum.rb, line 26 def self.sum(*args, &bl) Sum.new(*args, &bl) end
Converts some summarization definitions to a Hash of summarizers.
# File lib/bmg/summarizer.rb, line 55 def self.summarization(defs) Hash[defs.map{|k,v| summarizer = case v when Summarizer then v when Symbol then Summarizer.send(v, k) when Proc then Summarizer.by_proc(&v) else raise ArgumentError, "Unexpected summarizer #{k} => #{v}" end [ k, summarizer ] }] end
# File lib/bmg/summarizer/value_by.rb, line 57 def self.value_by(*args, &bl) ValueBy.new(*args, &bl) end
Factors a variance summarizer
# File lib/bmg/summarizer/variance.rb, line 37 def self.variance(*args, &bl) Variance.new(*args, &bl) end
Public Instance Methods
This method finalizes an aggregation.
Argument memo is either least or the result of aggregating through happens. The default implementation simply returns memo. The method is intended to be overriden for complex aggregations that need statefull information such as `avg`.
@param [Object] memo the current aggregation value @return [Object] the aggregation value, as finalized
# File lib/bmg/summarizer.rb, line 120 def finalize(memo) memo end
This method is called on each aggregated tuple and must return an updated memo value. It can be seen as the block typically given to Enumerable.inject.
The default implementation collects the pre-value on the tuple and delegates to _happens.
@param memo the current aggregation value @param the current iterated tuple @return updated memo value
# File lib/bmg/summarizer.rb, line 97 def happens(memo, tuple) value = extract_value(tuple) _happens(memo, value) end
Returns the least value, which is the one to use on an empty set.
This method is intended to be overriden by subclasses; default implementation returns nil.
@return the least value for this summarizer
# File lib/bmg/summarizer.rb, line 83 def least nil end
Summarizes an enumeration of tuples.
@param an enumerable of tuples @return the computed summarization value
# File lib/bmg/summarizer.rb, line 128 def summarize(enum) finalize(enum.inject(least){|m,t| happens(m, t) }) end
Returns the canonical summarizer name
# File lib/bmg/summarizer.rb, line 133 def to_summarizer_name self.class.name .gsub(/[a-z][A-Z]/){|x| x.split('').join('_') } .downcase[/::([a-z_]+)$/, 1] .to_sym end
Protected Instance Methods
@see happens.
This method is intended to be overriden and returns value by default, making this summarizer a “Last(…)” summarizer.
# File lib/bmg/summarizer.rb, line 106 def _happens(memo, value) value end
Returns the default options to use
@return the default aggregation options
# File lib/bmg/summarizer.rb, line 71 def default_options {} end
# File lib/bmg/summarizer.rb, line 142 def extract_value(tuple) value = case @functor when Proc @functor.call(tuple) when NilClass tuple when Symbol tuple[@functor] else tuple[@functor] end end