class Bmg::Summarizer

Summarizer.

This class provides a basis for implementing aggregation operators.

Aggregation operators are made available through factory methods on the Summarizer class itself:

Summarizer.count
Summarizer.sum(:qty)
Summarizer.sum{|t| t[:qty] * t[:price] }

Once built, summarizers can be used either in black-box or white-box modes.

relation = ...
agg = Summarizer.sum(:qty)

# Black box mode:
result = agg.summarize(relation)

# White box mode:
memo = agg.least
relation.each do |tuple|
  memo = agg.happens(memo, tuple)
end
result = agg.finalize(memo)

Attributes

functor[R]

@return the underlying functor, either a Symbol or a Proc

options[R]

@return Aggregation options as a Hash

Public Class Methods

avg(*args, &bl) click to toggle source

Factors an average summarizer

# File lib/bmg/summarizer/avg.rb, line 31
def self.avg(*args, &bl)
  Avg.new(*args, &bl)
end
by_proc(least = nil, proc = nil, &bl) click to toggle source

Factors a distinct summarizer

# File lib/bmg/summarizer/by_proc.rb, line 35
def self.by_proc(least = nil, proc = nil, &bl)
  least, proc = nil, least if least.is_a?(Proc)
  ByProc.new(least, proc || bl)
end
collect(*args, &bl) click to toggle source

Factors a collect summarizer

# File lib/bmg/summarizer/collect.rb, line 26
def self.collect(*args, &bl)
  Collect.new(*args, &bl)
end
concat(*args, &bl) click to toggle source

Factors a concatenation summarizer

# File lib/bmg/summarizer/concat.rb, line 37
def self.concat(*args, &bl)
  Concat.new(*args, &bl)
end
count(*args, &bl) click to toggle source

Factors a count summarizer

# File lib/bmg/summarizer/count.rb, line 26
def self.count(*args, &bl)
  Count.new(*args, &bl)
end
distinct(*args, &bl) click to toggle source

Factors a distinct summarizer

# File lib/bmg/summarizer/distinct.rb, line 31
def self.distinct(*args, &bl)
  Distinct.new(*args, &bl)
end
distinct_count(*args, &bl) click to toggle source

Factors a distinct count summarizer

# File lib/bmg/summarizer/distinct_count.rb, line 31
def self.distinct_count(*args, &bl)
  DistinctCount.new(*args, &bl)
end
max(*args, &bl) click to toggle source

Factors a max summarizer

# File lib/bmg/summarizer/max.rb, line 26
def self.max(*args, &bl)
  Max.new(*args, &bl)
end
median(*args, &bl) click to toggle source
# File lib/bmg/summarizer/percentile.rb, line 66
def self.median(*args, &bl)
  Percentile.new(*(args + [50]), &bl)
end
median_cont(*args, &bl) click to toggle source
# File lib/bmg/summarizer/percentile.rb, line 70
def self.median_cont(*args, &bl)
  Percentile.new(*(args + [50, {:variant => :continuous}]), &bl)
end
median_disc(*args, &bl) click to toggle source
# File lib/bmg/summarizer/percentile.rb, line 74
def self.median_disc(*args, &bl)
  Percentile.new(*(args + [50, {:variant => :discrete}]), &bl)
end
min(*args, &bl) click to toggle source

Factors a min summarizer

# File lib/bmg/summarizer/min.rb, line 26
def self.min(*args, &bl)
  Min.new(*args, &bl)
end
multiple(defs) click to toggle source

Factors a distinct summarizer

# File lib/bmg/summarizer/multiple.rb, line 41
def self.multiple(defs)
  Multiple.new(defs)
end
new(*args, &block) click to toggle source

Creates an Summarizer instance.

Private method, please use the factory methods

# File lib/bmg/summarizer.rb, line 40
def initialize(*args, &block)
  @options = default_options
  args.push(block) if block
  args.each do |arg|
    case arg
    when Symbol, Proc then @functor = arg
    when Hash         then @options = @options.merge(arg)
    else
      raise ArgumentError, "Unexpected `#{arg}`"
    end
  end
end
percentile(*args, &bl) click to toggle source
# File lib/bmg/summarizer/percentile.rb, line 54
def self.percentile(*args, &bl)
  Percentile.new(*args, &bl)
end
percentile_cont(*args, &bl) click to toggle source
# File lib/bmg/summarizer/percentile.rb, line 58
def self.percentile_cont(*args, &bl)
  Percentile.new(*(args + [{:variant => :continuous}]), &bl)
end
percentile_disc(*args, &bl) click to toggle source
# File lib/bmg/summarizer/percentile.rb, line 62
def self.percentile_disc(*args, &bl)
  Percentile.new(*(args + [{:variant => :discrete}]), &bl)
end
stddev(*args, &bl) click to toggle source

Factors a standard deviation summarizer

# File lib/bmg/summarizer/stddev.rb, line 21
def self.stddev(*args, &bl)
  Stddev.new(*args, &bl)
end
sum(*args, &bl) click to toggle source

Factors a sum summarizer

# File lib/bmg/summarizer/sum.rb, line 26
def self.sum(*args, &bl)
  Sum.new(*args, &bl)
end
summarization(defs) click to toggle source

Converts some summarization definitions to a Hash of summarizers.

# File lib/bmg/summarizer.rb, line 55
def self.summarization(defs)
  Hash[defs.map{|k,v|
    summarizer = case v
    when Summarizer then v
    when Symbol     then Summarizer.send(v, k)
    when Proc       then Summarizer.by_proc(&v)
    else
      raise ArgumentError, "Unexpected summarizer #{k} => #{v}"
    end
    [ k, summarizer ]
  }]
end
value_by(*args, &bl) click to toggle source
# File lib/bmg/summarizer/value_by.rb, line 57
def self.value_by(*args, &bl)
  ValueBy.new(*args, &bl)
end
variance(*args, &bl) click to toggle source

Factors a variance summarizer

# File lib/bmg/summarizer/variance.rb, line 37
def self.variance(*args, &bl)
  Variance.new(*args, &bl)
end

Public Instance Methods

finalize(memo) click to toggle source

This method finalizes an aggregation.

Argument memo is either least or the result of aggregating through happens. The default implementation simply returns memo. The method is intended to be overriden for complex aggregations that need statefull information such as `avg`.

@param [Object] memo the current aggregation value @return [Object] the aggregation value, as finalized

# File lib/bmg/summarizer.rb, line 120
def finalize(memo)
  memo
end
happens(memo, tuple) click to toggle source

This method is called on each aggregated tuple and must return an updated memo value. It can be seen as the block typically given to Enumerable.inject.

The default implementation collects the pre-value on the tuple and delegates to _happens.

@param memo the current aggregation value @param the current iterated tuple @return updated memo value

# File lib/bmg/summarizer.rb, line 97
def happens(memo, tuple)
  value = extract_value(tuple)
  _happens(memo, value)
end
least() click to toggle source

Returns the least value, which is the one to use on an empty set.

This method is intended to be overriden by subclasses; default implementation returns nil.

@return the least value for this summarizer

# File lib/bmg/summarizer.rb, line 83
def least
  nil
end
summarize(enum) click to toggle source

Summarizes an enumeration of tuples.

@param an enumerable of tuples @return the computed summarization value

# File lib/bmg/summarizer.rb, line 128
def summarize(enum)
  finalize(enum.inject(least){|m,t| happens(m, t) })
end
to_summarizer_name() click to toggle source

Returns the canonical summarizer name

# File lib/bmg/summarizer.rb, line 133
def to_summarizer_name
  self.class.name
    .gsub(/[a-z][A-Z]/){|x| x.split('').join('_') }
    .downcase[/::([a-z_]+)$/, 1]
    .to_sym
end

Protected Instance Methods

_happens(memo, value) click to toggle source

@see happens.

This method is intended to be overriden and returns value by default, making this summarizer a “Last(…)” summarizer.

# File lib/bmg/summarizer.rb, line 106
def _happens(memo, value)
  value
end
default_options() click to toggle source

Returns the default options to use

@return the default aggregation options

# File lib/bmg/summarizer.rb, line 71
def default_options
  {}
end
extract_value(tuple) click to toggle source
# File lib/bmg/summarizer.rb, line 142
def extract_value(tuple)
  value = case @functor
  when Proc
    @functor.call(tuple)
  when NilClass
    tuple
  when Symbol
    tuple[@functor]
  else
    tuple[@functor]
  end
end