class Traject::Indexer::Context
Attributes
sometimes we have multiple inputs, input_name
describes the current one, and position_in_input
the position of the record in the current input – both can sometimes be blanl when we don't know.
'position' is a 1-based position in stream of processed records.
sometimes we have multiple inputs, input_name
describes the current one, and position_in_input
the position of the record in the current input – both can sometimes be blanl when we don't know.
Should we be skipping this record?
Public Class Methods
# File lib/traject/indexer/context.rb, line 8 def initialize(hash_init = {}) # TODO, argument checking for required args? self.clipboard = {} self.output_hash = {} hash_init.each_pair do |key, value| self.send("#{key}=", value) end @skip = false end
Public Instance Methods
Add values to an array in context.output_hash with the specified key/field_name(s). Creates array in output_hash
if currently nil.
Post-processing/filtering:
-
uniqs accumulator, unless settings is set.
-
Removes nil values unless settings is set.
-
Will not add an empty array to
output_hash
(will leave it nil instead) unless settings is set.
Multiple values can be added with multiple arguments (we avoid an array argument meaning multiple values to accomodate odd use cases where array itself is desired in output_hash
value)
@param field_name [output_hash
, or
an array of such keys.
@example add one value
context.add_output(:additional_title, "a title")
@example add multiple values as multiple params
context.add_output("additional_title", "a title", "another title")
@example add multiple values as multiple params from array using ruby spread operator
context.add_output(:some_key, *array_of_values)
@example add to multiple keys in output hash
context.add_output(["key1", "key2"], "value")
@return [Traject::Context] self
Note for historical reasons relevant settings key names are in constants in Traject::Indexer::ToFieldStep
, but the settings don't just apply to ToFieldSteps
# File lib/traject/indexer/context.rb, line 117 def add_output(field_name, *values) values.compact! unless self.settings && self.settings[Traject::Indexer::ToFieldStep::ALLOW_NIL_VALUES] return self if values.empty? and not (self.settings && self.settings[Traject::Indexer::ToFieldStep::ALLOW_EMPTY_FIELDS]) Array(field_name).each do |key| accumulator = (self.output_hash[key.to_s] ||= []) accumulator.concat values accumulator.uniq! unless self.settings && self.settings[Traject::Indexer::ToFieldStep::ALLOW_DUPLICATE_VALUES] end return self end
a string label that can be used to refer to a particular record in log messages and exceptions. Includes various parts depending on what we got.
# File lib/traject/indexer/context.rb, line 59 def record_inspect str = "<" str << "record ##{position}" if position if input_name && position_in_input str << " (#{input_name} ##{position_in_input}), " elsif position str << ", " end if source_id = source_record_id str << "source_id:#{source_id} " end if output_id = self.output_hash["id"] str << "output_id:#{[output_id].join(',')}" end str.chomp!(" ") str.chomp!(",") str << ">" str end
Set the fact that this record should be skipped, with an optional message
# File lib/traject/indexer/context.rb, line 35 def skip!(msg = '(no message given)') @skipmessage = msg @skip = true end
Should we skip this record?
# File lib/traject/indexer/context.rb, line 41 def skip? @skip end
Useful for describing a record in a log or especially error message. May be useful to combine with position
in output messages, especially since this method may sometimes return empty string if info on record id is not available.
Returns id from source_record
(if we can get it from a source_record_id_proc
), then a slash,then output_hash – if both are present. Otherwise may return just one, or even an empty string.
# File lib/traject/indexer/context.rb, line 53 def source_record_id source_record_id_proc && source_record_id_proc.call(source_record) end