class Traject::MarcExtractor::Spec
Constants
- CONTROLFIELD_PATTERN
- DATAFIELD_PATTERN
Converts from a string marc spec like “008:245abc:700a” to a hash used internally to represent the specification. See comments at head of class for documentation of string specification format.
## Return value
The hash returned is keyed by tag, and has as values an array of 0 or or more
MarcExtractor::Spec
objects representing the specified extraction operations for that tag.It's an array of possibly more than one, because you can specify multiple extractions on the same tag: for instance “245a:245abc”
See tests for more examples.
Attributes
Public Class Methods
Create a new controlfield spec
# File lib/traject/marc_extractor_spec.rb, line 218 def self.create_controlfield_spec(tag, byte1, byte2) spec = Spec.new(:tag => tag.freeze) spec.set_bytes(byte1.freeze, byte2.freeze) spec end
Create a new datafield spec. Most of the logic about how to deal with special characters is built into the Spec
class.
# File lib/traject/marc_extractor_spec.rb, line 204 def self.create_datafield_spec(tag, ind1, ind2, subfields) spec = Spec.new(:tag => tag) spec.indicator1 = ind1.freeze spec.indicator2 = ind2.freeze if subfields and !subfields.empty? spec.subfields = subfields.split('') end spec end
# File lib/traject/marc_extractor_spec.rb, line 168 def self.hash_from_string(spec_string) # hash defaults to [] hash = Hash.new # Split the string(s) given on colon spec_strings = spec_string.is_a?(Array) ? spec_string.map { |s| s.split(/\s*:\s*/) }.flatten : spec_string.split(/\s*:\s*/) spec_strings.each do |part| if m = DATAFIELD_PATTERN.match(part) tag, ind1, ind2, subfields = m[1], m[3], m[4], m[5] spec = create_datafield_spec(tag, ind1, ind2, subfields) hash[spec.tag] ||= [] hash[spec.tag] << spec elsif m = CONTROLFIELD_PATTERN.match(part) tag, byte1, byte2 = m[1], m[3], m[5] spec = create_controlfield_spec(tag, byte1, byte2) hash[spec.tag] ||= [] hash[spec.tag] << spec else raise ArgumentError.new("Unrecognized marc extract specification: #{part}") end end return hash end
Allow use of a hash to initialize. Should ditch this and use optional keyword args once folks move to 2.x syntax
# File lib/traject/marc_extractor_spec.rb, line 77 def initialize(hash = nil) if hash hash.each_pair do |key, value| self.send("#{key}=", value) end end end
Public Instance Methods
Simple equality definition
# File lib/traject/marc_extractor_spec.rb, line 138 def ==(spec) return false unless spec.kind_of?(Spec) return (self.tag == spec.tag) && (self.subfields == spec.subfields) && (self.indicator1 == spec.indicator1) && (self.indicator2 == spec.indicator2) && (self.bytes == spec.bytes) end
# File lib/traject/marc_extractor_spec.rb, line 104 def byte1=(byte1) @byte1 = byte1.to_i if byte1 set_bytes(@byte1, @byte2) end
# File lib/traject/marc_extractor_spec.rb, line 109 def byte2=(byte2) @byte2 = byte2.to_i if byte2 set_bytes(@byte1, @byte2) end
Pass in a string subfield code like 'a'; does this spec include it?
# File lib/traject/marc_extractor_spec.rb, line 132 def includes_subfield_code?(code) # subfields nil means include them all self.subfields.nil? || self.subfields.include?(code) end
# File lib/traject/marc_extractor_spec.rb, line 96 def indicator1=(ind1) ind1 == '*' ? @indicator1 = nil : @indicator1 = ind1.freeze end
# File lib/traject/marc_extractor_spec.rb, line 100 def indicator2=(ind2) ind2 == '*' ? @indicator2 = nil : @indicator2 = ind2.freeze end
Should subfields extracted by joined, if we have a seperator? * '630' no subfields specified => join all subfields * '630abc' multiple subfields specified = join all subfields * '633a' one subfield => do not join, return one value for each $a in the field * '633aa' one subfield, doubled => do join after all, will return a single string joining all the values of all the $a's.
Last case is handled implicitly at the moment when subfields == ['a', 'a']
# File lib/traject/marc_extractor_spec.rb, line 92 def joinable? (self.subfields.nil? || self.subfields.size != 1) end
Pass in a MARC field, do it's indicators match indicators in this spec? nil indicators in spec mean we don't care, everything matches.
# File lib/traject/marc_extractor_spec.rb, line 125 def matches_indicators?(field) return (indicator1.nil? || indicator1 == field.indicator1) && (indicator2.nil? || indicator2 == field.indicator2) end
# File lib/traject/marc_extractor_spec.rb, line 114 def set_bytes(byte1, byte2) if byte1 && byte2 @bytes = ((byte1.to_i)..(byte2.to_i)) elsif byte1 @bytes = byte1.to_i end end