class Stockboy::Readers::CSV

Parse data from CSV into hashes

All standard ::CSV options are respected and passed through

@see

http://www.ruby-doc.org/stdlib-2.0.0/libdoc/csv/rdoc/CSV.html#DEFAULT_OPTIONS

Public Class Methods

new(opts={}, &block) click to toggle source

Initialize a new CSV reader

All stdlib ::CSV options are respected. @see ruby-doc.org/stdlib-2.0.0/libdoc/csv/rdoc/CSV.html#method-c-new

@param [Hash] opts

Calls superclass method Stockboy::Reader::new
# File lib/stockboy/readers/csv.rb, line 70
def initialize(opts={}, &block)
  super
  @csv_options = opts.reject {|k,v| !::CSV::DEFAULT_OPTIONS.keys.include?(k) }
  @csv_options[:headers] = @csv_options.fetch(:headers, true)
  @skip_header_rows = opts.fetch(:skip_header_rows, 0)
  @skip_footer_rows = opts.fetch(:skip_footer_rows, 0)
  DSL.new(self).instance_eval(&block) if block_given?
end

Public Instance Methods

options() click to toggle source

Hash of all CSV-specific options

@!attribute [r] options

@return [Hash]
# File lib/stockboy/readers/csv.rb, line 91
def options
  @csv_options
end
parse(data) click to toggle source
# File lib/stockboy/readers/csv.rb, line 79
def parse(data)
  chain = options[:header_converters] || []
  chain << proc{ |h| h.freeze }
  opts = options.merge(header_converters: chain)
  ::CSV.parse(sanitize(data), opts).map(&:to_hash)
end

Private Instance Methods

row_end_index(data, skip_rows) click to toggle source
# File lib/stockboy/readers/csv.rb, line 127
def row_end_index(data, skip_rows)
  Array.new(skip_rows).inject(-1) { |i| data.rindex(/$/, i) - 1 }
end
row_start_index(data, skip_rows) click to toggle source
# File lib/stockboy/readers/csv.rb, line 123
def row_start_index(data, skip_rows)
  Array.new(skip_rows).inject(0) { |i| data.index(/$/, i) + 1 }
end
sanitize(data) click to toggle source

Clean incoming data based on set encoding or best information

1. Assign the given input encoding setting if available
2. Scrub invalid characters for the encoding. (Scrubbing does not apply
   for BINARY input, which is undefined.)
3. Encode to UTF-8 with considerations for undefined input. The main
   issue are control characters that are absent in UTF-8 (and ISO-8859-1)
   but are common printable characters in Windows-1252, so we preserve
   this range as a best guess.
4. Delete null bytes that are inserted as terminators by some "CSV" output
5. Delete leading/trailing garbage lines based on settings
# File lib/stockboy/readers/csv.rb, line 109
def sanitize(data)
  data = data.dup
  data.force_encoding encoding if encoding
  data.scrub!
  data.encode! Encoding::UTF_8, universal_newline: true, fallback: proc { |c|
    c.force_encoding(Encoding::Windows_1252) if (127..159).cover? c.ord
  }
  data.delete! 0.chr
  data.chomp!
  from = row_start_index(data, skip_header_rows)
  to = row_end_index(data, skip_footer_rows)
  data[from..to]
end