class IOStreams::Record::Reader
Converts each line of an input stream into hash for every row
Public Class Methods
file(file_name, original_file_name: file_name, delimiter: $/, **args) { |new(io, original_file_name: original_file_name, **args)| ... }
click to toggle source
When reading from a file also add the line reader stream
# File lib/io_streams/record/reader.rb, line 18 def self.file(file_name, original_file_name: file_name, delimiter: $/, **args) IOStreams::Line::Reader.file(file_name, original_file_name: original_file_name, delimiter: delimiter) do |io| yield new(io, original_file_name: original_file_name, **args) end end
new(line_reader, cleanse_header: true, original_file_name: nil, **args)
click to toggle source
Create a Tabular
reader to return the stream as Hash records Parse a delimited data source.
Parameters
format: [Symbol] :csv, :hash, :array, :json, :psv, :fixed file_name: [String] When `:format` is not supplied the file name can be used to infer the required format. Optional. Default: nil format_options: [Hash] Any specialized format specific options. For example, `:fixed` format requires the file definition. columns [Array<String>] The header columns when the file does not include a header row. Note: It is recommended to keep all columns as strings to avoid any issues when persistence with MongoDB when it converts symbol keys to strings. allowed_columns [Array<String>] List of columns to allow. Default: nil ( Allow all columns ) Note: When supplied any columns that are rejected will be returned in the cleansed columns as nil so that they can be ignored during processing. required_columns [Array<String>] List of columns that must be present, otherwise an Exception is raised. skip_unknown [true|false] true: Skip columns not present in the `allowed_columns` by cleansing them to nil. #as_hash will skip these additional columns entirely as if they were not in the file at all. false: Raises Tabular::InvalidHeader when a column is supplied that is not in the whitelist.
# File lib/io_streams/record/reader.rb, line 60 def initialize(line_reader, cleanse_header: true, original_file_name: nil, **args) unless line_reader.respond_to?(:each) raise(ArgumentError, "Stream must be a IOStreams::Line::Reader or implement #each") end @tabular = IOStreams::Tabular.new(file_name: original_file_name, **args) @line_reader = line_reader @cleanse_header = cleanse_header end
stream(line_reader, **args) { |line_reader| ... }
click to toggle source
Read a record at a time from a line stream Note:
-
The supplied stream must already be a line stream, or a stream that responds to :each
# File lib/io_streams/record/reader.rb, line 10 def self.stream(line_reader, **args) # Pass-through if already a record reader return yield(line_reader) if line_reader.is_a?(self.class) yield new(line_reader, **args) end
Public Instance Methods
each() { |record_parse| ... }
click to toggle source
# File lib/io_streams/record/reader.rb, line 70 def each @line_reader.each do |line| if @tabular.header? @tabular.parse_header(line) @tabular.cleanse_header! if @cleanse_header else yield @tabular.record_parse(line) end end end