module NdrImport::Helpers::File::Excel

This mixin adds excel spreadsheet functionality to unified importers. It provides a file reader method and methods to cast raw values appropriately. These methods can be overridden or aliased as required.

Protected Instance Methods

cast_excel_datetime_as_date(raw_value) click to toggle source
# File lib/ndr_import/helpers/file/excel.rb, line 34
def cast_excel_datetime_as_date(raw_value)
  raw_value.to_formatted_s(:db)
end
cast_excel_value(raw_value) click to toggle source
# File lib/ndr_import/helpers/file/excel.rb, line 17
def cast_excel_value(raw_value)
  return raw_value if raw_value.nil?

  if raw_value.is_a?(Date) || raw_value.is_a?(DateTime) || raw_value.is_a?(Time)
    cast_excel_datetime_as_date(raw_value)
  elsif raw_value.is_a?(Float)
    if raw_value.to_f == raw_value.to_i
      # Whole number
      return raw_value.to_i.to_s
    else
      return raw_value.to_f.to_s
    end
  else
    return raw_value.to_s.strip
  end
end
excel_tables(path) { |tablename, excel_rows(workbook, sheet)| ... } click to toggle source

Iterate through the file table by table, yielding each one in turn.

# File lib/ndr_import/helpers/file/excel.rb, line 39
def excel_tables(path)
  return enum_for(:excel_tables, path) unless block_given?

  workbook = load_workbook(path)
  workbook.each_with_pagename do |tablename, sheet|
    yield tablename, excel_rows(workbook, sheet)
  end
end

Private Instance Methods

copy_file(source, dest) click to toggle source

Note that this method can produce insecure calls. All callers must protect their arguments. Arguments:

* source - SafeFile
* dest - SafeFile
# File lib/ndr_import/helpers/file/excel.rb, line 144
def copy_file(source, dest)
  # SECURE: TVB Mon Aug 13 13:53:02 BST 2012 : Secure SafePath will do the security checks
  # before it is converted to string.
  # SafeFile will make sure that the arguments are actually SafePath
  FileUtils.mkdir_p(SafeFile.safepath_to_string(SafeFile.dirname(dest)))
  FileUtils.cp(SafeFile.safepath_to_string(source), SafeFile.safepath_to_string(dest))
end
excel_rows(workbook, sheet, &block) click to toggle source

Iterate through the sheet line by line, yielding each one in turn.

# File lib/ndr_import/helpers/file/excel.rb, line 70
def excel_rows(workbook, sheet, &block)
  return enum_for(:excel_rows, workbook, sheet) unless block

  if workbook.is_a?(Roo::Excelx)
    # FIXME: xlsx_rows(sheet, &block) should produce the same output as xls_rows
    xls_rows(sheet, &block)
  else
    xls_rows(sheet, &block)
  end
end
get_excel_sheets_name(path) click to toggle source
# File lib/ndr_import/helpers/file/excel.rb, line 105
def get_excel_sheets_name(path)
  workbook = load_workbook(path)
  workbook.sheets
end
load_workbook(path) click to toggle source
# File lib/ndr_import/helpers/file/excel.rb, line 110
def load_workbook(path)
  case SafeFile.extname(path).downcase
  when '.xls'
    Roo::Excel.new(SafeFile.safepath_to_string(path))
  when '.xlsx'
    Roo::Excelx.new(SafeFile.safepath_to_string(path))
  else
    fail "Received file path with unexpected extension #{SafeFile.extname(path)}"
  end
rescue Ole::Storage::FormatError => e
  # TODO: Do we need to remove the new_file after using it?

  # try to load the .xls file as an .xlsx file, useful for sources like USOM
  # roo check file extensions in file_type_check (GenericSpreadsheet),
  # so we create a duplicate file in xlsx extension
  if /(.*)\.xls$/.match(path)
    new_file_name = SafeFile.basename(path).gsub(/(.*)\.xls$/, '\1_amend.xlsx')
    new_file_path = SafeFile.dirname(path).join(new_file_name)
    copy_file(path, new_file_path)

    load_workbook(new_file_path)
  else
    raise e.message
  end
rescue => e
  raise ["Unable to read the file '#{path}'", e.message].join('; ')
end
read_excel_file(path, selected_sheet = nil) click to toggle source
# File lib/ndr_import/helpers/file/excel.rb, line 50
def read_excel_file(path, selected_sheet = nil)
  # SECURE: TVB Mon Aug 13 15:30:32 BST 2012 SafeFile.safepath_to_string makes sure that
  # the path is SafePath.

  # Load the workbook
  workbook = load_workbook(path)

  # Choose selected worksheet (if provided and exist) or the first worksheet
  workbook.default_sheet =
    if selected_sheet.nil? || !workbook.sheets.include?(selected_sheet.to_s)
      workbook.sheets.first
    else
      selected_sheet.to_s
    end

  # Read the cells from working worksheet into a nested array
  excel_rows(workbook, workbook).to_a
end
xls_rows(sheet) { |line| ... } click to toggle source

Iterate through an xls sheet line by line, yielding each one in turn.

# File lib/ndr_import/helpers/file/excel.rb, line 82
def xls_rows(sheet)
  return enum_for(:xls_rows, sheet) unless block_given?
  return unless sheet.first_row # NULL Sheet (TODO: perhaps XLSX only, use xlsx_rows...)

  sheet.first_row.upto(sheet.last_row) do |row|
    line = []
    sheet.first_column.upto(sheet.last_column) do |col|
      line << cast_excel_value(sheet.cell(row, col))
    end
    yield line
  end
end
xlsx_rows(sheet) { |map { |cell| cast_excel_value(value) }| ... } click to toggle source

Iterate through an xlsx sheet line by line, yielding each one in turn. This method uses streaming github.com/roo-rb/roo#excel-xlsx-and-xlsm-support

# File lib/ndr_import/helpers/file/excel.rb, line 97
def xlsx_rows(sheet)
  return enum_for(:xlsx_rows, sheet) unless block_given?

  sheet.each_row_streaming(:pad_cells => true) do |row|
    yield row.map { |cell| cast_excel_value(cell.value) }
  end
end