class ParallelParser

Public: Parser which runs in parallel.

Examples

parser = ParallelParser.new(4) parser.parse([‘Betula L.’, ‘Pardosa moesta’])

Public Class Methods

new(processes_num = nil) click to toggle source

Public: Initialize ParallelParser.

processes_num - an Integer to setup the number of processes (default: nil).

If processes number is not set it will be determined
automatically.
# File lib/biodiversity/parser.rb, line 47
def initialize(processes_num = nil)
  require 'parallel'
  cpu_num
  if processes_num.to_i > 0
    @processes_num = [processes_num, cpu_num - 1].min
  else
    @processes_num = cpu_num > 3 ? cpu_num - 2 : 1
  end
end

Public Instance Methods

cpu_num() click to toggle source

Public: Returns the number of cores/CPUs.

Returns Integer of cores/CPUs.

# File lib/biodiversity/parser.rb, line 83
def cpu_num
  @cpu_num ||= Parallel.processor_count
end
parse(names_list) click to toggle source

Public: Parses an array of scientific names using several processes in parallel.

Scientific names are deduplicated in the process, so every string is parsed only once.

names_list - takes an Array of scientific names,

each element should be a String.

Examples

parser = ParallelParser.new(4) parser.parse([‘Homo sapiens L.’, ‘Quercus quercus’])

Returns a Hash with scientific names as a key, and parsing results as a value.

# File lib/biodiversity/parser.rb, line 73
def parse(names_list)
  parsed = Parallel.map(names_list.uniq, in_processes: @processes_num) do |n|
    [n, parse_process(n)]
  end
  parsed.inject({}) { |res, x| res[x[0]] = x[1]; res }
end

Private Instance Methods

parse_process(name) click to toggle source
# File lib/biodiversity/parser.rb, line 88
def parse_process(name)
  p = ScientificNameParser.new
  p.parse(name) rescue ScientificNameParser::FAILED_RESULT.(name)
end