class MorMor::FSA
@private
This class and its subclasses contains a loose simplified port of the whole `morfologik-fsa` package. Original source at: github.com/morfologik/morfologik-stemming/tree/master/morfologik-fsa/src/main/java/morfologik/fsa
NB: TBH, I don't always understand deeply what am I doing here. Just ported Java algorithms statement-by-statement, then rubyfied a bit and debugged in parallel with original package to make sure it produces the same result.
Code contains some of my comments, original implementations referred where appropriate. Also, in more straightforwardly ported code, original comments are left and marked with “OC:”.
Constants
Public Class Methods
read(path)
click to toggle source
# File lib/mormor/fsa.rb, line 32 def read(path) io = File.open(path, 'rb') io.read(4) == '\\fsa' or fail ArgumentError, 'Invalid file header, probably not an FSA.' choose_impl(io.getbyte).new(io) end
Private Class Methods
choose_impl(version_byte)
click to toggle source
# File lib/mormor/fsa.rb, line 40 def choose_impl(version_byte) VERSIONS .fetch(version_byte) { fail ArgumentError 'Unsupported version byte, probably not FSA' } .tap { |name| constants.include?(name.to_sym) or fail ArgumentError "Unsupported version: #{name}" } .then(&method(:const_get)) end
Public Instance Methods
each_arc(from:) { |arc| ... }
click to toggle source
# File lib/mormor/fsa.rb, line 59 def each_arc(from:) return to_enum(__method__, from: from) unless block_given? arc = first_arc(from) until arc.zero? yield arc arc = next_arc(arc) end end
each_sequence(from: root_node, &block)
click to toggle source
# File lib/mormor/fsa.rb, line 51 def each_sequence(from: root_node, &block) Enumerator.new(self, from).then { |e| block ? e.each(&block) : e } end
find_arc(node, label)
click to toggle source
# File lib/mormor/fsa.rb, line 69 def find_arc(node, label) each_arc(from: node).detect { |a| arc_label(a) == label } || 0 end
match(sequence, node = root_node)
click to toggle source
Port of FSATraversal.java Method is left unsplit to leave original algorithm recognizable, hence rubocop:disable's
# File lib/mormor/fsa.rb, line 75 def match(sequence, node = root_node) # rubocop:disable Metrics/AbcSize,Metrics/CyclomaticComplexity return Match.new(:no) if node.zero? sequence.each_with_index do |byte, i| a = find_arc(node, byte) case when a.zero? return i.zero? ? Match.new(:no, i, node) : Match.new(:automaton_has_prefix, i, node) when i + 1 == sequence.size && final_arc?(a) return Match.new(:exact, i, node) when terminal_arc?(a) return Match.new(:automaton_has_prefix, i + 1, node) else node = end_node(a) end end Match.new(:sequence_is_a_prefix, 0, node) end
next_arc(arc)
click to toggle source
# File lib/mormor/fsa.rb, line 55 def next_arc(arc) last_arc?(arc) ? 0 : skip_arc(arc) end