class Pascoale::SyllableSeparator

Constants

CODA
KERNEL

The concept of “rhyme” does not help in this algorithm. It seems the concept makes no sense for syllable separation in portuguese (by an algorithm, at least)

NUCLEUS
NUCLEUS_RULES

Biggest problem are “sinéreses” and “diéreses”. It seems some consonants like “n” and “m” in the next syllable can cause it.

ONSET

Public Class Methods

new(word) click to toggle source
# File lib/pascoale/syllable_separator.rb, line 32
def initialize(word)
  @word = word
end

Public Instance Methods

separate() click to toggle source
# File lib/pascoale/syllable_separator.rb, line 36
def separate
  rest = @word
  result = []
  while rest && rest.size > 0
    if rest =~ /^(#{KERNEL})(?:(#{KERNEL})|(#{CODA})(#{KERNEL})|(#{CODA}#{CODA})(#{KERNEL})|(#{CODA}#{CODA})|(#{CODA}))?(.*)$/
      result << $1 + $3.to_s + $5.to_s + $7.to_s + $8.to_s
      rest = $2.to_s + $4.to_s + $6.to_s + $9.to_s
      # Special case! Hate them :(
      # Pneu, Gnomo, Mnemônica, Pseudônimo
    elsif result.size == 0
      if rest =~ /^([#{CONSONANTS}]#{KERNEL})(?:(#{KERNEL})|(#{CODA})(#{KERNEL})|(#{CODA}#{CODA})(#{KERNEL})|(#{CODA}#{CODA})|(#{CODA}))?(.*)$/
        result << $1 + $3.to_s + $5.to_s + $7.to_s + $8.to_s
        rest = $2.to_s + $4.to_s + $6.to_s + $9.to_s
      else
        raise %(Cannot separate "#{@word}". No rule match next syllable at "#{result.join('')}|>#{rest}")
      end
    else
      raise %(Cannot separate "#{@word}". No rule match next syllable at "#{result.join('')}|>#{rest}")
    end
  end
  result
end
Also aliased as: separated
separated()
Alias for: separate