class Pascoale::SyllableSeparator
Constants
- CODA
- KERNEL
The concept of “rhyme” does not help in this algorithm. It seems the concept makes no sense for syllable separation in portuguese (by an algorithm, at least)
- NUCLEUS
- NUCLEUS_RULES
Biggest problem are “sinéreses” and “diéreses”. It seems some consonants like “n” and “m” in the next syllable can cause it.
- ONSET
Public Class Methods
new(word)
click to toggle source
# File lib/pascoale/syllable_separator.rb, line 32 def initialize(word) @word = word end
Public Instance Methods
separate()
click to toggle source
# File lib/pascoale/syllable_separator.rb, line 36 def separate rest = @word result = [] while rest && rest.size > 0 if rest =~ /^(#{KERNEL})(?:(#{KERNEL})|(#{CODA})(#{KERNEL})|(#{CODA}#{CODA})(#{KERNEL})|(#{CODA}#{CODA})|(#{CODA}))?(.*)$/ result << $1 + $3.to_s + $5.to_s + $7.to_s + $8.to_s rest = $2.to_s + $4.to_s + $6.to_s + $9.to_s # Special case! Hate them :( # Pneu, Gnomo, Mnemônica, Pseudônimo elsif result.size == 0 if rest =~ /^([#{CONSONANTS}]#{KERNEL})(?:(#{KERNEL})|(#{CODA})(#{KERNEL})|(#{CODA}#{CODA})(#{KERNEL})|(#{CODA}#{CODA})|(#{CODA}))?(.*)$/ result << $1 + $3.to_s + $5.to_s + $7.to_s + $8.to_s rest = $2.to_s + $4.to_s + $6.to_s + $9.to_s else raise %(Cannot separate "#{@word}". No rule match next syllable at "#{result.join('')}|>#{rest}") end else raise %(Cannot separate "#{@word}". No rule match next syllable at "#{result.join('')}|>#{rest}") end end result end
Also aliased as: separated