class Slaw::ActGenerator

Base class for generating Act documents

Attributes

builder[RW]
Slaw::Parse::Builder

builder used by the generator

parser[RW]
Treetop::Runtime::CompiledParser

compiled parser

Public Class Methods

new(grammar) click to toggle source
# File lib/slaw/generator.rb, line 15
def initialize(grammar)
  @grammar = grammar

  @parser = build_parser
  @builder = Slaw::Parse::Builder.new(parser: @parser)
  @parser = @builder.parser
  @cleanser = Slaw::Parse::Cleanser.new
end

Public Instance Methods

build_parser() click to toggle source
# File lib/slaw/generator.rb, line 24
def build_parser
  unless @@parsers[@grammar]
    # load the grammar with polyglot and treetop
    # this will ensure the class below is available
    # see: http://cjheath.github.io/treetop/using_in_ruby.html
    require "slaw/grammars/#{@grammar}/act"
    grammar_class = "Slaw::Grammars::#{@grammar.upcase}::ActParser"
    @@parsers[@grammar] = eval(grammar_class)
  end

  @parser = @@parsers[@grammar].new
  @parser.root = :act

  @parser
end
cleanup(text) click to toggle source

Run basic cleanup on text, such as ensuring clean newlines and removing tabs. This is always automatically done before processing.

# File lib/slaw/generator.rb, line 52
def cleanup(text)
  @cleanser.cleanup(text)
end
generate_from_text(text) click to toggle source

Generate a Slaw::Act instance from plain text.

@param text [String] plain text

@return [Nokogiri::Document] the resulting xml

# File lib/slaw/generator.rb, line 45
def generate_from_text(text)
  @builder.parse_and_process_text(cleanup(text))
end
guess_section_number_after_title(text) click to toggle source

Try to determine if section numbers come after titles, rather than before.

eg:

Section title
1. Section content

versus

1. Section title
Section content
# File lib/slaw/generator.rb, line 75
def guess_section_number_after_title(text)
  before = text.scan(/^\w{4,}[^\n]+\n\d+\. /).length
  after  = text.scan(/^\s*\n\d+\. \w{4,}/).length

  before > after * 1.25
end
reformat(text) click to toggle source

Reformat some common errors in text to help make parsing more successful. Option and only recommended when processing a document for the first time.

# File lib/slaw/generator.rb, line 59
def reformat(text)
  @cleanser.reformat(text)
end
text_from_act(doc) click to toggle source

Transform an Akoma Ntoso XML document back into a plain-text version suitable for re-parsing back into XML with no loss of structure.

# File lib/slaw/generator.rb, line 84
def text_from_act(doc)
  # look on the load path for an XSL file for this grammar
  filename = "/slaw/grammars/#{@grammar}/act_text.xsl"

  if dir = $LOAD_PATH.find { |p| File.exist?(p + filename) }
    xslt = Nokogiri::XSLT(File.read(dir + filename))
    xslt.apply_to(doc).gsub(/^( *\n){2,}/, "\n")
  else
    raise "Unable to find text XSL for grammar #{@grammar}: #{fragment}"
  end
end