class Sanzang::Translator

Translator is the main class for performing text translations with Sanzang. A Translator utilizes a TranslationTable, which is passed to it at the time of creation. The Translator can then apply these translation rules, generate full translation listings, and perform translations by reading and writing to IO objects.

Attributes

table[R]

The TranslationTable used by the Translator

Public Class Methods

new(translation_table) click to toggle source

Creates a new Translator object with the given TranslationTable. The TranslationTable stores rules for translation, while the Translator is the worker who applies these rules and can create translation listings.

# File lib/sanzang/translator.rb, line 32
def initialize(translation_table)
  @table = translation_table
end

Public Instance Methods

gen_listing(source_text, pos = 1) click to toggle source

Generate a translation listing text string, in which the output of Translator#translate is collated and numbered for reference purposes. This is the normal text listing output of the Sanzang Translator.

# File lib/sanzang/translator.rb, line 71
def gen_listing(source_text, pos = 1)
  source_encoding = source_text.encoding
  source_text.encode!(Encoding::UTF_8)

  newline = source_text.include?("\r") ? "\r\n" : "\n"
  texts = translate(source_text).collect {|t| t = t.split(newline) }

  listing = ""
  texts[0].length.times do |line_i|
    texts.length.times do |col_i|
      listing << "[#{pos + line_i}.#{col_i + 1}] #{texts[col_i][line_i]}" \
              << newline
    end
    listing << newline
  end
  listing.encode!(source_encoding)
end
text_vocab(source_text) click to toggle source

Return an Array of all translation rules used by a particular text. These records represent the vocabulary used by the text.

# File lib/sanzang/translator.rb, line 39
def text_vocab(source_text)
  text_copy = String.new(source_text)
  @table.records.select do |r|
    text_copy.include?(r[0]) ? text_copy.gsub!(r[0], "\x1F") : false
  end
end
translate(source_text) click to toggle source

Use the TranslationTable of the Translator to create translations for each destination language column of the translation table. The result is a simple Array of String objects, with each String object corresponding to a destination language column in the TranslationTable.

# File lib/sanzang/translator.rb, line 51
def translate(source_text)
  text_collection = [source_text]
  vocab_terms = text_vocab(source_text)
  1.upto(@table.width - 1) do |column_i|
    translation = String.new(source_text)
    translation.delete!("\x1F")
    vocab_terms.each do |term|
      translation.gsub!(term[0], "\x1F#{term[column_i]}\x1F")
    end
    translation.gsub!(/\x1F(?=[\r\n])/, "")
    translation.gsub!(/\x1F+/, " ")
    text_collection << translation
  end
  text_collection
end
translate_io(input, output) click to toggle source

Read a text from input and write its translation listing to output. If a parameter is a string, it is interpreted as the path to a file, and the relevant file is opened and used. Otherwise, the parameter is treated as an open IO object. I/O is buffered for better performance and to avoid reading entire texts into memory.

# File lib/sanzang/translator.rb, line 95
def translate_io(input, output)
  if input.kind_of?(String)
    io_in = File.open(input, "rb", encoding: @table.source_encoding)
  else
    io_in = input
  end
  if output.kind_of?(String)
    io_out = File.open(output, "wb", encoding: @table.source_encoding)
  else
    io_out = output
  end

  buf_size = 100
  buffer = ""
  io_in.each do |line|
    buffer << line
    if io_in.lineno % buf_size == 0
      io_out.write(gen_listing(buffer, io_in.lineno - buf_size + 1))
      buffer = ""
    end
  end

  newline = "\n".encode!(buffer.encoding)
  io_out.write(gen_listing(buffer,
      io_in.lineno - buffer.rstrip.count(newline)))
ensure
  if input.kind_of?(String) and defined?(io_in) and io_in
    io_in.close if not io_in.closed?
  end
  if output.kind_of?(String) and defined?(io_out) and io_out
    io_out.close if not io_out.closed?
  end
end