class Sanzang::Translator
Translator
is the main class for performing text translations with Sanzang
. A Translator
utilizes a TranslationTable
, which is passed to it at the time of creation. The Translator
can then apply these translation rules, generate full translation listings, and perform translations by reading and writing to IO objects.
Attributes
The TranslationTable
used by the Translator
Public Class Methods
Creates a new Translator
object with the given TranslationTable
. The TranslationTable
stores rules for translation, while the Translator
is the worker who applies these rules and can create translation listings.
# File lib/sanzang/translator.rb, line 32 def initialize(translation_table) @table = translation_table end
Public Instance Methods
Generate a translation listing text string, in which the output of Translator#translate
is collated and numbered for reference purposes. This is the normal text listing output of the Sanzang
Translator
.
# File lib/sanzang/translator.rb, line 71 def gen_listing(source_text, pos = 1) source_encoding = source_text.encoding source_text.encode!(Encoding::UTF_8) newline = source_text.include?("\r") ? "\r\n" : "\n" texts = translate(source_text).collect {|t| t = t.split(newline) } listing = "" texts[0].length.times do |line_i| texts.length.times do |col_i| listing << "[#{pos + line_i}.#{col_i + 1}] #{texts[col_i][line_i]}" \ << newline end listing << newline end listing.encode!(source_encoding) end
Return an Array of all translation rules used by a particular text. These records represent the vocabulary used by the text.
# File lib/sanzang/translator.rb, line 39 def text_vocab(source_text) text_copy = String.new(source_text) @table.records.select do |r| text_copy.include?(r[0]) ? text_copy.gsub!(r[0], "\x1F") : false end end
Use the TranslationTable
of the Translator
to create translations for each destination language column of the translation table. The result is a simple Array of String objects, with each String object corresponding to a destination language column in the TranslationTable
.
# File lib/sanzang/translator.rb, line 51 def translate(source_text) text_collection = [source_text] vocab_terms = text_vocab(source_text) 1.upto(@table.width - 1) do |column_i| translation = String.new(source_text) translation.delete!("\x1F") vocab_terms.each do |term| translation.gsub!(term[0], "\x1F#{term[column_i]}\x1F") end translation.gsub!(/\x1F(?=[\r\n])/, "") translation.gsub!(/\x1F+/, " ") text_collection << translation end text_collection end
Read a text from input and write its translation listing to output. If a parameter is a string, it is interpreted as the path to a file, and the relevant file is opened and used. Otherwise, the parameter is treated as an open IO object. I/O is buffered for better performance and to avoid reading entire texts into memory.
# File lib/sanzang/translator.rb, line 95 def translate_io(input, output) if input.kind_of?(String) io_in = File.open(input, "rb", encoding: @table.source_encoding) else io_in = input end if output.kind_of?(String) io_out = File.open(output, "wb", encoding: @table.source_encoding) else io_out = output end buf_size = 100 buffer = "" io_in.each do |line| buffer << line if io_in.lineno % buf_size == 0 io_out.write(gen_listing(buffer, io_in.lineno - buf_size + 1)) buffer = "" end end newline = "\n".encode!(buffer.encoding) io_out.write(gen_listing(buffer, io_in.lineno - buffer.rstrip.count(newline))) ensure if input.kind_of?(String) and defined?(io_in) and io_in io_in.close if not io_in.closed? end if output.kind_of?(String) and defined?(io_out) and io_out io_out.close if not io_out.closed? end end