class Sanzang::TranslationTable

A translation table encapsulates a set of rules for translating with the Sanzang system.

Attributes

records[R]

The records for the translation table, as an array

source_encoding[R]

Original encoding when the table was read

Public Class Methods

new(rules) click to toggle source

The translation table file format is summarized as follows:

  • Each line of text is a record for a translation rule.

  • Fields in the record are separated by the “|” character.

  • The first field contains the term in the source language.

  • Subsequent fields are equivalent terms in destination languages.

  • The number of columns must be consistent for the entire table.

The rules passed in here may either be a file descriptor or a string.

# File lib/sanzang/translation_table.rb, line 43
def initialize(rules)
  contents = rules.kind_of?(String) ? rules : rules.read
  @source_encoding = contents.encoding
  contents.encode!(Encoding::UTF_8)

  if contents =~ /~\||\|~|\| /       # If there is any old formatting...
    contents.gsub!(/~\||\|~/, "")    # Rm old style "~|" and "|~"
    contents.gsub!(/^\s+|\s+$/, "")  # Rm WS around lines
    contents.gsub!(/\s*\|\s*/, "|")  # Rm WS around delimiters
  end

  @records = contents.strip.split("\n").collect {|r| r.strip.split("|") }
  @sorted = false
  check_dims
 #sort!
end

Public Instance Methods

[](index) click to toggle source

Retrieve a record by its numeric index.

# File lib/sanzang/translation_table.rb, line 62
def [](index)
  @records[index]
end
check_dims() click to toggle source

Check the basic dimensions of the translation table

# File lib/sanzang/translation_table.rb, line 68
def check_dims
  if @records.size < 1
    raise "Table must have at least 1 row"
  elsif records[0].size < 2
    raise "Table must have at least 2 columns"
  end
  @records.each do |r|
    if r.size != width
      raise "Column mismatch: Line #{i + 1}"
    end
  end
end
encoding() click to toggle source

The text encoding used internally for all translation table data

# File lib/sanzang/translation_table.rb, line 97
def encoding
  Encoding::UTF_8
end
find(term) click to toggle source

Find a record by the source language term (first column).

# File lib/sanzang/translation_table.rb, line 103
def find(term)
  @records.find {|rec| rec[0] == term }
end
length() click to toggle source

The number of records in the table

# File lib/sanzang/translation_table.rb, line 149
def length
  @records.length
end
merge!(tab2) click to toggle source

Merge another table into this one. If the same source term exists in both tables, then the record from the other table will be used instead. Note: after a merge, the resulting table is unsorted.

# File lib/sanzang/translation_table.rb, line 129
def merge!(tab2)
  if tab2.width != width
    raise "Table widths must match when merging tables"
  end
  h1 = to_h
  tab2.records.each do |rec|
    h1[rec[0]] = rec
  end
  @records = h1.values
  @sorted = false
end
sort!() click to toggle source

Reverse sort all records by length

# File lib/sanzang/translation_table.rb, line 89
def sort!
  @records.sort! {|x,y| y[0].size <=> x[0].size }
  @sorted = true
  nil
end
sorted?() click to toggle source

Check if the table records are sorted

# File lib/sanzang/translation_table.rb, line 83
def sorted?
  @sorted
end
to_csv() click to toggle source

Return a CSV formatted string

# File lib/sanzang/translation_table.rb, line 143
def to_csv
  @records.map {|r| r.join("|") }.join("\n")
end
to_h() click to toggle source

Convert to a hash. The original records are the values.

For example: “A” => [“A”, “B”, “C”]

# File lib/sanzang/translation_table.rb, line 111
def to_h
  h = Hash.new
  @records.each {|rec| h[rec[0]] = rec if not h[rec[0]] }
  h
end
uniq!() click to toggle source

Only include unique source values. The resulting table is unsorted.

# File lib/sanzang/translation_table.rb, line 119
def uniq!
  @records = to_h.values
  @sorted = false
  nil
end
width() click to toggle source

The number of columns in the table

# File lib/sanzang/translation_table.rb, line 155
def width
  @records[0].length
end