class PROIEL::Sentence

A sentence object in a treebank.

Attributes

alignment_id[R]

@return [nil, Integer] ID of the sentence that this sentence is aligned to

annotated_at[R]

@return [nil, DateTime] time of annotation

annotated_by[R]

@return [nil, String] annotator of sentence

div[R]

@return [Div] parent div object

id[R]

@return [Fixnum] ID of the sentence

presentation_after[R]

@return [nil, String] presentation material after sentence

presentation_before[R]

@return [nil, String] presentation material before sentence

reviewed_at[R]

@return [nil, DateTime] time of reviewed

reviewed_by[R]

@return [nil, String] reviewer of sentence

status[R]

@return [Symbol] annotation status of sentence

Public Class Methods

new(parent, id, status, presentation_before, presentation_after, alignment_id, annotated_by, reviewed_by, annotated_at, reviewed_at, &block) click to toggle source

Creates a new sentence object.

# File lib/proiel/sentence.rb, line 42
def initialize(parent, id, status, presentation_before, presentation_after, alignment_id, annotated_by, reviewed_by, annotated_at, reviewed_at, &block)
  @div = parent

  raise ArgumentError, 'integer expected' unless id.is_a?(Integer)
  @id = id

  raise ArgumentError, 'string or symbol expected' unless status.is_a?(String) or status.is_a?(Symbol)
  @status = status.to_sym

  raise ArgumentError, 'string or nil expected' unless presentation_before.nil? or presentation_before.is_a?(String)
  @presentation_before = presentation_before.freeze

  raise ArgumentError, 'string or nil expected' unless presentation_after.nil? or presentation_after.is_a?(String)
  @presentation_after = presentation_after.freeze

  raise ArgumentError, 'integer or nil expected' unless alignment_id.nil? or alignment_id.is_a?(Integer)
  @alignment_id = alignment_id

  unless annotated_at.nil? or PROIEL::Utilities.xmlschema_datetime?(annotated_at)
    raise ArgumentError, 'XML schema date time or nil expected'
  end
  @annotated_at = annotated_at ? DateTime.xmlschema(annotated_at).freeze : nil

  unless reviewed_at.nil? or PROIEL::Utilities.xmlschema_datetime?(reviewed_at)
    raise ArgumentError, 'XML schema date time or nil expected'
  end
  @reviewed_at = reviewed_at ? DateTime.xmlschema(reviewed_at).freeze : nil

  raise ArgumentError, 'string or nil expected' unless annotated_by.nil? or annotated_by.is_a?(String)
  @annotated_by = annotated_by.freeze

  raise ArgumentError, 'string or nil expected' unless reviewed_by.nil? or reviewed_by.is_a?(String)
  @reviewed_by = reviewed_by.freeze

  @children = block.call(self) if block_given?
end

Public Instance Methods

alignment(aligned_source) click to toggle source

Returns the aligned sentence if any.

@return [Sentence, NilClass] aligned sentence

# File lib/proiel/sentence.rb, line 231
def alignment(aligned_source)
  alignment_id ? aligned_source.treebank.find_sentence(alignment_id) : nil
end
annotated?() click to toggle source

Checks if the sentence is annotated.

Since only annotated sentences can be reviewed, a sentence is annotated if its ‘status` is either `:reviewed` or `:annotated`.

@return [true,false]

# File lib/proiel/sentence.rb, line 145
def annotated?
  @status == :reviewed or @status == :annotated
end
citation() click to toggle source

@return [String] the complete citation for the sentence

# File lib/proiel/sentence.rb, line 97
def citation
  [source.citation_part, citation_part].join(' ')
end
citation_part() click to toggle source

Computes an appropriate citation component for the sentence.

The computed citation component must be concatenated with the citation component provided by the source to produce a complete citation.

@see citation

@return [String] the citation component

# File lib/proiel/sentence.rb, line 109
def citation_part
  tc = @children.select(&:has_citation?)
  x = tc.first ? tc.first.citation_part : nil
  y = tc.last ? tc.last.citation_part : nil

  Citations.citation_make_range(x, y)
end
inferred_alignment(aligned_source) click to toggle source

Returns inferred aligned sentences if any.

@return [Array<Sentence>] inferred aligned sentences

# File lib/proiel/sentence.rb, line 238
def inferred_alignment(aligned_source)
  tokens.select(&:alignment_id).map do |token|
    token.alignment(aligned_source)
  end.flatten.compact.map(&:sentence).uniq
end
language() click to toggle source

@return [String] language of the sentence as an ISO 639-3 language tag

# File lib/proiel/sentence.rb, line 90
def language
  source.language
end
printable_form(custom_token_formatter: nil) click to toggle source

Returns the printable form of the sentence with all token forms and any presentation data.

@param custom_token_formatter [Lambda] formatting function for tokens which is passed the token as its sole argument

@return [String] the printable form of the sentence

# File lib/proiel/sentence.rb, line 124
def printable_form(custom_token_formatter: nil)
  [presentation_before,
   @children.reject(&:is_empty?).map { |t| t.printable_form(custom_token_formatter: custom_token_formatter) },
   presentation_after].compact.join
end
reviewed?() click to toggle source

Checks if the sentence is reviewed.

A sentence has been reviewed if its ‘status` is `:reviewed`.

@return [true,false]

# File lib/proiel/sentence.rb, line 135
def reviewed?
  @status == :reviewed
end
source() click to toggle source

@return [Source] parent source object

# File lib/proiel/sentence.rb, line 80
def source
  @div.source
end
syntax_graph() click to toggle source

Builds a syntax graph for the dependency annotation of the sentence and inserts a dummy root node. The graph is represented as a hash of hashes. Each hash contains the ID of the token, its relation (to its syntatically dominating token) and a list of secondary edges.

@return [Hash] a single graph with a dummy root node represented as a hash

@example

sentence.syntax_graph # => [id: nil, relation: nil, children: [{ id: 1000, relation: "pred", children: [ { id: 1001, relation: "xcomp", children: [], slashes: [["xsub", 1000]]}]}], slashes: []]
# File lib/proiel/sentence.rb, line 169
def syntax_graph
  { id: nil, relation: nil, children: syntax_graphs, slashes: [] }
end
syntax_graphs() click to toggle source

Builds syntax graphs for the dependency annotation of the sentence. Multiple graphs may be returned as the function does not insert an empty dummy root node. Each graph is represented as a hash of hashes. Each hash contains the ID of the token, its relation (to its syntatically dominating token) and a list of secondary edges.

@return [Array] zero or more syntax graphs represented as hashes

@example Get a single syntax graph with a dummy root node

sentence.syntax_graphs # => [{ id: 1000, relation: "pred", children: [ { id: 1001, relation: "xcomp", children: [], slashes: [["xsub", 1000]]}]}]
# File lib/proiel/sentence.rb, line 185
def syntax_graphs
  Array.new.tap do |graphs|
    token_map = {}

    # Pass 1: create new attribute hashes for each token and index each hash by token ID
    @children.each do |token|
      token_map[token.id] =
        {
          id: token.id,
          relation: token.relation,
          children: [],
          slashes: token.slashes,
        }
    end

    # Pass 2: append attribute hashes for tokens with a head ID to the head's children list; append attribute hashes for tokens without a head ID to the list of graphs to return
    @children.each do |token|
      if token.head_id
        token_map[token.head_id][:children] << token_map[token.id]
      else
        graphs << token_map[token.id]
      end
    end
  end
end
tokens() click to toggle source

Finds all tokens in the sentence.

@return [Enumerator] tokens in the sentence

@example Iterating tokens

tokens.each { |t| puts t.id }

@example Create an array with only empty tokens

tokens.select(&:is_empty?)

@example Counting tokens

puts tokens.count #=> 200
# File lib/proiel/sentence.rb, line 224
def tokens
  @children.to_enum
end
treebank() click to toggle source

@return [Treebank] parent treebank object

# File lib/proiel/sentence.rb, line 85
def treebank
  @div.source.treebank
end
unannotated?() click to toggle source

Checks if the sentence is unannotated.

A sentence is unannotated if its ‘status` is `:unannotated`.

@return [true,false]

# File lib/proiel/sentence.rb, line 154
def unannotated?
  @status == :unannotated
end