class SPARQL::Grammar::Parser

A parser for the SPARQL 1.1 grammar.

@see www.w3.org/TR/sparql11-query/#grammar @see en.wikipedia.org/wiki/LR_parser

Constants

AGGREGATE_RULES
BUILTINS

Builtin functions

BUILTIN_RULES

Attributes

input[RW]

The current input string being processed.

@return [String]

nd_var_gen[RW]

Used for generating BNode labels

options[R]

Any additional options for the parser.

@return [Hash]

result[RW]

The internal representation of the result using hierarchy of RDF objects and SPARQL::Algebra::Operator objects. @return [Array]

tokens[R]

The current input tokens being processed.

@return [Array<Token>]

Public Class Methods

new(input = nil, **options, &block) click to toggle source

Initializes a new parser instance.

@param [String, IO, StringIO, to_s] input @param [Hash{Symbol => Object}] options @option options [Hash] :prefixes (Hash.new)

the prefix mappings to use (for acessing intermediate parser productions)

@option options [#to_s] :base_uri (nil)

the base URI to use when resolving relative URIs (for acessing intermediate parser productions)

@option options [#to_s] :anon_base (“b0”)

Basis for generating anonymous Nodes

@option options [Boolean] :resolve_iris (false)

Resolve prefix and relative IRIs, otherwise, when serializing the parsed SSE
as S-Expressions, use the original prefixed and relative URIs along with `base` and `prefix`
definitions.

@option options [Boolean] :validate (false)

whether to validate the parsed statements and values

@option options [Logger, write, <<] :logger

Record error/info/debug output

@yield [parser] `self` @yieldparam [SPARQL::Grammar::Parser] parser @yieldreturn [void] ignored @return [SPARQL::Grammar::Parser]

# File lib/sparql/grammar/parser11.rb, line 1562
def initialize(input = nil, **options, &block)
  @input = case input
  when IO, StringIO then input.read
  else input.to_s.dup
  end
  @input.encode!(Encoding::UTF_8) if @input.respond_to?(:encode!)
  @options = {anon_base: "b0", validate: false}.merge(options)

  debug("base IRI") {base_uri.inspect}
  debug("validate") {validate?.inspect}

  @vars = {}
  @nd_var_gen = "0"

  if block_given?
    case block.arity
      when 0 then instance_eval(&block)
      else block.call(self)
    end
  end
end

Public Instance Methods

ll1_parse(prod = START)
Alias for: parse
parse(prod = START) click to toggle source

Parse query

The result is a SPARQL Algebra S-List. Productions return an array such as the following:

(prefix ((: <http://example/>))
  (union
    (bgp (triple ?s ?p ?o))
    (graph ?g
      (bgp (triple ?s ?p ?o)))))

@param [Symbol, to_s] prod The starting production for the parser.

It may be a URI from the grammar, or a symbol representing the local_name portion of the grammar URI.

@return [Array] @see www.w3.org/TR/sparql11-query/#sparqlAlgebra @see axel.deri.ie/sparqltutorial/ESWC2007_SPARQL_Tutorial_unit2b.pdf

# File lib/sparql/grammar/parser11.rb, line 1621
def parse(prod = START)
  ll1_parse(@input,
    prod.to_sym,
    branch: BRANCH,
    first: FIRST,
    follow: FOLLOW,
    whitespace: WS,
    **@options
  )

  # The last thing on the @prod_data stack is the result
  @result = case
  when !prod_data.is_a?(Hash)
    prod_data
  when prod_data.empty?
    nil
  when prod_data[:query]
    Array(prod_data[:query]).length == 1 ? prod_data[:query].first : prod_data[:query]
  when prod_data[:update]
    prod_data[:update]
  else
    key = prod_data.keys.first
    [key] + Array(prod_data[key])  # Creates [:key, [:triple], ...]
  end

  # Validate resulting expression
  @result.validate! if @result && validate?
  @result
end
Also aliased as: ll1_parse
to_s() click to toggle source
# File lib/sparql/grammar/parser11.rb, line 1600
def to_s
  @result.to_sxp
end
to_sxp_bin() click to toggle source

@return [String]

# File lib/sparql/grammar/parser11.rb, line 1596
def to_sxp_bin
  @result
end
valid?() click to toggle source

Returns `true` if the input string is syntactically valid.

@return [Boolean]

# File lib/sparql/grammar/parser11.rb, line 1588
def valid?
  parse
  true
rescue Error
  false
end

Private Instance Methods

accumulate_operator_expressions(operator, production, data) click to toggle source

Accumulate joined expressions in for prod1 (op prod2)* to form (op (op 1 2) 3)

# File lib/sparql/grammar/parser11.rb, line 2014
def accumulate_operator_expressions(operator, production, data)
  if data[operator]
    # Add [op data] to stack based on "production"
    add_prod_datum(production, [data[operator], data[:Expression]])
    # Add previous [op data] information
    add_prod_datum(production, data[production])
  else
    # No operator, forward :Expression
    add_prod_datum(:Expression, data[:Expression])
  end
end
add_operator_expressions(production, data) click to toggle source

Add joined expressions in for prod1 (op prod2)* to form (op (op 1 2) 3)

# File lib/sparql/grammar/parser11.rb, line 2003
def add_operator_expressions(production, data)
  # Iterate through expression to create binary operations
  lhs = data[:Expression]
  while data[production] && !data[production].empty?
    op, rhs = data[production].shift, data[production].shift
    lhs = SPARQL::Algebra::Expression[op + lhs + rhs]
  end
  add_prod_datum(:Expression, lhs)
end
add_pattern(production, **options) click to toggle source

add a pattern

@param [String] production Production generating pattern @param [Hash{Symbol => Object}] options

# File lib/sparql/grammar/parser11.rb, line 1873
def add_pattern(production, **options)
  progress(production, "[:pattern, #{options[:subject]}, #{options[:predicate]}, #{options[:object]}]")
  triple = {}
  options.each_pair do |r, v|
    if v.is_a?(Array) && v.flatten.length == 1
      v = v.flatten.first
    end
    if validate? && !v.is_a?(RDF::Term)
      error("add_pattern", "Expected #{r} to be a resource, but it was #{v.inspect}",
        production: production)
    end
    triple[r] = v
  end
  add_prod_datum(:pattern, RDF::Query::Pattern.new(triple))
end
base_uri() click to toggle source

Returns the Base URI defined for the parser, as specified or when parsing a BASE prologue element.

@example

base  #=> RDF::URI('http://example.com/')

@return [HRDF::URI]

# File lib/sparql/grammar/parser11.rb, line 1709
def base_uri
  RDF::URI(@options[:base_uri])
end
base_uri=(iri) click to toggle source

Set the Base URI to use for this parser.

@param [RDF::URI, to_s] iri

@example

base_uri = RDF::URI('http://purl.org/dc/terms/')

@return [RDF::URI]

# File lib/sparql/grammar/parser11.rb, line 1722
def base_uri=(iri)
  @options[:base_uri] = RDF::URI(iri)
end
bnode(id = nil) click to toggle source

Generate a BNode identifier

# File lib/sparql/grammar/parser11.rb, line 1768
def bnode(id = nil)
  if @nd_var_gen
    # Use non-distinguished variables within patterns
    variable(id, false)
  else
    unless id
      id = @options[:anon_base]
      @options[:anon_base] = @options[:anon_base].succ
    end
    @bnode_cache ||= {}
    raise Error, "Illegal attempt to reuse a BNode" if @bnode_cache[id] && @bnode_cache[id].frozen?
    @bnode_cache[id] ||= RDF::Node.new(id)
  end
end
clear_bnode_cache() click to toggle source

Clear cached BNodes @return [void]

# File lib/sparql/grammar/parser11.rb, line 1756
def clear_bnode_cache
  @bnode_cache = {}
end
expand_collection(data) click to toggle source

Take collection of objects and create RDF Collection using rdf:first, rdf:rest and rdf:nil @param [Hash] data Production Data

# File lib/sparql/grammar/parser11.rb, line 1847
def expand_collection(data)
  # Add any triples generated from deeper productions
  add_prod_datum(:pattern, data[:pattern])

  # Create list items for each element in data[:GraphNode]
  first = data[:Collection]
  list = Array(data[:GraphNode]).flatten.compact
  last = list.pop

  list.each do |r|
    add_pattern(:Collection, subject: first, predicate: RDF["first"], object: r)
    rest = bnode()
    add_pattern(:Collection, subject: first, predicate: RDF["rest"], object: rest)
    first = rest
  end

  if last
    add_pattern(:Collection, subject: first, predicate: RDF["first"], object: last)
  end
  add_pattern(:Collection, subject: first, predicate: RDF["rest"], object: RDF["nil"])
end
flatten_filter(data) click to toggle source

Flatten a Data in form of filter: [op+ bgp?], without a query into filter and query creating exprlist, if necessary @return [Array[:expr, query]]

# File lib/sparql/grammar/parser11.rb, line 1891
def flatten_filter(data)
  query = data.pop if data.last.is_a?(SPARQL::Algebra::Query)
  expr = data.length > 1 ? SPARQL::Algebra::Operator::Exprlist.new(*data) : data.first
  [expr, query]
end
freeze_bnodes() click to toggle source

Freeze BNodes, which allows us to detect if they're re-used @return [void]

# File lib/sparql/grammar/parser11.rb, line 1762
def freeze_bnodes
  @bnode_cache ||= {}
  @bnode_cache.each_value(&:freeze)
end
gen_bnodes(value = true) click to toggle source

Generate BNodes, not non-distinguished variables @param [Boolean] value @return [void]

# File lib/sparql/grammar/parser11.rb, line 1750
def gen_bnodes(value = true)
  @nd_var_gen = value ? false : "0"
end
iri(value) click to toggle source

Create URIs

# File lib/sparql/grammar/parser11.rb, line 1810
def iri(value)
  # If we have a base URI, use that when constructing a new URI
  value = RDF::URI(value)
  if base_uri && value.relative?
    u = base_uri.join(value)
    u.lexical = "<#{value}>" unless resolve_iris?
    u
  else
    value
  end
end
literal(value, **options) click to toggle source

Create a literal

# File lib/sparql/grammar/parser11.rb, line 1833
def literal(value, **options)
  options = options.dup
  # Internal representation is to not use xsd:string, although it could arguably go the other way.
  options.delete(:datatype) if options[:datatype] == RDF::XSD.string
  debug("literal") do
    "value: #{value.inspect}, " +
    "options: #{options.inspect}, " +
    "validate: #{validate?.inspect}, "
  end
  RDF::Literal.new(value, validate: validate?, **options)
end
merge_modifiers(data) click to toggle source

Merge query modifiers, datasets, and projections

This includes tranforming aggregates if also used with a GROUP BY

@see www.w3.org/TR/sparql11-query/#convertGroupAggSelectExpressions

# File lib/sparql/grammar/parser11.rb, line 1902
def merge_modifiers(data)
  debug("merge modifiers") {data.inspect}
  query = data[:query] ? data[:query].first : SPARQL::Algebra::Operator::BGP.new

  vars = data[:Var] || []
  order = data[:order] ? data[:order].first : []
  extensions = data.fetch(:extend, [])
  having = data.fetch(:having, [])
  values = data.fetch(:ValuesClause, []).first

  # extension variables must not appear in projected variables.
  # Add them to the projection otherwise
  extensions.each do |(var, _)|
    raise Error, "Extension variable #{var} also in SELECT" if vars.map(&:to_s).include?(var.to_s)
    vars << var
  end

  # If any extension contains an aggregate, and there is now group, implicitly group by 1
  if !data[:group] &&
     extensions.any? {|(_, function)| function.aggregate?} ||
     having.any? {|c| c.aggregate? }
    debug {"Implicit group"}
    data[:group] = [[]]
  end

  # Add datasets and modifiers in order
  if data[:group]
    group_vars = data[:group].first

    # For creating temporary variables
    agg = 0

    # Find aggregated varirables in extensions
    aggregates = []
    aggregated_vars = extensions.map do |(var, function)|
      var if function.aggregate?
    end.compact

    # Common function for replacing aggregates with temporary variables,
    # as defined in http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#convertGroupAggSelectExpressions
    aggregate_expression = lambda do |expr|
      # Replace unaggregated variables in expr
      # - For each unaggregated variable V in X
      expr.replace_vars! do |v|
        aggregated_vars.include?(v) ? v : SPARQL::Algebra::Expression[:sample, v]
      end

      # Replace aggregates in expr as above
      expr.replace_aggregate! do |function|
        if avf = aggregates.detect {|(_, f)| f == function}
          avf.first
        else
          # Allocate a temporary variable for this function, and retain the mapping for outside the group
          av = RDF::Query::Variable.new(".#{agg}", distinguished: false)
          agg += 1
          aggregates << [av, function]
          av
        end
      end
    end

    # If there are extensions, they are aggregated if necessary and bound
    # to temporary variables
    extensions.map! do |(var, expr)|
      [var, aggregate_expression.call(expr)]
    end

    # Having clauses
    having.map! do |expr|
      aggregate_expression.call(expr)
    end

    query = if aggregates.empty?
      SPARQL::Algebra::Expression[:group, group_vars, query]
    else
      SPARQL::Algebra::Expression[:group, group_vars, aggregates, query]
    end
  end

  if values
    query = query ? SPARQL::Algebra::Expression[:join, query, values] : values
  end

  query = SPARQL::Algebra::Expression[:extend, extensions, query] unless extensions.empty?

  query = SPARQL::Algebra::Expression[:filter, *having, query] unless having.empty?

  query = SPARQL::Algebra::Expression[:order, data[:order].first, query] unless order.empty?

  query = SPARQL::Algebra::Expression[:project, vars, query] unless vars.empty?

  query = SPARQL::Algebra::Expression[data[:DISTINCT_REDUCED], query] if data[:DISTINCT_REDUCED]

  query = SPARQL::Algebra::Expression[:slice, data[:slice][0], data[:slice][1], query] if data[:slice]

  query = SPARQL::Algebra::Expression[:dataset, data[:dataset], query] if data[:dataset]

  query
end
ns(prefix, suffix) click to toggle source
# File lib/sparql/grammar/parser11.rb, line 1822
def ns(prefix, suffix)
  base = prefix(prefix).to_s
  suffix = suffix.to_s.sub(/^\#/, "") if base.index("#")
  debug {"ns(#{prefix.inspect}): base: '#{base}', suffix: '#{suffix}'"}
  iri = iri(base + suffix.to_s)
  # Cause URI to be serialized as a lexical
  iri.lexical = "#{prefix}:#{suffix}" unless resolve_iris?
  iri
end
prefix(name, iri = nil) click to toggle source

Defines the given named URI prefix for this parser.

@example Defining a URI prefix

prefix :dc, RDF::URI('http://purl.org/dc/terms/')

@example Returning a URI prefix

prefix(:dc)    #=> RDF::URI('http://purl.org/dc/terms/')

@overload prefix(name, uri)

@param  [Symbol, #to_s]   name
@param  [RDF::URI, #to_s] uri

@overload prefix(name)

@param  [Symbol, #to_s]   name

@return [RDF::URI]

# File lib/sparql/grammar/parser11.rb, line 1696
def prefix(name, iri = nil)
  name = name.to_s.empty? ? nil : (name.respond_to?(:to_sym) ? name.to_sym : name.to_s.to_sym)
  iri.nil? ? prefixes[name] : prefixes[name] = iri
end
prefixes() click to toggle source

Returns the URI prefixes currently defined for this parser.

@example

prefixes[:dc]  #=> RDF::URI('http://purl.org/dc/terms/')

@return [Hash{Symbol => RDF::URI}] @since 0.3.0

# File lib/sparql/grammar/parser11.rb, line 1660
def prefixes
  @options[:prefixes] ||= {}
end
prefixes=(prefixes) click to toggle source

Defines the given URI prefixes for this parser.

@example

prefixes = {
  dc: RDF::URI('http://purl.org/dc/terms/'),
}

@param [Hash{Symbol => RDF::URI}] prefixes @return [Hash{Symbol => RDF::URI}] @since 0.3.0

# File lib/sparql/grammar/parser11.rb, line 1675
def prefixes=(prefixes)
  @options[:prefixes] = prefixes
end
resolve_iris?() click to toggle source

Returns `true` if parsed statements and values should be validated.

@return [Boolean] `true` or `false` @since 0.3.0

# File lib/sparql/grammar/parser11.rb, line 1731
def resolve_iris?
  @options[:resolve_iris]
end
validate?() click to toggle source

Returns `true` when resolving IRIs, otherwise BASE and PREFIX are retained in the output algebra.

@return [Boolean] `true` or `false` @since 1.0.3

# File lib/sparql/grammar/parser11.rb, line 1740
def validate?
  @options[:validate]
end
variable(id, distinguished = true) click to toggle source

Return variable allocated to an ID. If no ID is provided, a new variable is allocated. Otherwise, any previous assignment will be used.

The variable has a distinguished? method applied depending on if this is a disinguished or non-distinguished variable. Non-distinguished variables are effectively the same as BNodes. @return [RDF::Query::Variable]

# File lib/sparql/grammar/parser11.rb, line 1792
def variable(id, distinguished = true)
  id = nil if id.to_s.empty?

  if id
    @vars[id] ||= begin
      RDF::Query::Variable.new(id, distinguished: distinguished)
    end
  else
    unless distinguished
      # Allocate a non-distinguished variable identifier
      id = @nd_var_gen
      @nd_var_gen = id.succ
    end
    RDF::Query::Variable.new(id, distinguished: distinguished)
  end
end