class SPARQL::Grammar::Parser
A parser for the SPARQL
1.1 grammar.
@see www.w3.org/TR/sparql11-query/#grammar @see en.wikipedia.org/wiki/LR_parser
Constants
- AGGREGATE_RULES
- BUILTINS
Builtin functions
- BUILTIN_RULES
Attributes
The current input string being processed.
@return [String]
Used for generating BNode labels
Any additional options for the parser.
@return [Hash]
The internal representation of the result using hierarchy of RDF
objects and SPARQL::Algebra::Operator
objects. @return [Array]
The current input tokens being processed.
@return [Array<Token>]
Public Class Methods
Initializes a new parser instance.
@param [String, IO, StringIO, to_s
] input @param [Hash{Symbol => Object}] options @option options [Hash] :prefixes (Hash.new)
the prefix mappings to use (for acessing intermediate parser productions)
@option options [#to_s] :base_uri (nil)
the base URI to use when resolving relative URIs (for acessing intermediate parser productions)
@option options [#to_s] :anon_base (“b0”)
Basis for generating anonymous Nodes
@option options [Boolean] :resolve_iris (false)
Resolve prefix and relative IRIs, otherwise, when serializing the parsed SSE as S-Expressions, use the original prefixed and relative URIs along with `base` and `prefix` definitions.
@option options [Boolean] :validate (false)
whether to validate the parsed statements and values
@option options [Logger, write, <<] :logger
Record error/info/debug output
@yield [parser] `self` @yieldparam [SPARQL::Grammar::Parser] parser @yieldreturn [void] ignored @return [SPARQL::Grammar::Parser]
# File lib/sparql/grammar/parser11.rb, line 1562 def initialize(input = nil, **options, &block) @input = case input when IO, StringIO then input.read else input.to_s.dup end @input.encode!(Encoding::UTF_8) if @input.respond_to?(:encode!) @options = {anon_base: "b0", validate: false}.merge(options) debug("base IRI") {base_uri.inspect} debug("validate") {validate?.inspect} @vars = {} @nd_var_gen = "0" if block_given? case block.arity when 0 then instance_eval(&block) else block.call(self) end end end
Public Instance Methods
Parse query
The result is a SPARQL
Algebra
S-List. Productions return an array such as the following:
(prefix ((: <http://example/>)) (union (bgp (triple ?s ?p ?o)) (graph ?g (bgp (triple ?s ?p ?o)))))
@param [Symbol, to_s
] prod The starting production for the parser.
It may be a URI from the grammar, or a symbol representing the local_name portion of the grammar URI.
@return [Array] @see www.w3.org/TR/sparql11-query/#sparqlAlgebra @see axel.deri.ie/sparqltutorial/ESWC2007_SPARQL_Tutorial_unit2b.pdf
# File lib/sparql/grammar/parser11.rb, line 1621 def parse(prod = START) ll1_parse(@input, prod.to_sym, branch: BRANCH, first: FIRST, follow: FOLLOW, whitespace: WS, **@options ) # The last thing on the @prod_data stack is the result @result = case when !prod_data.is_a?(Hash) prod_data when prod_data.empty? nil when prod_data[:query] Array(prod_data[:query]).length == 1 ? prod_data[:query].first : prod_data[:query] when prod_data[:update] prod_data[:update] else key = prod_data.keys.first [key] + Array(prod_data[key]) # Creates [:key, [:triple], ...] end # Validate resulting expression @result.validate! if @result && validate? @result end
# File lib/sparql/grammar/parser11.rb, line 1600 def to_s @result.to_sxp end
@return [String]
# File lib/sparql/grammar/parser11.rb, line 1596 def to_sxp_bin @result end
Returns `true` if the input string is syntactically valid.
@return [Boolean]
# File lib/sparql/grammar/parser11.rb, line 1588 def valid? parse true rescue Error false end
Private Instance Methods
Accumulate joined expressions in for prod1 (op prod2)* to form (op (op 1 2) 3)
# File lib/sparql/grammar/parser11.rb, line 2014 def accumulate_operator_expressions(operator, production, data) if data[operator] # Add [op data] to stack based on "production" add_prod_datum(production, [data[operator], data[:Expression]]) # Add previous [op data] information add_prod_datum(production, data[production]) else # No operator, forward :Expression add_prod_datum(:Expression, data[:Expression]) end end
Add joined expressions in for prod1 (op prod2)* to form (op (op 1 2) 3)
# File lib/sparql/grammar/parser11.rb, line 2003 def add_operator_expressions(production, data) # Iterate through expression to create binary operations lhs = data[:Expression] while data[production] && !data[production].empty? op, rhs = data[production].shift, data[production].shift lhs = SPARQL::Algebra::Expression[op + lhs + rhs] end add_prod_datum(:Expression, lhs) end
add a pattern
@param [String] production Production generating pattern @param [Hash{Symbol => Object}] options
# File lib/sparql/grammar/parser11.rb, line 1873 def add_pattern(production, **options) progress(production, "[:pattern, #{options[:subject]}, #{options[:predicate]}, #{options[:object]}]") triple = {} options.each_pair do |r, v| if v.is_a?(Array) && v.flatten.length == 1 v = v.flatten.first end if validate? && !v.is_a?(RDF::Term) error("add_pattern", "Expected #{r} to be a resource, but it was #{v.inspect}", production: production) end triple[r] = v end add_prod_datum(:pattern, RDF::Query::Pattern.new(triple)) end
Returns the Base URI defined for the parser, as specified or when parsing a BASE prologue element.
@example
base #=> RDF::URI('http://example.com/')
@return [HRDF::URI]
# File lib/sparql/grammar/parser11.rb, line 1709 def base_uri RDF::URI(@options[:base_uri]) end
Set the Base URI to use for this parser.
@param [RDF::URI, to_s
] iri
@example
base_uri = RDF::URI('http://purl.org/dc/terms/')
@return [RDF::URI]
# File lib/sparql/grammar/parser11.rb, line 1722 def base_uri=(iri) @options[:base_uri] = RDF::URI(iri) end
Generate a BNode identifier
# File lib/sparql/grammar/parser11.rb, line 1768 def bnode(id = nil) if @nd_var_gen # Use non-distinguished variables within patterns variable(id, false) else unless id id = @options[:anon_base] @options[:anon_base] = @options[:anon_base].succ end @bnode_cache ||= {} raise Error, "Illegal attempt to reuse a BNode" if @bnode_cache[id] && @bnode_cache[id].frozen? @bnode_cache[id] ||= RDF::Node.new(id) end end
Clear cached BNodes @return [void]
# File lib/sparql/grammar/parser11.rb, line 1756 def clear_bnode_cache @bnode_cache = {} end
Take collection of objects and create RDF
Collection using rdf:first, rdf:rest and rdf:nil @param [Hash] data Production Data
# File lib/sparql/grammar/parser11.rb, line 1847 def expand_collection(data) # Add any triples generated from deeper productions add_prod_datum(:pattern, data[:pattern]) # Create list items for each element in data[:GraphNode] first = data[:Collection] list = Array(data[:GraphNode]).flatten.compact last = list.pop list.each do |r| add_pattern(:Collection, subject: first, predicate: RDF["first"], object: r) rest = bnode() add_pattern(:Collection, subject: first, predicate: RDF["rest"], object: rest) first = rest end if last add_pattern(:Collection, subject: first, predicate: RDF["first"], object: last) end add_pattern(:Collection, subject: first, predicate: RDF["rest"], object: RDF["nil"]) end
Flatten a Data in form of filter: [op+ bgp?], without a query into filter and query creating exprlist, if necessary @return [Array[:expr, query]]
# File lib/sparql/grammar/parser11.rb, line 1891 def flatten_filter(data) query = data.pop if data.last.is_a?(SPARQL::Algebra::Query) expr = data.length > 1 ? SPARQL::Algebra::Operator::Exprlist.new(*data) : data.first [expr, query] end
Freeze BNodes, which allows us to detect if they're re-used @return [void]
# File lib/sparql/grammar/parser11.rb, line 1762 def freeze_bnodes @bnode_cache ||= {} @bnode_cache.each_value(&:freeze) end
Generate BNodes, not non-distinguished variables @param [Boolean] value @return [void]
# File lib/sparql/grammar/parser11.rb, line 1750 def gen_bnodes(value = true) @nd_var_gen = value ? false : "0" end
Create URIs
# File lib/sparql/grammar/parser11.rb, line 1810 def iri(value) # If we have a base URI, use that when constructing a new URI value = RDF::URI(value) if base_uri && value.relative? u = base_uri.join(value) u.lexical = "<#{value}>" unless resolve_iris? u else value end end
Create a literal
# File lib/sparql/grammar/parser11.rb, line 1833 def literal(value, **options) options = options.dup # Internal representation is to not use xsd:string, although it could arguably go the other way. options.delete(:datatype) if options[:datatype] == RDF::XSD.string debug("literal") do "value: #{value.inspect}, " + "options: #{options.inspect}, " + "validate: #{validate?.inspect}, " end RDF::Literal.new(value, validate: validate?, **options) end
Merge query modifiers, datasets, and projections
This includes tranforming aggregates if also used with a GROUP BY
@see www.w3.org/TR/sparql11-query/#convertGroupAggSelectExpressions
# File lib/sparql/grammar/parser11.rb, line 1902 def merge_modifiers(data) debug("merge modifiers") {data.inspect} query = data[:query] ? data[:query].first : SPARQL::Algebra::Operator::BGP.new vars = data[:Var] || [] order = data[:order] ? data[:order].first : [] extensions = data.fetch(:extend, []) having = data.fetch(:having, []) values = data.fetch(:ValuesClause, []).first # extension variables must not appear in projected variables. # Add them to the projection otherwise extensions.each do |(var, _)| raise Error, "Extension variable #{var} also in SELECT" if vars.map(&:to_s).include?(var.to_s) vars << var end # If any extension contains an aggregate, and there is now group, implicitly group by 1 if !data[:group] && extensions.any? {|(_, function)| function.aggregate?} || having.any? {|c| c.aggregate? } debug {"Implicit group"} data[:group] = [[]] end # Add datasets and modifiers in order if data[:group] group_vars = data[:group].first # For creating temporary variables agg = 0 # Find aggregated varirables in extensions aggregates = [] aggregated_vars = extensions.map do |(var, function)| var if function.aggregate? end.compact # Common function for replacing aggregates with temporary variables, # as defined in http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#convertGroupAggSelectExpressions aggregate_expression = lambda do |expr| # Replace unaggregated variables in expr # - For each unaggregated variable V in X expr.replace_vars! do |v| aggregated_vars.include?(v) ? v : SPARQL::Algebra::Expression[:sample, v] end # Replace aggregates in expr as above expr.replace_aggregate! do |function| if avf = aggregates.detect {|(_, f)| f == function} avf.first else # Allocate a temporary variable for this function, and retain the mapping for outside the group av = RDF::Query::Variable.new(".#{agg}", distinguished: false) agg += 1 aggregates << [av, function] av end end end # If there are extensions, they are aggregated if necessary and bound # to temporary variables extensions.map! do |(var, expr)| [var, aggregate_expression.call(expr)] end # Having clauses having.map! do |expr| aggregate_expression.call(expr) end query = if aggregates.empty? SPARQL::Algebra::Expression[:group, group_vars, query] else SPARQL::Algebra::Expression[:group, group_vars, aggregates, query] end end if values query = query ? SPARQL::Algebra::Expression[:join, query, values] : values end query = SPARQL::Algebra::Expression[:extend, extensions, query] unless extensions.empty? query = SPARQL::Algebra::Expression[:filter, *having, query] unless having.empty? query = SPARQL::Algebra::Expression[:order, data[:order].first, query] unless order.empty? query = SPARQL::Algebra::Expression[:project, vars, query] unless vars.empty? query = SPARQL::Algebra::Expression[data[:DISTINCT_REDUCED], query] if data[:DISTINCT_REDUCED] query = SPARQL::Algebra::Expression[:slice, data[:slice][0], data[:slice][1], query] if data[:slice] query = SPARQL::Algebra::Expression[:dataset, data[:dataset], query] if data[:dataset] query end
# File lib/sparql/grammar/parser11.rb, line 1822 def ns(prefix, suffix) base = prefix(prefix).to_s suffix = suffix.to_s.sub(/^\#/, "") if base.index("#") debug {"ns(#{prefix.inspect}): base: '#{base}', suffix: '#{suffix}'"} iri = iri(base + suffix.to_s) # Cause URI to be serialized as a lexical iri.lexical = "#{prefix}:#{suffix}" unless resolve_iris? iri end
Defines the given named URI prefix for this parser.
@example Defining a URI prefix
prefix :dc, RDF::URI('http://purl.org/dc/terms/')
@example Returning a URI prefix
prefix(:dc) #=> RDF::URI('http://purl.org/dc/terms/')
@overload prefix(name, uri)
@param [Symbol, #to_s] name @param [RDF::URI, #to_s] uri
@overload prefix(name)
@param [Symbol, #to_s] name
@return [RDF::URI]
# File lib/sparql/grammar/parser11.rb, line 1696 def prefix(name, iri = nil) name = name.to_s.empty? ? nil : (name.respond_to?(:to_sym) ? name.to_sym : name.to_s.to_sym) iri.nil? ? prefixes[name] : prefixes[name] = iri end
Returns the URI prefixes currently defined for this parser.
@example
prefixes[:dc] #=> RDF::URI('http://purl.org/dc/terms/')
@return [Hash{Symbol => RDF::URI}] @since 0.3.0
# File lib/sparql/grammar/parser11.rb, line 1660 def prefixes @options[:prefixes] ||= {} end
Defines the given URI prefixes for this parser.
@example
prefixes = { dc: RDF::URI('http://purl.org/dc/terms/'), }
@param [Hash{Symbol => RDF::URI}] prefixes @return [Hash{Symbol => RDF::URI}] @since 0.3.0
# File lib/sparql/grammar/parser11.rb, line 1675 def prefixes=(prefixes) @options[:prefixes] = prefixes end
Returns `true` if parsed statements and values should be validated.
@return [Boolean] `true` or `false` @since 0.3.0
# File lib/sparql/grammar/parser11.rb, line 1731 def resolve_iris? @options[:resolve_iris] end
Returns `true` when resolving IRIs, otherwise BASE and PREFIX are retained in the output algebra.
@return [Boolean] `true` or `false` @since 1.0.3
# File lib/sparql/grammar/parser11.rb, line 1740 def validate? @options[:validate] end
Return variable allocated to an ID. If no ID is provided, a new variable is allocated. Otherwise, any previous assignment will be used.
The variable has a distinguished? method applied depending on if this is a disinguished or non-distinguished variable. Non-distinguished variables are effectively the same as BNodes. @return [RDF::Query::Variable]
# File lib/sparql/grammar/parser11.rb, line 1792 def variable(id, distinguished = true) id = nil if id.to_s.empty? if id @vars[id] ||= begin RDF::Query::Variable.new(id, distinguished: distinguished) end else unless distinguished # Allocate a non-distinguished variable identifier id = @nd_var_gen @nd_var_gen = id.succ end RDF::Query::Variable.new(id, distinguished: distinguished) end end