class Oga::XML::SaxParser
The SaxParser
class provides the basic interface for writing custom SAX parsers. All callback methods defined in {Oga::XML::Parser} are delegated to a dedicated handler class.
To write a custom handler for the SAX parser, create a class that implements one (or many) of the following callback methods:
-
‘on_document`
-
‘on_doctype`
-
‘on_cdata`
-
‘on_comment`
-
‘on_proc_ins`
-
‘on_xml_decl`
-
‘on_text`
-
‘on_element`
-
‘on_element_children`
-
‘on_attribute`
-
‘on_attributes`
-
‘after_element`
For example:
class SaxHandler def on_element(namespace, name, attrs = {}) puts name end end
You can then use it as following:
handler = SaxHandler.new parser = Oga::XML::SaxParser.new(handler, '<foo />') parser.parse
For information on the callback arguments see the documentation of the corresponding methods in {Oga::XML::Parser}.
## Element
Callbacks
The SAX parser changes the behaviour of both ‘on_element` and `after_element`. The latter in the regular parser only takes a {Oga::XML::Element} instance. In the SAX parser it will instead take a namespace name and the element name. This eases the process of figuring out what element a callback is associated with.
An example:
class SaxHandler def on_element(namespace, name, attrs = {}) # ... end def after_element(namespace, name) puts name # => "foo", "bar", etc end end
## Attributes
Attributes returned by ‘on_attribute` are passed as an Hash as the 3rd argument of the `on_element` callback. The keys of this Hash are the attribute names (optionally prefixed by their namespace) and their values. You can overwrite `on_attribute` to control individual attributes and `on_attributes` to control the final set.
Public Class Methods
@param [Object] handler The SAX handler to delegate callbacks to. @see [Oga::XML::Parser#initialize]
Oga::XML::Parser::new
# File lib/oga/xml/sax_parser.rb, line 71 def initialize(handler, *args) @handler = handler super(*args) end
Public Instance Methods
Manually define ‘after_element` so it can take a namespace and name. This differs a bit from the regular `after_element` which only takes an {Oga::XML::Element} instance.
@param [Array] namespace_with_name
# File lib/oga/xml/sax_parser.rb, line 93 def after_element(namespace_with_name) run_callback(:after_element, *namespace_with_name) return end
Manually define this method since for this one we do want the return value so it can be passed to ‘on_element`.
@see [Oga::XML::Parser#on_attribute]
# File lib/oga/xml/sax_parser.rb, line 103 def on_attribute(name, ns = nil, value = nil) if @handler.respond_to?(:on_attribute) return run_callback(:on_attribute, name, ns, value) end key = ns ? "#{ns}:#{name}" : name if value value = EntityDecoder.try_decode(value, @lexer.html?) end {key => value} end
Merges the attributes together into a Hash.
@param [Array] attrs @return [Hash]
# File lib/oga/xml/sax_parser.rb, line 121 def on_attributes(attrs) if @handler.respond_to?(:on_attributes) return run_callback(:on_attributes, attrs) end merged = {} attrs.each do |pair| # Hash#merge requires an extra allocation, this doesn't. pair.each { |key, value| merged[key] = value } end merged end
Manually define ‘on_element` so we can ensure that `after_element` always receives the namespace and name.
@see [Oga::XML::Parser#on_element] @return [Array]
# File lib/oga/xml/sax_parser.rb, line 82 def on_element(namespace, name, attrs = []) run_callback(:on_element, namespace, name, attrs) [namespace, name] end
@param [String] text
# File lib/oga/xml/sax_parser.rb, line 137 def on_text(text) if @handler.respond_to?(:on_text) unless inside_literal_html? text = EntityDecoder.try_decode(text, @lexer.html?) end run_callback(:on_text, text) end return end
Private Instance Methods
@return [TrueClass|FalseClass]
# File lib/oga/xml/sax_parser.rb, line 167 def inside_literal_html? @lexer.html_script? || @lexer.html_style? end
@param [Symbol] method @param [Array] args
# File lib/oga/xml/sax_parser.rb, line 173 def run_callback(method, *args) @handler.send(method, *args) if @handler.respond_to?(method) end