class HTML::Pipeline::Filter
Base class for user content HTML
filters. Each filter takes an HTML
string or Nokogiri::HTML::DocumentFragment, performs modifications and/or writes information to the result hash. Filters must return a DocumentFragment (typically the same instance provided to the call method) or a String with HTML
markup.
Example filter that replaces all images with trollface:
class FuuuFilter < HTML::Pipeline::Filter def call doc.search('img').each do |img| img['src'] = "http://paradoxdgn.com/junk/avatars/trollface.jpg" end end end
The context Hash passes options to filters and should not be changed in place. A Result Hash allows filters to make extracted information available to the caller and is mutable.
Common context options:
:base_url - The site's base URL :repository - A Repository providing context for the HTML being processed
Each filter may define additional options and output values. See the class docs for more info.
Attributes
Public: Returns a simple Hash used to pass extra information into filters and also to allow filters to make extracted information available to the caller.
Public: Returns a Hash used to allow filters to pass back information to callers of the various Pipelines. This can be used for mentioned_users, for example.
Public Class Methods
Perform a filter on doc with the given context.
Returns a HTML::Pipeline::DocumentFragment or a String containing HTML
markup.
# File lib/html/pipeline/filter.rb, line 141 def self.call(doc, context = nil, result = nil) new(doc, context, result).call end
# File lib/html/pipeline/filter.rb, line 32 def initialize(doc, context = nil, result = nil) if doc.kind_of?(String) @html = doc.to_str @doc = nil else @doc = doc @html = nil end @context = context || {} @result = result || {} validate end
Like call but guarantees that a DocumentFragment is returned, even when the last filter returns a String.
# File lib/html/pipeline/filter.rb, line 147 def self.to_document(input, context = nil) html = call(input, context) HTML::Pipeline::parse(html) end
Like call but guarantees that a string of HTML
markup is returned.
# File lib/html/pipeline/filter.rb, line 153 def self.to_html(input, context = nil) output = call(input, context) if output.respond_to?(:to_html) output.to_html else output.to_s end end
Public Instance Methods
The site’s base URL provided in the context hash, or ‘/’ when no base URL was specified.
# File lib/html/pipeline/filter.rb, line 111 def base_url context[:base_url] || '/' end
The main filter entry point. The doc attribute is guaranteed to be a Nokogiri::HTML::DocumentFragment when invoked. Subclasses should modify this document in place or extract information and add it to the context hash.
# File lib/html/pipeline/filter.rb, line 74 def call raise NotImplementedError end
Return whether the filter can access a given repo while applying a filter
A repo can only be accessed if its pullable by the user who submitted the content of this filter, or if it’s the same as the repository context in which the filter runs
# File lib/html/pipeline/filter.rb, line 103 def can_access_repo?(repo) return false if repo.nil? return true if repo == repository repo.pullable_by?(current_user) end
The User object provided in the context hash, or nil when no user was specified
# File lib/html/pipeline/filter.rb, line 93 def current_user context[:current_user] end
The Nokogiri::HTML::DocumentFragment to be manipulated. If the filter was provided a String, parse into a DocumentFragment the first time this method is called.
# File lib/html/pipeline/filter.rb, line 58 def doc @doc ||= parse_html(html) end
Helper method for filter subclasses used to determine if any of a node’s ancestors have one of the tag names specified.
node - The Node object to check. tags - An array of tag name strings to check. These should be downcase.
Returns true when the node has a matching ancestor.
# File lib/html/pipeline/filter.rb, line 129 def has_ancestor?(node, tags) while node = node.parent if tags.include?(node.name.downcase) break true end end end
The String representation of the document. If a DocumentFragment was provided to the Filter
, it is serialized into a String when this method is called.
# File lib/html/pipeline/filter.rb, line 65 def html raise InvalidDocumentException if @html.nil? && @doc.nil? @html || doc.to_html end
Validator for required context. This will check that anything passed in contexts exists in @contexts
If any errors are found an ArgumentError will be raised with a message listing all the missing contexts and the filters that require them.
# File lib/html/pipeline/filter.rb, line 168 def needs(*keys) missing = keys.reject { |key| context.include? key } if missing.any? raise ArgumentError, "Missing context keys for #{self.class.name}: #{missing.map(&:inspect).join ', '}" end end
Ensure the passed argument is a DocumentFragment. When a string is provided, it is parsed and returned; otherwise, the DocumentFragment is returned unmodified.
# File lib/html/pipeline/filter.rb, line 118 def parse_html(html) HTML::Pipeline.parse(html) end
The Repository object provided in the context hash, or nil when no :repository was specified.
It’s assumed that the repository context has already been checked for permissions
# File lib/html/pipeline/filter.rb, line 87 def repository context[:repository] end
Make sure the context has everything we need. Noop: Subclasses can override.
# File lib/html/pipeline/filter.rb, line 79 def validate end