class Webgen::PathHandler

Namespace for all path handlers.

About

A path handler is a webgen extension that uses source Path objects to create Node objects and that provides methods for rendering these nodes. The nodes are stored in a hierarchy, the root of which is a Tree object. Path handlers can do simple things, like copying a path from the source to the destination, or a complex things, like generating a whole set of nodes from one input path (e.g. generating a whole image gallery)!

The paths that are handled by a path handler are generally specified via path patterns. The create_nodes method of a path handler is called for each source path that should be handled. And when it is time to write out a node, the content method on the path handler associated with the node is called to retrieve the rendered content of the node.

Tree creation

The method populate_tree is used for creating the initial node tree, the internal representation of all paths. It is only the initial tree because it is possible that additional, secondary nodes are created during the rendering phase by using the create_secondary_nodes method.

Tree creation works like this:

  1. All path handlers on the invocation list are used in turn. The order is important; it allows avoiding unnecessary write phases and it makes sure that, for example, directory nodes are created before their file nodes.

  2. When a path handler is used for creating nodes, all source paths (retrieved by using Webgen::Source#paths method) that match one of the associated patterns and/or all path with the 'handler' meta information set to the path handler are used.

  3. The meta information of a used source path is then updated with the meta information applied by methods registered for the :apply_meta_info_to_path blackboard message.

    After that the source path is given to the parse_meta_info! method of the path handler so that meta information of the path can be updated with meta information stored in the content of the path itself.

    Then the meta information 'versions' is used to determine if multiple version of the path should be used for creating nodes and each path version is then given to the create_nodes method of the path handler so that it can create one or more nodes.

  4. Nodes returned by creates_nodes of a path handler are assumed to have the Node#node_info keys :path and :path_handler and the meta info key 'modified_at' correctly set (this is automatically done if the Webgen::PathHandler::Base#create_node method is used).

Path Patterns and Invocation order

Path patterns define which paths are handled by a specific path handler. These patterns are specified when a path handler is registered using register method. The patterns need to have a format that Dir.glob can handle. Note that a user can always associate any path with a path handler through a meta information path and the 'handler' meta information key.

In addition to specifying the patterns a path handler uses, one can also specify the place in the invocation list which the path handler should use. The invocation list is used from the front to the back when the Tree is created.

Implementing a path handler

A path handler must take the website as the only parameter on initialization and needs to define the following methods:

parse_meta_info!(path)

Update path.meta_info with meta information found in the content of the path. The return values of this method are given to the create_nodes method as additional parameters!

This allows one to use a single pass for reading the meta information and the normal content of the path.

create_nodes(path, …)

Create one or more nodes from the path and return them. If parse_meta_info! returns one or more values, these values are provided as additional parameters to this method.

It is a good idead to use the helper method Webgen::PathHandler::Base#create_node for actually creating a node.

content(node)

Return the content of the given node. This method is only called for nodes that have been created by the path handler.

Also note that a path handler does not need to reside under the Webgen::PathHandler namespace but all built-in ones do so that auto-loading of the path handlers works.

The Webgen::PathHandler::Base module provides default implementations of the needed methods (except for create_nodes) and should be used by all path handlers! If a path handler processes paths in Webgen Page Format, it should probably also use Webgen::PathHandler::PageUtils.

Information that is used by a path handler only for processing purposes should be stored in the node_info hash of a node as the meta_info hash is reserved for user provided node meta information.

Following is a simple path handler class example which copies paths from the source to the destination and modifies the extension in the process:

class SimpleCopy

  include Webgen::PathHandler::Base

  def create_nodes(path)
    path.ext += '.copied'
    create_node(path)
  end

  def content(node)
    node.node_info[:path]
  end

end

website.ext.path_handler.register(SimpleCopy, patterns: ['**/*.jpg', '**/*.png'])

Attributes

current_dest_node[R]

The destination node if one is currently written (only during the invocation of write_tree) or nil otherwise.

Public Class Methods

new(website) click to toggle source

Create a new path handler object for the given website.

Calls superclass method Webgen::ExtensionManager::new
    # File lib/webgen/path_handler.rb
133 def initialize(website)
134   super()
135   @website = website
136   @current_dest_node = nil
137   @invocation_order = []
138   @instances = {}
139   @secondary_nodes = {}
140 
141   @website.blackboard.add_listener(:website_generated, 'path_handler') do
142     @website.cache[:path_handler_secondary_nodes] = @secondary_nodes
143   end
144 
145   used_secondary_paths = {}
146   written_nodes = Set.new
147   @website.blackboard.add_listener(:before_secondary_nodes_created, 'path_handler') do |path, source_alcn|
148     (used_secondary_paths[source_alcn] ||= Set.new) << path if source_alcn
149   end
150   @website.blackboard.add_listener(:before_all_nodes_written, 'path_handler') do |node|
151     used_secondary_paths = {}
152     written_nodes = Set.new
153   end
154   @website.blackboard.add_listener(:after_node_written, 'path_handler') do |node|
155     written_nodes << node.alcn
156   end
157   @website.blackboard.add_listener(:after_all_nodes_written, 'path_handler') do
158     @secondary_nodes.delete_if do |path, data|
159       if written_nodes.include?(data[1]) && (!used_secondary_paths[data[1]] ||
160                                              !used_secondary_paths[data[1]].include?(path))
161         data[2].each {|alcn| @website.tree.delete_node(@website.tree[alcn])}
162         true
163       end
164     end
165   end
166 end

Public Instance Methods

create_secondary_nodes(path, content = '', source_alcn = nil) click to toggle source

Create nodes for the given path (a Path object which must not be a source path).

The content of the path also needs to be specified. Note that if an IO block is associated with the path, it is discarded!

If the parameter handler is present, nodes from the given path are only created with the specified handler.

If the secondary nodes are created during the rendering phase (and not during node creation, ie. in a create_nodes method of a path handler), the source_alcn has to be set to the node alcn from which these nodes are created!

    # File lib/webgen/path_handler.rb
341 def create_secondary_nodes(path, content = '', source_alcn = nil)
342   if (sn = @secondary_nodes[path]) && sn[1] != source_alcn
343     raise Webgen::NodeCreationError.new("Duplicate secondary path name <#{path}>", 'path_handler', path)
344   end
345   @website.blackboard.dispatch_msg(:before_secondary_nodes_created, path, source_alcn)
346 
347   path['modified_at'] ||= @website.tree[source_alcn]['modified_at'] if source_alcn
348   path.set_io { StringIO.new(content) }
349 
350   nodes = if path['handler']
351             @website.blackboard.dispatch_msg(:apply_meta_info_to_path, path)
352             create_nodes_with_path_handler(path, path['handler'])
353           else
354             create_nodes([path])
355           end
356   @website.blackboard.dispatch_msg(:after_secondary_nodes_created, path, nodes)
357 
358   if source_alcn
359     path.set_io(&nil)
360     _, _, stored_alcns = @secondary_nodes.delete(path)
361     cur_alcns = nodes.map {|n| n.alcn}
362     (stored_alcns - cur_alcns).each {|n| @website.tree.delete_node(@website.tree[n])} if stored_alcns
363     @secondary_nodes[path.dup] = [content, source_alcn, cur_alcns]
364   end
365 
366   nodes
367 end
instance(handler) click to toggle source

Return the instance of the path handler class with the given name.

    # File lib/webgen/path_handler.rb
212 def instance(handler)
213   @instances[handler.intern] ||= extension(handler).new(@website)
214 end
populate_tree() click to toggle source

Populate the website tree with nodes.

Can only be called once because the tree can only be populated once!

    # File lib/webgen/path_handler.rb
226 def populate_tree
227   raise Webgen::NodeCreationError.new("Can't populate tree twice", 'path_handler') if @website.tree.root
228 
229   time = Benchmark.measure do
230     meta_info, rest = @website.ext.source.paths.partition {|path| path.path =~ /[\/.]metainfo$/}
231 
232     used_paths = []
233 
234     @website.blackboard.add_listener(:before_node_created, 'path_handler (temp_populate_tree)') do |path|
235       used_paths << path
236     end
237     create_nodes(meta_info, [:meta_info])
238     create_nodes(rest)
239     @website.blackboard.remove_listener(:before_node_created, 'path_handler (temp_populate_tree)')
240 
241     unused_paths = rest - used_paths
242     @website.logger.vinfo do
243       "The following source paths have not been used: #{unused_paths.join(', ')}"
244     end if unused_paths.length > 0
245 
246     (@website.cache[:path_handler_secondary_nodes] || {}).each do |path, (content, source_alcn, _)|
247       next if !@website.tree[source_alcn]
248       create_secondary_nodes(path, content, source_alcn)
249     end
250   end
251   @website.logger.vinfo do
252     "Populating node tree took " << ('%2.2f' % time.real) << ' seconds'
253   end
254 
255   @website.blackboard.dispatch_msg(:after_tree_populated)
256 end
register(klass, options={}, &block) click to toggle source

Register a path handler.

The parameter klass has to contain the name of the path handler class or the class object itself. If the class is located under this namespace, only the class name without the hierarchy part is needed, otherwise the full class name including parent module/class names is needed.

Options:

:name

The name for the path handler. If not set, it defaults to the snake-case version of the class name (without the hierarchy part). It should only contain letters.

:patterns

A list of path patterns for which the path handler should be used. If not specified, defaults to an empty list.

:insert_at

Specifies the position in the invocation list. If not specified or if :end is specified, the handler is added to the end of the list. If :front is specified, it is added to the beginning of the list. Otherwise the value is expected to be a position number and the path handler is added at the specified position in the list.

Examples:

path_handler.register('Template')     # registers Webgen::PathHandler::Template

path_handler.register('::Template')   # registers Template !!!

path_handler.register('MyModule::Doit', name: 'template', patterns: ['**/*.template'])
    # File lib/webgen/path_handler.rb
197 def register(klass, options={}, &block)
198   name = do_register(klass, options, false, &block)
199   ext_data(name).patterns = options[:patterns] || []
200   pos = if options[:insert_at].nil? || options[:insert_at] == :end
201           -1
202         elsif options[:insert_at] == :front
203           0
204         else
205           options[:insert_at].to_i
206         end
207   @invocation_order.delete(name)
208   @invocation_order.insert(pos, name)
209 end
write_tree() click to toggle source

Write all changed nodes of the website tree to their respective destination using the Destination object at website.ext.destination.

Returns the number of passes needed for correctly writing out all paths.

    # File lib/webgen/path_handler.rb
262 def write_tree
263   passes = 0
264   content = nil
265 
266   begin
267     at_least_one_node_written = false
268     @website.cache.reset_volatile_cache
269     @website.blackboard.dispatch_msg(:before_all_nodes_written)
270     @website.tree.node_access[:alcn].sort_by {|a, n| [n['write_order'].to_s, a]}.each do |name, node|
271       begin
272         next if node == @website.tree.dummy_root ||
273           (node['passive'] && !node['no_output'] && !@website.ext.item_tracker.node_referenced?(node)) ||
274           ((@website.config['website.dry_run'] || node['no_output'] || @website.ext.destination.exists?(node.dest_path)) &&
275            !@website.ext.item_tracker.node_changed?(node))
276 
277         @website.blackboard.dispatch_msg(:before_node_written, node)
278         if !node['no_output']
279           content = write_node(node)
280           at_least_one_node_written = true
281         end
282         @website.blackboard.dispatch_msg(:after_node_written, node, content)
283       rescue Webgen::Error => e
284         e.path = node.alcn if e.path.to_s.empty?
285         e.location = "path_handler.#{name_of_instance(node.node_info[:path_handler])}" unless e.location
286         raise
287       rescue Exception => e
288         raise Webgen::RenderError.new(e, "path_handler.#{name_of_instance(node.node_info[:path_handler])}", node)
289       end
290     end
291     @website.blackboard.dispatch_msg(:after_all_nodes_written)
292     passes += 1 if at_least_one_node_written
293   end while at_least_one_node_written
294 
295   @website.blackboard.dispatch_msg(:website_generated)
296   passes
297 end

Private Instance Methods

create_nodes(paths, handlers = @invocation_order) click to toggle source

Use the registered path handlers to create nodes which are all returned.

    # File lib/webgen/path_handler.rb
318 def create_nodes(paths, handlers = @invocation_order)
319   nodes = []
320   paths.each {|path| @website.blackboard.dispatch_msg(:apply_meta_info_to_path, path)}
321   handlers.each do |name|
322     paths_for_handler(name.to_s, paths).each do |path|
323       nodes += create_nodes_with_path_handler(path, name)
324     end
325   end
326   nodes
327 end
create_nodes_with_path_handler(path, handler) { |path| ... } click to toggle source

Prepare everything to create nodes from the path using the given handler. After the nodes are created, it is checked if they have all needed properties.

Returns an array with all created nodes.

    # File lib/webgen/path_handler.rb
388 def create_nodes_with_path_handler(path, handler) #:yields: path
389   *data = instance(handler).parse_meta_info!(path)
390 
391   (path.meta_info.delete('versions') || {'default' => {}}).map do |name, mi|
392     vpath = path.dup
393     (mi ||= {})['version'] ||= name
394     vpath.meta_info.merge!(mi)
395     @website.logger.debug do
396       "Creating node version '#{vpath['version']}' from path <#{vpath}> with #{handler} handler"
397     end
398     @website.blackboard.dispatch_msg(:before_node_created, vpath)
399     instance(handler).create_nodes(vpath, *data)
400   end.flatten.compact.each do |node|
401     @website.blackboard.dispatch_msg(:after_node_created, node)
402   end
403 rescue Webgen::Error => e
404   e.path = path.to_s if e.path.to_s.empty?
405   e.location = "path_handler.#{handler}" unless e.location
406   raise
407 rescue Exception => e
408   raise Webgen::NodeCreationError.new(e, "path_handler.#{handler}", path)
409 end
name_of_instance(instance) click to toggle source

Return the name of the give path handler instance.

    # File lib/webgen/path_handler.rb
217 def name_of_instance(instance)
218   @instances.rassoc(instance)[0] || 'unknown'
219 end
paths_for_handler(name, paths) click to toggle source

Return the paths which are handled by the path handler name (where name is a String).

    # File lib/webgen/path_handler.rb
370 def paths_for_handler(name, paths)
371   patterns = ext_data(name).patterns || []
372 
373   options = (@website.config['path_handler.patterns.case_sensitive'] ? 0 : File::FNM_CASEFOLD) |
374     (@website.config['path_handler.patterns.match_leading_dot'] ? File::FNM_DOTMATCH : 0) |
375     File::FNM_PATHNAME
376 
377   paths.select do |path|
378     path.meta_info['handler'] == name ||
379       patterns.any? {|pat| Webgen::Path.matches_pattern?(path, pat, options)}
380   end.sort {|a,b| a.path <=> b.path}
381 end
write_node(node) click to toggle source

Write the given node to the destination.

    # File lib/webgen/path_handler.rb
300 def write_node(node)
301   @current_dest_node = node
302   @website.logger.info do
303     "[#{(@website.ext.destination.exists?(node.dest_path) ? 'update' : 'create')}] <#{node.dest_path}>"
304   end
305   content = nil
306   time = Benchmark.measure { content = node.content }
307   @website.ext.destination.write(node.dest_path, content)
308   @website.logger.vinfo do
309     "[timing] <#{node.dest_path}> rendered in " << ('%2.2f' % time.real) << ' seconds'
310   end
311   content
312 ensure
313   @current_dest_node = nil
314 end