module Webgen::ContentProcessor::Fragments

Uses the HTML headers h1, h2, …, h6 (only those with valid IDs) to generate nested fragment nodes.

Constants

HTML_ATTR_REGEXP
HTML_HEADER_REGEXP

Public Class Methods

call(context) click to toggle source

Create the nested fragment nodes from the content under the content node but only if there is no associated block or the block is named content.

   # File lib/webgen/content_processor/fragments.rb
14 def self.call(context)
15   if !context[:block_name] || context[:block_name] == 'content'
16     sections = parse_html_headers(context.content)
17     create_fragment_nodes(context, sections, context.content_node)
18   end
19   context
20 end
create_fragment_nodes(context, sections, parent, si = 1000) click to toggle source

Create nested fragment nodes under parent from sections (which can be created using parse_html_headers).

The meta info sort_info is calculated from the base si value.

   # File lib/webgen/content_processor/fragments.rb
63 def self.create_fragment_nodes(context, sections, parent, si = 1000)
64   sections.each do |level, id, title, sub_sections|
65     path = Webgen::Path.new(parent.alcn.sub(/#.*$/, '') + '#' + id)
66     path['parent_alcn'] = parent.alcn
67     path['handler'] = 'copy'
68     path['pipeline'] = []
69     path['no_output'] = true
70     path['title'] = title
71     path['sort_info'] = si = si.succ
72     node = context.website.ext.path_handler.create_secondary_nodes(path, '', context.content_node.alcn).first
73 
74     create_fragment_nodes(context, sub_sections, node, si.succ)
75   end
76 end
parse_html_headers(content) click to toggle source

Parse the string content for headers h1, …, h6 and return the found, nested sections.

Only those headers are used which have an id attribute set. The method returns a list of arrays with entries 'level, id, title, sub sections' where 'sub sections' is such a list again.

   # File lib/webgen/content_processor/fragments.rb
31 def self.parse_html_headers(content)
32   sections = []
33   stack = []
34   content.scan(HTML_HEADER_REGEXP).each do |level,attrs,title|
35     next if attrs.nil?
36     id_attr = attrs.scan(HTML_ATTR_REGEXP).find {|name,sep,value| name == 'id'}
37     next if id_attr.nil?
38     id = id_attr[2]
39 
40     section = [level.to_i, id, title, []]
41     success = false
42     while !success
43       if stack.empty?
44         sections << section
45         stack << section
46         success = true
47       elsif stack.last.first < section.first
48         stack.last.last << section
49         stack << section
50         success = true
51       else
52         stack.pop
53       end
54     end
55   end
56   sections
57 end