module RequestLogAnalyzer::FileFormat::CommonRegularExpressions

This module contains some methods to construct regular expressions for log fragments that are commonly used, like IP addresses and timestamp.

You need to extend (or include in an unlikely case) this module in your file format to use these regular expression constructors.

Constants

TIMESTAMP_PARTS

Public Instance Methods

anchored(regexp) click to toggle source
    # File lib/request_log_analyzer/file_format.rb
176 def anchored(regexp)
177   /^#{regexp}$/
178 end
hostname(blank = false) click to toggle source

Creates a regular expression to match a hostname

    # File lib/request_log_analyzer/file_format.rb
128 def hostname(blank = false)
129   regexp = /(?:(?:[a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*(?:[A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])/
130   add_blank_option(regexp, blank)
131 end
hostname_or_ip_address(blank = false) click to toggle source

Creates a regular expression to match a hostname or ip address

    # File lib/request_log_analyzer/file_format.rb
134 def hostname_or_ip_address(blank = false)
135   regexp = Regexp.union(hostname, ip_address)
136   add_blank_option(regexp, blank)
137 end
ip_address(blank = false) click to toggle source

Construct a regular expression to parse IPv4 and IPv6 addresses.

Allow nil values if the blank option is given. This can be true to allow an empty string or to a string substitute for the nil value.

    # File lib/request_log_analyzer/file_format.rb
163 def ip_address(blank = false)
164   # IP address regexp copied from Resolv::IPv4 and Resolv::IPv6,
165   # but adjusted to work for the purpose of request-log-analyzer.
166   ipv4_regexp                     = /\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/
167   ipv6_regex_8_hex                = /(?:[0-9A-Fa-f]{1,4}:){7}[0-9A-Fa-f]{1,4}/
168   ipv6_regex_compressed_hex       = /(?:(?:[0-9A-Fa-f]{1,4}(?::[0-9A-Fa-f]{1,4})*)?)::(?:(?:[0-9A-Fa-f]{1,4}(?::[0-9A-Fa-f]{1,4})*)?)/
169   ipv6_regex_6_hex_4_dec          = /(?:(?:[0-9A-Fa-f]{1,4}:){6})#{ipv4_regexp}/
170   ipv6_regex_compressed_hex_4_dec = /(?:(?:[0-9A-Fa-f]{1,4}(?::[0-9A-Fa-f]{1,4})*)?)::(?:(?:[0-9A-Fa-f]{1,4}:)*)#{ipv4_regexp}/
171   ipv6_regexp                     = Regexp.union(ipv6_regex_8_hex, ipv6_regex_compressed_hex, ipv6_regex_6_hex_4_dec, ipv6_regex_compressed_hex_4_dec)
172 
173   add_blank_option(Regexp.union(ipv4_regexp, ipv6_regexp), blank)
174 end
timestamp(format_string, blank = false) click to toggle source

Create a regular expression for a timestamp, generated by a strftime call. Provide the format string to construct a matching regular expression. Set blank to true to allow and empty string, or set blank to a string to set a substitute for the nil value.

    # File lib/request_log_analyzer/file_format.rb
143 def timestamp(format_string, blank = false)
144   regexp = ''
145   format_string.scan(/([^%]*)(?:%([A-Za-z%]))?/) do |literal, variable|
146     regexp << Regexp.quote(literal)
147     if variable
148       if TIMESTAMP_PARTS.key?(variable)
149         regexp << TIMESTAMP_PARTS[variable]
150       else
151         fail "Unknown variable: %#{variable}"
152       end
153     end
154   end
155 
156   add_blank_option(Regexp.new(regexp), blank)
157 end

Protected Instance Methods

add_blank_option(regexp, blank) click to toggle source

Allow the field to be blank if this option is given. This can be true to allow an empty string or a string alternative for the nil value.

    # File lib/request_log_analyzer/file_format.rb
184 def add_blank_option(regexp, blank)
185   case blank
186     when String then Regexp.union(regexp, Regexp.new(Regexp.quote(blank)))
187     when true then   Regexp.union(regexp, //)
188     else regexp
189   end
190 end