class RequestLogAnalyzer::FileFormat::Apache

The Apache file format is able to log Apache access.log files.

The access.log can be configured in Apache to have many different formats. In theory, this FileFormat can handle any format, but it must be aware of the log formatting that is used by sending the formatting string as parameter to the create method, e.g.:

RequestLogAnalyzer::FileFormat::Apache.create('%h %l %u %t "%r" %>s %b')

It also supports the predefined Apache log formats “common” and “combined”. The line definition and the report definition will be constructed using this file format string. From the command line, you can provide the format string using the --apache-format command line option.

Constants

APACHE_TIMESTAMP

I have encountered two timestamp types, with timezone and without. Parse both.

LOG_DIRECTIVES

A hash that defines how the log format directives should be parsed.

LOG_FORMAT_DEFAULTS

A hash of predefined Apache log formats

Public Class Methods

access_line_definition(format_string) click to toggle source

Creates the access log line definition based on the Apache log format string

   # File lib/request_log_analyzer/file_format/apache.rb
66 def self.access_line_definition(format_string)
67   format_string ||= :common
68   format_string   = LOG_FORMAT_DEFAULTS[format_string.to_sym] || format_string
69 
70   line_regexp = ''
71   captures    = []
72   format_string.scan(/([^%]*)(?:%(?:\{([^\}]+)\})?>?([A-Za-z%]))?/) do |literal, arg, variable|
73 
74     line_regexp << Regexp.quote(literal) # Make sure to parse the literal before the directive
75 
76     if variable
77       # Check if we recognize the log directive
78       directive = LOG_DIRECTIVES[variable][arg] rescue nil
79 
80       if directive
81         line_regexp << directive[:regexp]   # Parse the value of the directive
82         captures    += directive[:captures] # Add the directive's information to the captures
83       else
84         puts "Apache log directive %#{arg}#{variable} is not yet supported by RLA, the field will be ignored."
85         line_regexp << '.*' # Just accept any input for this literal
86       end
87     end
88   end
89 
90   # Return a new line definition object
91   RequestLogAnalyzer::LineDefinition.new(:access, regexp: Regexp.new(line_regexp),
92                                     captures: captures, header: true, footer: true)
93 end
create(*args) click to toggle source

Creates the Apache log format language based on a Apache log format string. It will set up the line definition and the report trackers according to the Apache access log format, which should be passed as first argument. By default, is uses the ‘combined’ log format.

   # File lib/request_log_analyzer/file_format/apache.rb
59 def self.create(*args)
60   access_line = access_line_definition(args.first)
61   trackers = report_trackers(access_line) + report_definer.trackers
62   new(line_definer.line_definitions.merge(access: access_line), trackers)
63 end
report_trackers(line_definition) click to toggle source

Sets up the report trackers according to the fields captured by the access line definition.

    # File lib/request_log_analyzer/file_format/apache.rb
 96 def self.report_trackers(line_definition)
 97   analyze = RequestLogAnalyzer::Aggregator::Summarizer::Definer.new
 98 
 99   analyze.timespan      if line_definition.captures?(:timestamp)
100   analyze.hourly_spread if line_definition.captures?(:timestamp)
101 
102   analyze.frequency category: :http_method, title: 'HTTP methods'  if line_definition.captures?(:http_method)
103   analyze.frequency category: :http_status, title: 'HTTP statuses' if line_definition.captures?(:http_status)
104   analyze.frequency category: lambda { |r| r.category }, title: 'Most popular URIs'    if line_definition.captures?(:path)
105 
106   analyze.frequency category: :user_agent, title: 'User agents'    if line_definition.captures?(:user_agent)
107   analyze.frequency category: :referer,    title: 'Referers'       if line_definition.captures?(:referer)
108 
109   analyze.duration duration: :duration,  category: lambda { |r| r.category }, title: 'Request duration' if line_definition.captures?(:duration)
110   analyze.traffic traffic: :bytes_sent, category: lambda { |r| r.category }, title: 'Traffic'          if line_definition.captures?(:bytes_sent)
111 
112   analyze.trackers
113 end