module Failbot

Failbot asynchronously takes exceptions and reports them to the exception logger du jour. Keeps the main app from failing or lagging if the exception logger service is down or slow.

This file exists so that the unhandled exception hook may easily be injected into programs that don’t register it themselves. It also provides a lightweight failbot interface that doesn’t bring in any other libraries until a report is made, which is useful for environments where boot time is important.

To use, set RUBYOPT or pass an -r argument to ruby:

RUBYOPT=rfailbot/exit_hook some-program.rb

Or:

ruby -rfailbot/exit_hook some-program.rb

Your program can also require this library instead of ‘failbot’ to minimize the amount of up-front processing required and automatically install the exit hook.

require 'failbot/exit_hook'

The ‘failbot’ lib is loaded in full the first time an actual report is made.

Constants

DEFAULT_ROLLUP

Default rollup for an exception. Exceptions with the same rollup are grouped together in Haystack. The rollup is an MD5 hash of the exception class and the raising file, line, and method.

EXCEPTION_DETAIL
EXCEPTION_FORMATS

Enumerates the available exception formats this gem supports. The original format and the default is :haystack and the newer format is :structured

MAXIMUM_CAUSE_DEPTH

We’ll include this many nested Exception#cause objects in the needle context. We limit the number of objects to prevent excessive recursion and large needle contexts.

VERSION

Attributes

backtrace_parser[R]
instrumenter[RW]

Public: Set an instrumenter to be called when exceptions are reported.

class CustomInstrumenter
  def instrument(name, payload = {})
    warn "Exception: #{payload["class"]}\n#{payload.inspect}"
  end
end

Failbot.instrumenter = CustomInstrumenter

The instrumenter must conform to the ‘ActiveSupport::Notifications` interface, which defines `#instrument` and accepts:

name - the String name of the event (e.g. “report.failbot”) payload - a Hash of the exception context.

source_root[R]

Public Class Methods

backtrace_parser=(callable) click to toggle source

Set a callable that is responsible for parsing and formatting ruby backtraces. This is only necessary to set if your app deals with exceptions that are manipulated to contain something other than actual stackframe strings in the format produced by ‘caller`. The argument passed must respond to `call` with an arity of 1. The callable expects to be passed Exception instances as its argument.

# File lib/failbot.rb, line 100
def self.backtrace_parser=(callable)
  unless callable.respond_to?(:call)
    raise ArgumentError, "backtrace_parser= passed #{callable.inspect}, which is not callable"
  end
  if callable.method(:call).arity != 1
    raise ArgumentError, "backtrace_parser= passed #{callable.inspect}, whose `#call` has arity =! 1"
  end
  @backtrace_parser = callable
end
exception_classname_from_hash(hash) click to toggle source
# File lib/failbot.rb, line 120
def self.exception_classname_from_hash(hash)
  @exception_formatter.exception_classname_from_hash(hash)
end
exception_format=(identifier) click to toggle source

Set the current exception format.

# File lib/failbot.rb, line 84
def self.exception_format=(identifier)
  @exception_formatter = EXCEPTION_FORMATS.fetch(identifier) do
    fail ArgumentError, "#{identifier} is not an available exception_format (want one of #{EXCEPTION_FORMATS.keys})"
  end
end
exception_message_from_hash(hash) click to toggle source

Helpers needed to parse hashes included in e.g. Failbot.reports.

# File lib/failbot.rb, line 116
def self.exception_message_from_hash(hash)
  @exception_formatter.exception_message_from_hash(hash)
end

Public Instance Methods

already_reporting() click to toggle source
# File lib/failbot.rb, line 502
def already_reporting
  @thread_local_already_reporting.value
end
already_reporting=(bool) click to toggle source
# File lib/failbot.rb, line 498
def already_reporting=(bool)
  @thread_local_already_reporting.value = bool
end
before_report(&block) click to toggle source

Public: your last chance to modify the context that is to be reported with an exception.

The key value pairs that are returned from your block will get squashed into the context, replacing the values of any keys that were already present.

Example:

Failbot.before_report do |exception, context|

# context is { "a" => 1, "b" => 2 }
{ :a => 0, :c => 3 }

end

context gets reported as { “a” => 0, “b” => “2”, “c” => 3 }

# File lib/failbot.rb, line 274
def before_report(&block)
  @before_report = block
end
clear_before_report() click to toggle source

For tests

# File lib/failbot.rb, line 279
def clear_before_report
  @before_report = nil
end
context() click to toggle source

Stack of context information to include in the next failbot report. These hashes are condensed down into one and included in the next report. Don’t mess with this structure directly - use the push and pop methods.

# File lib/failbot.rb, line 209
def context
  @thread_local_context.value
end
disable(&block) click to toggle source

Public: Disable exception reporting. This is equivalent to calling ‘Failbot.setup(“FAILBOT_REPORT” => 0)`, but can be called after setup.

Failbot.disable do
  something_that_might_go_kaboom
end

block - an optional block to perform while reporting is disabled. If a block

is passed, reporting will be re-enabled after the block is called.
# File lib/failbot.rb, line 373
def disable(&block)
  original_report_errors = @thread_local_report_errors.value
  @thread_local_report_errors.value = false

  if block
    begin
      block.call
    ensure
      @thread_local_report_errors.value = original_report_errors
    end
  end
end
enable() click to toggle source

Public: Enable exception reporting. Reporting is enabled by default, but this can be called if it is explicitly disabled by calling ‘Failbot.disable` or setting `FAILBOT_REPORTING => “0”` in `Failbot.setup`.

# File lib/failbot.rb, line 389
def enable
  @thread_local_report_errors.value = true
end
exception_info(e) click to toggle source

Extract exception info into a simple Hash.

e - The exception object to turn into a Hash.

Returns a Hash.

# File lib/failbot.rb, line 458
def exception_info(e)
  res = @exception_formatter.call(e)

  if exception_context = (e.respond_to?(:failbot_context) && e.failbot_context)
    res.merge!(exception_context)
  end

  if original = (e.respond_to?(:original_exception) && e.original_exception)
    remote_backtrace  = []
    remote_backtrace << original.message
    if original.backtrace
      remote_backtrace.concat(Array(original.backtrace)[0,500])
    end
    res['remote_backtrace'] = remote_backtrace.join("\n")
  end

  res
end
hostname() click to toggle source
# File lib/failbot.rb, line 494
def hostname
  @hostname ||= Socket.gethostname
end
install_unhandled_exception_hook!() click to toggle source

Installs an at_exit hook to report exceptions that raise all the way out of the stack and halt the interpreter. This is useful for catching boot time errors as well and even signal kills.

To use, call this method very early during the program’s boot to cover as much code as possible:

require 'failbot'
Failbot.install_unhandled_exception_hook!

Returns true when the hook was installed, nil when the hook had previously been installed by another component.

# File lib/failbot/exit_hook.rb, line 51
def install_unhandled_exception_hook!
  # only install the hook once, even when called from multiple locations
  return if @unhandled_exception_hook_installed

  # the $! is set when the interpreter is exiting due to an exception
  at_exit do
    boom = $!
    if boom && !@raise_errors && !boom.is_a?(SystemExit)
      report(boom, 'argv' => ([$0]+ARGV).join(" "), 'halting' => true)
    end
  end

  @unhandled_exception_hook_installed = true
end
logger() click to toggle source
# File lib/failbot.rb, line 477
def logger
  @logger ||= Logger.new($stderr, formatter: proc { |severity, datetime, progname, msg|
    log = case msg
          when Hash
            msg.map { |k,v| "#{k}=#{v.inspect}" }.join(" ")
          else
            %Q|msg="#{msg.inspect}"|
          end
    log_line = %Q|ts="#{datetime.utc.iso8601}" level=#{severity} logger=Failbot #{log}\n|
    log_line.lstrip
  })
end
logger=(logger) click to toggle source
# File lib/failbot.rb, line 490
def logger=(logger)
  @logger = logger
end
method_missing(method, *args, &block) click to toggle source

Tap into any other method invocation on the Failbot module (especially report) and lazy load and configure everything the first time.

Calls superclass method
# File lib/failbot/exit_hook.rb, line 75
def method_missing(method, *args, &block)
  return super if @failbot_loaded
  require 'failbot'
  send(method, *args, &block)
end
pop() click to toggle source

Remove the last info hash from the context stack.

# File lib/failbot.rb, line 233
def pop
  context.pop if context.size > 1
end
push(info={}) { || ... } click to toggle source

Add info to be sent in the next failbot report, should one occur.

info - Hash of name => value pairs to include in the exception report. block - When given, the info is removed from the current context after the

block is executed.

Returns the value returned by the block when given; otherwise, returns nil.

# File lib/failbot.rb, line 220
def push(info={})
  info.each do |key, value|
    if value.kind_of?(Proc)
      raise ArgumentError, "Proc usage has been removed from Failbot"
    end
  end
  context.push(info)
  yield if block_given?
ensure
  pop if block_given?
end
remove_from_report(key) click to toggle source

Loops through the stack of contexts and deletes the given key if it exists.

key - Name of key to remove.

Examples

remove_from_report(:some_key)

remove_from_report("another_key")

Returns nothing.

# File lib/failbot.rb, line 253
def remove_from_report(key)
  context.each do |hash|
    hash.delete(key.to_s)
    hash.delete(key.to_sym)
  end
end
report(e, other = {}) click to toggle source

Public: Sends an exception to the exception tracking service along with a hash of custom attributes to be included with the report. When the raise_errors option is set, this method raises the exception instead of reporting to the exception tracking service.

e - The Exception object. Must respond to message and backtrace. other - Hash of additional attributes to include with the report.

Examples

begin
  my_code
rescue => e
  Failbot.report(e, :user => current_user)
end

Returns nothing.

# File lib/failbot.rb, line 334
def report(e, other = {})
  return if ignore_error?(e)

  if @raise_errors
    squash_contexts(context, exception_info(e), other) # surface problems squashing
    raise e
  else
    report!(e, other)
  end
end
report!(e, other = {}) click to toggle source
# File lib/failbot.rb, line 345
def report!(e, other = {})
  report_with_context!(Thread.current, context, e, other)
end
report_from_thread(thread, e, other = {}) click to toggle source
# File lib/failbot.rb, line 349
def report_from_thread(thread, e, other = {})
  if @raise_errors
    squash_contexts(@thread_local_context.value_from_thread(thread), exception_info(e), other) # surface problems squashing
    raise e
  else
    report_from_thread!(thread, e, other)
  end
end
report_from_thread!(thread, e, other = {}) click to toggle source
# File lib/failbot.rb, line 358
def report_from_thread!(thread, e, other = {})
  return if ignore_error?(e)

  report_with_context!(thread, @thread_local_context.value_from_thread(thread), e, other)
end
reports() click to toggle source

Public: exceptions that were reported. Only available when using the memory and file backends.

Returns an Array of exceptions data Hash.

# File lib/failbot.rb, line 397
def reports
  backend.reports
end
reset!() click to toggle source

Reset the context stack to a pristine state.

# File lib/failbot.rb, line 238
def reset!
  @thread_local_context.value = [context[0]].dup
end
rollup(&block) click to toggle source

Specify a custom block for calculating rollups. It should accept:

exception - The exception object context - The context hash

The block must return a String.

If a ‘rollup` attribute is supplied at the time of reporting, either via the `failbot_context` method on an exception, or passed to `Failbot.report`, it will be used as the rollup and this block will not be called.

# File lib/failbot.rb, line 293
def rollup(&block)
  @rollup = block
end
sanitize(attrs) click to toggle source
# File lib/failbot.rb, line 421
def sanitize(attrs)
  result = {}

  attrs.each do |key, value|
    result[key] =
      case value
      when Time
        value.iso8601
      when Date
        value.strftime("%F") # equivalent to %Y-%m-%d
      when Numeric
        value
      when String, true, false
        value.to_s
      when Proc
        "proc usage is deprecated"
      when Array
        if key == EXCEPTION_DETAIL
          # special-casing for the exception_detail key, which is allowed to
          # be an array with a specific structure.
          value
        else
          value.inspect
        end
      else
        value.inspect
      end
  end

  result
end
setup(settings={}, default_context={}) click to toggle source

Public: Setup the backend for reporting exceptions.

# File lib/failbot.rb, line 125
def setup(settings={}, default_context={})
  deprecated_settings = %w[
    backend host port haystack
    raise_errors
  ]

  if settings.empty? ||
    settings.keys.any? { |key| deprecated_settings.include?(key) }
    warn "%s Deprecated Failbot.setup usage. See %s for details." % [
      caller[0], "https://github.com/github/failbot"
    ]
    return setup_deprecated(settings)
  end

  initial_context = if default_context.respond_to?(:to_hash) && !default_context.to_hash.empty?
                      default_context.to_hash
                    else
                      { 'server' => hostname }
                    end

  @thread_local_context = ::Failbot::ThreadLocalVariable.new do
    [initial_context]
  end
  @thread_local_already_reporting = ::Failbot::ThreadLocalVariable.new { false }

  populate_context_from_settings(settings)
  @enable_timeout = false
  if settings.key?("FAILBOT_TIMEOUT_MS")
    @timeout_seconds = settings["FAILBOT_TIMEOUT_MS"].to_f / 1000
    @enable_timeout = (@timeout_seconds > 0.0)
  end

  @connect_timeout_seconds = nil
  if settings.key?("FAILBOT_CONNECT_TIMEOUT_MS")
    @connect_timeout_seconds = settings["FAILBOT_CONNECT_TIMEOUT_MS"].to_f / 1000
    # unset the value if it's not parsing to something valid
    @connect_timeout_seconds = nil unless @connect_timeout_seconds > 0
  end

  self.backend =
    case (name = settings["FAILBOT_BACKEND"])
    when "memory"
      Failbot::MemoryBackend.new
    when "waiter"
      Failbot::WaiterBackend.new
    when "file"
      Failbot::FileBackend.new(settings["FAILBOT_BACKEND_FILE_PATH"])
    when "http"
      Failbot::HTTPBackend.new(URI(settings["FAILBOT_HAYSTACK_URL"]), @connect_timeout_seconds, @timeout_seconds)
    when 'json'
      Failbot::JSONBackend.new(settings["FAILBOT_BACKEND_JSON_HOST"], settings["FAILBOT_BACKEND_JSON_PORT"])
    when 'console'
      Failbot::ConsoleBackend.new
    else
      raise ArgumentError, "Unknown backend: #{name.inspect}"
    end

  @raise_errors  = !settings["FAILBOT_RAISE"].to_s.empty?
  @thread_local_report_errors = ::Failbot::ThreadLocalVariable.new do
    settings["FAILBOT_REPORT"] != "0"
  end

  # allows overriding the 'app' value to send to single haystack bucket.
  # used primarily on ghe.io.
  @app_override = settings["FAILBOT_APP_OVERRIDE"]

  # Support setting exception_format from ENV/settings
  if settings["FAILBOT_EXCEPTION_FORMAT"]
    self.exception_format = settings["FAILBOT_EXCEPTION_FORMAT"].to_sym
  end

  @ignored_error_classes = settings.fetch("FAILBOT_IGNORED_ERROR_CLASSES", "").split(",").map do |class_name|
    Module.const_get(class_name.strip)
  end
end
source_root=(str) click to toggle source

Root directory of the project’s source. Used to clean up stack traces if the exception format supports it

# File lib/failbot.rb, line 53
def source_root=(str)
  @source_root = if str
    File.join(str, '')
  end
end
squash_contexts(*contexts_to_squash) click to toggle source

Combines all context hashes into a single hash converting non-standard data types in values to strings, then combines the result with a custom info hash provided in the other argument.

other - Optional array of hashes to also squash in on top of the context

stack hashes.

Returns a Hash with all keys and values.

# File lib/failbot.rb, line 409
def squash_contexts(*contexts_to_squash)
  squashed = {}

  contexts_to_squash.flatten.each do |hash|
    hash.each do |key, value|
      squashed[key.to_s] = value
    end
  end

  squashed
end
use_default_rollup() click to toggle source
# File lib/failbot.rb, line 311
def use_default_rollup
  rollup(&DEFAULT_ROLLUP)
end

Private Instance Methods

ignore_error?(error) click to toggle source
# File lib/failbot.rb, line 612
def ignore_error?(error)
  @cache ||= Hash.new do |hash, error_class|
    hash[error_class] = @ignored_error_classes.any? do |ignored_error_class|
      error_class.ancestors.include?(ignored_error_class)
    end
  end

  @cache[error.class]
end
instrument(name, payload = {}) click to toggle source

Internal: Publish an event to the instrumenter

# File lib/failbot.rb, line 566
def instrument(name, payload = {})
  Failbot.instrumenter.instrument(name, payload) if Failbot.instrumenter
end
log_failure(action, data, original_exception, exception) click to toggle source
# File lib/failbot.rb, line 581
def log_failure(action, data, original_exception, exception)
  begin
    record = {
      "msg" => "exception",
      "action" => action,
      "data" => data,
    }

    record.merge!(to_semconv(exception))
    logger.debug record

    record = {
      "msg" => "report-failed",
      "action" => action,
      "data" => data,
    }
    record.merge!(to_semconv(original_exception))
    logger.debug record
  rescue => e
    raise e
  end
end
populate_context_from_settings(settings) click to toggle source

Populate default context from settings. Since settings commonly comes from ENV, this allows setting defaults for the context via the environment.

# File lib/failbot.rb, line 572
def populate_context_from_settings(settings)
  settings.each do |key, value|
    if /\AFAILBOT_CONTEXT_(.+)\z/ =~ key
      key = $1.downcase
      context[0][key] = value unless context[0][key]
    end
  end
end
report_with_context!(thread, provided_context, e, other = {}) click to toggle source
# File lib/failbot.rb, line 508
def report_with_context!(thread, provided_context, e, other = {})
  return unless @thread_local_report_errors.value
  return if ignore_error?(e)

  if already_reporting
    logger.warn "FAILBOT: asked to report while reporting!" rescue nil
    logger.warn e.message rescue nil
    logger.warn e.backtrace.join("\n") rescue nil
    return
  end
  self.already_reporting = true

  begin
    data = squash_contexts(provided_context, exception_info(e), other)

    if !data.has_key?("rollup")
      data = data.merge("rollup" => @rollup.call(e, data, thread))
    end

    if defined?(@before_report) && @before_report
      data = squash_contexts(data, @before_report.call(e, data, thread))
    end

    if @app_override
      data = data.merge("app" => @app_override)
    end

    data = scrub(sanitize(data))
  rescue Object => i
    log_failure("processing", data, e, i)
    self.already_reporting = false
    return
  end

  start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  instrumentation_data = {
    "report_status" => "error",
  }
  begin
    if @enable_timeout
      Timeout.timeout(@timeout_seconds) do
        backend.report(data)
      end
    else
      backend.report(data)
    end
    instrumentation_data["report_status"] = "success"
  rescue Object => i
    log_failure("reporting", data, e, i)
    instrumentation_data["exception_type"] = i.class.name
  ensure
    instrumentation_data["elapsed_ms"] = ((Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time) * 1000).to_i
    instrument("report.failbot", data.merge(instrumentation_data)) rescue nil
    self.already_reporting = false
  end
end
to_semconv(exception) click to toggle source
# File lib/failbot.rb, line 604
def to_semconv(exception)
  {
    "exception.type" => exception.class.to_s,
    "exception.message" => exception.message.encode("UTF-8", invalid: :replace, undef: :replace, replace: '�'),
    "exception.backtrace" => exception.full_message(highlight: false, order: :top).encode('UTF-8', invalid: :replace, undef: :replace, replace: '�'),
  }
end