class DaimonSkycrawlers::Filter::RobotsTxtChecker

This filter provides robots.txt checker for given URL. We want to obey robots.txt provided by a web site.

Public Class Methods

new(base_url: nil, user_agent: "DaimonSkycrawlers/ click to toggle source
Calls superclass method DaimonSkycrawlers::Filter::Base::new
# File lib/daimon_skycrawlers/filter/robots_txt_checker.rb, line 12
def initialize(base_url: nil, user_agent: "DaimonSkycrawlers/#{DaimonSkycrawlers::VERSION}")
  super()
  @base_url = base_url
  @webrobots = WebRobots.new(user_agent)
end

Public Instance Methods

allowed?(message)
Alias for: call
call(message) click to toggle source

@param message [Hash] check given URL is allowed or not by robots.txt @return [true|false] Return true when web site allows to fetch the URL, otherwise return false

# File lib/daimon_skycrawlers/filter/robots_txt_checker.rb, line 22
def call(message)
  url = normalize_url(message[:url])
  @webrobots.allowed?(url)
end
Also aliased as: allowed?