class DaimonSkycrawlers::Filter::DuplicateChecker

This filter provides duplication checker for given URL.

Skip processing duplicated URLs.

Public Class Methods

new(base_url: nil) click to toggle source
# File lib/daimon_skycrawlers/filter/duplicate_checker.rb, line 12
def initialize(base_url: nil)
  @base_url = nil
  @base_url = URI(base_url) if base_url
  @urls = Set.new
end

Public Instance Methods

call(message) click to toggle source

@param message [Hash] message to check duplication. If given URL is

relative URL, use `@base_url + url` as absolute URL.

@return [true|false] Return false when duplicated, otherwise return true.

# File lib/daimon_skycrawlers/filter/duplicate_checker.rb, line 23
def call(message)
  url = normalize_url(message[:url])
  return false if @urls.include?(url)
  @urls << url
  true
end
duplicated?(message) click to toggle source

@param message [Hash] message to check duplication. If given URL is

relative URL, use `@base_url + url` as absolute URL.

@return [true|false] Return true when duplicated, otherwise return false.

# File lib/daimon_skycrawlers/filter/duplicate_checker.rb, line 35
def duplicated?(message)
  !call(message)
end