class DaimonSkycrawlers::Filter::DuplicateChecker
This filter provides duplication checker for given URL.
Skip processing duplicated URLs.
Public Class Methods
new(base_url: nil)
click to toggle source
# File lib/daimon_skycrawlers/filter/duplicate_checker.rb, line 12 def initialize(base_url: nil) @base_url = nil @base_url = URI(base_url) if base_url @urls = Set.new end
Public Instance Methods
call(message)
click to toggle source
@param message [Hash] message to check duplication. If given URL is
relative URL, use `@base_url + url` as absolute URL.
@return [true|false] Return false when duplicated, otherwise return true.
# File lib/daimon_skycrawlers/filter/duplicate_checker.rb, line 23 def call(message) url = normalize_url(message[:url]) return false if @urls.include?(url) @urls << url true end
duplicated?(message)
click to toggle source
@param message [Hash] message to check duplication. If given URL is
relative URL, use `@base_url + url` as absolute URL.
@return [true|false] Return true when duplicated, otherwise return false.
# File lib/daimon_skycrawlers/filter/duplicate_checker.rb, line 35 def duplicated?(message) !call(message) end