class ProxyFetcher::Providers::Base

Base class for all the ProxyFetcher providers.

Public Class Methods

fetch_proxies!(*args) click to toggle source

Just synthetic sugar to make it easier to call fetch_proxies! method.

# File lib/proxy_fetcher/providers/base.rb, line 42
def self.fetch_proxies!(*args)
  new.fetch_proxies!(*args)
end

Public Instance Methods

fetch_proxies(filters = {}) click to toggle source

Loads proxy provider page content, extract proxy list from it and convert every entry to proxy object.

# File lib/proxy_fetcher/providers/base.rb, line 9
def fetch_proxies(filters = {})
  raw_proxies = load_proxy_list(filters)
  proxies = raw_proxies.map { |html_node| build_proxy(html_node) }.compact
  proxies.reject { |proxy| proxy.addr.nil? }
end
Also aliased as: fetch_proxies!
fetch_proxies!(filters = {})

For retro-compatibility

Alias for: fetch_proxies
provider_headers() click to toggle source

@return [Hash]

Provider headers required to fetch the proxy list
# File lib/proxy_fetcher/providers/base.rb, line 33
def provider_headers
  {}
end
provider_method() click to toggle source
# File lib/proxy_fetcher/providers/base.rb, line 22
def provider_method
  :get
end
provider_params() click to toggle source
# File lib/proxy_fetcher/providers/base.rb, line 26
def provider_params
  {}
end
provider_url() click to toggle source
# File lib/proxy_fetcher/providers/base.rb, line 18
def provider_url
  raise NotImplementedError, "#{__method__} must be implemented in a descendant class!"
end
xpath() click to toggle source
# File lib/proxy_fetcher/providers/base.rb, line 37
def xpath
  raise NotImplementedError, "#{__method__} must be implemented in a descendant class!"
end

Protected Instance Methods

build_proxy(*args) click to toggle source
# File lib/proxy_fetcher/providers/base.rb, line 107
def build_proxy(*args)
  to_proxy(*args)
rescue StandardError => e
  ProxyFetcher.logger.warn(
    "Failed to build Proxy for #{self.class.name.split("::").last} " \
    "due to error: #{e.message}"
  )

  nil
end
load_document(url, filters = {}) click to toggle source

Loads provider HTML and parses it with internal document object.

@param url [String]

URL to fetch

@param filters [Hash]

filters for proxy provider

@return [ProxyFetcher::Document]

ProxyFetcher document object
# File lib/proxy_fetcher/providers/base.rb, line 90
def load_document(url, filters = {})
  html = load_html(url, filters)
  ProxyFetcher::Document.parse(html)
end
load_html(url, filters = {}) click to toggle source

Loads raw provider HTML with proxies.

@param url [String]

Provider URL

@param filters [#to_h]

Provider filters (Hash-like object)

@return [String]

HTML body from the response
# File lib/proxy_fetcher/providers/base.rb, line 59
def load_html(url, filters = {})
  unless filters.respond_to?(:to_h)
    raise ArgumentError, "filters must be a Hash or respond to #to_h"
  end

  if filters&.any?
    # TODO: query for post request?
    uri = URI.parse(url)
    uri.query = URI.encode_www_form(provider_params.merge(filters.to_h))
    url = uri.to_s
  end

  ProxyFetcher.config.http_client.fetch(
    url,
    method: provider_method,
    headers: provider_headers,
    params: provider_params
  )
end
load_proxy_list(filters = {}) click to toggle source

Fetches HTML content by sending HTTP request to the provider URL and parses the document (built as abstract ProxyFetcher::Document) to return all the proxy entries (HTML nodes).

@return [Array<ProxyFetcher::Document::Node>]

Collection of extracted HTML nodes with full proxy info
# File lib/proxy_fetcher/providers/base.rb, line 102
def load_proxy_list(filters = {})
  doc = load_document(provider_url, filters)
  doc.xpath(xpath)
end
to_proxy(*) click to toggle source

Convert HTML element with proxy info to ProxyFetcher::Proxy instance.

Abstract method. Must be implemented in a descendant class

@return [Proxy]

new proxy object from the HTML node
# File lib/proxy_fetcher/providers/base.rb, line 125
def to_proxy(*)
  raise NotImplementedError, "#{__method__} must be implemented in a descendant class!"
end