class Html2rss::ItemExtractors::Href

Returns the value of the href attribute. It always returns absolute URLs. If the extracted href value is a relative URL, it prepends the channel's URL.

Imagine this a HTML element with a href attribute:

<a href="/posts/latest-findings">...</a>

YAML usage example:

channel:
  url: http://blog-without-a-feed.example.com
  ...
selectors:
  link:
    selector: a
    extractor: href

Would return:

'http://blog-without-a-feed.example.com/posts/latest-findings'

Public Class Methods

new(xml, options) click to toggle source
# File lib/html2rss/item_extractors/href.rb, line 24
def initialize(xml, options)
  @options = options
  element = ItemExtractors.element(xml, options)
  @href = Html2rss::Utils.sanitize_url(element.attr('href'))
end

Public Instance Methods

get() click to toggle source

@return [URI::HTTPS, URI::HTTP]

# File lib/html2rss/item_extractors/href.rb, line 31
def get
  Html2rss::Utils.build_absolute_url_from_relative(@href, @options[:channel][:url])
end