module NewsScraper::Trainer

Public Instance Methods

train(query: '') click to toggle source

Fetches articles from Extraction sources and trains on the results

Training is a process where we take an untrained url (root domain is not in article_scrape_patterns.yml) and determine patterns and methods to match the data_types listed in article_scrape_patterns.yml, then record them to the article_scrape_patterns.yml file

Params

  • query: a keyword arugment specifying the query to train on

# File lib/news_scraper/trainer.rb, line 18
def train(query: '')
  article_urls = Extractors::GoogleNewsRss.new(query: query).extract
  article_urls.each do |url|
    Trainer::UrlTrainer.new(url).train
  end
end