Module: Textminer
- Extended by:
- Configuration
- Defined in:
- lib/textminer/mined.rb,
lib/textminer.rb,
lib/textminer/miner.rb,
lib/textminer/request.rb,
lib/textminer/version.rb,
lib/textminer/response.rb
Overview
Textminer::Miner
Class to give back text mining object
Defined Under Namespace
Classes: Mined, Miner, Request, Response
Constant Summary
- VERSION =
"0.1.5"
Class Method Summary (collapse)
-
+ (Object) extract(path)
Thin layer around pdf-reader gem's PDF::Reader.
-
+ (Mined) fetch(url)
Get full text.
-
+ (Array) search(doi: nil, member: nil, filter: nil, limit: nil, options: nil)
Search for papers and get full text links.
Methods included from Configuration
Class Method Details
+ (Object) extract(path)
Thin layer around pdf-reader gem's PDF::Reader
This method is used internally within fetch to parse PDFs.
140 141 142 143 |
# File 'lib/textminer.rb', line 140 def self.extract(path) rr = PDF::Reader.new(path) rr.pages.map { |page| page.text }.join("\n") end |
+ (Mined) fetch(url)
Get full text
Work easily for open access papers, but for closed. For non-OA papers, use Crossref's Text and Data Mining service, which requires authentication and pre-authorized IP address. Go to apps.crossref.org/clickthrough/researchers to sign up for the TDM service, to get your key. The only publishers taking part at this time are Elsevier and Wiley.
the url requested, the file path, and parsing the plain text, XML, or extracting text from the pdf.
120 121 122 |
# File 'lib/textminer.rb', line 120 def self.fetch(url) Miner.new(url).perform end |
+ (Array) search(doi: nil, member: nil, filter: nil, limit: nil, options: nil)
Search for papers and get full text links
48 49 50 |
# File 'lib/textminer.rb', line 48 def self.search(doi: nil, member: nil, filter: nil, limit: nil, options: nil) Request.new(doi, member, filter, limit, ).perform end |