class DoverToCalais::Dover
This class is responsible for parsing, reading and sending to OpenCalais, text from a data source. The data source is passed to the class constructor and can be pretty much any form of document or URL. The class allows the user to specify one or more callbacks, to be called when the data source has been processed by OpenCalais ({#to_calais}).
@!attribute [r] data_src
@return [String] the data source to be processed, either a file path or a URL.
@!attribute [r] error
@return [String, nil] any error that occurred during data-source processing, nil if none occurred
Constants
- CALAIS_SERVICE
Attributes
Public Class Methods
Public Instance Methods
Gets the source text parsed. If the parsing is successful, the data source is POSTed to OpenCalais via an EventMachine request and a callback is set to manage the OpenCalais response. All Dover
object callbacks are then called with the request result yielded to them.
@param N/A @return a {Class ResponseData} object
# File lib/dover_to_calais.rb, line 531 def analyse_this(output_format=nil) if output_format @output_format = 'application/json' else @output_format = 'Text/Simple' end @document = get_src_data(@data_src) begin if @document[0..2].eql?('ERR') raise 'Invalid data source' else response = nil connection_options = {:inactivity_timeout => 0} if DoverToCalais::PROXY && DoverToCalais::PROXY.class.eql?('Hash') && DoverToCalais::PROXY.keys[0].eql?(:proxy) connection_options = connection_options.merge(DoverToCalais::PROXY) end request_options = { :body => @document.to_s, :head => { 'x-calais-licenseID' => DoverToCalais::API_KEY, :content_type => 'TEXT/RAW', :enableMetadataType => 'GenericRelations,SocialTags', :outputFormat => @output_format} } http = EventMachine::HttpRequest.new(CALAIS_SERVICE, connection_options ).post request_options http.callback do if http.response_header.status == 200 if @output_format == 'Text/Simple' http.response.match(/<OpenCalaisSimple>/) do |m| response = Nokogiri::XML('<OpenCalaisSimple>' + m.post_match) do |config| #strict xml parsing, disallow network connections config.strict.nonet end #block end else #@output_format == 'application/json' response = JSON.parse(http.response) #response should now be a Hash end #if case response.class.to_s when 'NilClass' result = ResponseData.new(nil,'ERR: cannot parse response data - source invalid?') when 'Nokogiri::XML::Document' result = ResponseData.new(response, nil) when 'Hash' result = ResponseData.new(response, nil) else result = ResponseData.new(nil,'ERR: cannot parse response data - unrecognized format!') end else #non-200 response result = ResponseData.new nil, "ERR: OpenCalais service responded with #{http.response_header.status} - response body: '#{http.response}'" end @callbacks.each { |c| c.call(result) } end #callback http.errback do result = ResponseData.new nil, "ERR: #{http.error}" @callbacks.each { |c| c.call(result) } end #errback end #if rescue Exception=>e #result = ResponseData.new nil, "ERR: #{e}" #@callbacks.each { |c| c.call(result) } @error = "ERR: #{e}" end end
Defines the user callbacks. If the data source is successfully read, then this method will store a user-defined block which will be called on completion of the OpenCalais HTTP request. If the data source cannot be read -for whatever reason- then the block will immediately be called, passing the parameter that caused the read failure.
@param block a user-defined block @return N/A
# File lib/dover_to_calais.rb, line 510 def to_calais(&block) #fred rules ok if !@error @callbacks << block else result = ResponseData.new nil, @error block.call(result) end end
Private Instance Methods
uses the {github.com/Erol/yomu yomu} gem to extract text from a number of document formats and URLs. If an exception occurs, it is written to the {@error} instance variable
@param [String] src the name of the data source (file-path or URI) @return [String, nil] the extracted text, or nil if an exception occurred.
# File lib/dover_to_calais.rb, line 491 def get_src_data(src) begin yomu = Yomu.new src rescue Exception=>e @error = "ERR: #{e}" else yomu.text end end