class PgImporter
Imports the StackOverflow data into a PostgreSQL data.
Public Class Methods
import_from_argv(argv)
click to toggle source
# File lib/so2pg.rb, line 46 def self.import_from_argv(argv) # Parse the command-line options cmd_opts = PgOptionsParser.parse(ARGV) # If all validation passed, then execute the import! if cmd_opts start = Time.now pg = PgImporter.new(cmd_opts.has_key?(:relationships), cmd_opts.has_key?(:optionals), cmd_opts) pg.import(cmd_opts[:dir]) puts "Import completed in #{Time.now - start}s" end end
new(relations = false, optionals = false, options = {})
click to toggle source
(See SO2DB::Importer.initialize documentation)
Calls superclass method
SO2DB::Importer::new
# File lib/so2pg.rb, line 42 def initialize(relations = false, optionals = false, options = {}) super(relations, optionals, "postgresql", options) end
Private Instance Methods
build_cmd(sql)
click to toggle source
Builds the import command with the given SQL command and the global connection options.
Example:
>> sql = "COPY ..." >> puts build_cmd(sql) => psql -d test -h localhost -c "COPY ..."
# File lib/so2pg.rb, line 91 def build_cmd(sql) # Only exists within the context of this script (not exported), so this # does not degrade security posture after the script has completed ENV['PGPASSWORD'] = conn_opts[:password] if conn_opts.has_key? :password cmd = "psql" cmd << " -d #{conn_opts[:database]}" if conn_opts.has_key? :database cmd << " -h #{conn_opts[:host]}" if conn_opts.has_key? :host cmd << " -U #{conn_opts[:username]}" if conn_opts.has_key? :username cmd << " -p #{conn_opts[:port]}" if conn_opts.has_key? :port cmd << " -c \"#{sql}\"" return cmd end
build_sql(value_str)
click to toggle source
Builds the SQL command used for bulk loading the tables.
# File lib/so2pg.rb, line 80 def build_sql(value_str) "COPY #{value_str} FROM STDIN WITH (FORMAT csv, DELIMITER E'\x0B')" end
execute_cmd(cmd, formatter)
click to toggle source
Executes the provided shell command and pumps the data from the formatter to it over stdin.
# File lib/so2pg.rb, line 108 def execute_cmd(cmd, formatter) IO.popen(cmd, 'r+') do |s| formatter.format(s) s.close_write end end
import_stream(formatter)
click to toggle source
Imports the data from the formatter into the PostgreSQL database.
Note that what follows is just one way to implement the importer. You could just as easily push the formatted data into a file and then ask the database to suck that file in.
# File lib/so2pg.rb, line 68 def import_stream(formatter) puts "Importing file #{formatter.file_name}..." start = Time.now sql = build_sql(formatter.value_str) cmd = build_cmd(sql) execute_cmd(cmd, formatter) puts " -> #{Time.now - start}s" end