class RemoteHadoopGem
Public Class Methods
cleanTempFiles(username, host, keyPathFile)
click to toggle source
****** cleanTempFiles ******
function to clean all the temporary files created by commandToHadoopJobID require:
1) username (to access the machine via ssh) 2) host (IP of the hadoop machine) 3) keyPathFile (path of the private key to access the machine)
# File lib/remoteHadoopGem.rb, line 114 def self.cleanTempFiles(username, host, keyPathFile) `ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; rm ./tempJobID_* "` end
commandToHadoop(username, host, keyPathFile, command)
click to toggle source
****** commandToHadoop ******
function to run a command in a remote hadoop machine require:
1) username (to access the machine via ssh) 2) host (IP of the hadoop machine) 3) keyPathFile (path of the private key to access the machine) 4) command to execute
# File lib/remoteHadoopGem.rb, line 15 def self.commandToHadoop(username, host, keyPathFile, command) res=`ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; #{command}"` return "#{res}" end
commandToHadoopJobID(username, host, keyPathFile, command)
click to toggle source
****** commandToHadoopJobID ******
function to run a command in a remote hadoop machine; returns the <job-id> require:
1) username (to access the machine via ssh) 2) host (IP of the hadoop machine) 3) keyPathFile (path of the private key to access the machine) 4) command to execute
# File lib/remoteHadoopGem.rb, line 30 def self.commandToHadoopJobID(username, host, keyPathFile, command) # create a file with a random name to obtained the jobID filename="tempJobID_"+UUIDTools::UUID.random_create.to_s #`ssh -n -f -i "#{keyPathFile}" "#{username}"@"#{host}" 'sh -c "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; nohup #{command} >/dev/null 2>#{filename}.txt & " '` `ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" 'sh -c "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; nohup #{command} >/dev/null 2>#{filename}.txt & " '` res=`sleep 10; ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "cat /hadoop/#{filename}.txt | grep 'Running job' " ` `ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "rm /hadoop/#{filename}.txt"` jobid=res.split[6] return "#{jobid}" end
copyFileTo(username, host, keyPathFile, filePath, destFilePath)
click to toggle source
****** copyFileTo ******
function to copy a file in a remote hadoop machine require:
1) username (to access the machine via ssh) 2) host (IP of the hadoop machine) 3) keyPathFile (path of the private key to access the machine) 4) filePath (path of the local file) 5) destFilePath (path on the remote hadoop node)
# File lib/remoteHadoopGem.rb, line 53 def self.copyFileTo(username, host, keyPathFile, filePath, destFilePath) res=`scp -r -i "#{keyPathFile}" -o StrictHostKeyChecking=no #{filePath} "#{username}"@"#{host}":#{destFilePath}` end
jobStatus(username, host, keyPathFile, job_id)
click to toggle source
****** jobStatus ******
function to read the status of a running job in a remote hadoop machine require:
1) username (to access the machine via ssh) 2) host (IP of the hadoop machine) 3) keyPathFile (path of the private key to access the machine) 4) job_id (id of the job)
# File lib/remoteHadoopGem.rb, line 99 def self.jobStatus(username, host, keyPathFile, job_id) res=`ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; hadoop job -status #{job_id} "` return "#{res}" end
jobsList(username, host, keyPathFile)
click to toggle source
****** jobsList ******
function to list the running jobs in a remote hadoop machine require:
1) username (to access the machine via ssh) 2) host (IP of the hadoop machine) 3) keyPathFile (path of the private key to access the machine)
# File lib/remoteHadoopGem.rb, line 83 def self.jobsList(username, host, keyPathFile) res=`ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; hadoop job -list "` return "#{res}" end
readFileFrom(username, host, keyPathFile, destFilePath)
click to toggle source
****** readFileFrom ******
function to read a file from a remote hadoop machine require:
1) username (to access the machine via ssh) 2) host (IP of the hadoop machine) 3) keyPathFile (path of the private key to access the machine) 4) destFilePath (path of the remote file to read)
# File lib/remoteHadoopGem.rb, line 68 def self.readFileFrom(username, host, keyPathFile, destFilePath) res=`ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "cat #{destFilePath}"` return "#{res}" end