class RemoteHadoopGem

Public Class Methods

cleanTempFiles(username, host, keyPathFile) click to toggle source
****** cleanTempFiles ******

function to clean all the temporary files created by commandToHadoopJobID require:

1) username (to access the machine via ssh)
2) host (IP of the hadoop machine) 
3) keyPathFile (path of the private key to access the machine)
# File lib/remoteHadoopGem.rb, line 114
def self.cleanTempFiles(username, host, keyPathFile)
  `ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; rm ./tempJobID_* "`
end
commandToHadoop(username, host, keyPathFile, command) click to toggle source
****** commandToHadoop ******

function to run a command in a remote hadoop machine require:

1) username (to access the machine via ssh)
2) host (IP of the hadoop machine) 
3) keyPathFile (path of the private key to access the machine)
4) command to execute
# File lib/remoteHadoopGem.rb, line 15
def self.commandToHadoop(username, host, keyPathFile, command)
  res=`ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; #{command}"`
  return "#{res}"
end
commandToHadoopJobID(username, host, keyPathFile, command) click to toggle source
****** commandToHadoopJobID ******

function to run a command in a remote hadoop machine; returns the <job-id> require:

1) username (to access the machine via ssh)
2) host (IP of the hadoop machine) 
3) keyPathFile (path of the private key to access the machine)
4) command to execute
# File lib/remoteHadoopGem.rb, line 30
def self.commandToHadoopJobID(username, host, keyPathFile, command)
  # create a file with a random name to obtained the jobID
  filename="tempJobID_"+UUIDTools::UUID.random_create.to_s
  #`ssh -n -f -i "#{keyPathFile}" "#{username}"@"#{host}" 'sh -c "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; nohup #{command} >/dev/null 2>#{filename}.txt & " '`
  `ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" 'sh -c "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; nohup #{command} >/dev/null 2>#{filename}.txt & " '`
  res=`sleep 10; ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "cat /hadoop/#{filename}.txt | grep 'Running job' " `
  `ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "rm /hadoop/#{filename}.txt"`
  jobid=res.split[6]
  
  return "#{jobid}"
end
copyFileTo(username, host, keyPathFile, filePath, destFilePath) click to toggle source
****** copyFileTo ******

function to copy a file in a remote hadoop machine require:

1) username (to access the machine via ssh)
2) host (IP of the hadoop machine) 
3) keyPathFile (path of the private key to access the machine)
4) filePath (path of the local file)
5) destFilePath (path on the remote hadoop node)
# File lib/remoteHadoopGem.rb, line 53
def self.copyFileTo(username, host, keyPathFile, filePath, destFilePath)
  res=`scp -r -i "#{keyPathFile}" -o StrictHostKeyChecking=no #{filePath}  "#{username}"@"#{host}":#{destFilePath}`
end
jobStatus(username, host, keyPathFile, job_id) click to toggle source
****** jobStatus ******

function to read the status of a running job in a remote hadoop machine require:

1) username (to access the machine via ssh)
2) host (IP of the hadoop machine) 
3) keyPathFile (path of the private key to access the machine)
4) job_id (id of the job)
# File lib/remoteHadoopGem.rb, line 99
def self.jobStatus(username, host, keyPathFile, job_id)
  res=`ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; hadoop job -status #{job_id} "`
  return "#{res}"
end
jobsList(username, host, keyPathFile) click to toggle source
****** jobsList ******

function to list the running jobs in a remote hadoop machine require:

1) username (to access the machine via ssh)
2) host (IP of the hadoop machine) 
3) keyPathFile (path of the private key to access the machine)
# File lib/remoteHadoopGem.rb, line 83
def self.jobsList(username, host, keyPathFile)
  res=`ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "export PATH=/hadoop/bin/:$PATH; cd /hadoop/; hadoop job -list "`
  return "#{res}"
end
readFileFrom(username, host, keyPathFile, destFilePath) click to toggle source
****** readFileFrom ******

function to read a file from a remote hadoop machine require:

1) username (to access the machine via ssh)
2) host (IP of the hadoop machine) 
3) keyPathFile (path of the private key to access the machine)
4) destFilePath (path of the remote file to read)
# File lib/remoteHadoopGem.rb, line 68
def self.readFileFrom(username, host, keyPathFile, destFilePath)
  res=`ssh -i "#{keyPathFile}" -o StrictHostKeyChecking=no "#{username}"@"#{host}" "cat #{destFilePath}"`
  return "#{res}"
end