class AWS::Flow::S3DataConverter

S3DataConverter uses YAMLDataConverter internally to serialize and deserialize ruby objects. Additionally it stores objects larger than 32k characeters in AWS S3 and returns a serialized s3 link to be deserialized remotely. It caches objects locally to minimize calls to S3.

AWS Flow Framework for Ruby doesn’t delete files from S3 to prevent loss of data. It is recommended that users use Object Lifecycle Management in AWS S3 to auto delete files.

More information about object expiration can be found at: docs.aws.amazon.com/AmazonS3/latest/dev/ObjectExpiration.html

Attributes

conv[RW]
bucket[R]
cache[R]
converter[R]

Public Class Methods

converter() click to toggle source

Ensures singleton

# File lib/aws/decider/data_converter.rb, line 97
def converter
  return self.conv if self.conv
  name = ENV['AWS_SWF_BUCKET_NAME']
  if name.nil?
    raise "Need a valid S3 bucket name to initialize S3DataConverter."\
      " Please set the AWS_SWF_BUCKET_NAME environment variable with the"\
      " bucket name."
  end
  self.conv ||= self.new(name)
  return self.conv
end
new(bucket) click to toggle source
# File lib/aws/decider/data_converter.rb, line 111
def initialize(bucket)
  @bucket = bucket
  @cache = S3Cache.new
  s3 = AWS::S3.new
  s3.buckets.create(bucket) unless s3.buckets[bucket].exists?
  @converter = FlowConstants.default_data_converter
end

Public Instance Methods

delete_from_s3(s3_filename) click to toggle source

Helper method to delete an s3 file @param s3_filename

File name to be deleted

@api private

# File lib/aws/decider/data_converter.rb, line 197
def delete_from_s3(s3_filename)
  s3 = AWS::S3.new
  s3.buckets[@bucket].objects.delete(s3_filename)
end
dump(object) click to toggle source

Serializes a ruby object into a string. If the size of the converted string is greater than 32k characters, the string is uploaded to an AWS S3 file and a serialized hash containing the filename is returned instead. The filename is generated at random in the following format - rubyflow_data_<UUID>.

The format of the returned serialized hash is - { s3_filename: <filename> }

@param object

The object that needs to be serialized into a string. By default it
serializes the object into a YAML string.
# File lib/aws/decider/data_converter.rb, line 131
def dump(object)
  string = @converter.dump(object)
  ret = string
  if string.size > 32768
    filename = put_to_s3(string)
    ret = @converter.dump({ s3_filename: filename })
  end
  ret
end
get_from_s3(s3_filename) click to toggle source

Helper method to read an s3 file @param s3_filename

File name to be deleted

@api private

# File lib/aws/decider/data_converter.rb, line 179
def get_from_s3(s3_filename)
  return @cache[s3_filename] if @cache[s3_filename]
  s3 = AWS::S3.new
  s3_object = s3.buckets[@bucket].objects[s3_filename]
  begin
    ret = s3_object.read
    @cache[s3_filename] = ret
  rescue AWS::S3::Errors::NoSuchKey => e
    raise "Could not find key #{s3_filename} in bucket #{@bucket} on S3. #{e}"
  end
  return ret
end
load(source) click to toggle source

Deserializes a string into a ruby object. If the deserialized string is a ruby hash of the format { s3_filename: <filename> }, then it will first look for the file in a local cache. In case of a cache miss, it will try to download the file from AWS S3, deserialize the contents of the file and return the new object.

@param source

The source that needs to be deserialized into a ruby object. By
default it expects the source to be a YAML string.       #
# File lib/aws/decider/data_converter.rb, line 150
def load(source)
  object = @converter.load(source)
  ret = object
  if object.is_a?(Hash) && object[:s3_filename]
    ret = @converter.load(get_from_s3(object[:s3_filename]))
  end
  ret
end
put_to_s3(string) click to toggle source

Helper method to write a string to an s3 file. A random filename is generated of the format - rubyflow_data_<UUID>

@param string

The string to be uploaded to S3

@api private

# File lib/aws/decider/data_converter.rb, line 166
def put_to_s3(string)
  filename = "rubyflow_data_#{SecureRandom.uuid}"
  s3 = AWS::S3.new
  s3.buckets[@bucket].objects.create(filename, string)
  @cache[filename] = string
  return filename
end