digest_extensions

Extension to the Digest module of the ruby standard library to support marshalling

With this gem you can accumulatively compute the digest of a byte sequence over several executions by storing the marshalled digest object in a database:

class FakeDatabase
  attr_accessor :value
end
@db = FakeDatabase.new

def update_digest(part)
  marshalled_digest = Marshal::load(@db.value) # Fetch the last digest from database
  marshalled_digest << part # Compute digest incrementally
  @db.value = Marshal::dump(marshalled_digest) # Store updated digest in database
end

## parts_array is provided; for example as a series of files
## generated with the command split on a larger file.
@db.value = Marshal::dump(Digest::MD5.new) #An initial value
parts_array.each do |part|
  # Assume this method is invoked in a new process each time, and
  # the only thing they share is the database (faked in the example with @db)
  update_digest(part)
end
Marshal::load(@db.value).hexdigest.should == Digest::MD5.hexdigest(parts_array.inject("", :+))

The plan is to have this feature to be included in the ruby standard library.

Known issues

Currently the marshalled values are specific to the underlying native library used to compute the digest values. That means that if you are using a ruby that is compiled with linkage to OpenSSL as the underlying md5 library, then the marshalled values (in the database) cannot be used with a ruby version compiled with linkage to rubys native md5 implementation. So take care of that when you consider upgrading your ruby version.

Contributing to digest_extensions

Copyright © 2011-2012 Jarl Friis. See LICENSE.txt for further details.