zlib {zlib}R Documentation

zlib

Description

What My Package Offers

This package provides several key features:

Robustness:

Built to handle even corrupted or incomplete gzip data efficiently without causing system failures.

Demonstration:
  compressed_data <- memCompress(charToRaw(paste0(rep("This is an example string. It contains more than just 'hello, world!'", 1000), collapse = ", ")))
  decompressor <- zlib$decompressobj(zlib$MAX_WBITS)
  rawToChar(c(decompressor$decompress(compressed_data[1:300]), decompressor$flush()))  # Still working
  
Compliance:

Strict adherence to the GZIP File Format Specification, ensuring compatibility across systems.

Demonstration:
  compressor <- zlib$compressobj(zlib$Z_DEFAULT_COMPRESSION, zlib$DEFLATED, zlib$MAX_WBITS + 16)
  c(compressor$compress(charToRaw("Hello World")), compressor$flush())  # Correct 31 wbits (or custom wbits you provide)
  # [1] 1f 8b 08 00 00 00 00 00 00 03 f3 48 cd c9 c9 57 08 cf 2f ca 49 01 00 56 b1 17 4a 0b 00 00 00
  
Flexibility:

Ability to manage Gzip streams from REST APIs without the need for temporary files or other workarounds.

Demonstration:
    # Byte-Range Request and decompression in chunks

    # Initialize the decompressor
    decompressor <- zlib$decompressobj(zlib$MAX_WBITS + 16)

    # Define the URL and initial byte ranges
    url <- "https://example.com/api/data.gz"
    range_start <- 0
    range_increment <- 5000  # Adjust based on desired chunk size

    # Placeholder for the decompressed content
    decompressed_content <- character(0)

    # Loop to make multiple requests and decompress chunk by chunk
    for (i in 1:5) {  # Adjust the loop count based on the number of chunks you want to retrieve
      range_end <- range_start + range_increment

      # Make a byte-range request
      response <- httr::GET(url, httr::add_headers(`Range` = paste0("bytes=", range_start, "-", range_end)))

      # Check if the request was successful
      if (httr::http_type(response) != "application/octet-stream" || httr::http_status(response)$category != "Success") {
        stop("Failed to retrieve data.")
      }

      # Decompress the received chunk
      compressed_data <- httr::content(response, "raw")
      decompressed_chunk <- decompressor$decompress(compressed_data)
      decompressed_content <- c(decompressed_content, rawToChar(decompressed_chunk))

      # Update the byte range for the next request
      range_start <- range_end + 1
    }

    # Flush the decompressor after all chunks have been processed
    final_data <- decompressor$flush()
    decompressed_content <- c(decompressed_content, rawToChar(final_data))
  

In summary, while R’s built-in methods could someday catch up in functionality, the zlib package for now fills an important gap by providing a more robust and flexible way to handle compression and decompression tasks.

Usage

.onLoad(libname, pkgname)

Details

The following 'zlib' enrivonment is generated by the .onLoad Behavior for R packages.

The .onLoad function is automatically called when the package is loaded using library() or require(). It initializes the an environment, which can be reached from anywhere and is unique (i.e. cannot be ovwerwritten), including defining a variety of constants / methods related to the zlib compression library.

Specifically, the function assigns a new environment named "zlib" containing constants such as DEFLATED, DEF_BUF_SIZE, MAX_WBITS, and various flush and compression strategies like Z_FINISH, Z_BEST_COMPRESSION, etc.

Value

No return value, called for side effect. An environment containing the zlib constants created onLoad.

Methods

Constants

See Also

publicEval() for the method used to set up the public environment.

zlib_constants() for the method used to set up the constants in the environment. https://www.zlib.net/manual.html#Constants

Examples

# Load the package
library(zlib)
# Create a temporary file
temp_file <- tempfile(fileext = ".txt")

# Generate example data and write to the temp file
example_data <- "This is an example string. It contains more than just 'hello, world!'"
writeBin(charToRaw(example_data), temp_file)

# Read data from the temp file into a raw vector
file_con <- file(temp_file, "rb")
raw_data <- readBin(file_con, "raw", file.info(temp_file)$size)
close(file_con)
# Create a Compressor object gzip
compressor <- zlib$compressobj(zlib$Z_DEFAULT_COMPRESSION, zlib$DEFLATED, zlib$MAX_WBITS + 16)

# Initialize variables for chunked compression
chunk_size <- 1024
compressed_data <- raw(0)

# Compress the data in chunks
for (i in seq(1, length(raw_data), by = chunk_size)) {
   chunk <- raw_data[i:min(i + chunk_size - 1, length(raw_data))]
   compressed_chunk <- compressor$compress(chunk)
   compressed_data <- c(compressed_data, compressed_chunk)
}

# Flush the compressor buffer
compressed_data <- c(compressed_data, compressor$flush())


# Create a Decompressor object for gzip
decompressor <- zlib$decompressobj(zlib$MAX_WBITS + 16)

# Initialize variable for decompressed data
decompressed_data <- raw(0)

# Decompress the data in chunks
for (i in seq(1, length(compressed_data), by = chunk_size)) {
  chunk <- compressed_data[i:min(i + chunk_size - 1, length(compressed_data))]
  decompressed_chunk <- decompressor$decompress(chunk)
  decompressed_data <- c(decompressed_data, decompressed_chunk)
}

# Flush the decompressor buffer
decompressed_data <- c(decompressed_data, decompressor$flush())

# Comporess / Decompress data in a single step

original_data <- charToRaw("some data")
compressed_data <- zlib$compress(original_data,
                                 zlib$Z_DEFAULT_COMPRESSION,
                                 zlib$DEFLATED,
                                 zlib$MAX_WBITS + 16)
decompressed_data <- zlib$decompress(compressed_data, zlib$MAX_WBITS + 16)


[Package zlib version 1.0.3 Index]