ZipAccel-D
GUNZIP/ZLIB/Inflate Data Decompression

ZipAccel-D is a custom hardware implementation of a lossless data decompression engine that complies with the Inflate/Deflate, GZIP/GUNZIP, and ZLIB compression standards. 

The core features fast processing, with low latency and high throughput. On average the core outputs three bytes of decompressed data per clock cycle, providing over 15Gbps in a typical 40nm technology. Designers can scale the throughput further by instantiating the core multiple times to achieve throughput rates exceeding 100Gbps.The latency is in the order of few tens of clock cycles for blocks coded with static Huffman tables, and typically less than 2,000 cycles for block encoded with dynamic Huffman tables.

The decompression core has been designed for ease of use and integration. It operates on a standalone basis, off-loading the host CPU from the demanding task of data decompression. The core receives compressed input files and outputs decompressed files. No preprocessing of the compressed files is required, as the core parses the file headers, checks the input files for errors, and outputs the decompressed data payload. Featuring extensive error tracking and reporting errors, the core enables smooth system operation and error recovery, even in the presence of errors in the compressed input files.  Furthermore, internal memories can optionally support Error Correction Codes (ECC) to simplify achievement of Enterprise Class reliability requirements. 

The ZipAccel-D core is a microcode-free design developed for reuse in ASIC and FPGA implementations. Streaming data, optionally bridged to AMBA AXI4-stream, interfaces ease SoC integration. Technology mapping is straightforward, as the design is scan-ready, microcode-free, and uses easily replaceable, generic memory models. 
 

Verification

The core has been verified through extensive synthesis, place and route, and simulation runs. It has also been embedded in several commercially-shipping products, and is proven in both ASIC and FPGA technologies.

The core has been verified for interoperability with a number of software applications that use GZIP, ZLIB, or Deflate compression.

Deliverables

The core is available in ASIC (synthesizable HDL) and FPGA (netlist) forms, and includes everything required for successful implementation. The ASIC version includes:

  • HDL (Verilog) RTL source code
  • Sophisticated Test Environment
  • Simulation scripts, test vectors and expected results
  • Synthesis script
  • Comprehensive user documentation

Support

The core as delivered is warranted against defects for ninety days from purchase. Thirty days of phone and email technical support are included, starting with the first interaction. Additional maintenance and support options are available.

ZipAccel-D silicon resources requirements and throughput depends on its configuration. Also ZipAccel-D performance can scale by using multiple core instances. 

Over 100 Gbps throughputs are feasible, and the silicon footprint can be less than 200k Gates. Contact CAST Sales for help defining likely configuration options and estimating implementation results for your specific system.

The ZipAccel-D can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). ZipAccel-D silicon resources requirements and throughput depends on its configuration. Also ZipAccel-D performance can scale by instantiating more Huffman decoders and by using multiple core instances.

The following table provides sample performance and resource utilization data for different configurations of the core on an Arria10 device, and do not represent the smallest possible area requirements nor the highest possible clock frequency. Please contact CAST to get characterization data for your target configuration and technology.

Family /
Device

Huffman
Tables

History
Window

Freq.
(MHz)

ALMs

Memory
Bits

Gbps

Arria10
GX-1150

Dynamic

512

135

7,348

154,178

3.24

Dynamic

1,024

135

7,470

158,786

3.24

Dynamic

2,048

135

7,334

171,074

3.24

Dynamic

4,096

135

7,329

187,714

3.24

Dynamic

8,192

135

7,378

220,738

3.24

Dynamic

16,384

135

7,360

286,530

3.24

Dynamic

32,768

130

7,385

417,858

3.12

Static

512

160

4,914

25,680

3.84

Static

1,024

160

5,012

30,288

3.84

Static

2,048

135

4,824

42,576

3.24

Static

4,096

160

4,862

59,216

3.84

Static

8,192

125

4,882

92,240

3.00

Static

16,384

140

4,917

158,032

3.36

Static

32,768

155

4,957

289,350

3,72

 

The ZipAccel-D can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). ZipAccel-D silicon resources requirements and throughput depends on its configuration. Also ZipAccel-D performance can scale by using multiple core instances. The following table provides sample performance and resource utilization data for different configurations of the core on an Kintex Ultrascale device, and do not represent the smallest possible area requirements nor the highest possible clock frequency.. Please contact CAST to get characterization data for your target configuration and technology.

Family /
Device

Huffman
Tables

History
Window

Freq.
(MHz)

LUTs

BRAMs

Gbps

Kintex
UltraScale
xcku060-2

Dynamic

8,192

125

8,221

23.0

3.00

Dynamic

16,384

130

8,243

25.0

3.12

Dynamic

32,768

125

8,250

29.0

3.00

Static

8,192

165

5,375

4.5

3.96

Static

16,384

165

5,416

6.5

3.96

Static

32,768

165

5,392

10.5

3.96

Related Content

Features List

Compression Standards 

  • ZLIB (RFC-1950)
  • Inflate/Deflate (RFC-1951)
  • GZIP/GUNZIP (RFC-1952)

Inflate/Deflate Features

  • Up to 32KB History Window Size
  • All Deflate Block Types 
    • Static and Dynamic Huffman-Coded Blocks
    • Stored Deflate Blocks 

High Performance & Low Latency 

  • Three bytes per clock average processing rate, for throughputs exceeding 20Gbps in modern ASIC technologies, and scalable to more than 100Gbps with multiple core instances
  • Latency from 20 clock cycles for Static Huffman blocks, and typically less than 2000 cycles for Dynamic Huffman Blocks 

Easy to Use and Integrate

  • Processor-Free, Standalone Operation  
  • Extensive Error-Catching & Reporting for Smooth Operation and Recovery in the Presence of Errors
    • Header Syntax Errors
    • CRC/Adler 32 Errors
    • File Size Errors
    • Coding errors
    • Huffman Tables Errors
    • Non-correctable ECC memory errors
  • Optional ECC Memories, Necessary for Enterprise-Class RAMs  
  • Streaming-Capable Interfaces
  • Microcode-Free, Scan-Ready Design
  • Complete, Turn-Key Accelerator Designs Available on FPGA Boards from Different Vendors

Configuration Options

  • Synthesis-time configuration options allow fine-tuning the core’s size and performance (partial list):
    • Input and Output Bus Width
    • FIFO Sizes
    • Maximum History Window
    • Static-Only or Dynamic and Static Huffman Tables support
       

Resources

Applicable Standards
RFC 1952 – GZIP file format
 RFC 1950 – ZLIB Compressed Data Format
• RFC 1951 – DEFLATE Compressed Data Format
Background & More Info
Data Compression in Solid State Storage, presentation at Flash Memory Summit 2013 (PDF)
• Wikipedia entries on GZIP, ZLIB, and Deflate
• An explanation of the Deflate algorithm by Antaeus Feldspar
GZIP Project website
ZLIB Project website

Let's talk about your project and our IP solutions

Request Info