Silicon IP Cores
JPEG-LS-E
Lossless & Near-Lossless JPEG-LS Encoder
The JPEG-LS-E core implements a highly efficient, low-power, lossless and near-lossless image compression engine that is compliant to the JPEG-LS, ISO/IEC 14495-1 standard.
Based on LOCO-I (LOw COmplexity LOssless COmpression for Images), the JPEG-LS algorithm leads in numerically lossless compression efficiency, attaining compression ratios similar or superior to those obtained with more advanced algorithms such as JPEG 2000. JPEG-LS also enables hardware implementations with a much smaller silicon footprint and lower memory requirements, thanks to its lower computational complexity and line-based processing. Further, the Near-Lossless mode of the JPEG-LS standard makes higher compression ratios and visually lossless compressed images feasible, allowing the user to set the maximum acceptable difference between a reconstructed and an original image sample.
The JPEG-LS-E core delivers the full compression efficiency of the standard in a compact and easy-to-use hardware block. The core interfaces to the system via standardized AMBA® interfaces: it accepts images and outputs compressed data via AXI4-Stream interfaces, and provides access to its control and status registers via a 32-bit APB interface. After its registers are programmed, the core can encode an arbitrary number of images without requiring any further assistance or action from the system. Users can optionally insert timestamps or other metadata in the compressed stream using a dedicated AXI Streaming interface.
The core is designed with industry best practices, and its reliability has been proven through both rigorous verification and silicon validation. The deliverables include a complete verification environment and a bit-accurate software model.
Despite its lower computational complexity JPEG-LS offers exceptionally high lossless compression efficiency. JPEG-LS is expected to outperform PNG, and to provide similar compression ratios as lossless JPEG 2000 for both color and greyscale images.
The following shows the lossless compression advantage of JPEG-LS over other, more complex algorithms using several indicative example images.
The JPEG-LS-E is suitable for systems requiring numerically or visually lossless compression of images or video of potentially high color or greyscale accuracy. Application areas include medical Imaging (DICOM), aerospace imaging or surveillance, and advanced driver assistance systems.
The core is available in two versions: size-optimized JPEG-LS-ES and scalable throughput JPEG-LS-EF.
The JPEG-LS-ES version uses just 40K gates, provides a throughput of one sample per cycle, and requires only one image line of buffering. A single JPEG-LS-ES core can compress several hundreds of Msamples per second when mapped on an ASIC technology.
The scalable-throughput JPEG-LS-EF version can process multiple samples per cycle by internally aggregating a user-defined number of JPEG-LS-ES cores. It is suitable for compressing images or video with ultra-high resolutions and/or frame rates.
Support
The core as delivered is warranted against defects for ninety days from purchase. Thirty days of phone and email technical support are included, starting with the first interaction. Additional maintenance and support options are available.
Deliverables
The core is available in source code RTL (Verilog) or as an FPGA netlist, and its deliverables include everything required for successful implementation:
- Sophisticated self-checking Testbench
- Software (C++) Bit-Accurate Model
- Sample simulation and synthesis scripts
- Comprehensive user documentation Applications
The JPEG-LS-E can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). The JPEG-LS-ES version requires about 40k gates and can run up to 500MHz in a typical 28nm technology. The size of a JPEG-LS-EF version depends on its configuration. Please contact CAST to get characterization data for your target configuration and technology.
The JPEG-LS-E can be mapped to any Altera® FPGA device, provided sufficient silicon resources are available. The following table provides sample performance and resource utilization data for different Altera® FPGA device families for the JPEGLS-E-S version of the core.
720p30 |
1080p25 |
720p60 |
1080p30 |
|
Arria® 10 |
||||
Cyclone® 10 |
||||
Max® 10 |
||||
Logic 1 |
9.3K LEs or 3.6K ALMs (for max bits/sampe:8) |
|||
Memory Bits 1 |
54K (for max bits/sampe:8) |
|||
DSPs / MULTs | 0 |
Notes
1: Exact resource utilization and max performance depend on target device and core configuration
2: List of video formats is not exhaustive. Indicated video formats may not be supported at devices of all speed grades
The JPEG-LS-E can be mapped to any AMD device, provided sufficient silicon resources are available. The following table provides sample performance and resource utilization data for different Xilinx device families for the JPEGLS-ES version of the core.
720p30 |
720p50 |
720p60 |
1080p30 |
|
KINTEX ULTRASCALE |
||||
KINTEX-7 |
||||
ARTIX-7 |
||||
LUTs 1 |
5.6k (for max bits/sampe:8) |
|||
BRAMs 1 |
2.5 Block RAM Tiles (for max bits/sampe:8) |
|||
DSPs | 0 |
Note
1: Exact resource utilization and max performance depend on target device and core configuration
2: List of video formats is not exhaustive. Indicated video formats may not be supported at devices of all speed grades
The JPEG-LS-E can be mapped to any Lattice device, provided sufficient silicon resources are available. The following tables provide sample performance and resource utilization data for a limited set of configurations of the core. The sample results do not represent the higher speed or smaller area for the core.
Family Device |
Configuration | Logic Resources |
Memory Resources |
Freq. MHz |
---|---|---|---|---|
CrossLink-NX LIFCL-40 (-8HP) |
1 samples/cycle 8 bits/sample |
6,877 LUT4 | 8 EBR | 99 |
CrossLink-NX LIFCL-40 (-8HP) |
2 samples/cycle 8 bits/sample |
15,007 LUT4 | 16 EBR | 88 |
ECP3 LFE-35 |
2 samples/cycle 10 bits/sample |
15,597 LUT4 | 16 EBR | 73 |
Table 1: Sample results for the core configured to support a max image width of 2048 pixels, and 1 color component.
The JPEG-LS-E can be mapped to any Microchip device, provided sufficient silicon resources are available. The following tables provide sample performance and resource utilization data for different Microsemi device families for the JPEG-LS-ES and the JPEG-LS-EF versions of the core. The sample results do not represent the higher speed or smaller area for the core.
Family Device |
Logic Resources |
Memory Resources |
Freq. MHz |
MSamples/s |
---|---|---|---|---|
Igloo2 M2GL150-STD |
9,402 4LUT | 25 RAM64x18 4 RAM1K18 |
80 | 80 |
PolarFire MPF500T-STD |
8,892 4LUT | 31 uSRAM 6 LSRAM |
120 | 120 |
RTG4 RT4G150 -STD |
9,289 4LUT | 25 RAM64x18 4 RAM1K18 |
60 | 60 |
SmartFusion2 M2S150-STD |
9,402 4LUT | 25 RAM64x18 4 RAM1K18 |
80 | 80 |
Table 1: Sample results for the JPEG-LS-ES version of the core configured to support a max image width of 2048 pixels, 8 bits per sample, and 1 color component.
Family Device |
Logic Resources |
Memory Resources |
Freq. MHz |
MSamples/s |
---|---|---|---|---|
Igloo2 M2GL150-STD |
15,889 4LUT | 40 RAM64x18 12 RAM1K18 |
80 | 160 |
PolarFire MPF500T-STD |
15,463 4LUT | 40 uSRAM 12 LSRAM |
120 | 240 |
RTG4 RT4G150 -STD |
15,571 4LUT | 40 RAM64x18 14 RAM1K18 |
50 | 100 |
SmartFusion2 M2S150-STD |
15,889 4LUT | 40 RAM64x18 12 RAM1K18 |
80 | 160 |
Table 2: Sample results for the JPEG-LS-EF version of the core configured to support a max image width of 2048 pixels, 8 bits per sample, 1 color component, and 2 samples/cycle throughput.
Engineered by Beyond Semiconductor.
Features List
JPEG-LS Encoder
- Highly Efficient Numerically Lossless Compression
- Better compression ratio than most lossless compression algorithms (JPEG2000, PNG, etc.)
- Near-Lossless Compression
- Enables greater compression with visually lossless quality by constraining the maximum difference between reconstructed and original image samples
- Maximum image resolution of 64Kx64K, or higher via support for oversize image dimension parameters
- Up to 16 bits per color sample; up to four color components
Easy to Use and Integrate
- Run-time programmable input and encoding parameters
- Image resolution, number of color components, color depth
- Maximum reconstruction error, Point-Transform, Local Gradient, Reset Frequency
- Automatic program-once encode-many operation
- AXI4-Stream interfaces for image and compressed data, and 32-bit wide APB for register access
- Dedicated, easy-to-use timestamps interface
Versions and Throughput
- Area-optimized JPEG-LS-ES: one sample per cycle
- 40,000 eq. gates and up to 500 Msamples/sec on a typical 28nm technology
- Throughput optimized JPEG-LS-EF: synthesis-configurable number of samples per cycle
Deliverables
- Source code RTL (Verilog) or Targeted FPGA Netlist
- Bit Accurate Model
- Sample simulation and synthesis scripts
- Verification testbenches
- Comprehensive documentation
Resources
HP Labs LOCO-I/JPEG-LS Home Page
JPEG-LS file viewers: ffplay, IrfanView, XnView
SDKs supporting JPEG-LS:
– ffmpeg
– PICTools,
– VintaSoft Imaging .NET SDK,
– LEADTOOLS imaging tool kits
JPEG-LS software implementations:
UBC, ChartLS, Clunie
Comparison paper (PDF): Lossless Compression of Grayscale Medical Images - Effectiveness of Traditional and State of the Art Approaches