Silicon IP Cores
JPEG-LS-D
Lossless & Near-Lossless JPEG-LS Decoder
The JPEG-LS-D core implements a highly efficient and low-power, lossless and near-lossless image decompression engine that is compliant to the JPEG-LS, ISO/IEC 14495-1 standard.
The decoder core can decompress any JPEG-LS stream or JPEG-LS payload of image container formats, such as DICOM (Digital Imaging and Communications in Medicine). It accepts compressed streams of images with up to 16-bit per color samples and up to four color components, in all widely used color subsampling formats. Supporting oversize image dimension parameters, the core can decode images with resolutions exceeding 64k x 64k pixels.
The easy-to-use JPEG-LS-D core operates on a standalone basis, parsing marker segments and decompressing coded data with no assistance from a host processor. The decoder reports the image format (i.e., resolution, subsampling format, and color sample depth) to the system, so that the decoded images are properly further processed and/or displayed. Application (APP) or comment (COM) marker segments—which are typically used to embed metadata in the compressed stream—are also passed to the system via a dedicated interface.
SoC integration is straightforward thanks to standardized AMBA® interfaces. The core accepts compressed data and outputs pixel data, frame format information, and APP or COM marker segments via AXI4-Stream interfaces, and it provides access to its control and status registers via a 32-bit APB interface. A wrapper that bridges the AXI-Stream interfaces to AXI4 can optionally be delivered with the core.
The core is designed with industry best practices, and its reliability has been proven through both rigorous verification and silicon validation. The deliverables include a complete verification environment and a bit-accurate software model.
Despite its lower computational complexity, JPEG-LS offers exceptionally high lossless compression efficiency. JPEG-LS is expected to outperform PNG, and to provide similar compression ratios as lossless JPEG2000 for both color and greyscale images. The following illustration shows several indicative examples.
The JPEG-LS-D is suitable for applications requiring numerically or visually lossless compression of images or video of potentially high color or greyscale accuracy such as medical Imaging (DICOM), aerospace imaging/surveillance, and advanced driver assistance systems (ADAS).
The core is available in two versions, size-optimized and scalable-throughput. The size-optimized version, JPEG-LS-DS, provides a throughput of one sample per cycle and requires only one image line of buffering. A single JPEG-LS-DS core can decompress several hundreds of Msamples per second when mapped on an ASIC technology.
The scalable-throughput version, JPEG-LS-DF, can process multiple samples per cycle by internally aggregating a user-defined number of JPEG-LS-DS cores. The JPEG-LS-DF is suitable for compressing images or video with ultra-high resolutions and/or frame rates but assumes the use of restart markers in the encoded stream.
Support
The core as delivered is warranted against defects for ninety days from purchase. Thirty days of phone and email technical support are included, starting with the first interaction. Additional maintenance and support options are available.
Deliverables
The core is available in source code RTL (Verilog) or as an FPGA netlist, and its deliverables include everything required for successful implementation:
- Sophisticated self-checking Testbench
- Software (C++) Bit-Accurate Model
- Sample simulation and synthesis scripts
- Comprehensive user documentation Applications
The JPEG-LS-D is suitable for applications requiring numerically or visually lossless compression of images or video of potentially high color or greyscale accuracy such as medical Imaging (DICOM), aerospace imaging/surveillance, and advanced driver assistance systems (ADAS).
The JPEG-LS-D can be mapped to any ASIC technology. The size of a core depends on its configuration. The following table provides sample area and performance data for the JPEG-LS-DS core mapped on a typical 16nm technology using SVT cells.
Max. Sample Depth (bits) |
Max. NEAR Value |
Logic Area | Freq (MHz) |
|
---|---|---|---|---|
um2 | eq. Gates | |||
8 | 0 | 6,511 | 37,681 | 300 |
10 | 0 | 7,066 | 40,892 | 300 |
16 | 0 | 8,581 | 49,660 | 300 |
8 | 4 | 7,139 | 41,315 | 300 |
10 | 4 | 7,974 | 46,143 | 300 |
16 | 4 | 10,023 | 58,003 | 300 |
8 | 8 | 7,377 | 42,689 | 300 |
10 | 8 | 8,381 | 48,499 | 300 |
16 | 8 | 11,036 | 63,865 | 280 |
Note that the list of core configurations is not exhaustive and that these sample implementation figures do not represent the highest speed or smallest area possible for the core. Please consult with CAST to get accurate characterization data for your target technology and required core configuration.
The JPEG-LS-D can be mapped to any Altera® FPGA device, provided enough silicon resources are available. The size of a core depends on its configuration. The following table provides sample area and performance data for the JPEG-LS-D core mapped on an Arria® 10 device (speed grade 3) and excludes the image line buffer.
Core Version | Max. Bits per Sample |
Max. NEAR Value | FPGA Resources | MSamples per sec | |
---|---|---|---|---|---|
ALMs | Mem. bits | ||||
JPEG-LS-DS | 8 | 0 | 5,095 | 25,046 | 63 |
4 | 5,455 | 25,046 | 55 | ||
8 | 5,535 | 25,046 | 50 | ||
10 | 0 | 5,503 | 30,014 | 61 | |
4 | 5,958 | 30,014 | 50 | ||
8 | 5,976 | 30,014 | 41 | ||
16 | 0 | 7,767 | 45,283 | 77 | |
4 | 8,479 | 45,283 | 37 | ||
8 | 8,492 | 45,283 | 31 | ||
JPEG-LS-DF with 3 cores |
8 | 0 | 12,850 | 75,208 | 189 |
10 | 13,824 | 90,112 | 183 | ||
16 | 20,378 | 135,919 | 168 |
Note that the list of core configurations is not exhaustive and that these sample implementation figures do not represent the highest speed or smallest area possible for the core. Please consult with CAST to get accurate characterization data for your target technology and required core configuration.
The JPEG-LS-D can be mapped to any AMD device, provided enough silicon resources are available. The size of a core depends on its configuration. The following table provides sample area and performance data for the JPEG-LS-D core mapped on Kintex UltraScale+ device with -1 speed grade and excludes the image line buffer.
Core Version | Max. Bits per Sample |
Max. NEAR Value | FPGA Resources | MSamples per sec | |
---|---|---|---|---|---|
LUTs | BRAM Tiles | ||||
JPEG-LS-DS | 8 | 0 | 7,076 | 2 | 100 |
4 | 7,442 | 2 | 100 | ||
8 | 7,858 | 2 | 95 | ||
10 | 0 | 7,414 | 2 | 100 | |
4 | 8,067 | 2 | 92 | ||
8 | 8,301 | 2 | 85 | ||
16 | 0 | 9,782 | 2.5 | 98 | |
4 | 10,823 | 2.5 | 78 | ||
8 | 11,021 | 2.5 | 57 | ||
JPEG-LS-DF with 3 cores |
8 | 0 | 16,330 | 6 | 300 |
10 | 17,820 | 6 | 300 | ||
16 | 25.384 | 7.5 | 285 |
Note that the list of core configurations is not exhaustive and that these sample implementation figures do not represent the highest speed or smallest area possible for the core. Please consult with CAST to get accurate characterization data for your target technology and required core configuration.
The JPEG-LS-D can be mapped to any Microchip device, provided enough silicon resources are available. The size of a core depends on its configuration. The following tables provide sample performance and resource utilization data for different Microsemi device families for the JPEG-LS-DS and JPEG-LS-DF versions of the core. The sample results do not represent the higher speed or smaller area for the core.
Family Device |
Logic Resources |
Memory Resources |
Freq. MHz |
MSamples/s |
---|---|---|---|---|
Igloo2 M2GL150-STD |
13,358 4LUT | 12 RAM64x18 4 RAM1K18 |
30 | 30 |
PolarFire MPF500T-STD |
12,910 4LUT | 13 uSRAM 6 LSRAM |
50 | 50 |
RTG4 RT4G150 -STD |
12,943 4LUT | 12 RAM64x18 4 RAM1K18 |
25 | 25 |
SmartFusion2 M2S150-STD |
13,358 4LUT | 12 RAM64x18 4 RAM1K18 |
30 | 30 |
Table 1: Sample results for the JPEG-LS-DS version of the core configured to support a max image width of 2018 pixels, 8 bits per sample, and 1 color component.
Family Device |
Logic Resources |
Memory Resources |
Freq. MHz |
MSamples/s |
---|---|---|---|---|
Igloo2 M2GL150-STD |
22,950 4LUT | 25 RAM64x18 8 RAM1K18 |
30 | 60 |
PolarFire MPF500T-STD |
22,715 4ΛΘΤ | 29 uSRAM 8 LSRAM |
50 | 100 |
RTG4 RT4G150 -STD |
22,794 4LUT | 25 RAM64x18 8 RAM1K18 |
25 | 50 |
SmartFusion2 M2S150-STD |
22,950 4LUT | 25 RAM64x18 8 RAM1K18 |
30 | 60 |
Table 2: Sample results for the JPEG-LS-DF version of the core configured to support a max image width of 2018 pixels, 8 bits per sample, 1 color component, and 2 samples/cycle throughput.
Engineered by Beyond Semiconductor.
Features List
JPEG-LS ISO/IEC 14495-1 Standard Support
- All JPEG-LS encoding parameters
- Optional Near Lossless support
- All interleaved modes
- All marker-segments including APP, COM, DNL and Restart markers
- Image resolution higher than 64Kx64K (supports oversize image dimension parameters)
- Up to 16 bits per color sample, and up to four color components
Easy to Use and Integrate
- Requires no programming or control from host
- Reports image format
- Detects and reports marker syntax errors
- Delivered with bit-accurate software model
- AXI4-Stream Interfaces for image and compressed data, and 32-bit wide APB for register access
- Dedicated interface for APP and COM markers to pass metadata to system
Versions and Throughput
- One sample per cycle, for the area optimized JPEG-LS-DS version
- From 40,000 eq. gates and up to 350 Msamples/sec on a typical 16nm technology
- Synthesis-configurable number of samples per cycle, for the throughput optimized JPEG-LS-DF version. Maximum throughput is only possible when images are encoded using restart markers.
Resources
HP Labs LOCO-I/JPEG-LS Home Page
JPEG-LS file viewers: ffplay, IrfanView, XnView
SDKs supporting JPEG-LS:
– ffmpeg
– PICTools,
– VintaSoft Imaging .NET SDK,
– LEADTOOLS imaging tool kits
JPEG-LS software implementations:
UBC, ChartLS, Clunie
Comparison paper (PDF): Lossless Compression of Grayscale Medical Images - Effectiveness of Traditional and State of the Art Approaches