Silicon IP Cores
JPEG-EX-F
Ultra-Fast Baseline and Extended JPEG Encoder
This JPEG compression IP core supports the Baseline Sequential DCT and the Extended Sequential DCT modes of the ISO/IEC 10918-1 standard. It implements a scalable, ultra-high-performance, ASIC or FPGA, hardware JPEG encoder that can compress high pixel rate video using significantly fewer silicon resources and less power than encoders for video compression standards such as HEVC/H,265, DSC, AVC/H.264, or JPEG200.
The JPEG-EX-F encoder produces compressed JPEG images and the video payload for Motion-JPEG container formats. It accepts images with up to 12-bit color samples and up to four color components, in all widely-used color subsampling formats.
Depending on its configuration, the encoder processes from two to 32 color samples per clock cycle, enabling it to compress UHD (4K/8K) video and/or very high frame video.
Once programmed, the easy-to-use encoder requires no assistance from a host processor to compress an arbitrary number of frames. SoC integration is straightforward thanks to standardized AMBA® interfaces: AXI Streaming for pixel and compressed data, and a 32-bit APB slave interface for registers access. Users can optionally insert timestamps or other metadata in the compressed stream using a dedicated AXI Streaming interface.
Customers with a short time to market priority can use CAST’s IP Integration Services to receive complete JPEG subsystems. These integrate the JPEG encoder with video interface controllers, Hardware UDPIP or Transport Stream networking stacks, or other IP cores available from CAST.
The core is designed with industry best practices, and its reliability and low risk have been proven through both rigorous verification and customer production. Its deliverables include a complete verification environment and a bit-accurate software model.
The core has been verified through extensive synthesis, place and route and simulation runs. It has also been embedded in several products, and is proven in both ASIC and FPGA technologies.
Support
The core as delivered is warranted against defects for ninety days from purchase. Thirty days of phone and email technical support are included, starting with the first interaction. Additional maintenance and support options are available.
Deliverables
The core is available in ASIC (synthesizable HDL) and FPGA (netlist) forms, and includes everything required for successful implementation. The ASIC version includes:
- Verilog RTL source code
- Sophisticated self-checking Testbench
- Software (C++) Bit-Accurate Model
- Sample simulation and synthesis scripts
- Comprehensive user documentation
The JPEG-EX-F core can be mapped to any ASIC and optimized to suit the particular project’s requirements. The following table provides sample implementation and performance data with the core configured to process 2 pixels per cycle (JPEG-EX-F/2) and under its default configuration.
JPEG-EX-F/2 | UHD/4k 60 fps, 4:2:2 |
UHD/4k 120 fps 4:2:0 |
UHD/4k 120 fps 4:2:2 |
Area (Kgates) |
Freq (MHz) |
---|---|---|---|---|---|
TSMC 40g | 145 | 500 | |||
TSMC 28hpm | 120 | 800 | |||
TSMC 16 | 130 | 1,000 |
Note that the list of video formats is not exhaustive, and that these sample implementation figures do not represent the highest speed or smallest area possible for the core. Please contact CAST to discuss silicon resource utilization and performance for your target technology.
The JPEG-EX-F can be mapped to any Altera FPGA Device (provided sufficient silicon resources are available) and optimized to suit the particular project’s requirements. The following table provides sample implementation and performance data with the core configured to process 2 samples per cycle (JPEG-EX-F/2) and under its default configuration.
JPEG-EX-F/2 | 1080p60 4:4:4 |
UHD/4k 25fps 4:2:0 |
UHD/4k 30fps 4:2:2 |
Logic Resources |
DSPs | Memory Bits |
---|---|---|---|---|---|---|
Max10 | 16,310 LEs | 16 | 77,432 | |||
CycloneV | 6,060 ALMs | 8 | 78,146 | |||
Arria10 | 6,204 ALMs | 8 | 78,034 |
Note that the list of video formats is not exhaustive, and that these sample implementation figures do not represent the highest speed or smallest area possible for the core. Please contact CAST to discuss silicon resource utilization and performance for your target technology.
The JPEG-EX-F can be mapped to any AMD Device (provided sufficient silicon resources are available) and optimized to suit the particular project’s requirements. The following table provides sample implementation and performance data with the core configured to process 2 samples per cycle (JPEG-EX-F/2) and under its default configuration.
Device Family | LUTs | DSP | BRAMs | Freq. (MHz) |
---|---|---|---|---|
Artix-7 (speed grade -1) |
6,378 | 32 | 5 | 166 |
Kintex-7 (speed grade -1) |
6,354 | 32 | 5 | 200 |
Spartan-7 (speed grade -2) |
6,356 | 32 | 5 | 130 |
Kintex Ultrascale+ (speed grade -1) |
6,221 | 32 | 4 | 250 |
Artix Ultrascale+ (speed grade -1) |
6,223 | 32 | 4 | 240 |
Versal Prime (speed grade -2) |
6,372 | 32 | 4 | 250 |
Please note that these sample implementation figures do not represent the highest speed or smallest area possible for the core. Please contact CAST to discuss silicon resource utilization and performance for your target technology.
The JPEG-EX-F can be mapped to any Lattice Device (provided sufficient silicon resources are available) and optimized to suit the particular project’s requirements. The following table provides sample implementation and performance data with the core configured to process 2 samples per cycle (JPEG-EX-F/2) and under its default configuration.
Family / Device | Logic Resources | Memory Resources | Fmax (MHz) |
---|---|---|---|
ECP5U LAE5U-12F |
21,962 LUT4s 20 MULT18 |
21 EBR | 104 |
CrossLink-NX LIFCL-40 (-8HP) |
17,539 LUT4s 12 MULT18 |
22 EBR | 142 |
Note that the implementation figures do not represent the highest speed or smallest area possible for the core. Please contact CAST to discuss silicon resource utilization and performance for your target technology.
The JPEG-EX-F can be mapped to any Microchip Device (provided sufficient silicon resources are available) and optimized to suit the particular project’s requirements. The following table provides sample implementation and performance data with the core configured to process 2 samples per cycle (JPEG-EX-F/2) and under its default configuration.
Family / Device | Logic Resources |
Memory Resources / Device |
Freq (MHz) / Device |
MSamples/sec |
---|---|---|---|---|
IGLOO2 / M2GL150-STD | 16,197 4LUTs | 41 RAM64x18 6 RAM1Kx18 |
130 | 260 |
PolarFire / MPF500T-STD | 14,802 4LUTs | 50 uSRAM 16 LSRAM |
150 | 300 |
RTG4 / RT4G150-STD | 15,984 4LUTs | 41 RAM64x18 6 RAM1Kx18 |
70 | 140 |
SmartFusion2 / M2S150-STD | 16,197 4LUTs | 41 RAM64x18 6 RAM1Kx18 |
130 | 260 |
Note that these sample implementation figures do not represent the highest speed or smallest area possible for the core. Please contact CAST to discuss silicon resource utilization and performance for your target technology.
Engineered by Beyond Semiconductor.
Features List
Scalable, ultra-high performance 4K/8k capable JPEG Encoder
- Requires significantly lower power and fewer silicon resources than any equally fast hardware video encoder for HEVC/H.265, AVC/H.264, DSC, or JPEG2000.
- Consumes much less power than any equivalent software, or software-hardware encoder.
Standards Support
- ISO/IEC 10918-1 Baseline and Extended Sequential DCT modes
- Single-frame JPEG images and Motion JPEG payloads
- 8-bit and 12-bit per color samples
- Up to four color components; any image size up to 64k x 64k
- All scan configurations and all JPEG formats APP, COM, and restart markers
- Programmable Huffman Quantization tables
Rate Control Options
- Image: Limits the size of each individual frame
- Video: Regulates bit rate over a number of input frames.
Interfaces
- AXI Streaming pixel and compressed stream interfaces
- APB Control/Status interface
Performance
- Synthesis-time configurable scalable throughput
- Up to 32 samples per clock cycle
- Supports UHD (4k/8K) video and/or ultra-high frame rates
Ease of Integration
- Automatic program-once/encode-many operation
- Simple, dedicated timestamps interface
- Bit-accurate software model generates test vectors, expected results, and core programming values
- Optional Raster-to-Block Conversion with AXI or standard memory interface to the lines buffer
Resources
See the JPEG entry at Wikipedia.
See the Motion JPEG entry at Wikipedia.