AES-P
Programmable Advanced Encryption Standard Engine

The AES-P encryption IP core implements hardware Rijndael encoding and decoding in compliance with the NIST Advanced Encryption Standard. It processes 128-bit blocks, and is programmable for 128-, 192-, and 256-bit key lengths. 

Two architectural versions are available to suit system requirements. The Standard version (AES-P-S) is more compact, using a 32-bit datapath and requiring 44/52/60 clock cycles for each data block (128/192/256-bit cipher key, respectively). The Fast version (AES-P-F) achieves higher throughput, using a 128-bit datapath and requiring 11/13/15 clock cycles for each data block.  It can be programmed to use any of the following cipher modes: CBC, CTR, ECB, and OFB. 

The core works with a pre-expanded key, or with optional key expansion logic. 

The AES-P core is a fully synchronous design and has been evaluated in a variety of technologies, and is available optimized for ASICs or FPGAs.  

An AES encryption operation transforms a 128-bit block into a block of the same size. The encryption key can be chosen among three different sizes: 128, 192 or 256 bits. The key is expanded during cryptographic operations.  

The AES algorithm consists of a series of steps repeated a number of times (rounds). The number of rounds depends on the size of the key and the data block. The intermediate cipher result is known as state.

Number of rounds as a function of key size.
  KSIZE = 00 KSIZE = 01 KSIZE = 10
Rounds 10 12 14

Initially, the incoming data and the key are added together in the AddRoundKey module. The result is stored in the State Storage area. 

The state information is then retrieved and the ByteSub, Shiftrow, MixColumn and AddRoundKey functions are performed on it in the specified order. At the end of each round, the new state is stored in the State Storage area. These operations are repeated according to the number of rounds.  

The final round is anomalous as the MixColumn step is skipped. The cipher is output after the final round. 

Supported modes 

This AES-P core supports both encryption and decryption in ECB, CBC, CFB, OFB and CTR modes. ECB stands for Electronic CodeBook mode. This is the basic AES algorithm as described in the FIPS 197 documentation. 

CBC stands for Cipher Block Chaining mode. Chaining adds a feedback mechanism to a block cipher. The result of the previous encryption operation is XORed with the incoming data. An initialization vector IV is used for the first iteration. Decryption reverses encryption operations. 

The figure below shows the data flow during encryption (left) and decryption (right) in CBC mode.

Encryption and decryption data flow in CBC mode
Encryption and decryption data flow in CBC mode

CFB stands for Cipher FeedBack mode. In this mode, the output of an encryption operation is fed back to the input of the AES core. An initialization vector IV is used for the first iteration. 

Input data is encrypted by XORing it with the output of the encryption module. Decryption reverses encryption operations. 

The figure below shows the block diagram of the AES in CFB mode.

AES CFB core block diagram
AES CFB core block diagram

OFB stands for Output FeedBack mode. In this mode, the output of an encryption operation is fed back to the input of the AES core. An initialization vector IV is used for the first iteration. 

Input data is encrypted by XORing it with the output of the encryption module. Decryption reverses encryption operations. 

The figure below shows the block diagram of the AES in OFB mode.

AES OFB mode block diagram
AES OFB mode block diagram

CTR stands for Counter mode. In this mode, the output of counter is input of the AES core. An initialization vector IV is used to initialize the counter. 

Input data is encrypted by XORing it with the output of the encryption module. Decryption reverses encryption operations. 

The figure below shows the block diagram of the AES in CTR mode.

AES CTR mode block diagram
AES CTR mode block diagram

Key Expansion 

The AES algorithm requires an expanded key for encryption or decryption. The KEXP AES key expander core is available as an AES-P core option. 

During encryption, the key expander can produce the expanded key on the fly while the AES-P core is consuming it. For decryption, though, the key must be pre-expanded and stored in an appropriate memory before being used by the AES-P core. This is because the core uses the expanded key backwards during decryption.  

In some cases a key expander is not required. This might be the case when the key does not need to be changed (and so it can be stored in its expanded form) or when the key does not change very often (and thus it can be expanded more slowly in software). 

The core has been verified through extensive synthesis, place and route and simulation runs. It has also been embedded in several products, and is proven both in ASIC and FPGA technologies. 
 

Support 

The core as delivered is warranted against defects for ninety days from purchase. Thirty days of phone and email technical support are included, starting with the first interaction. Additional maintenance and support options are available. 

Deliverables 

The core is available in ASIC (RTL) or FPGA (netlist) forms, and includes everything required for successful implementation. The ASIC version includes 

  •     HDL RTL source 
  •     Sophisticated HDL Testbench (self checking) 
  •     C Model & test vector generator 
  •     Simulation script, vectors & expected results 
  •     Synthesis script 
  •     User documentation 

The AES-P can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). The following are sample ASIC pre-layout results reported from synthesis with a silicon vendor design kit under typical conditions, with all core I/Os assumed to be routed on-chip. The provided figures do not represent the higher speed or smaller area for the core. Please contact CAST to get characterization data for your target configuration and technology.

AES-P Standard Core ASIC Implementation Results

ASIC Technology

Number of eq. gates

Fmax (MHz)

Throughput (Gbps)

TSMC 16nm
9,149
500
1.455
TSMC 28nm HPM
9,564
500
1.455

TSMC 40nm G

12,231
500
1.455

Throughput for a 128-bit key size

AES-P Fast Core ASIC Implementation Results

ASIC Technology

Number of eq. gates

Fmax (MHz)

Throughput (Gbps)

TSMC 16nm
27,598
500
5.818
TSMC 28nm HPM
28,313
500
5.818

TSMC 40nm G

37,075
500
5.818

Throughput for a 128-bit key size

The AES-P can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). The following are sample Intel® results with all core I/Os assumed to be routed on-chip. The provided figures do not represent the higher speed or smaller area for the core. Please contact CAST to get characterization data for your target configuration and technology.

AES-P Standard Core Intel Implementation Results

Family Logic Memory Freq.
(MHz)
Throughput
(Mbps)
Cyclone V (-7) 238 ALMs 8 RAM Block 160 465
Stratix V (-1) 229 ALMs 4 RAM Block 380 1,105
MAX 10 (-7) 546 LEs 16 M9K 140 145

Throughput for a 128-bit key size

AES-P Fast Core Intel Implementation Results

Family Logic Memory Freq.
(MHz)
Throughput
(Mbps)
Cyclone V (-7) 1,040 ALMs 32 RAM Block 110 1,280
Stratix V (-1) 1,014 ALMs 16 RAM Block 290 3,375
MAX 10 (-7)\ 1,248 LEs 64 M9K 80 683

Throughput for a 128-bit key size

The AES-P can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). The following are sample AMD® results with all core I/Os assumed to be routed on-chip. The provided figures do not represent the higher speed or smaller area for the core. Please contact CAST to get characterization data for your target configuration and technology.

AES-P Standard Core AMD Implementation Results

Family (speed grade) LUTs BRAMs Freq.
(MHz)
Throughput
(Mbps)
Zynq-7000 (-3) 354 2 250 727
Kintex-7 (-3) 354 2 326 945
Virtex-7 (-3) 355 2 300 873
Kintex UltraScale (-3) 351 2 425 1,236
Kintex UltraScale+ (-3) 353 2 550 1,600
Versal (-2) 383 2 425 1,236

Throughput for a 128-bit key size

AES-P Fast Core AMD Implementation Results

Family LUTs BRAMs Freq
(MHz)
Throughput
(Mbps)
Zynq—7000 (-3) 998 8 175 2,036
Kintex-7 (-3) 1,007 8 250 2,909
Virtex-7 (-3) 1,011 8 250 2,909
Kintex UltraScale (-3) 1,029 8 350 4,073
Kintex UltraScale+ (-3) 1,033 8 475 5,527
Versal (-2) 1098 8 325 3,782

Throughput for a 128-bit key size

Related Content

Features List

  • Encrypts and decrypts using the AES Rijndael Block Cipher Algorithm
  • Satisfies Federal Information Processing Standard (FIPS) Publication 197 from the US National Institute of Standards and Technology (NIST)
  • Processes 128-bit data in 32-bit blocks
  • Employs user-programmable key size of 128, 192 or 256 bits
  • Two architectural versions:
    • Standard is more compact: 32-bit data path size Processes each 128-bit data block in 44/52/60 clock cycles for 128/192/256-bit cipher keys, respectively
    • Fast yields higher transmission rates: 128-bit data path Processes each 128-bit block in 11/13/15 clock cycles for 128/192/256-bit cipher keys, respectively
  • Supports Cipher Block Chaining (CBC), Cipher Feedback (CFB), Counter (CTR), Electronic Codebook (ECB), and Output Feedback (OFB) modes
  • Works with a pre-expended key or can integrate the optional key expansion function
  • NIST-Validated
  • Simple, fully synchronous, reusable design
  • Available as fully functional and synthesizable VHDL or Verilog, or as a netlist for popular programmable devices
  • Complete deliverables include test benches, C model and test vector generator

Resources

AES on Wikipedia

NIST: Approved Block Ciphers

FIPS 197, Advanced Encryption Standard (AES): download PDF

AES test suite: The Advanced Encryption Standard Algorithm Validation Suite (AESAVS): download PDF

Let's talk about your project and our IP solutions

Request Info

This core implements encryption functions and as such it is subject to export control regulations. Export to your country may or may not require a special export license. Please contact CAST to determine what applies in your specific case.