Silicon IP Cores
BA22-CE
32-bit Cache-Enabled Embedded Processor
The royalty-free BA22-CE is a 32-bit processor for deeply embedded applications that use off-chip instruction and data memories and that may need to run a real-time operating system (RTOS). This processor core is extremely competitive in terms of high performance and low power consumption, and has best-in-class code density.
The core has Instruction and Data Caches, and an AMBA® AHB™, AXI4™, or Wishbone system bus interface. It is also equipped with dedicated Quick Memory (QMEM) interfaces to tightly-coupled memories, which offer fast and deterministic access to code and data, and can be used for inter-core communication in a multi-core architecture. Its base version includes 16 to 32 general-purpose registers (GPRs), a tick-timer (TTimer), a programmable interrupt controller (PIC), an advanced power management unit (PMU), and optionally a debug unit (DBGU). The core’s processing capabilities can be enhanced further with the optional hardware Multiply-Accumulate (MAC), IEEE 754 compliant floating-point, and DSP instructions acceleration units. Its interrupt response time can also be optimized with the addition of a Vectored Interrupt controller (VIC).
The BA22-CE supports the variable instruction length BA2 instruction set, benefits from its extreme code density, and is binary compatible with other members of the BA2x processor family. Programming is facilitated with the included C/C++ tool chain, Eclipse IDE, architectural simulator, and ported C libraries. Advanced debugging capabilities and off-the-shelf development boards can further ease software development.
At its minimum configuration, the BA22-CE core synthesizes to approximately 30k eq. gates (22,000 sq. um in a 28nm technology, excluding the SRAMs required for the caches). The processor core can be clocked at over 800MHz in 28nm and 16nm processes, and its performance is rated at 2.93 Coremarks/MHz.
Additional microcontroller peripherals may be ordered for pre-integration and delivery with the core, individually or in a complete platform. IP Integration Services are also available to help integrate the processor with memory controllers, image compression, or other CAST IP cores.
Part of the royalty-free BA2x family, the BA22-CE processor core has been designed for easy reuse and integration, has been rigorously verified, and is production proven.
The BA2 instruction set provides extreme code density with-out compromising performance, ease of use, or scalability. It features:
- A linear, 32-bit address space
- Variable length instructions: 16, 24, 32, or 48 bits
- Simple memory addressing modes
- 162 to 32 general purpose registers
- Efficient flow-control, arithmetic, and load/store instructions
- Floating point and DSP extensions
The core is delivered with BeyondStudio™, a complete Integrated Development Environment (IDE) for Windows and Linux under Eclipse. BeyondStudio includes a highly featured source code editor, supports graphical source-level debugging and GUI based configuration, and can be extended with a collection of available or custom plug ins.
The IDE integrates an Instruction level simulator and a GNU cross-compiling toolchain. The GNU Compiler Collection (GCC), includes front ends for C, C++, Objective-C, Fortran, Java, and Ada; libraries for these languages (e.g. libstdc++, libgcj, etc) are provided. The toolchain also includes the GNU Binutils collection of binary tools, and the GNU Project Debugger (GDB).
Extensive support of libraries enables easy application development for Linux and Android. Finally, hardware targets can be interfaced with the cost effective Beyond Debug Key, which in addition to standard JTAG (IEEE 1149.1 and IEEE 1149.7) also supports proprietary One Wire Debug and Two Wire Debug protocols.
Support and Services
The core as delivered is warranted against defects for 90 days from purchase. Thirty days of phone and email technical support are included, starting with the first interaction. Additional maintenance and support options are available.
IP Integration Services are also available to help minimize time to market for BA22-based systems. The processor core can be delivered pre-integrated with typical microcontroller peripherals such us UARTs, timers and serial communication cores, or with memory controllers and interconnect IP cores. Contact CAST Sales for details.
Deliverables
The core is available for ASICs in synthesizable Verilog source code or for FPGAs in optimized netlists. It includes everything required for successful implementation: extensive documentation, a testbench, a sample SoC design, sample synthesis and simulation scripts, and the BeyondStudio™ Eclipse-based software development IDE for Windows and Linux.
Reference designs on FPGA boards are also available; contact CAST Sales for information.
The BA22-CE can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). The following are sample ASIC pre-layout results reported from synthesis with a silicon vendor design kit under typical conditions, with all core I/Os assumed to be routed on-chip. Implementation numbers are for the core implemented with 4-way associative cache memories. The provided results do not represent the higher speed or smaller area for the core and area figures do not include the cache RAM size. Please contact CAST to get characterization data for your target configuration and technology.
ASIC Technology |
Freq. (MHz) |
Area (µm2) |
Logic Eq. Gates |
TSMC 130nm (wl30, typ)
|
50
|
241195.4
|
28,419
|
150
|
286847.0
|
33,798
|
|
225
|
431501.1
|
50,842
|
|
TSMC 90nm (wl30, typ)
|
50
|
126612.9
|
25,634
|
150
|
146659.0
|
29,692
|
|
200
|
165317.8
|
33,470
|
|
TSMC 65nm (wl30, typ)
|
50
|
75978.4
|
18,994
|
200
|
100748.4
|
25,187
|
|
300
|
144512.0
|
36,128
|
Area, power and speed depend on configuration, optimizations, process, and libraries. Furthermore power consumption depends on power management, software and memories configuration.
The BA22-CE can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). The following results reported from Altera tools, assume 32 GPRs, a 4kx64 RAM connected to the IQEM bus, an 8kx32 RAM connected to the DQEM bus, a 8Kbytes 2-way associative instruction cache, a 2Kbytes 2-way associative data cache, that all clocks are driven by a common source, and that all core I/Os are routed off-chip. The provided figures do not represent the higher speed or smaller area for the core. Area, power and speed depend on core configuration and tool optimizations. Furthermore power consumption depends on power management, software and memories configuration. Please contact CAST to get characterization data for your target configuration and technology.
Family Device |
Logic Area | Freq. (MHz) |
Memory* |
Cyclone IV-E
EP4CE75F29C6 |
8,788 LEs
|
58
|
84 M9Ks
|
Cyclone V**
5CGXFC7D6F31C6 |
4,696 ALUTs
|
90
|
84 M10Ks
|
Stratix IV
EP4SE820H35C3 |
5,437 ALUTs
|
131
|
82 M9Ks
|
Stratix V
5SGXMA7H1F35C1 |
5,203 ALUTs
|
153
|
50 M20Ks
|
* Memory required for the implementation of QMEMs and chaches, not he CPU.
** CycloneV implementation does not include the debug unit.
The BA22-CE can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). The following results reported from AMD tools, assume 32 GPRs, a 4kx64 RAM connected to the IQEM bus, an 8kx32 RAM connected to the DQEM bus, a 8Kbytes 2-way associative instruction cache, a 2Kbytes 2-way associative data cache, that all clocks are driven by a common source, and that all core I/Os are routed off-chip. The provided figures do not represent the higher speed or smaller area for the core. Area, power and speed depend on core configuration and tool optimizations. Furthermore power consumption depends on power management, software and memories configuration. Please contact CAST to get characterization data for your target configuration and technology.
Family Device |
Logic (Slices) |
Freq. (MHz) |
Memory* (BRAM) |
Spartan-6
XC6SLX150T-3 |
1,988
|
91
|
38
|
Virtex-6
XC6SLX130T-3 |
2,336
|
128
|
40
|
* Memory required for the implementation of QMEMs and caches, not he CPU.
Engineered by Beyond Semiconductor.
Features List
High Performance 32-bit CPU
- 2.93 CoreMarks/MHZ
- Single-cycle instruction execution on most instructions
- Fast and precise internal interrupt response
- Hardware Multiply Unit
- Optional hardware divide, multiply-accumulate, DSP instructions acceleration, and floating-point units
Low Power Consumption
- Industry-leading code density minimizes instruction memory area & power
- Small silicon footprint (from 22,000 sq. um in 28 nm, or 30K eq. gates) for lower leakage and dynamic power
Fast & Flexible Memory Access
- Harvard-style, separate Instructions and Data caches
- Tightly coupled Quick Memory for fast and deterministic access to code and/or data
- Optional Memory Protection Unit
- Efficient Power Management
- Dynamic clock gating and power shut-off of unused units
- Software- and hardware-controlled clock frequency
- Wake-up on tick timer or external interrupt
Advanced Debug Capability
- Non-intrusive debug/trace for both CPU and system
- Complex chained watchpoint and breakpoint conditions
- Standard JTAG and proprietary Two-Wire Debug interfaces
Integrated Peripherals
- Base configuration includes a 32 bits-wide tick timer and a programmable interrupt controller
- Optionally pre-integrated with AMBA bus infrastructure, DMAs, GPIOs, UARTS, Timers, SPI, I2C, memory controller and other IP cores from CAST.
Easy Software Development
- Beyond Studio IDE for Windows & Linux
- C /C++ compiler, debugger, linker, assembler, and utilities
- Architectural simulator
- Ported libraries and RTOS