Fast Inline Cipher Engine, AES-XTS/GCM, SM4-XTS/GCM, DPA

Overview

The ICE-IP-338 (EIP-338) Inline Cipher Engine is a scalable, high-performance, multi-stream inline cryptographic engine that offers XTS and GCM modes of operation on bulk data for the AES and optionally SM4 algorithms. Its flexible data path is suitable to scale from 50 Gbps to 2 Tbps providing a tailored engine with minimal area for your application.

The flexible interface makes it possible to perform processing for many different applications and protocols, including inline memory encryption, inline disk encryption, MACsec, IPsec and OTN security. The multi-stream architecture allows interleaved data processing for many independent data streams simultaneously. Switching between streams can be done every clock cycle without loss of performance. Data is processed without flow control and with fixed latency, dependent on the static configuration selected.

The ICE-IP-338 data path can be scaled to widths that are multiples of 128 bit to allow a tradeoff between area and performance that best fits the target application. Configuration options include or exclude support for CipherText Stealing (CTS), the GCM mode, and the SM4 algorithm and/or Datapath Integrity logic. The cryptographic AES and SM4 primitives can be provided with or without side channel attack DPA countermeasures.

On-chip SRAM external to the ICE-IP-338 is used to store the key database as well as various precomputes and state information for each of the streams the engine is processing in interleaved fashion.

Key Features

  • Performance and Configuration
    • One input word per clock without any backpressure
    • Design can switch stream, algorithm, mode, key and/or direction every clock cycle
    • GCM: throughput is solely determined by the data width, data alignment and clock frequency
    • XTS: block processing rate may be limited by the number of configured tweak encryption & CTS cores; a configuration allowing 1 block/clock is available
    • Design achieves up to 2 GHz in 7nm technologies (DPA: 1 GHz)
    • Datapath integrity configuration option
  • Low Latency with Zero Variation
    • Low-latency processing with fixed latency per static pipeline configuration.
    • Pipeline can be statically configured to reduce latency in cases where certain modes or algorithms are not in use
  • Cryptographic Processing
    • Algorithms: AES or AES and SM4
    • Modes of operation: XTS or XTS and GCM
    • Supported key sizes: 128 and 256 bits
    • Regular or DPA protected implementation
    • Bi-directional design: direction is selected on a per-key (for XTS) or per-packet (for GCM) basis
    • Uni-directional XTS design option available for reduced area
    • Tag output
    • External 96-bit IV generation that allows supporting various use cases
  • Packet Interface
    • Push-bus time-sliced interface without handshake
    • Each data word may belong to a different stream
    • Sideband signals for control and processing status
    • Configurable bus width, depending on desired throughput in multiples of 128-bit units
  • Control Plane Interface
    • Key loading interface can easily be mapped to a 32-bit wide host interface
    • Key set separate from stream state; allows for many parallel streams sharing a limited set of keys to reduce storage requirements
  • External Memory Interface
    • Interfaces to buffer data and control information
    • Interfaces are for 1R/1W memory with 2-cycle read latency to allow inserting ECC logic
    • ECC uncorrectable status input
    • Some memories have per-word chip selection for efficient power usage
  • Clocking
    • Single-clock synchronous design with a number of switchable clock domains for efficient power usage
  • Compliance
    • FIPS 197, IEEE-P1619/D16
    • NIST CAVP (to be used in a FIPS 140-3 compliant product)
    • NIST SP800-38A: AES-CTR
    • NIST SP800-38E: AES-GMAC, AES-GCM,
    • NIST SP800-38E: AES-XTS

Benefits

  • The ICE-IP-338 is a data-processing engine and contains input/output data interfaces and interfaces intended for supplying key material that is stored in the engine’s local SRAM.
  • Before cryptographic processing can start, the Host CPU transfers the key material, together with the algorithm and mode of operation to use, to one of the key slots in the engine. Key material can be shared between multiple streams and many blocks while the key remains available in local SRAM.
  • The Tweak (for XTS) or IV (for GCM) is provided prior to or at the same time as the first data word, together with a reference to the Key slot and the direction of processing in case of GCM. After processing, the ICE-IP-338 outputs the result data and, in case of GCM mode, authentication tag together with the last output data word.
  • The external system is responsible for the following items:
    • Per-block Tweak or IV generation
    • Key lifetime management, to ensure that the key is refreshed when the current key expires
    • XTS decrypt key generation in case of an engine configuration without Decrypt Key generator
    • Reacting to processing errors reported by the ICE-IP-338
  • Separate IP cores can be provided to assist with Tweak or Decrypt Key generation. 

Block Diagram

Fast Inline Cipher Engine, AES-XTS/GCM, SM4-XTS/GCM, DPA Block Diagram

Deliverables

  • Packages
    • RTL IP
    • Driver Development Kit
  • Complete Documentation
    • Hardware integration guide
    • Hardware integration manual
  • Tools and Scripts
    • Verilog for synthesis and simulation
    • All scripts and support files needed for standard EDA tool flows
    • Integration Support
      • Complete verification test bench
      • Comprehensive set of test vectors

Technical Specifications

Foundry, Node
Any
Maturity
In production
Availability
Now
TSMC
Silicon Proven: 7nm , 16nm , 28nm , 40nm G
×
Semiconductor IP