Using peripheral DMA boosts networked 32 bit MCU security and bandwidth
By Dany Nativel, Jacko Wilbrink and Tim Morin , Embedded.com
Sep 9 2005 (10:00 AM)
Increasingly, embedded control systems are likely to be connected to local area networks that have dozens or even thousands of nodes. These include controller area networks (CAN) in with up to one hundred nodes, ZigBee wireless control networks with thousands of interlinked nodes, and even Ethernet networks with theoretically unlimited nodes.
These localized embedded networks are, in turn, increasingly connected to the other networks and the outside world via corporate intranets, extranets and, most significantly, the Internet.
Networking MCUs Mandates Data Security
The challenge facing MCU chip designers and the developers who build systems using them is that the connected devices must also be able to transfer and verify large amounts of data over multiple network protocols while at the same time provide a high degree of security, especially in wirelessly connected embedded MCU designs.
Controlling a geographically dispersed array of embedded systems over a public network, such as the Internet radically increases the need for security because it opens up access to those systems. You would not want an outsider hacking your building security or HVAC systems. And you really wouldn’t want anyone to shut down the power grid, or release all the water from a dam on short notice, or open a valve on a gas pipeline. Therefore, access to embedded networks must be controlled and data must be protected by using advanced encryption algorithms such as Advanced Encryption Standard (AES), Data Encryption Standard (DES) or Triple Data Encryption Standard (TDES).
Selecting an encryption algorithm depends on how much processing power is available, how quickly the encryption needs to happen, and how secure or “uncrackable” the data needs to be. Any encryption code can be “cracked” if enough computing power is thrown at it. More complex encryption algorithms are harder to crack, but they also require more computation. Thus, a balance must be struck between the level of security and the required throughput of the application.
Encryption is computationally intensive, frequently requiring dedicated external processors. Simpler encryption algorithms with small keys and/or small data streams can be executed in software. However, if the data rates are high and/or the data has stringent security requirements, more complex algorithms and longer keys are necessary.
At 50 MHz an ARM7-based 32 bit MCU can execute software AES encryption at 4.3 Mbps. Not only is this not fast enough for many applications, the ARM7 is unable to execute any of its control functions while it is encrypting or decrypting data. The ARM7 essentially becomes a software co-processor. Large volumes of data make a software implementation in the ARM7 unacceptable. More rigorous algorithms make it impossible.
The most practical solution is to embed an encryption engine directly on the microcontroller that can execute AES and Triple DES independently. Here again bandwidth is a primary issue. Embedding an encryption engine on Atmel’s SAM7X controllers increases AES encryption throughput to 20 Mbps, DES to 12.8 Mbps and Triple DES to 11.2 Mbps.
Although these encryption rates are substantially faster than software implementations, they may not be sufficient for many high data rate Ethernet applications. Higher data rates can be achieve by augmenting the MCU hardware with a peripheral DMA controller. Use of an enhanced PDC increases AES en-/de-cryption throughput to 80 million bits per second, sufficient for high bandwidth data transfers. DES bandwidth nearly triples to 32.8 Mbps and TDES nearly doubles to 20 Mbps.
Boosting secure ARM MCU operations with PDC functionality
Even without factoring the high level of security required in most connected MCU applications, vendors who have begun to offer ARM7 MCUs with a variety of network interfaces -- CAN, Ethernet and/or USB, TWI, SPI, and USART interfaces --- are finding that there is more to networking such devices than just adding an interface and a protocol stack.
Putting a 10/100 Ethernet MAC or CAN or USB on an ARM7 is not sufficient to network embedded control. The processor must be able move the data around at the required rate. When you consider that the data rate for full speed USB 2.0 is 12 Mbps, the CAN data rate is 1 Mbps, Ethernet is 100 Mbps, and SPI and USART peripherals can run at 25 Mbps, it becomes quite clear that the issue of data transfer must be dealt with in any extensively connected embedded control system.
The core processor has to be augmented so it can handle the huge volumes of data that are likely to pass through it. The ARM7 core in and of itself may not be up to the task (see Table 1 below). The CPU must directly handle all data transfers one-byte-at-a-time. At 50 MHz, a 2 megabits per second (Mbps) data transfer eats up 55 percent of the ARM7’s resources; at 4 Mbps all the processor’s resources are dedicated to data transfers. There are no cycles left to execute its real-time control application.
At the same time, streaming encryption must support the data rate of the transferring peripheral. Therefore, encrypting a data stream for a high speed SPI or USART transfer requires encryption bandwidth that approaches 25 Mbps. Streaming Ethernet encryption must approach 100 Mbps. In software, the ARM7 can do AES encryption at only 4.3 Mbps, but that is all it can do. It becomes a dedicated encryption software processor. By adding an encryption engine to an ARM7TDMI based MCU, such as Atmel’s SAM7X, streaming encryption can accelerate to 20 Mbps for AES, 12.8 for DES, and 11.2 Mbps for TDES.
The hardware encryption engine also simplifies the user interface while offering various complex modes defined in the AES and TDES specification. Basically the message to encrypt/decrypt is passed to the AES or TDES engine through a set of dedicated registers. The encryption key is then placed into another set of registers and finally the encryption/decryption process is initiated using a special configuration register. Depending of the operation, plaintext or encrypted data can then be found in a set of output data registers.
Automatic DMA block transfer boosts MCU encryption
The entire operation can be simplified even further using the automatic block transfer mechanism provided by the use of a Peripheral DMA controller. Besides boosting the encryption speed, it allows the end user to encrypt/decrypt data by blocks of bytes instead of single bytes. Basically the AES and TDES engine embeds dedicated Peripheral DMA registers that contain the address of the source data buffer, the number of transfers or encryption/decryption operations (up to 64K transfers) and, finally, the address of the output data buffer. The defined block is processed in background without any CPU intervention. A new dual-pointer mode is now available on the ARM7TDMI-based SAM7X that removes the limit of 64K transfers on the peripheral DMA by having an automatic buffer switch when one is empty.
The peripheral DMA controller operates independently of the processor, eliminating interrupt overhead and radically reducing the number of CPU clock cycles required for a data transfer. Each peripheral in the architecture has two dedicated PDC channels, one each for receiving and transmitting data.
The user interface of a PDC channel is integrated in the memory space of each peripheral, and contains a 32-bit memory pointer register, a 16-bit transfer count register, a 32-bit register for next memory pointer, and a 16-bit register for next transfer count. Multiple, continuous, blocks of data, from more than one peripheral can be transferred using the PDC, thereby removing the burden of moving data from the processor and sustaining high-speed data transfers on any peripheral.
When performing the necessary encryption functions necessary for secure operation, the use of such an enhanced PDC capability allows the MCU to easily handle the additional data movement load that encryption imposes. By off-loading from the CPU the function of transferring data between the peripherals, the memories and the encryption engine, the PDC nearly quadruples AES encryption bandwidth to 80 Mbps. DES to 32.8 Mbps and TDES to 20 Mbps (see Table 2 below).
The combination of an on-chip encryption engine and a peripheral DMA controller results in streaming AES encryption bandwidth that is nearly 20 times greater than can be achieved using software encryption alone. Additionally, it frees up the processor to execute its embedded control functions.
Dany Nativel, is ARM Technical Marketing Manager, Jacko Wilbrink is ARM Marketing Manager; and Tim Morin is North American ARM Business Development, at Atmel Corp. in San Jose, Ca.
Related Semiconductor IP
- AES GCM IP Core
- High Speed Ethernet Quad 10G to 100G PCS
- High Speed Ethernet Gen-2 Quad 100G PCS IP
- High Speed Ethernet 4/2/1-Lane 100G PCS
- High Speed Ethernet 2/4/8-Lane 200G/400G PCS
Related White Papers
- Migrating from 8-/16-bit to 32 bit: Lessons Learned the Hard Way
- Boost MCU security AND performance with hardware accelerated crypto
- Using model-driven development to reduce system software security vulnerabilities
- Advanced BLDC Motor Control using Freescale Ultra Reliable MPC5676R/MPC5674F MCU
Latest White Papers
- New Realities Demand a New Approach to System Verification and Validation
- How silicon and circuit optimizations help FPGAs offer lower size, power and cost in video bridging applications
- Sustainable Hardware Specialization
- PCIe IP With Enhanced Security For The Automotive Market
- Top 5 Reasons why CPU is the Best Processor for AI Inference