Atmel claims 20-fold ARM7TDMI MCU speed boost with DMA addition

By Bernard Cole, Embedded.com
Aug 17 2005 (15:09 PM)

San Jose, Ca. - Atmel Corp. has begun revamping its 32 bit ARM7TDM-based microcontroller devices with a new peripheral DMA controller (PDC) architecture it believes will significantly improve performance in the networked and secure applications environments. In September, it will role out the first devices in its Smart ARM Microcontroller (SAM) family with the new PDC enhancement.

In benchmarks that the company has run on devices operating in secure wired or wireless environments, a 50 MHz ARM7-based MCU running the AES encryption in software is capable of only 4.3 Mbits/sec bandwidth, with no room left over for essential MCU operations. When encryption is done in hardware, throughput increases to 20 Mb/s. With the new peripheral DMA controller enhancements, data rate increases to 80 Mb/s.

“The key to the performance enhancement in a secure, networked MCU environment lies not in a more powerful CPU, or a faster encryption engine,” said Tim Morin, Atmel’s director of business development for the 32 bit ARM MCU product line, “ but in the way MCUs directly access and manage data streams to and from device peripherals and memory.”

He said that the problem that his company has had - in common with most licensees of the popular and low cost ARM 7TDMI core - is that in the effort to come of with a design that appealed to the broadest range of developers, ARM did not incorporate any mechanisms for direct memory access operations. And simply adapting one of the many alternative DMA structures commonly in use in RISC architectures, said Morin, does not work, because traditional DMA structures transfer data only between memories, but not between peripherals and memories.

The PDC architecture that the company’s engineers have come up with, said Morin, is elegantly simple and requires only four thousand additional gates to the existing core architecture plus 2500 gates per channel. “Moreover, it gives developers very precise control of the flow of data to and from the peripheral devices in their design,” he said.

In the new architectural enhancement, (see Figure below) most peripherals on an MCU have two channels dedicated to a PDC integrated into the memory space of each peripheral and consists of a 32 bit memory pointer register, a 16 bit transfer count register, a 32 bit register for the next memory pointer and a 16 bit register for the next transfer count, all capable of handling up to 64k byte transfers of data. The PDCs transfer data between on-chip serial peripherals such as the UART, USART, SSC, SPI, MCI and the on and off chip memories.

“Using the PDC avoids processor intervention and removes the processor interrupt-handling overhead,“ said Morin. “This significantly reduces the number of clock cycles required for a data transfer and as a result improves the MCU performance and makes it more power efficient.”

The PDCs channels are implemented in pairs, with each pair dedicated to a particular channel. One channel in the pair is dedicated to the receiving channel and one to the transmitting channel of each peripheral device.

“What is important in embedded MCU applications linked to a network is the ability to deal with both communications and control chores and give the programmer as much control as possible over those operations,” he said, “providing him or her the ability to service the peripherals at any data rate they want and still have as much as 95 percent or better throughput available for other operations.”

Also important, he said, was the ability to provide seamless and endless transmission between memory and the peripheral devices with no interruptions. “You don’t want a transmission counter to expire, direct the processor to do something else while it resends and than come back to the stack,” said Morin. “That is a no-no in a deterministic control environment.”

To deal with both requirements and insure that the DMA transfers operate continuously, at speed, the PDC was configured such that when one counter expires, the PDC downloads the next counter into the current register, generates an interrupt and updates the next counter.

“One counter-register set is used for the initial transfer and the other set for the next transfer, allowing the programmer to ping pong the DMA count and simulate an endless DMA transfer of data to the peripherals without interruption,” said Morin.

Unlike traditional DMA structures, he said, in the PDC transfers are not measured in terms of 8, 16, 32 bits or byte, word or halfword. “ Rather, the counter transfers ‘cycles,’ over a 32 bit bus,” he said. The PDC defines whether the transfer cycle is byte, word or halfword.

"If the peripheral is programmed to transfer eight bit data, the PDC can handle 64k bytes of data, if 16 bit, twice as much. and if 32 bit, four times as much. This gives the programmer considerable leeway over the DMA transfer characteristics. "

To simplify programming of the PDC, the programming structures have been embedded into each peripheral device; that is, the register and counter locations are in the peripheral control map.

"It is just another part of the programming task necessary to communicate with the peripheral, and from a software programmers point of view easily understood," said Morin. " What the programmer sees in the peripheral control map is a list of register and pointers: addresses and counts for the current and next counter and a controller register to enable or disable it.”

Morin said DMA architectural modifications being made will have a substantial impact in many wired and wireless embedded controller applications where encryption is necessary. There, it is necessary to continuously feed data to the peripheral devices at the data rate they need to perform the necessary control operations while at the same time service the encryption blocks, which are both compute and memory intensive, said Morin.

“If your application does not require real time or deterministic operation and you are not concerned about the ability to feed data to your peripherals, this modification is not important,” he said. “But that is a small minority of most advanced MCU applications.”

×
Semiconductor IP