EtherCAT Master Software Stack Performance
When using EtherCAT technology as a fieldbus, performance often plays a decisive role, but what is really meant by performance? Most often, performance is equated with speed. In the case of an EtherCAT network, this usually means a fast cycle time, around 1kHz or faster, to achieve fast control cycles. However, good performance can also be synonymous with a large amount of data, or with the ability to operate many devices from one controller.
In an EtherCAT network, these performance considerations come down to the EtherCAT master, and so therefore, an EtherCAT master software should meet all of these requirements:
- Support short cycle times for fast device update rates
- Support a large amount of cyclic process data
- Be able to handle many EtherCAT devices
Furthermore, this must all be achieved with the lowest possible load on the controller. For a high performance EtherCAT network deployment, no compromises should be made in terms of functionality, error checking, diagnostic options, and reliability in the event of problems.
For this to be successful, the EtherCAT master software must be designed for the most efficient possible use of the computing time. Some important design characteristics to achieve this are:
- Include high performance and real-time capable Ethernet drivers (link layer) for a direct interaction with the Ethernet controller from the master software
- Have no dependency on the operating system in the cyclic processing component
- Support operation without interrupts
- Have no internal tasks
- Support time slicing for the processing of non-critical, non-cyclic tasks over several cycles
- Limit acyclic (mailbox) data communication traffic
- Utilize "C" macros and optimized compilers
In addition to reducing the average computing time consumption, the peak load (maximum computing time consumption or maximum bus utilization) of the controller is also a critical measure. An EtherCAT master software must therefore provide and manage a number of parameters (settings) so that this peak load is reduced. The goal is always that sufficient computing power is available for the actual application that runs above the master, and that the specified timing is always adhered to.
Today, EtherCAT is used in a wide variety of different applications. Controller hardware ranges from small, embedded ARM processors like Cortex-M4, to powerful ARM multi-core processors like Cortex A57, or even high-end industrial PC/Server processors like Intel Core i5, i7, and even Xeon. An EtherCAT master can be implemented on all of these systems but based on the application the maximum number of slaves, the maximum size of the cyclic data, and the shortest possible cycle time may all be very different. It is rare that when designing an EtherCAT system that the EtherCAT master is considered when selecting the required processor, but rather the actual application that is processing the data that dictates the decision.
Therefore, the following factors and variables determine the achievable performance and influence the selection of the required controller hardware:
- Number and type of slave devices
- Size of the cyclic process data
- Required cycle time
- Required EtherCAT master features or capabilities (Distributed Clocks, Hot Connect, Redundancy, etc.)
- Necessary computing power for the application
As we will demonstrate below, the acontis EtherCAT Master software, EC-Master, takes all of the design considerations above, and manages all of the numerous system variables, and typically only requires 10-20% of the available CPU time.
In order to support the selection of the control hardware, or to be able to make statements about what is possible with an existing hardware with regard to EtherCAT, one can utilize existing performance values or take new measurements. It is important that in the critical cyclic processing area of the application that the computing time consumption of all process paths that the EtherCAT master software runs through is measured correctly and precisely. In recent years, acontis has carried out a large number of performance measurements on different systems with different operating systems and the same reference network configuration. This data can be used for a rough assessment of the achievable performance on a given processor.
However, the most reliable values are of course obtained with a live measurement on the real hardware running the desired operating system with the actual desired network configuration. These measurements do not require any special know-how, or additional equipment, and can be carried out very easily with the example applications included in the delivery of the acontis EC-Master software: EcMasterDemo and EcMasterDemoDc. Within these demo applications, the execution times (minimum, maximum, and average) of the individual master job functions, along with the cycle time, are calculated and saved to the log file (or printed to the console).
Built-in Measurement Functions in the Example Application
In the acontis EtherCAT master software, the integration of the application with the master in the cyclic part is done by synchronously calling certain functions, each of which fulfills a specific task. These functions, sometimes called jobs, are called from a high-priority task to control the network timing. In many instances, this high-priority task is already existing within the customer application, and so the functions can simply be called from this existing task. These jobs are called within the context of the application, so there is no interaction by the application with other tasks. The computing time consumption of the master stack can thus be determined very simply and accurately by measuring the computing time consumption of these functions.
The functions are:
At the beginning of a cycle, the newly received data (inputs) are first updated. This is done by evaluating the previously received EtherCAT frames when calling the Process Inputs job function. The application then takes this newly received data and calculates the data (outputs) that should be sent to the network. These new output data are then sent out when calling the Write Outputs job function. With the help of Direct Memory Access (DMA), the frame is transported from the memory to the Ethernet controller without loading the CPU and sent over the physical network. The frame then passes through all EtherCAT devices on the network and is automatically received on returning to the master without the need for an interrupt. The master state machine and the state machines on each individual slave device are then executed when calling the "Master Administration" job function.
During the initial start-up process, all slave devices must be transferred from the INIT state to the OPERATIONAL state in a series of sequential steps. The state machines are required during regular operation to handle acyclic communications like the handling of the download of a parameter via the mailbox protocol CAN application protocol over EtherCAT (CoE). These acyclic mailbox communications require another frame with slave-specific commands for reading and writing to the slave. This acyclic frame is sent using the “Send Acyclic Datagrams/Commands” job function. It is important that the master is able to limit this acyclic data traffic, otherwise the network or the CPU could become overloaded.
Performance Measurements Using the Example Application
The acontis EC-Master software has a built-in performance measurement capability within the included example application. This performance measurement calculation can be called using a command line parameter with the example application (–perf). When enabled, the example application will measure the execution times of the job functions that are called within the cyclic part of the application, as well as the total computing time consumed by the cyclic task itself. The example application uses the included APIs ecatPerfMeasStart() and ecatPerfMeasEnd() for high-precision measurement time calculations.
The resulting measurement values are recorded every few seconds to the log file, and printed to the console in the following format:
PerfMsmt 'Cycle Time ' (min/avg/max) [usec]: 948.3/1000.0/1053.5
PerfMsmt 'Task Duration (JOB_Total + App)' (min/avg/max) [usec]: 7.4/ 12.2/ 77.0
PerfMsmt 'JOB_Total ' (min/avg/max) [usec]: 7.0/ 11.4/ 67.5
PerfMsmt 'JOB_ProcessAllRxFrames ' (min/avg/max) [usec]: 1.3/ 3.4/ 46.2
PerfMsmt 'JOB_SendAllCycFrames ' (min/avg/max) [usec]: 3.0/ 3.9/ 41.5
PerfMsmt 'JOB_MasterTimer ' (min/avg/max) [usec]: 0.4/ 1.5/ 37.9
PerfMsmt 'JOB_SendAcycFrames ' (min/avg/max) [usec]: 1.5/ 2.4/ 36.9
PerfMsmt 'myAppWorkPd ' (min/avg/max) [usec]: 0.0/ 0.2/ 27.9
The following measurement results were performed with 16, 32, and 64 slaves on different controllers with different cycle times. The percentage load on the CPU by the EtherCAT master (EC-Master) is calculated by taking the ratio of the cumulative runtimes of the job functions and the overall cycle time.
Texas Instruments AM3359, ARM Cortex-A8, 32-Bit, 600 MHz
NXP i.MX 8, ARM Cortex-A72, 64-Bit, 1000 Mhz
Intel Atom, D510, 64-Bit, 1600MHz