ATCA Newsletter

Applying Multicores to Networking Applications

By Srini Addepalli and Subhashini Venkataramanan, Freescale Semiconductor

The new mantra for increasing processing power is to put multiple cores on a chip. Processors are already widely available from major manufacturers with 2, 4, or 8 cores. However, software developers are used to working with single-core devices. A major current challenge is migrating legacy software and developing new software for multicores.  

Software developers working on networking applications face two key challenges in migrating to multicores:

  1. Achieving a near-linear increase in performance with more     cores 
  2. Minimizing development efforts and code complexity

Developers have a choice of programming models, including AMP (Asymmetric Multiprocessing), SMP (Symmetrical Multiprocessing), and combinations of the two.

In AMP, functionality is divided across cores with each one running a specific subfunction. Many networking applications cannot use AMP for the following reasons:

- Non-deterministic network usage patterns lead to inefficient core utilization
- Pipelining issues increase packet latency

The SMP model is a better fit. Here each core processes all functions for a given packet, reducing latency. However, developers must deal with problems resulting from simultaneous execution of an application on multiple cores. Issues include:

- Loss of integrity in data structures such as linked lists or binary trees
- Failure to protect state variables that one packet must update and the next one must check

The common approach is to introduce locks to prevent two cores from working on the same data simultaneously. However, locks in the packet path reduce performance due to contention. That is, cores have to wait for others to complete their updates. Most networking applications update state variables with every packet, and contention is therefore usually very high.

Flow/Session is a common element tracked by networking applications. Examples include VLAN ID based flows and 5-tuple (Source IP, Destination IP, Protocol, Source Port, and Destination Port) based flows. Applications maintain state variables on a flow/session basis. One can use this approach to avoid locks for state variable updates. Keep a session/flow tied to a single core at any point in time, and different flows/sessions are always tied to different cores. State variable updates are then possible without loss of integrity and without locks, keeping the programming model simple. This technique, called session parallelization, hides multicore complexity, reduces development time, makes code easier to maintain, and realizes a linear increase in application performance with more cores.

For example, Freescale’s VortiQa software uses session parallelization in an SMP model. It has been applied successfully to vertical markets such as service provider equipment, enterprise security appliances, small business or residential gateways.

Multicore processors represent a new challenge for networking software developers. They must create models that take full advantage of available processing power without increasing development time or software complexity. Session parallelization allows developers to create multicore software that resembles traditional single-core software, yet provides a linear increase in throughput as the number of cores increases. This method is being applied successfully in many networking applications.

Srini Addepalli is Chief Software Architect and Fellow and Subhashini Venkataramanan is Architect at Freescale Semiconductor. You can reach them at saddepalli@freescale.com and subha@freescale.com, respectively.