myAltera Log In

Forgot Username or Password?

Don't have an account?

In Pursuit of Power

It is no news that power design for modern systems is hard. The escalating demands of advanced chips—huge bursts of current, multidecade operating ranges, fast transients, and digital mode controls—have turned supplying power at the point-of-load (PoL) from an exercise in arithmetic into an adventure in high-bandwidth mixed-signal design. Looking in the opposite direction, pressure for greater plant-level efficiency is pushing really high DC voltages—48V and more—from the bottom of the rack or the back of the chassis closer and closer to the CPUs and SoCs. Caught in the middle, power designers must somehow produce a mixed-signal network and not a train wreck.

The Supply Side

The challenges start out with the bulk DC regulators. In aircraft, 28 VDC has long been the de-facto standard. In hybrid and electric vehicles, several hundred VDC may be available at the battery. Telco or server racks may be distributing anything from the traditional 12 VDC to 48V.

Normal practice says you step these high voltages down for distribution to individual circuit boards. But if the system is large or efficiency is a vital concern, multiple layers of buck regulators may not be the best choice. Efficiency dictates that you push the high DC voltage as deep into the system as you can.

Some designers talk about powering PoL regulators directly from 48V. In a recent presentation, Google claimed their 48V rack architecture reduced distribution losses by a factor of 16 compared to 12V racks. The Google approach fed 48V directly to the PoL regulators handling big loads like CPUs or DRAM arrays, while stepping down the bulk voltage to 12V to supply more complex requirements with specialty regulators.

Figure 1. Some system designers are feeding 48V directly to PoL regulators in data centers and other such large systems

But there are counter-arguments to this idea. There is a huge existing infrastructure for 12V systems based on silicon MOSFET switches and commodity, or at least inexpensive, regulators. Arguably, single-stage 48V PoL regulation requires GaN switch transistors which can be significantly more expensive. “It can come down to a question of capital expenditure versus operating expenditure,” observes Intel PSG power division manager Mark Davidson. “You may get more efficiency in really massive deployments with 48V PoLs, but stepping the high voltage down to 5-8V lets you use lower-cost PoLs.”

Distributing the lower voltage also gives you access to a wider range of PoLs with special capabilities. And with the increasingly sophisticated demands of system-level chips—the CPUs, FPGAs, GPUs, and SoCs that will take much of the power—having that flexibility may turn out to be mandatory.

What the Chips Want

This part of our story begins with a purely physical phenomenon: the end of threshold-voltage scaling. Below about 90 nm, the ability of process designers to reduce threshold voltages, parasitic capacitances, and interconnect series resistances as transistor dimensions fall began to slip away. Thus ended Dennard Scaling—the ability to keep power density from increasing as transistors get smaller. And that meant we were soon producing chips that could destroy themselves by simply operating at full speed.

This situation led to modern power management. At first it was just clock scaling: if the chip started to get too hot, or you didn’t need the full performance, you could turn down the clock frequency. But, particularly as static power became an issue equally as important as dynamic power, designers needed to reduce voltages as well as frequencies—hence, dynamic voltage-frequency scaling (DVFS) and outright power gating. By reducing voltage, these techniques gave designers leverage over both dynamic and static power consumption.

But there is always an issue or two. In most systems, there is significant latency involved in slowing down or—especially—powering down functional blocks, particularly if they have significant amounts of internal state. A chip needs to monitor its expected workload to find opportunities for power savings as well as tracking die temperature to recognize when self-preservation is in order. Often, a chip must depend on external clues about what is coming in the workload as well.

Then the power-management circuitry must orchestrate a complex dance. For example, you may have to freeze a block, isolate it from surrounding blocks, save its state, alter the supply voltage and frequency, wait for stability, possibly reinitialize the block (with restored state if the block has been powered-down) and reconnect it to the system. All this could take milliseconds, so guessing wrong about how much time will elapse before the block is needed again could actually cost energy, as well as harming system performance. And during this whole sequence, voltage regulators and clock sources must respond promptly to detailed commands from the power manager.

Nor is this sort of dynamic dance the only complication facing the modern supply network. There is the matter of sequencing. Clearly, different blocks within a system-level chip will, at various times, require different voltages. Especially during chip power-up and power-down, the sequences and rates at which these supplies ramp up and down are often critical to avoid latch-up at the interfaces between blocks, or even physical damage to circuitry. So PoL regulators must be able to respond immediately and accurately to commands calling for specific voltage ramps.

Even when there is no request to change the voltage, the load can change quite dramatically, putting yet another burden on the PoLs. According to Intel circuit researcher James Tschanz, “Aggressive power gating may suddenly take a rail from tens of Watts to milliWatts.” Within a modern CPU or SoC, signals from operating systems or even from instruction dispatch units may gate power to large functional blocks like vector processors or whole CPU cores. So the PoLs must be able to track huge current swings without exceeding voltage error or noise specifications. In some cases the dynamic range of these loads may be greater that the efficient range, or even the entire certified operating range, of the regulator. Designers may need to switch dynamically between buck regulators, which are generally more efficient at high currents, and low-noise low-drop-out (LDO) regulators, which can be more efficient at low currents.

There are also some loads in these big chips, and on the boards around them, that have special needs. Non-volatile memory chips may require higher voltage for programming. DRAM banks may have their own power-management strategies, and consume significant power. FPGAs can have huge inrush currents and complex sequencing requirements. On the big FPGAs, CPUs, and SoCs, analog circuits and SRAM blocks may have unusually stringent noise requirements.

Further, with the growing use of PCI Express* (PCIe*), multi-gigabit Ethernet, and even faster serial links, transceivers have emerged as a particularly demanding category of circuits with respect to supply noise. The supply noise specs on high-speed serial transceivers are often beyond the reach of all but a few switching regulators—so, often, power designers will use linear PoL regulators on these particular rails.

The picture we are drawing here is not charming. A system-level chip may be surrounded by a whole committee of PoL devices: high-current, multi-mode, multi-voltage switching regulators, inexpensive fixed-function switchers, high-efficiency or LDO regulators, a bulk regulator, and an microcontroller unit (MCU) or FPGA to provide control, sequencing, and telemetry.

This problem is getting out of hand. It requires too much information—for instance, details of transient requirements on some of the big chips’ DC rails may not be publicly available. Nor may be the transient responses of some regulators. The design has implications for board placement and routing in the already-critical neighborhood of the big chips. And try answering a simple-sounding question like “will my 25G Ethernet lanes stay up when a CPU core drops from turbo mode to power-down?” That could take a mixed-signal simulation for which many design teams would lack the tools, the models, and perhaps the expertise. Too often, the way teams get from initial design to something that meets in-field reliability goals is a mass deployment of big capacitors.

Toward a Solution

The problem has become serious enough that vendors from several directions are working on it: system-level chip designers, PoL regulator designers, and start-ups with entirely new ideas. All are trying to ease the load on the system design team.

One of the most ambitious efforts is to make the problem go away by absorbing all the PoLs into the big chip. Tschanz points to Intel’s Fully Integrated Voltage Regulator (FIVR) program, which has put both switching and linear regulators on some CPU dice. “We can put switching regulators near the blocks they are supplying, with the inductors integrated into the package,” Tschanz explains. “And we can put LDOs right near the on-die memories and PLLs.”

The design effort for Intel is non-trivial, requiring a high-voltage mixed-signal circuit design in a digital process and integration of inductors into a cost-constrained package—among other issues. But the result is that customers only have to provide a relatively uncritical 1.8V rail, and don’t have to understand the chip’s power-management machinations.

Regulator vendors are also pitching in. Newer designs include PoLs with multiple output voltages and digital interfaces to allow two-way communications between PoLs and power controllers. Close attention to transient behavior is helping. And at least one vendor is using much higher than normal switching frequencies on its switching regulators in order to meet stringent noise requirements.

One of the more innovative approaches comes from a startup, AnDAPT—in which, by way of full disclosure, Intel has an investment. Company CEO Kapil Shankar says that when a design problem starts presenting too many complex uncertainties, the industry’s response is usually programmability—either moving the problem into software or into programmable logic.

“In this case, an MCU has not been a good answer,” Shankar argues. “Especially with several concurrent real-time tasks, an MCU can’t guarantee the deterministic latency today’s power management requires.” And sometimes, the things you need to chance are in the design of the analog paths, not in the digital control functions, so software by itself can’t help.

AnDAPT’s answer is unique: a field-programmable mixed-signal chip they call an Adaptive Multi-Rail Power Platform. If you’ve been around for a while, the first image to come to your mind may well be a graveyard full of promising programmable-analog chip start-ups. But the AnDAPT chip is arguably something different (Figure 2).

Figure 2. The AnDAPT Power Platform provides a set of building blocks for a complete PoL solution.

Rather than a uniform array of analog elements—an intuitively rich architecture that has, over the years, proved singularly resistant to actual application—the AnDAPT architecture starts with two SRAM-programmable fabrics—one analog and one digital. The former is essentially a configurable fabric of analog signal paths, and the latter a conventional look-up table and register FPGA fabric. But the heart of the chip design is not the fabrics—it is a collection of configurable special-function blocks. There are several kinds of these: Power Blocks, Sensor Blocks, Compensator RAMs, and Timers. Rather than asking customers to build up functions from transistors, passives, gates, and registers, AnDAPT will provide function templates that configure these larger blocks perform specific functions.

For example, there are templates to configure a Power Block as any of a variety of switching regulators, LDO regulators, current protection circuits, or current-sensing DAC/comparators. Other functions are available as well. A Sensor Block can be an error digitizer, comparator, instrumentation amp, or reference DAC. Compensation RAMs are used as look-up tables in conjunction with arithmetic functions in the FPGA fabric to implement transfer functions for digital control loops. The FPGA fabric also implements state machines for control, sequencing, and interface functions, and the analog fabric provides analog interconnect.

You pick, configure, and wire up the blocks you need through a graphic user interface, much as you would pick discrete parts from a catalog. Then you can characterize the device you have designed with an integrated simulation tool.

Not every design would need AnDAPT’s level of flexibility, especially as CPU, FPGA, and SoC vendors bring power devices into their packages. But in this transition period it is vital that system architects and design managers realize the demands their choices—in bulk distribution, in big chips, even in memory and interconnect—are imposing on the design team.

 


CATEGORIES : Power Solutions/ AUTHOR : Ron Wilson

Write a Reply or Comment

Your email address will not be published.