Cover Page

Series Editor

Serge Petiton

Energy-Efficient Computing and Data Centers

Luigi Brochard

Vinod Kamath

Julita Corbalán

Scott Holland

Walter Mittelbach

Michael Ott

images

Introduction

As shown by a recent news article in Nature (Jones 2018), data centers consume about 1% of the total electricity demand while information and communication technology (ICT) as a whole, including personal devices, mobile networks and TV consumes about 10%. It shows also that the demand will grow exponentially in the near future leading in 2030 to a ratio of electricity demand varying depending on the estimations between 3% and 8% for data centers and between 8% and 21% for ICT An article in the Guardian (2017) shows a similar accelerating trend.

The energy consumed by a data center during a given period is the sum of the energy consumed by all the workloads that have been executed, plus the energy consumed when devices are idle, plus the energy loss to convert the electricity from the power lines down to the IT devices, plus the energy to cool the IT devices. This book will tackle of all four aspects of energy.

The energy consumed by a workload when running on a system is the integral of its power consumption over its execution time from beginning to end. If the power consumed by the workload is constant, it would simply be:

One trivial way to minimize the power of a workload while running on a system is to reduce the frequency of the processor. But this can be counterproductive since the elapsed time will very often increase to a point where the total energy is constant or has increased. From a return on investment perspective, it is also obvious that a system and a data center have to be used as much as possible and run as many workloads as possible. That is why energy efficiency, and not only power efficiency, is critical.

Existing data centers have two physical limits they cannot exceed: floor space and power supply. They also have an economical limit, which is their budget for operational costs.

To keep up as much as possible with Moore’s law, power consumption of IT devices is also on the rise, so that existing data centers are facing the dilemma of either keeping the same density of servers per rack (leading to a tremendous power and cooling challenge) or giving up on the density. That is one reason why numerous new data centers are being built around the world by hyperscale, cloud and high-performance computing data centers.

This increasing energy demand impacts not only the planet’s ecosystem but also the ICT business as increasing electricity demand leads to even more increased energy cost since the price of electricity itself is on the rise.

According to the U.S. Bureau of Labor Statistics, prices for electricity were 66% higher in 2018 versus 2000 which is 21% higher than inflation during the same period (Official Data Foundation 2018). A similar trend is seen in Europe (Eurostat n.d.).

Figure I.1 shows the wide range of electricity prices in 2018 around the world.

Image

Figure I.1. Price of electricity around the world in 2018, from Statisca.com

These prices are average price across different locations in a country. For example, the average price of electricity in the United States is $0.13 per kW/h, but it differs widely across the different states with locations where it can be as low as $0.06 per kW/h in Colorado. This variability of electricity price associated with the environmental conditions (temperature, humidity, etc.) of the location does play a very important role in the power and cooling design of a data center.

Power usage effectiveness (PUE) is a measure of how efficiently a computer data center uses its power. PUE is the ratio of total energy used by a data center facility to the energy used by the IT computing equipment. The ideal value is 1.0.

[I.1] Image

In the following, we will use PUE as the cooling efficiency not taking into account the power conversion losses. Legacy data centers usually have a PUE of about 2 and with a careful design it is possible to get around 1.1. Table I.1 shows the numbers of years needed for the energy cost to power and cool a server to equal the server acquisition cost with various PUE and electricity prices.

Table I.1. Impact of PUE on price of electricity versus price of server

PUE =

2

1.1

electricity price

number of years

number of years

$0.06

12.7

23.1

$0.21

3.6

6.6

$0.33

2.3

4.2

It shows how the energy costs can quickly equal the acquisition cost with a high PUE and medium to high electricity price. With a low electricity price, even with a high PUE, this will take more than 10 years and even close to 20 with a low PUE. This illustrates the importance of the data center location that impacts both PUE (through the possible use of free cooling) and electricity cost on the cooling infrastructure design.

Although PUE is an important metric, it does not take into account how much power the IT device is consuming and its power efficiency.

IT energy effectiveness (ITUE) measures how the IT equipment energy is optimized where voltage regulator (VR), PSU and fan are the energy consumed by the voltage regulator, power supply and fan. The ideal value is 1.0.

[I.2] Image

We will study all ITUE components not in isolation, but in relation to the type of workload the server is running since the server power consumption depends on the workload running on it.

But even with a low PUE and ITUE, the heat produced by the systems in the data center is still lost without any waste heat reuse.

Energy reuse effectiveness (ERE) measures how efficient a data center reuses the energy dissipated by the computer. ERE is the ratio of total energy used by the facility minus the waste heat energy reuse divided by the energy used by the IT computing equipment.

[I.3] Image

An ideal ERE is 0.0. With no heat reuse, ERE = PUE and we have always:

[I.4] Image

Many articles have been focusing on the PUE of data centers but very few have been looking at the global equation ITUE + PUE + ERE.

This is what we will address in this book.

Chapter 1 presents the different IT devices in a data center and the different components in a server from a power consumption perspective.

Chapter 2 presents the different cooling technologies in a server, their evolution and the trade-off between density and cooling.

Chapter 3 presents the different cooling technologies in the data center based on air cooling and liquid cooling, the ASHRAE standards and how waste heat can be reused in particular with adsorption chillers.

Chapter 4 presents the Xeon processor and NVIDIA accelerator evolution over the last 15 years in terms of performance and power consumption with reference to Moore’s and Dennard’s laws. It also presents the impact of microarchitecture evolutions and how the evolution in microprocessor design is power driven.

Chapter 5 analyzes the power, thermal and performance characteristics of a server when running different types of workloads in relation with the type of instruction executed. It also compares these metrics depending on the server cooling technologies.

Chapter 6 presents hardware and software to measure, model, predict and control power and performance in a server and in a data center in order to optimize their energy consumption.

Chapter 7 analyzes the PUE, ERE and total cost of ownership (TCO) of existing and new data centers with different cooling designs and the impact of power consumption, electricity price, free cooling and waste heat reuse. Through two data center examples it highlights also how a careful data center design, free cooling and waste heat reuse can reduce PUE, ERE and TCO and save up to 50% of energy. It concludes showing how renewable energy can be used to provide the remaining energy needed by the data center.

Acknowledgments

This book is the result of a 12-year experience with IBM and Lenovo designing energy-efficient servers and data centers in collaborations with high-performance customers around the world. I would like to mention and thank in particular the Leibniz Supercomputing Center in Munich (LRZ) for the work we did since 2011 to design and deliver the most energy-efficient solutions. This book would not have been possible without the help of colleagues at Lenovo Data Center Group, with the application team (Eric Michel, Peter Mayes, Christoph Pospiech), with the power management team (Robert Wolford) and with the data center team (Jerrod Buterbaugh), at Intel (Andrey Semin), NVIDIA (François Courteille), IBM (Ingmar Meijer, Vadim Elisseev) and LRZ (Herbert Huber). Special thanks also to Peter Mayes for his rereading.