Original Link: https://www.anandtech.com/show/2694
The Best Server CPUs Compared, Part 1
by Johan De Gelas on December 22, 2008 10:00 PM EST- Posted in
- IT Computing
The past several months have seen both Intel and AMD introducing interesting updates to their CPU lines. Intel started with the E-stepping of the Xeon. Even at 3GHz, the four cores of the Xeon 5450 need 80W at the most, and if speed is all you care about a 120W 5470 is available at 3.33GHz. The big news came of course from AMD. The "only native x86 quad-core" is finally shining bright thanks to a very successful transition to 45nm immersion lithography as you can read here. The result is a faster and larger 6MB L3 cache, higher clock speeds, and lower memory latency. AMD's quad-core is finally ready to be a Xeon killer.
So it was time for a new server CPU shoot out as server buyers are confronted with quickly growing server CPU pricelists. Talking about pricelists, is someone at marketing taking revenge on a strict math teacher that made him/her suffer a few years ago? How else can you explain that the Xeon 5470 is faster than the 5472, and that the Xeon 5472 and 5450 are running at the same clock speed? The deranged Intel (and in a lesser degree AMD) numbering system now forces you to read through spec sheets the size of a phone book just to get an idea of what you are getting. Or you could use a full-blown search engine to understand what exactly you can or will buy. The marketing departments are happy though: besides the technical white papers you need to read to build a server, reading white papers to simply buy a CPU is now necessary too. Market segmentation and creative numbering…a slightly insane combination.
Anyway, if you are an investor trying to understand how the different offerings compare, or you are out to buy a new server and are asking yourself what CPU should be in there, this article will help guide you through the newest offerings of Intel and AMD. In addition, as the Xeon 55xx - based on the Nehalem architecture - is not far off, we will also try it to see what this CPU will bring to the table. This article is different from the previous ones, as we have changed the collection of benchmarks we use to evaluate server CPUs. Read on, and find out why we feel this is a better and more realistic approach.
Breaking out of the benchmark prison
When I first started working on this article, I immediately started to run several of our "standard" benchmarks: CINEBENCH, Fritz Chess, etc. As I started to think about our "normal" benchmark suite, I quickly realized that this article would become imprisoned by its own benchmarks. It is nice to have a mix of exotic and easy to run benchmarks, but is it wise to make an article with analysis around such an approach? How well does this reflect the real world? If you are actually buying a server or are you are trying to understand how competitive AMD products are with Intel's, such a benchmark mix is probably only confusing the people who like to understand what decisions to make. For example, it is very tempting to run a lot of rendering and rarely used benchmarks as they are either easy to run or easy to find, but it gives a completely distorted view on how the different products compare. Of course, running more benchmarks is always better, but if we want to give you a good insight in how these server CPUs compare, there are two ways to do it: the "micro architecture" approach and the "buyer's market" approach.
With the micro architecture approach, you try to understand how well a CPU deals with branch/SSE/Integer/Floating Point/Memory intensive code. Once you have analyzed this, you can deduce how a particular piece of software will probably behave. It is the approach we have taken in AMD's 3rd generation Opteron versus Intel's 45nm Xeon: a closer look. It is a lot of fun to write these types of articles, but it only allows those who have profiled their code to understand how well the CPU will do with their own code.
The second approach is the "buyer's market" approach. Before we dive into new Xeons and Opterons, we should ask ourselves "why are people buying these server CPUs"? Luckily, IDC reports[1] answer these questions. Even though you have to take the results below with a grain of salt, they give us a rough idea of what these CPUs are used for.
IT infrastructure servers like firewalls, domain controllers, and e-mail/file/print servers are the most common reasons why servers are bought. However, file and print servers, domain controllers, and firewalls are rarely limited by CPU power. So we have the luxury of ignoring them: the CPU decision is a lot less important in these kind of servers. The same is true for software development servers: most of them are for testing purposes and are underutilized. Mail servers (probably 10% out of the 32-37%) are more interesting, but currently we have no really good benchmark comparisons available, since Microsoft's Exchange benchmark was unfortunately retired. We are currently investigating which e-mail benchmark should be added to our benchmarking suite. However, it seems that most mail server benchmarking boils down to storage testing. This subject is to be continued, and suggestions are welcome.
Collaborative servers really deserve more attention too as they comprise 14 to 18% of the market. We hope to show you some benchmarks on them later. Developing new server benchmarks takes time unfortunately.
ERP and heavy OLTP databases are good for up to 17% of the shipments and this market is even more important if you look at the revenue. That is why we discuss the SAP benchmarks published elsewhere, even though they are not run by us. We'll add Oracle Swingbench in this article to make sure this category of software is well represented. You can also check Jason's and Ross' AMD Shanghai review for detailed MS SQL Server benchmarking. With Oracle, MS SQL Server and SAP, which together dominate this part of the server market, we have covered this part well.
Reporting and OLAP databases, also called decision support databases, will be represented by our MySQL benchmark. Last but not least, we'll add the MCS eFMS web server test -- an ultra real world test -- to our benchmark suite to make sure the "heavy web" applications are covered too. It is not perfect, but this way we cover the actual market a lot better than before.
Secondly, we have to look at virtualization. According to IDC reports, 35% of the servers bought in 2007 were bought to be virtualized. IDC expect this number to climb up to 52% in 2008 [2]. Unfortunately, as soon as we upgraded the BIOS of our quad socket platform to support the latest Opteron, it would not allow us to install ESX nor let us enable power management. That is why we had to postpone our server review for a few weeks and that is why we split it into two parts. For now, we will look at the VMmark submissions to get an idea how the CPUs compare when it comes to virtualization.
In a nutshell, we're moving towards a new way of comparing server CPUs. We combine the more reliable industry standard benchmarks (SAP, VMmark) with our own benchmarks and try to give you a benchmark mix that comes closer to what the servers are actually bought for. That should allow you to get an overview that is as fair as possible. Performance/watt is still missing in this first part, but a first look is already available in the Shanghai review.
Benchmark Configuration
Here is the list of the different configurations. All servers have been flashed to the latest BIOS at this moment.
Inside our lab
Quad Xeon Server 1: Supermicro SC818TQ-1000 1U Chassis
2x - 4 x Intel Xeon E7330 at 2.4GHz and X7350 at 2.93GHz
Supermicro X7QCE board
16GB (8 x 2GB) Samsung M395T5750EZ4 667MHz 5-5-5-18
NIC: Dual Intel PRO/1000 Server NIC
PSU: Supermicro 1000W w/PFC (Model PWS-1K01-1R)
Dual Xeon Server 2: Intel "Stoakley platform" server
Supermicro X7DWE+/X7DWN+ board
1 or 2 Xeon E5472 at 3GHz, Xeon L5430 at 2.66GHz, and Xeon E5470 at 3.33GHz
16GB (8x 2GB) Samsung M395T5750EZ4 667MHz 5-5-5-18
NIC: Dual Intel PRO/1000 Server NIC
PSU: Ablecom PWS-702A-1R 700W
Dual and Quad Opteron Server: Supermicro SC828TQ-R1200LPB 2U Chassis
1, 2 or 4 AMD Opteron 8356 series at 2.3GHz
1, 2 or 4 AMD Opteron 8384 series at 2.7GHz
Supermicro H8QMi-2+ board
16GB (8x 2GB) Avant AVF7256R61E6800FB-MTEP 800MHz 6-5-5-18
NIC: Dual Intel PRO/1000 Server NIC
PSU: Supermicro 1200W w/PFC (Model PWS-1K22-1R)
Client Configuration: Intel Core 2 Quad Q6600
Foxconn P35AX-S
4GB (2x 2GB) Kingston 667MHz DDR-2
NIC: Intel Pro/1000
ERP and OLTP Benchmark 1: SAP S&D
The SAP S&D (sales and distribution, 2-tier internet configuration) benchmark is an extremely interesting benchmark as it is a real world client-server application. We decided to look at SAP's benchmark database. The results below are two tier benchmarks, so the database and the underlying OS can make a big difference. Unless we keep those parameters the same, we cannot compare the results. The results below are all run on Windows 2003 Enterprise Edition and MS SQL Server 2005 database (both 64-bit). Every "two tier Sales & Distribution" benchmark was performed on the SAP's "ERP release 2005".
In our previous server oriented article, we summed up a rough profile of SAP S&D:
- Very parallel resulting in excellent scaling
- Low to medium IPC, mostly due to "branchy" code
- Not really limited by memory bandwidth
- Likes large caches
- Sensitive to sync ("cache coherency") latency
There are no quad socket results for the latest 45nm AMD parts, but we can still get a pretty good idea where it would land. The "Barcelona" Opteron scales from 10520 SAPS (2 CPUs) to 17650 (4 CPUs), or an improvement of about 68%. The quad Opteron 8384 will probably scale a bit better, so we speculate it will probably attain a score of about 23000 to 24000 SAPS. It won't beat the best Intel score (Dunnington), but it will come close enough and offer an excellent performance/watt ratio. If you are wondering about the phenomenal Xeon X5570 scores, we discussed them here.
Also interesting is that the dual 2.7GHz "Shanghai" is about 31% faster than the dual 2.3GHz "Barcelona", while the clock speed advantage is only 17%. It clearly shows that the larger L3 cache pays off here. Now let's look at some more exotic setups: octal socket or similar systems.
This overview is hardly relevant if you are deciding which x86 server to buy, but it is a feast for those following the complete server market -- or those of us who are interested in different CPU architectures. There is a battle raging between three different philosophies: the thread machine-gun SUN UltraSPARC T2, the massive speed daemon IBM POWER6, and the cost effective x86 architectures. The UltraSPARC 2 machine only has four sockets, but each socket contains eight CPUs that have a fine grained multithreaded, in order, "Gatling gun" that cycles between eight threads. That means one quad socket machine keeps up to 256 threads alive. The POWER6 machine contains eight CPUs, but each CPU is only a dual-core CPU. However, each POWER6 CPU is a deeply pipelined, wide superscalar architecture running at 4.7GHz, backed up with massive caches (4MB L2, 32MB L3). The very wide superscalar architecture is used more efficiently thanks to Simultaneous Multi-Threading (SMT).
Each T2 with eight "mini cores" needs 95W compared to 130W for two massive POWER6 cores, so the SUN Server needs 4 x 95W for the CPUs, while the IBM server needs 8 x 130W. These differences could be smaller percentagewise when you look at how much power each server system will need, but when it comes to performance/watt, it will be hard to beat the T2 here. The latest octal Opteron server should come close (8 x 75W) as it does not use FB-DIMMs while the UltraSPARC T2 does. However, we are speculating here; let's get back to our own benchmarking.
ERP & OLTP Benchmark 2: Oracle Charbench (32-bit Windows 2003 EE)
- Memory: Custom, SGA size =1536MB, PGA size = 512MB.
- Sizing: Processes = 300.
- Connection Mode: Dedicated Server Mode
Although Oracle Swingbench is freely available, it has not been easy to get repeatable benchmarks. Oracle tunes itself constantly, and as Swingbench is somewhat similar to TPC-C, it requires a heavy disk subsystem to avoid being completely disk I/O bottlenecked. That is why we test with only one CPU: the combination of a superfast Intel SLC SSD taking care of the logs and a six disk RAID-0 set for the data is good enough to ensure that one quad-core CPU can reach up to 80% CPU load. We also repeat the test four times and only report the average of the last three tests to measure with "warm caches".
Here is a look in our lab. First, we plug the Intel X25-E Extreme SATA drive into the 3.5" drive case of our Colfax/Supermicro servers. This disk contains all the logs. This setup looks like this:
An SSD drive completely lost in a 3.5 inch hard disk cage.
The data is stored on a Promise J300s that is connected by a 12Gbit/s Infiniband to an Adaptec 5805 RAID controller, which is plugged into our servers. The Promise J300S contains a RAID-0 set of six 15000RPM Seagate SAS 300GB disks (one of the fastest hard disks you can get). The Adaptec 5805 is a pretty fast RAID card, as it is equipped with a dual-core Intel IOP 348 at 1.2GHz and 512MB of DDR2. This RAID controller won't quickly become a bottleneck. Below you see the fast Infiniband x4 wideport SAS connection from the Promise DAS to our servers.
The only way to get the necessary disk spindles for our 1U servers…
Only then can we get results that are not I/O bound but CPU bound… unless we use more than one CPU.
The 45nm quad-core Opteron gets relatively close to the mighty Xeon 7460, which has a huge L3. It is remarkable how even the 3.33GHz Xeon is not able to defeat the latest Opteron.
Decision Support Benchmark: 64-bit MySQL (Linux SUSE SLES 10 SP2 64-bit)
Decision support databases are completely different from the I/O dominated OLTP databases. Large select statements have to go through almost the entire database, so the CPU and especially the memory subsystem get a lot more work to do. We test with an e-commerce site's database on MySQL 5.1.23 using the INNODB database engine. You can find out more about our MySQL testing here. We tested with only one CPU as MySQL has trouble scaling above four cores.
Please note that these results cannot be compared with our earlier MySQL results, as the version is different as is the my.cnf config file too. The Opteron 8484 "Shanghai" simply annihilates the competition. Our VTune profiling on one of the Xeons shows us that MySQL is very sensitive to the latency of the cache and memory subsystem, so the latest Opteron has a great advantage compared to the older 65nm generation, as latency has been reduced in the L3 cache and memory controller. This results in a 35% boost, much more than the 17% clock speed advantage.
MCS eFMS (Windows 2003 32-bit EE)
One of the very interesting and processing intensive applications that we encountered was the modular MCS Enterprise Facility Management Software (MCS eFMS), developed by MCS. The objective of eFMS is to integrate the management of space usage (buildings), assets and equipment (such as furniture, beamers etc.), cabling infrastructure, and other areas while keeping track of costs. MCS eFMS stores all information in a central Oracle database.
MCS eFMS integrates space management, room reservations and much more.
What makes the application interesting to us as IT researchers is the integration of three key technologies: A web-based frontend that integrates CAD drawings and gets its information from a rather complex, ERP-like Oracle database; building overview trees of all rooms available and their reservations in a certain building; and drilling down using the CAD drawing to get more detail: MCS eFMS is one of the most demanding web applications we have encountered so far. MCS eFMS uses the following software:
- Microsoft IIS 6.0 (Windows 2003 Server Standard Edition R2)
- PHP 4.4.0
- FastCGI
- Oracle 9.2
MCS eFMS is used daily by large international companies such as Siemens, Ernst & Young, and Startpeople, which makes testing this application even more attractive. We used the specially developed APUS (Application Unique Stress-testing) software, developed by our own lab to analyze the logs we got from MCS and turn these logs into a real stress test. The result is a benchmark that closely models the way users access the web servers around the world. First we test with only one CPU.
This is the real thing. No team of compiler writers have been losing their sleep to rearrange a few pieces of assembler so that the out of order scheduling can happen as smoothly as possible. Like all business logic code, it is a huge pile of complex layers, one upon the other. It is almost impossible to add some extremely clever "benchmark only boosts" in this one.
Intel takes the top spots, but it needs a 120W 3.3GHz CPU to overpower the newest AMD CPU. Typically there about 6 to 10 frontend servers (the eFMS website) for one backend (the Oracle server), so while the backend server is all about the highest CPU and I/O performance, the frontend is best served with "midrange power" CPUs.
The Opteron improves its performance by 49% and the Xeon by 53% if you add a second quad-core. This benchmark has an error margin of about 3%, so it seems that both CPUs scale more or less the same.
HPC
Several of the HPC benchmarks are way too expensive for us to test, and contrary to virtualization, web servers, and databases, we have little expertise in our lab to perform and fully understand these benchmarks. Nevertheless, we can get an impression from AMD's and Intel's own benchmarking. Two applications, LSDyna (Crash simulation) and Fluent (fluid dynamics) from Ansys seem to dominate the benchmark scene.
Back in 2007, the AMD 8350 was running at a paltry 2GHz, but it was already capable of keeping up with the best Intel CPUs. Intel seems to have improved its LSDyna scores a little bit. The introduction of the Xeon 5450 means that we get the same performance at a much lower TDP, but the Intel 3GHz CPU is no match for AMD's latest at 2.7GHz, as it still needs 30% more time to perform the complex crash simulation.
At the website of the Fluent benchmark we find a wealth of benchmarking info. The "Shanghai" numbers are not published there yet, so the numbers below are a combination of already published numbers (Intel) and AMD's own benchmarking numbers.
AMD's best quad-core is in a neck and neck race in the first benchmark, but the other benches show a significant (24%) to supreme (66%) advantage for the latest AMD chip. The newest AMD quad-core clearly strengthens the position of AMD in the HPC market.
Other (Windows 2003 64-bit)
Render server are only a small part of the server market. We show you the typical render tests we have performed so many times before.
This is the one of the applications where the Xeons can still roll their muscles without being slowed down by the platform. The new AMD Shanghai Opteron does much better than its older brother, but Xeons remain the best choice: a dual Xeon 3.3GHz offers 81% to 88% of the rendering power of the quad Opteron at a much lower price point.
Virtualization (ESX 3.5 Update 2/3)
As we discussed in the first page of this article, virtualization will be implemented on about half of the servers bought this year. Virtualization is thus the killer application and the most important benchmark available. Since we have not been able to carry out our own virtualization benchmarking (due to the BIOS issues described earlier), we turn to VMware's VMmark. It is a relatively reliable benchmark as the number of "exotic tricks hardly used in the real world" (see SPECjbb) are very limited.
VMware VMmark is a benchmark of consolidation. Several virtual machines performing different tasks are consolidated together and called a tile. A VMmark tile consists of:
- MS Exchange VM
- Java App VM
- An Idle VM
- Apache web server VM
- MySQL database VM
- A SAMBA fileserver VM
The first three run on a Windows 2003 guest OS, the last three on SUSE SLES 10.
To understand why this benchmark is so important just look at the number of tiles that a certain machine can support:
The difference between the Opteron 8360 SE and the 8384 "Shanghai" is only 200MHz and 4MB L3 cache. However, this small difference allows you to run 18 (!) extra virtual machines. Granted, it may require that you install more memory, but adding memory is cheaper than buying a server with more CPU sockets or adding yet another server. Roughly calculated you could say that the new quad-core Opteron allows you to consolidate 27% more virtual servers on one physical machine, which is a significant cost saving.
Of course, the number of tiles that a physical server can accommodate provides only a coarse-grain performance measure, but an important one. This is one of the few times where a higher benchmark score directly translates to a cost reduction. Performance per VM is of course also very interesting. VMware explains how they translate the performance of each different workload in different tiles into one overall score:
After a benchmark run, the workload metrics for each tile are computed and aggregated into a score for that tile. This aggregation is performed by first normalizing the different performance metrics such as MB/second and database commits/second with respect to a reference system. Then, a geometric mean of the normalized scores is computed as the final score for the tile. The resulting per-tile scores are then summed to create the final metric.
Let us take a look:
Thanks to the lower world switch times, higher clock speed, and larger cache, the new "Shanghai" Opteron 8384 improves the already impressive scores of the AMD "Barcelona" 8356 by almost 43%. The only Intel that comes somewhat close is the hex-core behemoth known as the Xeon X7460, which needs a lot more power. IBM is capable of performing a tiny bit better than Dell thanks to its custom high performance chipset.
It is clear that the Xeon 7350 is at the end of its life: it offers a little more than 2/3 of the performance of the best Opteron while using a lot more power. Even the latest improved stepping, the Xeon X5470 at 3.33GHz, cannot keep up with the new Opteron quad-core. The reason is simple: as the number of VMs increase, so do the bandwidth requirements and the amount of world switches. That is exactly where the Opteron is far superior. It is game over here for Intel… until the Xeon 5570 2.93GHz arrives in March.
Pricing
You don't buy a server CPU of course; in most cases you buy a complete server. Still, the impact of the top of the line server CPUs on the total server price is still relatively high, so it does not hurt to compare on price too.
Server CPU Pricing | |||
Intel CPU | Price | AMD CPU | Price |
Xeon X7460 2.66 GHz (6 core, 16 MB) | $2729 | ||
Xeon E7450 2.4 GHz (6 Core, 12 MB) | $2301 | Opteron 8384 2.7 GHz | $2149 |
Xeon E7440 2.4 GHz (12 MB) | $1980 | Opteron 8382 2.6 GHz | $1865 |
Xeon E7430 2.13 GHz (12 MB) | $1391 | Opteron 8380 2.5 GHz | $1514 |
Xeon E7420 2.13 GHz (8 MB) | $1177 | Opteron 8378 2.4 GHz | $1165 |
Opteron 8350 2.0GHz | $873 | ||
Xeon L7455 2.13 GHz (6 core, 12 MB) | $2729 | ||
Xeon L7445 2.13 GHz (12 MB) | $1980 | Opteron 8347 HE 1.9 GHz | $873 |
The Opteron 8384 is clearly aimed at six-core Xeon 7450 2.4GHz.
Server CPU Pricing | |||
Intel CPU | Price | AMD CPU | Price |
Xeon X5470 3.33 GHz (120W) | $1386 | ||
Opteron 2384 2.7 GHz (75W) | $989 | ||
Xeon E5450 3.0 GHz (80W) | $915 | Opteron 2382 2.6 GHz (75W) | $873 |
Xeon E5440 2.83 GHz (80W) | $690 | Opteron 2380 2.5 GHz (75W) | $698 |
Xeon E5430 2.66 GHz (80W) | $455 | Opteron 2380 2.4 GHz (75W) | $523 |
Xeon X5420 2.5 GHz (80W) | $316 | Opteron 2378 2.3 GHz (75W) | $377 |
Xeon X5410 2.33 GHz (80W) | $256 | ||
Xeon L5430 2.66 GHz (50W) | $562 | Opteron 8350 HE 2 GHz (55W) | $316 |
Meanwhile, the Opteron 2384 targets the 3GHz Xeon E5450.
The Opteron Killer?
Will the Xeon 5570 -- a Nehalem based Xeon at 2.93GHz -- completely change the market? Since it is still about three months away, we decided to perform our server tests on the Nehalem test kit. This gives Intel a small advantage as it can use DDR3-1066 unbuffered RAM, while the Xeons will use buffered DDR3 (note: no FB-DIMMs!). However, it is quite possible the Xeon X5550 (2.66GHz) and Xeon X5570 (2.94GHz) will get 1333MHz buffered DDR3, which means that the numbers below will be very close to the truth.
As you can see, the boost from Hyper-Threading ranges from nothing to about 12%. It looks like the newest Xeon will be about 36% faster in MySQL, 26% on our MCS website, and 22% faster on Oracle. That is quite impressive, but this is only a preview and we are only showing you single processor results. Take them with a grain of salt, but it looks like the newest Xeon will smash things up.
Market Analysis
Let us take a quick look at the complete market to see how the most interesting CPUs from Intel and AMD compare. In the first column, you see the market. In the second column is the percentage of server shipments to this market. Some markets generate more revenue to server manufactures like ERP, OLTP, and OLAP; however, since we have no recent numbers on this, we'll just mention it. We compare the Opteron "Shanghai" 2.7GHz with the Xeon "Harpertown" 3GHz as they have similar pricing and power dissipation. The green zones of the market are the ones we have a decent benchmark for and which are won by AMD, the blue ones are the Intel zones, and the red parts are - for now - unknown.
AMD "Shanghai" Opteron 2.7 GHz versus Xeon "Harpertown" 3 GHz | ||||
Market | Importance | First bench | Second bench | Benchmarks/remarks |
ERP, OLTP | 10-14% | 21% | 5% | SAP, Oracle |
Reporting, OLAP | 10-17% | 27% | MySQL | |
Collaborative | 14-18% | N/a | ||
Software Dev. | 7% | N/a | ||
e-mail, DC, file/print | 32-37% | N/a | Not really a "CPU loving" market | |
Web | 10-14% | 2% | MCS eFMS | |
HPC | 4-6% | 28% | -3% to 66% | LS-DYNA, Fluent |
Other | 2%? | -18% | -15% | 3DSMax, Cinebench |
Virtualization | 33-50% | 34% | VMmark |
Yes, our benchmarks do not cover the whole market. However, keep in mind that for a large percentage of the "infrastructure" servers, the CPU is not really an important factor for the buying decision. We are convinced that once we have setup a good "collaborative benchmark" we cover most of the server market where the CPU performance makes a difference.
What do we learn from this overview? The new quad-core Opteron 2384 or 8384 is a success. It's a late success, but it can keep its most important competitor at a tangible distance in ERP, OLAP, and HPC. For ERP, OLTP, and OLAP, we are pretty sure our benchmarks give a good view. SAP, Oracle, and MySQL are very popular applications each in their own field, and the SQL server results of our "AMD Opteron Shanghai" review show more or less the same picture. In these markets, it will be hard to find benchmarks that contradict our findings
The HPC market is a lot more diverse, and since we have a limited knowledge of this market, we are sure that there are examples that show the complete opposite picture of the benchmarks we have shown here. Still, the Ansys benchmarks are good representatives of a decent part of this market.
The benchmark that really convinces us that currently the Opteron has the advantage is VMmark. Being able to consolidate 27% (14 vs. 11 tiles) to 33% (8 vs. 6 tiles) more virtual machines translates immediately into considerable cost savings. Those 27 to 33% more VMs do not result in a performance hit, as the total consolidated performance rises 34% and more. Considering that most IT investments in these uncertain times will target at cutting costs, that is a huge plus for AMD.
If you skipped to this page immediately, you can find our
"market analysis" on the previous page.
Looking at the Server CPUs from the point of view of the market was surprising and refreshing. The whole problem with running every benchmark you can get your hands on is that it just gets confusing. Sure we can have 10 more benchmarks that can be categorized under "other", but if either the Xeon or Opteron wins them, would that give you a better view of the market? That is why we decided to focus on finally getting that Oracle and MCS benchmark right. That is also why we rely on the more reliable industry standard benchmarks to make our analysis complete.
Right now, it is clear that the latest AMD Opteron is in the lead. We are really at the pivotal moment in time. No matter how good the current Xeon "Harpertown" and "Dunnington" architectures are, they lose too many battles due to the platform they are running on. The FSB architecture is singing its swan song. Only a small part of the market, namely:
- The ERP people who don't care about power, but who need the highest performance at any cost
- The HPC people who have extremely intensive code which does not work on sparse matrices
- The people who render
…can ignore the shortcomings of the FSB-based platform.
For most other applications, the AMD platform is simply better in price/performance and performance/watt (see our previous Shanghai review). It won't last long though, as the performance that the new Nehalem architecture has shown in OLTP, ERP, and OLAP is simply amazing. Moreover, there is little doubt that the dual Xeon 5570 with 34GB/s of bandwidth (dual Opteron is 20-21GB/s) will shine in HPC too. AMD servers can use the HyperTransport 3.0 and higher clock speeds to counter this, but that is for a later article….
A big thanks to Tijl Deneut for assisting me with the hundreds of benchmarks we ran in the past month.
References
[1] Choosing the Right Hardware for Server Virtualization (IDC Paper sponsored by: Intel), Ken Cayton, April 2008
[2] IDC's European Server virtualization forecast, July 2008