ERP & OLTP Benchmark 2: Oracle Charbench (32-bit Windows 2003 EE)

We are happy to announce our first Oracle benchmark. The Oracle 10G database has a somewhat mythical status in our Belgian Lab. We have been steadily improving our MySQL tuning and this has allowed us to improve the number of queries per second on a Barcelona 2.3GHz from 500-600 queries per second to an impressive 1600 queries in the decision support benchmark (see further). However, when we tried out Oracle 10G with exactly the same database without any tuning, it produced a staggering 6000 queries per second while we could not get the CPU above 40% load! So we can't say we fully understand how Oracle works.
 
We used the following optimizations:
 
  • Memory: Custom, SGA size =1536MB, PGA size = 512MB.
  • Sizing: Processes = 300.
  • Connection Mode: Dedicated Server Mode

Although Oracle Swingbench is freely available, it has not been easy to get repeatable benchmarks. Oracle tunes itself constantly, and as Swingbench is somewhat similar to TPC-C, it requires a heavy disk subsystem to avoid being completely disk I/O bottlenecked. That is why we test with only one CPU: the combination of a superfast Intel SLC SSD taking care of the logs and a six disk RAID-0 set for the data is good enough to ensure that one quad-core CPU can reach up to 80% CPU load. We also repeat the test four times and only report the average of the last three tests to measure with "warm caches".

Here is a look in our lab. First, we plug the Intel X25-E Extreme SATA drive into the 3.5" drive case of our Colfax/Supermicro servers. This disk contains all the logs. This setup looks like this:


An SSD drive completely lost in a 3.5 inch hard disk cage.

The data is stored on a Promise J300s that is connected by a 12Gbit/s Infiniband to an Adaptec 5805 RAID controller, which is plugged into our servers. The Promise J300S contains a RAID-0 set of six 15000RPM Seagate SAS 300GB disks (one of the fastest hard disks you can get). The Adaptec 5805 is a pretty fast RAID card, as it is equipped with a dual-core Intel IOP 348 at 1.2GHz and 512MB of DDR2. This RAID controller won't quickly become a bottleneck. Below you see the fast Infiniband x4 wideport SAS connection from the Promise DAS to our servers.

 

The only way to get the necessary disk spindles for our 1U servers…

Only then can we get results that are not I/O bound but CPU bound… unless we use more than one CPU.

Oracle Charbench

The 45nm quad-core Opteron gets relatively close to the mighty Xeon 7460, which has a huge L3. It is remarkable how even the 3.33GHz Xeon is not able to defeat the latest Opteron.

ERP and OLTP Benchmark 1: SAP S&D Decision Support Benchmark: 64-bit MySQL (Linux SUSE SLES 10 SP2 64-bit)
Comments Locked

29 Comments

View All Comments

  • zpdixon42 - Wednesday, December 24, 2008 - link

    DDR2-1067: oh, you are right. I was thinking of Deneb.

    Yes performance/dollar depends on the application you are running, so what I am suggesting more precisely is that you compute some perf/$ metric for every benchmark you run. And even if the CPU price is less negligible compared to the rest of the server components, it is always interesting to look both at absolute perf and perf/$ rather than just absolute perf.

  • denka - Wednesday, December 24, 2008 - link

    32-bit? 1.5Gb SGA? This is really ridiculous. Your tests should be bottlenecked by IO
  • JohanAnandtech - Wednesday, December 24, 2008 - link

    I forgot to mention that the database created is slightly larger than 1 GB. And we wouldn't be able to get >80% CPU load if we were bottlenecked by I/O
  • denka - Wednesday, December 24, 2008 - link

    You are right, this is a smallish database. By the way, when you report CPU utilization, would you take IOWait separate from CPU used? If taken together (which was not clear) it is possible to get 100% CPU utilization out of which 90% will be IOWait :)
  • denka - Wednesday, December 24, 2008 - link

    Not to be negative: excellent article, by the way
  • mkruer - Tuesday, December 23, 2008 - link

    If/When AMD does release the Istanbul (k10.5 6-core), The Nehalem will again be relegated to second place for most HPC.
  • Exar3342 - Wednesday, December 24, 2008 - link

    Yeah, by that time we will have 8-core Sandy Bridge 32nm chips from Intel...
  • Amiga500 - Tuesday, December 23, 2008 - link

    I guess the key battleground will be Shanghai versus Nehalem in the virtualised server space...

    AMD need their optimisations to shine through.


    Its entirely understandable that you could not conduct virtualisation tests on the Nehalem platform, but unfortunate from the point of view that it may decide whether Shanghai is a success or failure over its life as a whole. As always, time is the great enemy! :-)
  • JohanAnandtech - Tuesday, December 23, 2008 - link

    "you could not conduct virtualisation tests on the Nehalem platform"

    Yes. At the moment we have only 3 GB of DDR-3 1066. So that would make pretty poor Virtualization benches indeed.

    "unfortunate from the point of view that it may decide whether Shanghai is a success or failure"

    Personally, I think this might still be one of Shanghai strong points. Virtualization is about memory bandwidth, cache size and TLBs. Shanghai can't beat Nehalem's BW, but when it comes to TLB size it can make up a bit.
  • VooDooAddict - Tuesday, December 23, 2008 - link

    With the VMWare benchmark, it is really just a measure of the CPU / Memory. Unless you are running applications with very small datasets where everything fits into RAM, the primary bottlenck I've run into is the storage system. I find it much better to focus your hardware funds on the storage system and use the company standard hardware for server platform.

    This isn't to say the bench isn't useful. Just wanted to let people know not to base your VMWare buildout soley on those numbers.

Log in

Don't have an account? Sign up now