Processor Performance: Now Dual-Core Flavored
Apple uses what’s referred to as a system-on-a-chip (SoC) in its mobile devices like the iPad and iPhone. In this particular implementation the SoC includes the processor core (or cores), graphics processing, and RAM in a package-on-package. Because those components sit next to each other in the same package, data transfers are achieved more efficiently. Moreover, less PCB space is consumed, since more functionality lives in one on-board component.
The influence of an SoC isn't all positive, however. A heavily integrated IC still has specific physical and thermal constraints, so the SoC's comprising subsystems aren't as potent as they might be if they were discrete.
Intel's Sandy Bridge architecture is a good example. The company simultaneously improved platform performance, while trimming power versus its previous-generation design. However, keeping processing, memory control, cache, and graphics in the same 95 W thermal window required concessions. The HD Graphics engine is perhaps the clearest indicator that Intel was working with a very specific transistor budget. Though the company's engineers created an engine deemed "good enough" for many desktop workloads, discrete graphics cards like AMD's Radeon HD 6970 and Nvidia's GeForce GTX 580 demonstrate how much more flexibility there is without the considerations afforded to more integrated solutions.
Header Cell - Column 0 | Apple A4 (iPad) | Apple A5 (iPad 2) |
---|---|---|
Processor | 1 GHz ARM Cortex-A8 (single-core) | 1 GHz ARM Cortex-A9 (dual-core) |
Memory | 256 MB LP-DDR (single-channel?) | 512 MB LP-DDR2 (dual-channel) |
Graphics | PowerVR SGX535 (single-core) | PowerVR SGX545MP2 (dual-core) |
L1 Cache(Instruction/Data) | 32 KB / 32 KB | 32 KB / 32 KB |
L2 Cache | 640 KB | 1 MB |
The iPad 2 features Apple's newest SoC, the A5, which is completely different from the A4 in its iPad. Let's start with what changes in the CPU.
Header Cell - Column 0 | ARM Cortex-A8 | ARM Cortex-A9 |
---|---|---|
Package Size | 198.8 mm2 | 238.8 mm2 |
Issue Width | dual-issue | multi-issue |
Out-of-Order Execution | No | Yes |
Execution Pipeline Depth | 13-stages | 8-stages |
FPU | VFPv3 | VFPv3 |
Processing Power | 2.0 DMIPS/MHz/Core | 2.5 DMIPS/MHz/Core |
Instead of the iPad’s ARM Cortex-A8, the iPad 2 uses a dual-core ARM Cortex-A9 with a total of 1 MB L2 cache. At the architectural level, the major difference is out-of-order execution. This is regarded to be a higher-performance approach than in-order execution, which executes instructions based on the order they appear. An out-of-order design addresses instructions based on the availability of of input data, thereby preventing the pipeline from spinning idly as data is retrieved.
If you want to draw a summertime analogy, consider the process of preparing a glass of ice water. You could choose to put the ice in the cup before you get the water, or you might fill the cup with water before getting the ice. The quickest task depends on where you are in relation to the refrigerator and the faucet. Out-of-order execution pipelines operate similarly.
The problem is that out-of-order execution requires extra die space in order to rearrange all those operations, which means that you're using more transistors and increasing energy consumption. That's one reason why Intel's small, power-efficient Atom architecture employs in-order execution. The benefit, however, is improved performance, as fewer CPU cycles are wasted. The fact that Apple moved to out-of-order execution is indicative of its emphasis on augmenting the iPad 2's performance.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
According to analysis done by Chipworks, Apple also couples its dual-core ARM Corex-A9 with 512 MB of LP-DDR2 (low-power DDR2). The original iPad only used 256 MB of LP-DDR. So, not only do we have two times more memory, we have it delivered through a more modern memory technology (DDR versus DDR2).
CPU Performance
Geekbench is a synthetic benchmark similar to SiSoftware's Sandra, and it's one of the few available benchmarks available for iOS. The best part about Geekbench, however, is that it's offered on multiple platforms. That means we can use it to make apples to apples comparisons against low-power x86-based devices like netbooks.
Geekbench v.2Score in Points, Higher is Better | Apple iPad | Apple iPad 2 | Dell Mini 1012(Atom N450) |
---|---|---|---|
Overall | 456 | 746 | 917 |
Integer | 365 | 681 | 910 |
Floating Point | 458 | 909 | 762 |
Memory | 678 | 787 | 1105 |
Stream | 325 | 323 | 1112 |
Single-threaded floating point and integer performance is much stronger on the iPad 2 than its predecessor. On average, performance nearly doubles.
The Cortex-A9 demonstrates a large lead in single-threaded scenarios due to its updated execution pipeline. However, threaded floating point performance sees an even larger boost, as the architecture's advantages are multiplied by the increased parallelism enabled by a second core. Though, I should point out that this doesn’t necessarily translate into better real-world performance. Most Apps have a greater tendency to rely on integer performance. That's the case whether you're talking about iTunes on the desktop or on the iPad.
From an architectural standpoint, we've come a long way since the original iPad debuted. But tablets fall very short of netbook-class performance. Intel's old Atom N450 still manages to outclass even Apple's latest hardware.
Geekbench v2 (detailed results) | Apple iPad | Apple iPad 2 | Dell Mini 1012 |
---|---|---|---|
Integer Section | |||
Blowfish (single-threaded scalar) | 13.6 MB/s | 13.2 MB/s | 26.2 MB/s |
Blowfish (multi-threaded scalar) | 14.3 MB/s | 26.0 MB/s | 41.5 MB/s |
Text Compress (single-threaded scalar) | 1.25 MB/s | 1.49 MB/s | 2.49 MB/s |
Text Compress (multi-threaded scalar) | 1.20 MB/s | 2.79 MB/s | 3.60 MB/s |
Text Decompress (single-threaded scalar) | 1.13 MB/s | 2.07 MB/s | 3.22 MB/s |
Text Decompress (multi-threaded scalar) | 1.09 MB/s | 3.24 MB/s | 4.86 MB/s |
Image Compress (single-threaded scalar) | 3.26 Mpixels/s | 3.77 Mpixels/s | 6.00 Mpixels/s |
Image Compress (multi-threaded scalar) | 3.38 Mpixels/s | 7.42 Mpixels/s | 8.81 Mpixels/s |
Image Decompress (single-threaded scalar) | 6.12 Mpixels/s | 6.66 Mpixels/s | 9.98 Mpixels/s |
Image Decompress (multi-threaded scalar) | 6.04 Mpixels/s | 12.8 Mpixels/s | 15.0 Mpixels/s |
Lua (single-threaded scalar) | 173.5 Knodes/s | 272.6 Knodes/s | 340.4 Knodes/s |
Lua (multi-threaded scalar) | 172.9 Knodes/s | 535.0 Knodes/s | 488.4 Knodes/s |
Floating Point Section | |||
Mandelbot (single-threaded scalar) | 79.9 MFLOPS | 278.8 MFLOPS | 339.6 MFLOPS |
Mandelbot (multi-threaded scalar) | 79.4 MFLOPS | 549.0 MFLOPS | 613.2 MFLOPS |
Dot Product (single-threaded scalar) | 247.5 MFLOPS | 221.3 MFLOPS | 204.9 MFLOPS |
Dot Product (multi-threaded scalar) | 246.2 MFLOPS | 435.5 MFLOPS | 361.5 MFLOPS |
LU Decompression (single-threaded scalar) | 50.5 MFLOPS | 207.3 MFLOPS | 309.7 MFLOPS |
LU Decompression (multi-threaded scalar) | 54.7 MFLOPS | 403.4 MFLOPS | 534.0 MFLOPS |
Primality Test (single-threaded scalar) | 71.4 MFLOPS | 176.6 MFLOPS | 126.7 MFLOPS |
Primality Test (multi-threaded scalar) | 69.2 MFLOPS | 316.8 MFLOPS | 194.5 MFLOPS |
Sharpen Image (single-threaded scalar) | 1.51 Mpixels/s | 1.68 Mpixels/s | 482.1 Kpixels/s |
Sharpen Image (multi-threaded scalar) | 1.52 Mpixels/s | 3.32 Mpixels/s | 858.9 Kpixels/s |
Blur Image (single-threaded scalar) | 762.2 Kpixels/s | 664.4 Kpixels/s | 535.6 Kpixels/s |
Blur Image (multi-threaded scalar) | 762.0 Kpixels/s | 1.31 Mpixels/s | 941.5 Kpixels/s |
The write sequential and sfdlib write memory tests in Geekbench confirm better RAM performance, but it's difficult to separate how much of this is due to memory technology and how much is attributable to the processor. At the end of the day, it really doesn't matter; what does is that throughput goes up.
Intel's Atom N450 still manages to remain top dog, despite it's 64-bit single-channel interface. The Atom only falls behind in the sfdlib allocate and write tests. However, the N450's 1.97 GB/s score in read sequential is about 6x higher than what we see in the iPad 2.
Geekbench v2 (detailed results) | Apple iPad | Apple iPad 2 | Dell Mini 1012 |
---|---|---|---|
Memory Score | |||
Read Sequential (single-threaded scalar) | 306 MB/s | 342.2 MB/s | 1.97 GB/s |
Write Sequential (single-threaded scalar) | 849.1 MB/s | 1.02 GB/s | 1.32 GB/s |
Sfdlib Allocate (single-threaded scalar) | 1.99 Mallocs/s | 1.83 Mallocs/s | 1.25 Mallocs/s |
Sfdlib Write (single-threaded scalar) | 1.28 GB/s | 2.57 GB/s | 1.34 GB/s |
Sfdlib Copy (single-threaded scalar) | 830.4 MB/s | 474.8 MB/s | 1.03 GB/s |
Stream Score | |||
Stream Copy (single-threaded scalar) | 465.5 MB/s | 449.9 MB/s | 1.18 GB/s |
Stream Scale (single-threaded scalar) | 320.5 MB/s | 372.5 MB/s | 1.08 GB/s |
Stream Add (single-threaded scalar) | 655.9 MB/s | 606.3 MB/s | 1.41 GB/s |
Stream Triad (single-threaded scalar) | 427.4 MB/s | 426.6 MB/s | 1.11 GB/s |
Current page: Processor Performance: Now Dual-Core Flavored
Prev Page Meet iPad 2: Thinner And Lighter Next Page GPU Performance: More is Better-
-Fran- What about adding other tablets in the mix?Reply
The XOOM could be a good choice, or the Asus Transformer.
Cheers! -
acku We'll get to the others soon enough, but it's difficult to talk about other tablets without talking about Apple products.Reply
Cheers,
Andrew Ku
TomsHardware.com -
gidgiddonihah Here is the article where all the Apple fanboys start to tear this article up :).Reply -
gidgiddonihah Oops, sorry for the poor grammer :). Posted without rereading...Reply
Here is an article that Apple fanboys would be glad to rip up. -
Tijok gidgiddonihahHere is the article where all the Apple fanboys start to tear up .Reply
Fixed it for you. ;) -
Disagree.Reply
Like most computer guys like myself, adding an ipad won't make us to pack our PC/Mac to our closet. But again, mistakes like Microsoft or Linux, the "most computer guys on earth" are actually just a small group in total.
For example:
i've been teaching my 80 years old grandma to use pc to download, install and play simple games for years, no success. then she learned how to find/download/install/play many games after few days playing with my ipad.
my 5 years nephew reads/plays/watch cartoons all on ipad now, she didn't turn on her pc for weeks.
For myself, i uninstalled all my online video, movie client applications from my laptop, as I found watching these on ipad is much comfortable.
Yes I still do my works on my pc, mostly in my office. At home, now the only thing force me to turn on my pc is to play world of warcraft
I mean, who says ipad can't replace anything?
-
acku joeyluDisagree.Like most computer guys like myself, adding an ipad won't make us to pack our PC/Mac to our closet. But again, mistakes like Microsoft or Linux, the "most computer guys on earth" are actually just a small group in total.For example:i've been teaching my 80 years old grandma to use pc to download, install and play simple games for years, no success. then she learned how to find/download/install/play many games after few days playing with my ipad.my 5 years nephew reads/plays/watch cartoons all on ipad now, she didn't turn on her pc for weeks.For myself, i uninstalled all my online video, movie client applications from my laptop, as I found watching these on ipad is much comfortable. Yes I still do my works on my pc, mostly in my office. At home, now the only thing force me to turn on my pc is to play world of warcraftI mean, who says ipad can't replace anything?Reply
I'd actually argue that in your case you're not replacing a PC, you're augmenting it. Granted, there will be others like your nephew and grandmother who can use an iPad as their primary device.
-
damianrobertjones May I simply say, "thank you" to Toms for writing a well thought out objective, honest and practical review of thr iPad2. Instead of getting carried away you've hit the nail on the head.Reply
P.s. I use an Acer W500 and I still struggle to justify why i have it.