AnandTech's Journal
 
[Most Recent Entries] [Calendar View]

Saturday, September 22nd, 2012

    Time Event
    3:31a
    The iPhone 5 Performance Preview

    This morning we finally got our hands on Apple's iPhone 5. While we are eager to get started on battery life testing, that'll happen late tonight after a full day's worth of use and a recharge cycle. Meanwhile, we went straight to work on performance testing. As we've mentioned before, the A6 SoC makes use of a pair of Apple's own CPU cores that implement the ARMv7 ISA. These aren't vanilla Cortex A9s or Cortex A15s, but rather something of Apple's own design. For its GPU Apple integrated a PowerVR SGX543MP3 GPU running at higher clocks than the dual-core 543MP2 in the A5. The result is compute performance that's similar to the A5X in Apple's 3rd generation iPad, but with a smaller overall die area. The A6 has a narrower memory interface compared to the A5x (64-bits vs. 128-bits), but that makes sense given the much lower display resolution (0.7MP vs. 3.1MP).

    As always, our performance analysis starts out on the CPU. Although we originally thought the A6 ran its two CPU cores at 1GHz, it looks like max clocks range between 800MHz and 1.2GHz depending on load. Geekbench reports clock speed at launch, which varied depending on CPU load. With an app download process in the background I got Geekbench to report a 1.2GHz clock speed, and with everything quiet in the background the A6 reported 800MHz after being queried. This isn't anything new as dynamic voltage/frequency adjustment is in all smartphones, but we do now have a better idea of the range.

    The other thing I noticed is that without a network active I'm able to get another ~10% performance boost over the standard results while on a network. Take the BrowserMark results below for example, the first two runs are without the iPhone 5 being active on AT&T's network while the latter two are after I'd migrated my account over. The same was true for SunSpider performance, I saw numbers in the low 810ms range before I registered the device with AT&T.

    Overall, the performance of the A6 CPU cores seems to be very good. The iPhone 4S numbers below are updated to iOS 6.0 so you can get an idea of performance improvement.

    BrowserMark

    SunSpider Javascript Benchmark 0.9.1 - Stock Browser

    As we mentioned in our earlier post, SunSpider is a small enough benchmark that it really acts as a cache test. The memory interface on the A6 seems tangibly better than any previous ARM based design, and the advantage here even outpaces Intel's own Medfield SoC.

    I also ran some data using Google's V8 and Octane benchmarks, both bigger JavaScript tests than SunSpider. I had an AT&T HTC One X with me while in New York today (up here for meetings this week) and included its results in the charts below. Note that the default HTC web browser won't run the full Octane suite so I used Chrome there. I didn't use Chrome for the V8 test because it produced lower numbers than the stock browser for some reason.

    Google V8 Benchmark - Version 7

    Google Octane Benchmark v1

    Here we see huge gains over the iPhone 4S, but much closer performance to the One X. In the case of Google's V8 benchmark the two phones are effectively identical, although Octane gives the iPhone 5 a 30% lead once more.

    These are still narrowly focused tests, we'll be doing some more holisitic browser tests over the coming days. Finally we have Geekbench 2, comparing the iPhone 5 and 4S:

    Geekbench 2 Performance
    Geekbench 2 Overall Scores Apple iPhone 4S Apple iPhone 5
    Geekbench Score 628 1640
    Integer 545 1252
    Floating Point 737 2101
    Memory 747 1862
    Stream 299 946

    Apple claimed a 2x CPU performance advantage compared to the iPhone 4S during the launch event for the 5. How does that claim match up with our numbers? Pretty good actually:

    This is hardly the most comprehensive list of CPU benchmarks, but on average we're seeing the iPhone 5 deliver 2.13x the scores of the iPhone 4S. We'll be running more application level tests over the coming days so stay tuned for those.

    A6 GPU Performance: Nearly Identical to the iPad 3

    Before we got a die shot of Apple's A6 we had good information pointing to a three core PowerVR SGX 543MP3 in the new design. As a recap, Imagination Technologies' PowerVR SGX543 GPU core features four USSE2 pipes. Each pipe has a 4-way vector ALU that can crank out 4 multiply-adds per clock, which works out to be 16 MADs per clock or 32 FLOPS. Imagination lets the customer stick multiple 543 cores together, which scales compute performance linearly. The A5 featured a two core design, running at approximately 200MHz based on our latest news. The A5X in the 3rd generation iPad featured a four core design, running at the same 200MHz clock speed.

    The A6 on the other hand features a three core PowerVR SGX 543MP3, running at higher clock speeds to deliver a good balance of die size while still delivering on Apple's 2x GPU performance claim. The raw specs are below:

    Mobile SoC GPU Comparison
      Adreno 225 PowerVR SGX 540 PowerVR SGX 543MP2 PowerVR SGX 543MP3 PowerVR SGX 543MP4 Mali-400 MP4 Tegra 3
    SIMD Name - USSE USSE2 USSE2 USSE2 Core Core
    # of SIMDs 8 4 8 12 16 4 + 1 12
    MADs per SIMD 4 2 4 4 4 4 / 2 1
    Total MADs 32 8 32 48 64 18 12
    GFLOPS @ 200MHz 12.8 GFLOPS 3.2 GFLOPS 12.8 GFLOPS 19.2 GFLOPS 25.6 GFLOPS 7.2 GFLOPS 4.8 GFLOPS
    GFLOPS As Shipped by Apple/ASUS - - 12.8 GFLOPS 25.5 GFLOPS 25.6 GFLOPS - 12
    GFLOPS

    The result is peak theoretical GPU performance that's near identical to the A5X in the 3rd generation iPad. The main difference is memory bandwidth. The A5X features a 128-bit wide memory interface while the A6 retains the same 64-bit wide interface as the standard A5. In memory bandwidth limited situations, the A5X will still be quicker but it's quite likely that at the iPhone 5's native resolution we won't see that happen.

    We ran through the full GLBenchmark 2.5 suite to get a good idea of GPU performance. Note that the 3rd gen iPad results are still on iOS 5.1 so there's a chance you'll see some numbers change as we move to iOS 6.

    We'll start out with the raw theoretical numbers beginning with fill rate:

    GLBenchmark 2.5 - Fill Test

    The iPhone 5 nips at the heels of the 3rd generation iPad here, at 1.65GTexels/s. The performance advantage over the iPhone 4S is more than double, and even the Galaxy S 3 can't come close.

    GLBenchmark 2.5 - Fill Test (Offscreen 1080p)

    Triangle throughput is similarly strong:

    GLBenchmark 2.5 - Triangle Texture Test

    Take resolution into account and the iPhone 5 is actually faster than the new iPad, but normalize for resolution using GLBenchmark's offscreen mode and the A5X and A6 look identical:

    GLBenchmark 2.5 - Triangle Texture Test (Offscreen 1080p)

    The fragment lit texture test does very well on the iPhone 5, once again when you take into account the much lower resolution of the 5's display performance is significantly better than on the iPad:

    GLBenchmark 2.5 - Triangle Texture Test - Fragment Lit

    GLBenchmark 2.5 - Triangle Texture Test - Fragment Lit (Offscreen 1080p)

    GLBenchmark 2.5 - Triangle Texture Test - Vertex Lit

    GLBenchmark 2.5 - Triangle Texture Test - Vertex Lit (Offscreen 1080p)

    The next set of results are the gameplay simulation tests, which attempt to give you an idea of what game performance based on Kishonti's engine would look like. These tests tend to be compute monsters, so they'll make a great stress test for the iPhone 5's new GPU:

    GLBenchmark 2.5 - Egypt HD

    Egypt HD was the great equalizer when we first met it, but the iPhone 5 does very well here. The biggest surprise however is just how well the Qualcomm Snapdragon S4 Pro with Adreno 320 GPU does by comparison. LG's Optimus G, a device Brian flew to Seoul, South Korea to benchmark, is hot on the heels of the new iPhone.

    GLBenchmark 2.5 - Egypt HD (Offscreen 1080p)

    When we run everything at 1080p the iPhone 5 looks a lot like the new iPad, and is about 2x the performance of the Galaxy S 3. Here, LG's Optimus G actually outperforms the iPhone 5! It looks like Qualcomm's Adreno 320 is quite competant in a phone.

    GLBenchmark 2.5 - Egypt Classic

    The Egypt classic tests are much lighter workloads and are likely a good indication of the type of performance you can expect from many games today available on the app store. At its native resolution, the iPhone 5 has no problems hitting the 60 fps vsync limit.

    GLBenchmark 2.5 - Egypt Classic (Offscreen 1080p)

    Remove vsync, render at 1080p and you see what the GPUs can really do. Here the iPhone 5 pulls ahead of the Adreno 320 based LG Optimus G and even slightly ahead of the new iPad.

    Once again, looking at GLBenchmark's on-screen and offscreen Egypt tests we can get a good idea of how the iPhone 5 measures up to Apple's claims of 2x the GPU performance of the iPhone 4S:

    Removing the clearly vsync limited result from the on-screen Egypt Classic test, the iPhone 5 performs about 2.26x the speed of the 4S. If we include that result in the average you're still looking at a 1.95x average. As we've seen in the past, these gains don't typically translate into dramatically higher frame rates in games, but games with better visual quality instead.

    Final Words

    We still have a lot of work ahead of us, including evaluating the power profile of the new A6 SoC. Stay tuned for more data in our full review of the iPhone 5.

    Image


    Image
    11:02p
    Additional Details on Micron’s DDR3L-RS, DDR4-RS, and Other Memory

    Earlier this week we posted a short write-up about Micron’s new DDR3L-RS memory. We didn’t have a lot of technical detail to go on at the time, but Micron offered us a chance to chat with them on the phone and we were able to get more information about DDR3L-RS as well as their other memory products. Memory is something many of us take for granted in our PCs and other computing devices, but there’s a lot more going on in the market than you might expect.

    If you need the least expensive memory possible, DDR3 is currently the way to go. On the other hand, if you’re making a mobile device, finding memory that uses less power even if it costs more might be the best option. Naturally, there are plenty of other options that fall somewhere in between those extremes; Micron provided us with the following chart showing where the various memory types fall in terms of price vs. power requirements.

    Starting at the top with LPDDR3 and other LPDDR products, their specialized nature is what gives them both their low power usage as well as their higher cost—consider how most tablets and smartphones only ship with 1GB LPDDR or less right now. Chiefly this comes because of the complexity of the devices; for example, the memory might be integrated into an SoC, or placed in a PoP package. The result is that while you can get the best power characteristics out of LPDDR, the volume is much lower as it’s generally not used in high volume markets like laptops and PCs with 8GB or more RAM. We haven’t seen any laptops that use LPDDR so far (at least, not that I’m aware of), but Intel reportedly has LPDDR3 support on their Ultrabook roadmaps, which would allow for improved battery life as well as smaller/thinner designs.

    At the other end of the spectrum we have DDR3, a commodity memory where low price is generally the primary consideration. These devices are mass produced so economies of scale along with less difficult targets (e.g. 1.5V and DDR3-1600 speed) allow them to reach lower price points. Right now, for example, you can find a kit of 8GB DDR3-1600 CL9 SO-DIMMs for around $40 (and under $35 for DDR3-1333 and/or CL11).

    One step up from DDR3 in terms of power efficiency is DDR3L, which targets 1.35V instead of 1.5V. Power scales linearly with voltage and current (P = V * I), and reducing the voltage typically reduces the current required for the chip as well, resulting in a substantial reduction in power draw. Getting chips that will run at a lower voltage is mainly a matter of binning, along with improvements in process technology, so the costs are very similar to regular DDR3. Sticking with the previous example, the same DDR3-1600 CL9 kits cost about 10% more if you get 1.35V DDR3L. Note that most DDR3L laptop kits will also run fine at 1.5V, but if you want to run at 1.35V you’ll generally need a laptop specifically designed to utilize the lower voltage—Apple’s MacBook Pros for instance use 1.35V CL11 memory.

    Straddling the line between LPDDR3 and DDR3L, we have Micron’s new DDR3L-RS memory. The RS suffix stands for “Reduced Standby”, and through a process of binning along with a few extra features, Micron is able to cut standby power use for a system by around 25%. DDR3L-RS is also backwards compatible with the DDR3 standard, so there’s no change necessary at the controller level—all the extra work happens in the memory devices. Micron couldn’t discuss specific prices of their various memory types, but they did suggest that at the component level DDR3L-RS should cost around 20% more than DDR3L. In terms of power efficiency, Micron provided the following information showing their expected power savings:

    One of Micron’s key features in reducing the amount of power used in standby mode is TCSR: Temperature Controlled Self Refresh. Most systems are specified to run the RAM at up to 85C when active, but in sleep mode the temperatures drop substantially and open the door for some additional power savings. In the case of Micron’s DDR3L-RS, once the temperature hits 45C or less, they can reduce how frequently RAM needs to be refreshed and thereby reduce the power draw. It's also important to remember that DDR3L-RS won't perform any better than DDR3L in active use; its benefits as the name implies are only when the memory/system is in standby.

    We should note that while we’re talking about Micron’s specific memory, DDR3L-RS, it is expected that the other major memory manufacturers (e.g. Hynix, Elpida, Samsung, etc.) will have similar RAM technologies, though the specifics of how they save power may vary among the suppliers.

    What about future memory technologies like DDR4? Micron also discussed some of their upcoming designs that leverage DDR4 with us, and like the switch from DDR2 to DDR3, the change from DDR3 to DDR4 will necessitate new memory controllers and will not be backwards compatible. One of the biggest changes with DDR4 is that the standard voltage drops from 1.5V (DDR3) down to 1.2V, enabling power savings over even DDR3L. Micron will also have DDR4-RS memory available, and we’ll likely see products start to use that (e.g. some tablets) as soon as late 2012/early 2013. While Intel hasn’t officially made any statements in regards to Haswell’s memory technology, the emphasis on reducing power use would make that an ideal time for Intel to switch from a DDR3 controller to DDR4—we should know more sometime in the coming months.

    Wrapping up, obviously there’s no single “silver bullet” memory technology that works best for all markets. Paying a price premium for DDR3L or DDR3L-RS on a desktop just to save a couple watts of power doesn’t really make sense, while on laptops and in particular Ultrabooks the power savings could definitely be worthwhile. Like other DRAM manufacturers, Micron looks to offer a broad selection of DRAM devices for the whole array of options. The end result is that customers can choose based on cost, form factor, power, etc. and find the best balance of features and pricing for their product. Margins on memory products have become razor thin over the years, so anything that can help companies like Micron find a way to improve their bottom line is obviously something they will pursue; currently, the ultrathin computing initiative—tablets, Ultrabooks, sleekbooks, etc.—is really pushing for improvements in memory technology.

    Image


    Image

    << Previous Day 2012/09/22
    [Calendar]
    Next Day >>

AnandTech   About LJ.Rossia.org