AnandTech's Journal
 
[Most Recent Entries] [Calendar View]

Sunday, February 5th, 2017

    Time Event
    6:00p
    NVIDIA Announces Quadro P4000, P2000, P1000, P600, & P400 - Finishing the Quadro Pascal Refresh

    Alongside today’s big announcement of the GP100-powered Quadro GP100, NVIDIA is also announcing a sizable refresh to the rest of the Quadro family today at SOLIDWORKS World. Along with the Quadro GP100, altogether NVIDIA is announcing six new Quadro Pascal cards today, joining the two existing Pascal cards and finishing the rest of the Quadro family Pascal refresh.

    As we’ve already covered the Pascal Quadro family feature set in some detail with the Quadro P6000 launch, I won’t go over it in great depth here. But at a high level Pascal and the cards based on it bring with them a few core improvements over previous generation Maxwell and Kepler cards. This includes much greater performance thanks to the smaller 16nm manufacturing process and resulting wider GPUs, single-port support for 5K@60Hz monitors thanks to DisplayPort 1.4, and Simultaneous Multi-Projection (SMP) for VR.

    Moving on to the products then, the five cards covered here are being launched to replace their Maxwell M-series cards in the same segments. Over the years NVIDIA has worked out a distinct price/feature/TDP structure to their Quadro lineup, with this latest generation of cards meant to slot right into that.

    NVIDIA Quadro Specification Comparison (4000/2000)
      Quadro P4000 Quadro P2000 Quadro M4000 Quadro M2000
    CUDA Cores 1792 1024 1664 768
    Boost Clock ~1480MHz ~1470MHz 800MHz 1180MHz
    FP32 TFLOPS 5.3 TFLOPS 3.0 TFLOPS 2.66 TFLOPS 1.81 TFLOPS
    Memory Bus Width 256-bit 160-bit 256-bit 128-bit
    VRAM 8GB 5GB 8GB 4GB
    FP64 1/32 1/32 1/32 1/32
    TDP 105W 75W 120W 75W
    GPU GP104 GP106 GM204 GM206
    Architecture Pascal Pascal Maxwell 2 Maxwell 2
    Size Single-Slot Single-Slot Single-Slot Single-Slot
    DisplayPort Outputs 4 4 4 4

    At the mid-range of the market are the Quadro P4000 and P2000. As the more powerful of the two, the Quadro P4000 is based on a cut-down GP104 GPU, the same GPU used in the Quadro P5000. However unlike the P5000, the P4000 has been cut down to size as a single-slot card. With respect to performance on paper it should deliver right around 2x the performance of the M4000, and it can do so with a TDP of 105W, 15W lower than its predecessor. It is also the lower-tier card that NVIDIA still classifies as VR ready; below the P4000 they don’t recommend their cards for VR development.

    Meanwhile replacing the Quadro M2000 is the Quadro P2000. This card is in the same single-slot form factor as the P4000, but it drops down in power and performance, being built off of a GP106 GPU. Performance should be around 66% percent faster than its predecessor with the same 75W TDP. Surprisingly, NVIDIA didn’t opt to go with a fully-enabled 192-bit memory bus on this card; instead only 5 channels (160-bits) are enabled, which is also why it offers the more unusual memory capacity of 5GB.

    Finally, both cards come with 4 full-size DisplayPort 1.4 connectors.

    NVIDIA Quadro Specification Comparison (1xxx)
      Quadro P1000 Quadro K1200
    CUDA Cores 640 512
    Boost Clock ~1400MHz 954MHz
    FP32 TFLOPS 1.8 TFLOPS 0.98 TFLOPS
    Memory Bus Width 128-bit 128-bit
    VRAM 4GB 4GB
    TDP 47W 45W
    GPU GP107 GM107
    Architecture Pascal Maxwell 2
    Size Low Profile Low Profile
    DisplayPort Outputs 4 4

    Still farther down the refreshed Quadro lineup we have NVIDIA’s low-profile cards, which cover the rest of the mid-range market and the entry-level market. All based around the GP107 GPU, these cards also serve to reintegrate the Quadro lineup into a single family; the cards these new Quadro products replace were sold as part of the Quadro K-series, and featured a mix of Kepler and Maxwell 1 GPUs.

    Starting things off here we have the Quadro P1000. Replacing the GM107-based K1200, the P1000 is the most powerful of the low-profile Quadros. It uses a cut-down GP107 GPU clocked at around 1.4GHz. Overall performance on paper should be around 84% faster than the outgoing K1200, while TDP has drifted up just slightly from 45W to 47W. Meanwhile due to its size, NVIDIA has shifted to mini-DisplayPort connectors here. Interestingly, I’m told that these are latching ports, which isn’t standard for mini-DisplayPort but offers a more robust connection as a result.

    NVIDIA Quadro Specification Comparison (6xx/4xx)
      Quadro P600 Quadro P400 Quadro K620 Quadro K420
    CUDA Cores 384 256 384 192
    Boost Clock ~1430MHz ~1170MHz 1000MHz 780MHz
    FP32 TFLOPS 1.1 TFLOPS 0.6 TFLOPS 0.76 TFLOPS 0.3 TFLOPS
    Memory Bus Width 128-bit? 128-bit? 128-bit 128-bit
    VRAM 2GB 2GB 2GB 1GB
    TDP 40W 30W 45W 41W
    GPU GP107 GP107 GM107 GK107
    Architecture Pascal Pascal Maxwell 1 Kepler
    Size Low-Profile Low-Profile Low-Profile Low-Profile
    DisplayPort Outputs 4x Mini 3x Mini 1x (+1 DVI) 1x (+1 DVI)

    Rounding out the rest of the pack is the Quadro P600 and P400. The Quadro P600 is essentially a lower-performance version of the P1000. It drops down to 1.1 TFLOPS – around 45% faster than its predecessor – and retains the 4 mini-DisplayPort connectors. TDP on this part is 40W, down from 45W for the Quadro K620. Also of note here, the low-end Quadros are finally getting an upgrade to GDDR5, versus DDR3 that was found on their K-series counterparts.

    Finally, bringing up the rear is the Quadro P400. This replaces the K420, which was the last Kepler-based part in the Quadro lineup. Performance here should more than double, considering both the vast improvement in architecture and clockspeeds. Meanwhile the P400’s TDP is 30W, versus 41W for the K420. Note that relative to the P600, the P400 loses a mini-DisplayPort; though this is still up from the 2 total ports on the K420.

    Wrapping things up, as with the Quadro GP100, the rest of these new Quadro cards will start to arrive in March. NVIDIA hasn’t published formal prices for these cards, but we’re told that prices should be similar to the last-generation Quadro cards that they replace.

    6:01p
    NVIDIA Announces Quadro GP100 - Big Pascal Comes to Workstations

    Kicking off on this Sunday afternoon is CAD & CAE software developer Dassault Systèmes’ annual trade show, the aptly named SOLIDWORKS World. One of the major yearly gatherings for workstation hardware and software vendors, it’s often used as a backdrop for announcing new products. And this year NVIDIA is doing just that with a literal Big Pascal product launch for workstations.

    The last time we checked in on NVIDIA’s Quadro workstation video card group, they had just launched the Quadro P6000. Based on a fully enabled version of NVIDIA’s GP102 GPU, the P6000 was the first high-end Quadro card to be released based on the Pascal generation. This is a notable distinction, as NVIDIA’s GPU production strategy has changed since the days of Kepler and Maxwell. No longer does NVIDIA’s biggest GPU pull triple-duty across consumer, workstations, and servers. Instead the server (and broader compute market) is large enough to justify going all-in on a compute-centric GPU. This resulted in Big Pascal coming to life as the unique GP100, while NVIDIA’s graphical workhorse was the smaller and more conventional (but still very powerful) GP102.

    Because of this split in NVIDIA GPU designs, it wasn’t clear where this new compute-centric GPU would fit in across NVIDIA’s product lines. It’s the backbone of Tesla server cards, of course, and meanwhile it’s very unlikely to show up in consumer GeForce products. But what about the Quadro market, which in previous generations has catered to both graphics and compute users at the high-end (if only because of the mixed-use nature of previous generation GPUs)? The answer, as it turns out, is that Big Pascal has a place in the Quadro family after all. And that’s an interesting place at the top that NVIDIA calls the Quadro GP100.

    Based on NVIDIA’s GP100 GPU, Quadro GP100 defies a simple explanation due in large part to GP100’s unique place in NVIDIA’s Pascal GPU family. Quadro GP100 on one hand a return to form for NVIDIA’s Quadro lineup. It’s the jack of all trades card that does everything – graphics and compute – including features that the Tesla cards don’t offer, a job previously fulfilled by cards like the Quadro K6000. On the other hand, it’s not necessarily NVIDIA’s most powerful workstation card: on paper its FP32/graphics performance is lower than Quadro P6000’s. So where does Quadro GP100 fit in to the big picture?

    The long and short of it is that the Quadro GP100 is meant to be a Tesla/GP100 card for workstations, but with even more functionality. While NVIDIA offers PCIe Tesla P100 cards, those cards only feature passive cooling and are designed for servers; the lack of active cooling means you can’t put them in (conventional) workstations. The Quadro GP100 on the other hand is a traditional, fan & shroud active cooled card, like the rest of the Quadro lineup. And then NVIDIA doesn’t stop there, enabling graphics functionality that isn’t on the Tesla cards. The fact that NVIDIA isn’t even giving it a P-series name – rather naming it after the GPU underneath – is a good hint of where NVIDIA is going.

    NVIDIA Quadro Specification Comparison
      GP100 P6000 M6000 K6000
    CUDA Cores 3584 3840 3072 2880
    Texture Units 224 240 192 240
    ROPs 128? 96 96 48
    Boost Clock ~1430MHz ~1560MHz ~1140MHz N/A
    Memory Clock 1.4 Gbps HBM2 9Gbps GDDR5X 6.6Gbps GDDR5 6Gbps GDDR5
    Memory Bus Width 4096-bit 384-bit 384-bit 384-bit
    VRAM 16GB 24GB 24GB 12GB
    ECC Full Partial Partial Full
    FP64 1/2 FP32 1/32 FP32 1/32 FP32 1/3 FP32
    TDP 235W 250W 250W 225W
    GPU GP100 GP102 GM200 GK110
    Architecture Pascal Pascal Maxwell 2 Kepler
    Manufacturing Process TSMC 16nm TSMC 16nm TSMC 28nm TSMC 28nm
    Launch Date March 2017 October 2016 03/22/2016 07/23/2013

    The Quadro GP100 then is being pitched at an interesting mix of users. For compute users who need a workstation-suitable GP100 card, then the Quadro GP100 is meant to be their card. It offers all of GP100’s core functionality, including ECC memory, half-speed FP64, and double-speed (packed) FP16 instructions. As an added kicker, the Quadro GP100 introduces a new NVLink connector for PCIe cards. This allows for a pair of Quadro cards to be linked up in a 2-way NVLink configuration, bringing with it NVLInk’s memory access and low latency data transfer benefits to PCIe cards. Notably, this isn’t available on the Tesla PCIe cards.

    As NVIDIA discusses it, they sound rather confident that Quadro GP100 will sell well to compute users, and for good reason. The Tesla P100 cards have been a hit with neural network programmers, and now researchers have a card suitable for dropping into a workstation to develop against.

    Meanwhile the second market for the Quadro GP100 is the traditional high-end CAD/CAE market. For those more specialized users who need a workstation card with fast FP64 performance and ECC memory for maximum accuracy and reliability, the Quadro GP100 is the first Quadro card since the K6000 to offer that functionality. Arguably this is a bit of a niche, since most CAD users don’t need that kind of reliability, but for those who do for complex engineering simulations and the like, it’s critical (not to mention a lucrative market for NVIDIA). Serving this market also makes the Quadro GP100 unique in that it’s the only GP100 card with its graphical functionality turned on.

    However when it comes to those graphical workloads, this is where the line between the Quadro GP100 and P6000 gets a lot murkier. The Quadro P6000 is rated for 12 TFLOPS FP32, versus GP100’s 10.3 TFLOPS, and similarly the Quadro GP100 features around 86% of the texture throughput as well. Paper specs aren’t everything, of course, but in pure SM throughput-bound scenarios the P6000 should be the faster card. This being the advantage of the more compact (and manufacturable) GP102 versus the massive GP100.

    The one wildcard here is the HBM2 memory interface and associated ROPs. NVIDIA is specifically touting the Quadro GP100 as offering their fastest rendering performance, and depending on the scenario that can very well be the case. With 720GB/sec of memory bandwidth – thanks to 4 HBM2 stacks clocked at 1.4Gbps each – the Quadro GP100 has 66% more memory bandwidth than the Quadro P6000’s mere 432GB/sec. Coupled with what’s almost certainly a ROP count advantage – NVIDIA still hasn’t disclosed GP100’s ROP count, but based on what we know of GP102, 128 ROPs is a safe bet – and Quadro GP100’s pure pixel pushing power should be greater than even P6000 by around 22%. Given that CAD/CAE can be very pixel-bound, and this should be a tangible benefit for some Quadro customers.

    The one drawback though is memory capacity. While the Quadro P6000 offers 24GB of VRAM due to the greater practical capacity of GDDR5X, like all GP100 products the Quadro GP100 tops out at 16GB of HBM2. This means that for very large dataset users, a single Quadro GP100 is a good deal smaller than what they can get out of the P6000. It’s worth noting that NVIDIA is touting NVLink as helping out with memory crunch issues, however I suspect that’s rooted in compute more than graphics.

    Moving on then, outside of the GPU underneath, the Quadro GP100 packs the typical Quadro family hardware features. This includes 4 DisplayPort 1.4 ports and a single DVI port for display outputs, and NVIDIA is classifying it as VR Ready as well. Meanwhile towards the rear of the card are the Quadro Sync and Stereo connectors for synchronized refresh and quad-buffered stereo respectively.

    Wrapping things up, like the rest of the Quadro cards being launched today, NVIDIA is expecting the Quadro GP100 to ship in March. Pricing has yet to be determined, but as the Quadro GP100 is the jack-of-all-trades GP100 card, I'm told that pricing will be a bit slightly above the Quadro P6000, which would put it somewhere north of $5,000.

    << Previous Day 2017/02/05
    [Calendar]
    Next Day >>

AnandTech   About LJ.Rossia.org