Tag Archives: nvidia

CES 2014 report: robots, smart home, wearables, bendy TV, tablets, health gadgets, tubes and horns

CES in Las Vegas is an amazing event, partly through sheer scale. It is the largest trade show in Vegas, America’s trade show city. Apparently it was also the largest CES ever: two million square feet of exhibition space, 3,200 exhibitors, 150,000 industry attendees, of whom 35,000 were from outside the USA.

image

It follows that CES is beyond the ability of any one person to see in its entirety. Further, it is far from an even representation of the consumer tech industry. Notable absentees include Apple, Google and Microsoft – though Microsoft for one booked a rather large space in the Venetian hotel which was used for private meetings.  The primary purpose of CES, as another journalist explained to me, is for Asian companies to do deals with US and international buyers. The success of WowWee’s stand for app-controllable MiP robots, for example, probably determines how many of the things you will see in the shops in the 2014/15 winter season.

image

The kingmakers at CES are the people going round with badges marked Buyer. The press events are a side-show.

CES is also among the world’s biggest trade shows for consumer audio and high-end audio, which is a bonus for me as I have an interest in such things.

Now some observations. First, a reminder that CEA (the organisation behind CES) kicked off the event with a somewhat downbeat presentation showing that global consumer tech spending is essentially flat. Smartphones and tablets are growing, but prices are falling, and most other categories are contracting. Converged devices are reducing overall spend. One you had a camera, a phone and a music player; now the phone does all three.

Second, if there is one dominant presence at CES, it is Samsung. Press counted themselves lucky even to get into the press conference. A showy presentation convinced us that we really want not only UHD (4K UHD is 3840 x 2160 resolution) video, but also a curved screen, for a more immersive experience; or even the best of both worlds, an 85” bendable UHD TV which transforms from flat to curved.

image

We already knew that 4K video will go mainstream, but there is more uncertainty about the future connected home. Samsung had a lot to say about this too, unveiling its Smart Home service. A Smart Home Protocol (SHP) will connect devices and home appliances, and an app will let you manage them. Home View will let you view your home remotely. Third parties will be invited to participate. More on the Smart Home is here.

image

The technology is there; but there are several stumbling blocks. One is political. Will Apple want to participate in Samsung’s Smart Home? will Google? will Microsoft? What about competitors making home appliances? The answer is that nobody will want to cede control of the Smart Home specifications to Samsung, so it can only succeed through sheer muscle, or by making some alliances.

The other question is around value for money. If you are buying a fridge freezer, how high on your list of requirements is SHP compatibility? How much extra will you spend? If the answer is that old-fashioned attributes like capacity, reliability and running cost are all more important, then the Smart Home cannot happen until there are agreed standards and a low cost of implementation. It will come, but not necessarily from Samsung.

Samsung did not say that much about its mobile devices. No Galaxy S5 yet; maybe at Mobile World Congress next month. It did announce the Galaxy Note Pro and Galaxy Tab Pro series in three sizes; the “Pro” designation intrigues me as it suggests the intention that these be business devices, part of the “death of the PC” theme which was also present at CES.

Samsung did not need to say much about mobile because it knows it is winning. Huawei proudly announced that it it is 3rd in smartphones after Samsung and Apple, with a … 4.8% market share, which says all you need to know.

That said, Huawei made a rather good presentation, showing off its forthcoming AscendMate2 4G smartphone, with 6.1” display, long battery life (more than double that of iPhone 5S is claimed, with more than 2 days in normal use), 5MP front camera for selfies, 13MP rear camera, full specs here. No price yet, but expect it to be competitive.

image

Sony also had a good CES, with indications that PlayStation 4 is besting Xbox One in the early days of the next-gen console wars, and a stylish stand reminding us that Sony knows how to design good-looking kit. Sony’s theme was 4K becoming more affordable, with its FDR-AX100 camcorder offering 4K support in a device no larger than most camcorders; unfortunately the sample video we saw did not look particularly good.

image

Sony also showed the Xperia Z1 compact smartphone, which went down well, and teased us with an introduction for Sony SmartWear wearable entertainment and “life log” capture. We saw the unremarkable “core” gadget which will capture the data but await more details.

image

Another Sony theme was high resolution audio, on which I am writing a detailed piece (not just about Sony) to follow.

As for Microsoft Windows, it was mostly lost behind a sea of Android and other devices, though I will note that Lenovo impressed with its new range of Windows 8 tablets and hybrids – like the 8” Thinkpad with Windows 8.1 Pro and full HD 1920×1200 display – more details here.

image

There is an optional USB 3.0 dock for the Thinkpad 8 but I commented to the Lenovo folk that the device really needs a keyboard cover. I mentioned this again at the Kensington stand during the Mobile Focus Digital Experience event, and they told me they would go over and have a look then and there; so if a nice Kensington keyboard cover appears for the Thinkpad 8 you have me to thank.

Whereas Lenovo strikes me as a company which is striving to get the best from Windows 8, I was less impressed by the Asus press event, mainly because I doubt the Windows/Android dual boot concept will take off. Asus showed the TD300 Transformer Book Duet which runs both. I understand why OEMs are trying to bolt together the main business operating system with the most popular tablet OS, but I dislike dual boot systems, and if the Windows 8 dual personality with Metro and desktop is difficult, then a Windows/Android hybrid is more so. I’d guess there is more future in Android emulation on Windows. Run Android apps in a window? Asus did also announce its own 8” Windows 8.1 tablet, but did not think it worth attention in its CES press conference.

Wearables was a theme at CES, especially in the health area, and there was a substantial iHealth section to browse around.

image

I am not sure where this is going, but it seems to me inevitable that self-monitoring of how well or badly our bodies are functioning will become commonplace. The result will be fodder for hypochondriacs, but I think there will be real benefits too, in terms of motivation for exercise and healthy diets, and better warning and reaction for critical problems like heart attacks. The worry is that all that data will somehow find its way to Google or health insurance companies, raising premiums for those who need it most. As to which of the many companies jostling for position in this space will survive, that is another matter.

What else? It is a matter of where to stop. I was impressed by NVidia’s demo rig showing three 4K displays driven by a GTX-equipped PC; my snap absolutely does not capture the impact of the driving game being shown.

image

I was also impressed by NVidia’s ability to befuddle the press at its launch of the Tegra K1 chipset, confusing 192 CUDA cores with CPU cores. Having said that, the CUDA support does mean you can use those cores for general-purpose programming and I see huge potential in this for more powerful image processing on the device, for example. Tegra 4 on the Surface 2 is an excellent experience, and I hope Microsoft follows up with a K1 model in due course even though that looks doubtful.

There were of course many intriguing devices on show at CES, on some of which I will report over at the Gadget Writing blog, and much wild and wonderful high-end audio.

On audio I will note this. Bang & Olufsen showed a stylish home system, largely wireless, but the sound was disappointing (it also struck me as significant that Android or iOS is required to use it). The audiophiles over in the Venetian tower may have loopy ideas, but they had the best sounds.

CES can do retro as well as next gen; the last pinball machine manufacturer displayed at Digital Experience, while vinyl, tubes and horns were on display over in the tower.

image

China’s Tianhe-2 Supercomputer takes top ranking, a win for Intel vs Nvidia

The International Supercomputing Conference (ISC) is under way in Leipzig, and one of the announcements is that China’s Tianhe-2 is now the world’s fastest supercomputer according to the Top 500 list.

This has some personal interest for me, as I visited its predecessor Tianhe-1A in December 2011, on a press briefing organised by NVidia which was, I guess, on a diplomatic mission to promote Tesla, the GPU accelerator boards used in Tianhe-1A (which was itself the world’s fastest supercomputer for a period).

It appears that the mission failed, insofar as Tianhe-2 uses Intel Phi accelerator boards rather than Nvidia Tesla.

Tianhe-2 has 16,000 nodes, each with two Intel Xeon IvyBridge processors and three Xeon Phi processors for a combined total of 3,120,000 computing cores.

says the press release. Previously, the world’s fastest was the US Titan, which does use NVidia GPUs.

Nvidia has reason to worry. Tesla boards are present on 39 of the top 500, whereas Xeon Phi is only on 11, but it has not been out for long and is growing fast. A newly published paper shows Xeon Phi besting Tesla on sparse matrix-vector multiplication:

we demonstrate that our implementation is 3.52x and 1.32x faster, respectively, than the best available implementations on dual IntelR XeonR Processor E5-2680 and the NVIDIA Tesla K20X architecture.

In addition, Intel has just announced the successor to Xeon Phi, codenamed Knight’s Landing. Knight’s Landing can function as the host CPU as well as an accelerator board, and has integrated on-package memory to reduce data transfer bottlenecks.

Nvidia does not agree that Xeon Phi is faster:

The Tesla K20X is about 50% faster in Linpack performance, and in terms of real application performance we’re seeing from 2x to 5x faster performance using K20X versus Xeon Phi accelerator.

says the company’s Roy Kim, Tesla product manager. The truth I suspect is that it depends on the type of workload and I would welcome more detail on this.

It is also worth noting that Tianhe-2 does not better Titan on power/performance ratio.

  • Tianhe-2: 3,120,00 cores, 1,024,000 GB Memory, Linpack perf 33,862.7 TFlop/s, Power 17,808 kW.
  • Titan: 560,640 cores, 710,144 GB Memory, Linpack perf 17,590 TFlop/s, Power 8,209 kW.

NVIDIA’s Visual Computing Appliance: high-end virtual graphics power on tap

NVIDIA CEO Jen-Hsun Huang has announced the Grid Visual Computing Appliance (VCA). Install one of these, and users anywhere on the network can run graphically-demanding applications on their Mac, PC or tablet. The Grid VCA is based on remote graphics technology announced at last year’s GPU Technology Conference. This year’s event is currently under way in San Jose.

The Grid VCA is a 4U rack-mounted server.

image_thumb[14]

Inside are up to 2 Xeon CPUs each supporting 16 threads, and up to 8 Grid GPU boards each containing 2 Kepler GPUs each with 4GB GPU memory. There is up to 384GB of system RAM.

image_thumb[16]

There is a built-in hypervisor (I am not sure which hypervisor NVIDIA is using) which supports 16 virtual machines and therefore up to 16 concurrent users.

NVIDIA supplies a Grid client for Mac, Windows or Android (no mention of Apple iOS).

During the announcement, NVIDIA demonstrated a Mac running several simultaneous Grid sessions. The virtual machines were running Windows with applications including Autodesk 3D Studio Max and Adobe Premier. This looks like a great way to run Windows on a Mac.

image_thumb[17]

The Grid VCA is currently in beta, and when available will cost from $24,900 plus $2,400/yr software licenses. It looks as if the software licenses are priced at $300 per concurrent user, since the price doubles to $4,800/Yr for the box which supports 16 concurrent users.

image_thumb[18]

Businesses will need to do the arithmetic and see if this makes sense for them. Conceptually it strikes me as excellent, enabling one centralised GPU server to provide high-end graphics to anyone on the network, subject to the concurrent user limitation. It also enables graphically demanding Windows-only applications to run well on Macs.

The Grid VCA is part of the NVIDIA GRID Enterprise Ecosystem, which the company says is supported by partners including Citrix, Dell, Cisco, Microsoft, VMWare, IBM and HP.

image_thumb[13]

Big GPU news at NVIDIA tech conference including first Tegra with CUDA

NVIDIA CEO Jen-Hsun Huang made a number of announcements at the GPU Technology Conference (GTC) keynote yesterday, including an updated roadmap for both desktop and mobile GPUs.

image

Although the focus of the GTC is on high-performance computing using Tesla GPU accelerator boards, Huang’s announcements were not limited to that area but also covered the company’s progress on mobile and on the desktop. Huang opened by mentioning the recently released GeForce Titan graphics processor which has 2,600 CUDA cores, and which starts from under £700 so is within reach of serious gamers as well as developers who can make use of it for general-purpose computing. CUDA enables use of the GPU for massively parallel general-purpose computing. NVIDIA is having problems keeping up with demand, said Huang.

There are now 430 million CUDA capable GPUs out there, said Huang, including 50 supercomputers, and coverage in 640 university courses.

image

He also mentioned last week’s announcement of the Swiss Piz Daint supercomputer which will include Tesla K20X GPU accelerators and will be operational in early 2014.

But what is coming next? Here is the latest GPU roadmap:

image

Kepler is the current GPU architecture, which introduced dynamic parallelism, the ability for the GPU to generate work without transitioning back to the CPU.

Coming next is Maxwell, which has unified virtual memory. The GPU can see the CPU memory, and the CPU can see the GPU memory, making programming easier. I am not sure how this impacts performance, but note that it is unified virtual memory, so the task of copying data between host and device still exists under the covers.

After Maxwell comes Volta, which focuses on increasing memory bandwidth and reducing latency. Volta includes a stack of DRAM on the same silicon substrate as the GPU, which Huang said enables 1TB per second of memory bandwidth.

What about mobile? NVIDIA is aware of the growth in devices of all kinds. 2.5bn high definition displays are sold each year, said Huang, and this will double again by 2015. These displays are mostly not for PCs, but on smartphones or embedded devices.

Here is the roadmap for Tegra, NVIDIA’s system-on-a-chip (SoC).

image

Tegra 4, which I saw in preview at last month’s mobile world congress in Barcelona, includes a software-defined modem and computational camera, able to tracks moving objects while keeping them in focus.

Next is Tegra Logan. This is the first Tegra to include CUDA cores so you can use it for general-purpose computing. It  is based on the Kepler GPU and supports full CUDA 5 computing as well as Open GL 4.3. Logan with be previewed this year and in production early 2014.

After Logan comes Parker. This will be based on the Maxwell GPU (see above) and NVIDIA’s own Denver (ARM-based) CPU. It will include FinFET multigate transistors.

According to Huang, Tegra performance will includes by 100 times over 5 years. Today’s Surface RT (which runs Tegra 3) may be sluggish, but Windows RT will run fine on these future SoCs. Of course Intel is not standing still either.

Finally, Huang announced the Grid Visual Computing Appliance, which I will be covering shortly in another post.

Intel Xeon Phi shines vs NVidia GPU accelerators in Ohio State University tests

Which is better for massively parallel computing, a GPU accelerator board from NVidia, or Intel’s new Xeon Phi? On the eve of NVidia’s GPU Technology Conference comes a paper which Intel will enjoy. Erik Sauley, Kamer Kayay, and Umit V. C atalyurek from the Ohio State University have issued a paper with performance comparisons between Xeon Phi, NVIDIA Tesla C2050 and NVIDIA Tesla K20. The K20 has 2,496 CUDA cores, versus a mere 61 processor cores on the Xeon Phi, yet on the particular calculations under test the researchers got generally better performance from Xeon Phi.

In the case of sparse-matrix vector multiplication (SpMV):

For GPU architectures, the K20 card is typically faster than the C2050 card. It performs better for 18 of the 22 instances. It obtains between 4.9 and 13.2GFlop/s and the highest performance on 9 of the instances. Xeon Phi reaches the highest performance on 12 of the instances and it is the only architecture which can obtain more than 15GFlop/s.

and in the case of sparse-matrix matrix multiplication (SpMM):

The K20 GPU is often more than twice faster than C2050, which is much better compared with their relative performances in SpMV. The Xeon Phi coprocessor gets
the best performance in 14 instances where this number is 5 and 3 for the CPU and GPU configurations, respectively. Intel Xeon Phi is the only architecture which achieves more than 100GFlop/s.

Note that this is a limited test, and that the authors note that SpMV computation is known to be a difficult case for GPU computing:

the irregularity and sparsity of SpMV-like kernels create several problems for these architectures.

They also note that memory latency is the biggest factor slowing performance:

At last, for most instances, the SpMV kernel appears to be memory latency bound rather than memory bandwidth bound

It is difficult to compare like with like. The Xeon Phi implementation uses OpenMP, whereas the GPU implementation uses CuSparse. I would also be interested to know whether as much effort was made to optimise for the GPU as for the Xeon Phi.

Still, this is a real-world test that, if nothing else, demonstrates that in the right circumstances the smaller number of cores in a Xeon Phi do not prevent it comparing favourably against a GPU accelerator:

When compared with cutting-edge processors and accelerators, its SpMV, and especially SpMM, performance are superior thanks to its wide registers
and vectorization capabilities. We believe that Xeon Phi will gain more interest in HPC community in the near future.

Images of Eurora, the world’s greenest supercomputer

Yesterday I was in Bologna for the press launch of Eurora at Cineca, a non-profit consortium of universities and other public bodies. The claim is that Eurora is the world’s greenest supercomputer.

image

Eurora is a prototype deployment of Aurora Tigon, made by Eurotech. It is a hybrid supercomputer, with 128 CPUs supplemented by 128 NVidia Kepler K20 GPUs.

What makes it green? Of course, being new is good, as processor efficiency improves with every release, and “green-ness” is measured in floating point operations per watt. Eurora does 3150 Mflop/s per watt.

There is more though. Eurotech is a believer in water cooling, which is more efficient than air. Further, it is easier to do something useful with the hot water you generate than with hot air, such as generating energy.

Other factors include underclocking slightly, and supplying 48 volt DC power in order to avoid power conversion steps.

Eurora is composed of 64 nodes. Each node has a board with 2 Intel Xeon E5-2687W CPUs, an Altera Stratix V FPGA (Field Programmable Gate Array), an SSD drive, and RAM soldered to the board; apparently soldering the RAM is more efficient than using DIMMs.

image

Here is the FPGA:

image

and one of the Intel-confidential CPUs:

image

On top of this board goes a water-cooled metal block. This presses against the CPU and other components for efficient heat exchange. There is no fan.

Then on top of that go the K20 GPU accelerator boards. The design means that these can be changed for Intel Xeon Phi accelerator boards. Eurotech is neutral in the NVidia vs Intel accelerator wars.

image

Here you can see where the water enters and leaves the heatsink. When you plug a node into the rack, you connect it to the plumbing as well as the electrics.

image

Here are 8 nodes in a rack.

image

Under the floor is a whole lot more plumbing. This is inside the Aurora cabinet where pipes and wires rise from the floor.

image

Here is a look under the floor outside the cabinet.

image

while at the corner of the room is a sort of pump room that pumps the water, monitors the system, adds chemicals to prevent algae from growing, and no doubt a few other things.

image

The press was asked NOT to operate this big red switch:

image

I am not sure whether the switch we were not meant to operate is the upper red button, or the lower red lever. To be on the safe side, I left them both as-is.

So here is a thought. Apparently Eurora is 15 times more energy-efficient than a typical desktop. If the mobile revolution continues and we all use tablets, which also tend to be relatively energy-efficient, could we save substantial energy by using the cloud when we need more grunt (whether processing or video) than a tablet can provide?

NVIDIA Tegra 4 chipset: faster performance, longer battery life

NVIDIA has announced the Tegra 4 chipset, which combines an NVIDIA GPU with a quad-core ARM Cortex-A15 CPU.

According to ARM, the Cortex-A15 delivers around twice the performance of the Cortex-A9, used in Tegra 3, and is able to address up to 1TB of RAM.

The Tegra 4 GPU has 72 cores, compared to 12 cores on Tegra 3.

In addition, NVIDIA is including what it calls “Computational Photography Architecture” which uses both CPU and GPU to improve photographic capability.

The part of the announcement that most caught my eye though is the claim of “up to 45 percent less power than its predecessor, Tegra 3, in common use cases”.

Tegra 4 will enable high-performance smartphones, but I am more interested in what this and other next-generation chipsets will offer for tablets. Microsoft’s Surface RT would be more compelling with Tegra 4, rather than its current Tegra 3, since it suffers from poor performance in some cases (Excel, for example) and longer battery life would do no harm either.

There will be even less reason to want a laptop.

NVIDIA’s newly announced Project SHIELD gaming portable also uses a Tegra 4 chipset.

image

Exascale computing: you could do it today if you could supply the power says Nvidia

Nvidia’s Bill Dally has posted about the company’s progress towards exascale computing, boosted by a $12.4 million grant from the U.S. Department of Energy. He mentions that it would be possible to build an exascale supercomputer today, if you could supply enough power:

Exascale systems will perform a quintillion floating point calculations per second (that’s a billion billion), making them 1,000 times faster than a one petaflop supercomputer. The world’s fastest computer today is about 16 petaflops.

One of the great challenges in developing such systems is in making them energy efficient. Theoretically, an exascale system could be built with x86 processors today, but it would require as much as 2 gigawatts of power — the entire output of the Hoover Dam. The GPUs in an exascale system built with NVIDIA Kepler K20 processors would consume about 150 megawatts. The DOE’s goal is to facilitate the development of exascale systems that consume less than 20 megawatts by the end of the decade.

If the industry succeeds in driving down supercomputer power consumption to one fortieth of what it is today, I guess it also follows that tablets like the one on which I am typing now will benefit from much greater power efficiency. This stuff matters, and not just in the HPC (High Performance Computing) market.

Programming NVIDIA GPUs and Intel MIC with directives: OpenACC vs OpenMP

Last month I was at Intel’s software conference learning about Many Integrated Core (MIC), the company’s forthcoming accelerator card for HPC (High Performance Computing). This month I am in San Jose for NVIDIA’s GPU Technology Conference learning about the latest development in NVIDIA’s platform for accelerated massively parallel computing using GPU cards and the CUDA architecture. The approaches taken by NVIDIA and Intel have much in common – focus on power efficiency, many cores, accelerator boards with independent memory space controlled by the CPU – but also major differences. Intel’s boards have familiar x86 processors, whereas NVIDIA’s have GPUs which require developer to learn CUDA C or an equivalent such as OpenCL.

In order to simplify this, NVIDIA and partners Cray, CAPS and PGI announced OpenACC last year, a set of directives which when added to C/C++ code instruct the compiler to run code parallelised on the GPU, or potentially on other accelerators such as Intel MIC. The OpenACC folk have stated from the outset their hope and intention that OpenACC will converge with OpenMP, an existing standard for directives enabling shared memory parallelisation. OpenMP is not suitable for accelerators since these have their own memory space.

One thing that puzzled me though: Intel clearly stated at last month’s event that it would support OpenMP (not OpenACC) on MIC, due to go into production at the end of this year or early next. How can this be?

I took the opportunity here at NVIDIA’s conference to ask Duncan Poole, who is NVIDIA’s Senior Manager for High Performance Computing and also the President of OpenACC, about what is happening with these two standards. How can Intel implement OpenMP on MIC, if it is not suitable for accelerators?

“I think OpenMP in the form that’s being discussed inside of the sub-committee is suitable. There’s some debate about some of the specific features that continues. Also, in the OpenMP committee they’re trying to address the concerns of TI and IBM so it’s a broader discussion than just the Intel architecture. So OpenMP will be useful on this class of processor. What we needed to do is not wait for it. That standard, if we’re lucky it will be draft at the end of this year, and maybe a year later will be ratified. We want to unify this developer base now,” Poole told me.

How similar will this adapted OpenMP be to what OpenACC is now?

“It’s got the potential to be quite close. The guy that drafted OpenACC is the head of that sub-committee. There’ll probably be changes in keywords, but there’s also some things being proposed now that were not conceived of. So there’s good debate going on, and I expect that we’ll benefit from it.

“Some of the features for example that are shared by Kepler and MIC with respect to nested parallelism are very useful. Nested parallelism did not exist at the time that we started this work. So there’ll be an evolution that will happen and probably a logical convergence over time.

If OpenMP is not set to support acclerators until two years hence, what can Intel be doing with it?

“It will be a vendor implementation of a pre-release standard. Something like that,” said Poole, emphasising that he cannot speak for Intel. “To be complementary to Intel, they have some good ideas and it’s a good debate right now.”

Incidentally, I also asked Intel about OpenACC last month, and was told that the company has no plans to implement it on its compilers. OpenMP is the standard it supports.

The topic is significant, in that if a standard set of directives is supported across both Intel and NVIDIA’s HPC platforms, developers can easily port code from one to the other. You can do this today with OpenCL, but converting an application to use OpenCL to enhance performance is a great deal more effort than adding directives.

The pros and cons of NVIDIA’s cloud GPU

Yesterday NVIDIA announced the Geforce GRID, a cloud GPU service, here at the GPU Technology Conference in San Jose.

The Geforce GRID is server-side software that takes advantage of new features in the “Kepler” wave of NVIDIA GPUs, such as GPU virtualising, which enables the GPU to support multiple sessions, and an on-board encoder that lets the GPU render to an H.264 stream rather than to a display.

The result is a system that lets you play games on any device that supports H.264 video, provided you can also run a lightweight client to handle gaming input. Since the rendering is done on the server, you can play hardware-accelerated PC games on ARM tablets such as the Apple iPad or Samsung Galaxy Tab, or on a TV with a set-top box such as Apple TV, Google TV, or with a built-in client.

It is an impressive system, but what are the limitations, and how does it compare to the existing OnLive system which has been doing something similar for a few years? I attended a briefing with NVIDIA’s Phil Eisler, General Manager for Cloud Gaming & 3D Vision, and got a chance to put some questions.

The key problem is latency. Games become unplayable if there is too much lag between when you perform an action and when it registers on the screen. Here is NVIDIA’s slide:

image

This looks good: just 120-150ms latency. But note that cloud in the middle: 30ms is realistic if the servers are close by, but what if they are not? The demo here at GTC in yesterday’s keynote was done using servers that are around 10 miles away, but there will not be a GeForce GRID server within 10 miles of every user.

According to Eisler, the key thing is not so much the distance, as the number of hops the IP traffic passes through. The absolute distance is less important than being close to an Internet backbone.

The problem is real though, and existing cloud gaming providers like OnLive and Gaikai install servers close to major conurbations in order to address this. In other words, it pays to have many small GPU clouds dotted around, than to have a few large installations.

The implication is that hosting cloud gaming is expensive to set up, if you want to reach a large number of users, and that high quality coverage will always be limited, with city dwellers favoured over rural communities, for example. The actual breadth of coverage will depend on the hoster’s infrastructure, the users broadband provider, and so on.

It would make sense for broadband operators to partner with cloud gaming providers, or to become cloud gaming providers, since they are in the best position to optimise performance.

Another question: how much work is involved in porting a game to run on Geforce GRID? Not much, Eisler said; it is mainly a matter of tweaking the game’s control panel options for display and adapting the input to suit the service. He suggested 2-3 days to adapt a PC game.

What about the comparison with OnLive? Eisler let slip that OnLive does in fact use NVIDIA GPUs but would not be pressed further; NVIDIA has agreed not to make direct comparisons.

When might Geforce GRID come to Europe? Later this year or early next year, said Eisler.

Eisler was also asked about whether Geforce GRID will cannibalise sales of GPUs to gamers. He noted that while Geforce GRID latency now compares favourably with that of a games console, this is in part because the current consoles are now a relatively old generation, and a modern PC delivers around half the latency of a console. Nevertheless it could have an impact.

One of the benefits of the Geforce GRID is that you will, in a sense, get an upgraded GPU every time your provider upgrades its GPUs, at no direct cost to you.

I guess the real question is how the advent of cloud GPU gaming, if it takes off, will impact the gaming market as a whole. Casual gaming on iPhones, iPads and other smartphones has already eaten into sales of standalone games. Now you can play hardcore games on those same devices. If the centre of gaming gravity shifts further to the cloud, there is less incentive for gamers to invest in powerful GPUs on their own PCs.

Finally, note that the latency issues, while still important, matter less for the non-gaming cloud GPU applications, such as those targeted by NVIDIA VGX. Put another way, a virtual desktop accelerated by VGX could give acceptable performance over connections that are not good enough for Geforce GRID.