Tag • #gpu

chevron_right

Nvidia unveils Blackwell B200, the “world’s most powerful chip” designed for AI

news.movim.eu / ArsTechnica · Tuesday, 19 March - 15:27 · 1 minute

The GB200 "superchip" covered with a fanciful blue explosion that suggests computational power bursting forth from within. The chip does not actually glow blue in reality.

Enlarge / The GB200 "superchip" covered with a fanciful blue explosion that suggests computational power bursting forth from within. The chip does not actually glow blue in reality. (credit: Nvidia / Benj Edwards)

On Monday, Nvidia unveiled the Blackwell B200 tensor core chip—the company's most powerful single-chip GPU, with 208 billion transistors—which Nvidia claims can reduce AI inference operating costs (such as running ChatGPT ) and energy consumption by up to 25 times compared to the H100 . The company also unveiled the GB200, a "superchip" that combines two B200 chips and a Grace CPU for even more performance.

The news came as part of Nvidia's annual GTC conference, which is taking place this week at the San Jose Convention Center. Nvidia CEO Jensen Huang delivered the keynote Monday afternoon. "We need bigger GPUs," Huang said during his keynote. The Blackwell platform will allow the training of trillion-parameter AI models that will make today's generative AI models look rudimentary in comparison, he said. For reference, OpenAI's GPT-3, launched in 2020, included 175 billion parameters. Parameter count is a rough indicator of AI model complexity.

Nvidia named the Blackwell architecture after David Harold Blackwell , a mathematician who specialized in game theory and statistics and was the first Black scholar inducted into the National Academy of Sciences. The platform introduces six technologies for accelerated computing, including a second-generation Transformer Engine, fifth-generation NVLink, RAS Engine, secure AI capabilities, and a decompression engine for accelerated database queries.

Read 8 remaining paragraphs | Comments

chevron_right

Avec Blackwell, Nvidia améliore un facteur critique pour le futur de l’IA

news.movim.eu / Numerama · Tuesday, 19 March - 14:13

À l'occasion de sa conférence GTC, Nvidia a levé la voile sur la puce Blackwell B200, un nouveau GPU qu'il présente comme une « super puce ». Avec 208 milliards de transistors et une consommation énergétique en baisse, la puce Blackwell est la nouvelle arme fatale pour les acteurs de l'intelligence artificielle générative.

chevron_right

Review: AMD Radeon RX 7900 GRE GPU doesn’t quite earn its “7900” label

news.movim.eu / ArsTechnica · Wednesday, 28 February - 12:00

Enlarge / ASRock's take on AMD's Radeon RX 7900 GRE. (credit: Andrew Cunningham)

In July 2023, AMD released a new GPU called the "Radeon RX 7900 GRE" in China. GRE stands for "Golden Rabbit Edition," a reference to the Chinese zodiac, and while the card was available outside of China in a handful of pre-built OEM systems, AMD didn't make it widely available at retail.

That changes today—AMD is launching the RX 7900 GRE at US retail for a suggested starting price of $549. This throws it right into the middle of the busy upper-mid-range graphics card market, where it will compete with Nvidia's $549 RTX 4070 and the $599 RTX 4070 Super, as well as AMD's own $500 Radeon RX 7800 XT.

We've run our typical set of GPU tests on the 7900 GRE to see how it stacks up to the cards AMD and Nvidia are already offering. Is it worth buying a new card relatively late in this GPU generation, when rumors point to new next-gen GPUs from Nvidia, AMD, and Intel before the end of the year? Can the "Golden Rabbit Edition" still offer a good value, even though it's currently the year of the dragon?

Read 10 remaining paragraphs | Comments

chevron_right

Nvidia’s “Chat With RTX” is a ChatGPT-style app that runs on your own GPU

news.movim.eu / ArsTechnica · Thursday, 15 February - 16:54 · 2 minutes

Enlarge (credit: Nvidia)

On Tuesday, Nvidia released Chat With RTX, a free personalized AI chatbot similar to ChatGPT that can run locally on a PC with an Nvidia RTX graphics card. It uses Mistral or Llama open-weights LLMs and can search through local files and answer questions about them.

Chat With RTX works on Windows PCs equipped with NVIDIA GeForce RTX 30 or 40 Series GPUs with at least 8GB of VRAM. It uses a combination of retrieval-augmented generation (RAG), NVIDIA TensorRT-LLM software, and RTX acceleration to enable generative AI capabilities directly on users' devices. This setup allows for conversations with the AI model using local files as a dataset.

"Users can quickly, easily connect local files on a PC as a dataset to an open-source large language model like Mistral or Llama 2, enabling queries for quick, contextually relevant answers," writes Nvidia in a promotional blog post.

A screenshot of Chat With RTX, which runs in a web browser window. (credit: Benj Edwards)

Using Chat With RTX, users can talk about various subjects or ask the AI model to summarize or analyze data, similar to how one might interact with ChatGPT. In particular, the Mistal-7B model has built-in conditioning to avoid certain sensitive topics (like sex and violence, of course), but users could presumably somehow plug in an uncensored AI model and discuss forbidden topics without the paternalism inherent in the censored models.

Also, the application supports a variety of file formats, including .TXT, .PDF, .DOCX, and .XML. Users can direct the tool to browse specific folders, which Chat With RTX then scans to answer queries quickly. It even allows for the incorporation of information from YouTube videos and playlists, offering a way to include external content in its database of knowledge (in the form of embeddings) without requiring an Internet connection to process queries.

Rough around the edges

We downloaded and ran Chat With RTX to test it out. The download file is huge, at around 35 gigabytes, owing to the Mistral and Llama LLM weights files being included in the distribution. ("Weights" are the actual neural network files containing the values that represent data learned during the AI training process.) When installing, Chat With RTX downloads even more files, and it executes in a console window using Python with an interface that pops up in a web browser window.

Several times during our tests on an RTX 3060 with 12GB of VRAM, Chat With RTX crashed. Like open source LLM interfaces, Chat With RTX is a mess of layered dependencies, relying on Python, CUDA, TensorRT, and others. Nvidia hasn't cracked the code for making the installation sleek and non-brittle. It's a rough-around-the-edges solution that feels very much like an Nvidia skin over other local LLM interfaces (such as GPT4ALL ). Even so, it's notable that this capability is officially coming directly from Nvidia.

On the bright side (a massive bright side), local processing capability emphasizes user privacy, as sensitive data does not need to be transmitted to cloud-based services (such as with ChatGPT). Using Mistral 7B feels slightly less capable than ChatGPT-3.5 (the free version of ChatGPT), which is still remarkable for a local LLM running on a consumer GPU. It's not a true ChatGPT replacement yet, and it can't touch GPT-4 Turbo or Google Gemini Pro/Ultra in processing capability.

Nvidia GPU owners can download Chat With RTX for free on the Nvidia website.

Read on Ars Technica | Comments

chevron_right

Nvidia CEO calls for “Sovereign AI” as his firm overtakes Amazon in market value

news.movim.eu / ArsTechnica · Tuesday, 13 February - 16:41

Enlarge (credit: Nvidia / Benj Edwards)

On Monday, Nvidia CEO Jensen Huang said that every country should control its own AI infrastructure so it can protect its culture, Reuters reports . He called this concept "Sovereign AI," which an Nvidia blog post defined as each country owning "the production of their own intelligence."

Huang made the announcement in a discussion with UAE's Minister of AI, Omar Al Olama, during the World Governments Summit in Dubai. "It codifies your culture, your society’s intelligence, your common sense, your history—you own your own data," Huang told Al Olama.

The World Governments Summit organization defines itself as "a global, neutral, non-profit organization dedicated to shaping the future of governments." Its annual event attracts over 4,000 delegates from 150 countries, according to Nvidia. It's hosted in the United Arab Emirates , a collection of absolute monarchies with no democratically elected institutions.

Read 5 remaining paragraphs | Comments

chevron_right

Nvidia RTX 4080 Super review: All you need to know is that it’s cheaper than a 4080

news.movim.eu / ArsTechnica · Wednesday, 31 January - 14:00 · 1 minute

PNY's version of the RTX 4080 Super. [credit: Andrew Cunningham ]

Nvidia's new RTX 4080 Super is technically faster than the regular 4080 , but, by an order of magnitude, the most interesting thing about it is that, at its launch price of $999, it's $200 cheaper than the original 4080. I am going to write more after this sentence, but that's basically the review. You're welcome to keep reading, and I would appreciate it if you would, but truly there is only one number you need to know, and it is "$200."

All three of these Super cards—the 4070 Super , the 4070 Ti Super , and now the 4080 Super—are mild correctives for a GPU generation that has been more expensive than its predecessors and also, in relative terms, less of a performance boost. The difference is that where the 4070 Super and 4070 Ti Super try to earn their existing price tags by boosting performance, the 4080 Super focuses on lowering its price to be more in line with where its competition is.

Yes, it's marginally faster than the original 4080, but its best feature is a price drop from $1,199 to a still high, but more reasonable, $999. What it doesn't do is attempt to close the gap between the 4080 series and the 4090, a card that still significantly outruns any other consumer GPU that AMD or Nvidia offers. But if you have a big budget, want something that's still head-and-shoulders above the entire RTX 30-series, and don't want to deal with the 4090's currently inflated pricing, the 4080 Super is much more appealing than the regular 4080, even if it is basically the same GPU with a new name.

Read 8 remaining paragraphs | Comments

chevron_right

Ryzen 8000G review: An integrated GPU that can beat a graphics card, for a price

news.movim.eu / ArsTechnica · Monday, 29 January - 19:50

Enlarge / The most interesting thing about AMD's Ryzen 7 8700G CPU is the Radeon 780M GPU that's attached to it. (credit: Andrew Cunningham)

Put me on the short list of people who can get excited about the humble, much-derided integrated GPU.

Yes, most of them are afterthoughts, designed for office desktops and laptops that will spend most of their lives rendering 2D images to a single monitor. But when integrated graphics push forward, it can open up possibilities for people who want to play games but can only afford a cheap desktop (or who have to make do with whatever their parents will pay for, which was the big limiter on my PC gaming experience as a kid).

That, plus an unrelated but accordant interest in building small mini-ITX-based desktops, has kept me interested in AMD’s G-series Ryzen desktop chips (which it sometimes calls “APUs,” to distinguish them from the Ryzen CPUs). And the Ryzen 8000G chips are a big upgrade from the 5000G series that immediately preceded them (this makes sense, because as we all know the number 8 immediately follows the number 5).

Read 37 remaining paragraphs | Comments

chevron_right

Review: Nvidia’s RTX 4070 Ti Super is better, but I still don’t know who it’s for

news.movim.eu / ArsTechnica · Thursday, 25 January, 2024 - 12:30

Our specific RTX 4070 Ti Super is a PNY model, the RTX 4070 Ti Super 16GB Verto. [credit: Andrew Cunningham ]

Of all of Nvidia's current-generation GPU launches, there hasn't been one that's been quite as weird as the case of the "GeForce RTX 4080 12GB."

It was the third and slowest of the graphics cards Nvidia announced at the onset of the RTX 40-series, and at first blush it just sounded like a version of the second-fastest RTX 4080 but with less RAM. But spec sheets and Nvidia's own performance estimates showed that there was a deceptively huge performance gap between the two 4080 cards, enough that calling them both "4080" could have lead to confusion and upset among buyers.

Taking the hint, Nvidia reversed course, " unlaunching " the 4080 12GB because it was "not named right." This decision came late enough in the launch process that a whole bunch of existing packaging had to be trashed and that new BIOSes with new GPU named needed to be flashed to the cards before they could be sold.

Read 15 remaining paragraphs | Comments

chevron_right

Review: Radeon 7600 XT offers peace of mind via lots of RAM, remains a midrange GPU

news.movim.eu / ArsTechnica · Wednesday, 24 January, 2024 - 14:00 · 1 minute

XFX's take on AMD's Radeon RX 7600 XT. [credit: Andrew Cunningham ]

We don't need a long intro for this one: AMD's new Radeon RX 7600 XT is almost exactly the same as last year's RX 7600 , but with a mild bump to the GPU's clock speed and 16GB of memory instead of 8GB. It also costs $329 instead of $269, the current MSRP (and current street price) for the regular RX 7600.

It's a card with a pretty narrow target audience: people who are worried about buying a GPU with 8GB of memory, but who aren't worried enough about future-proofing or RAM requirements to buy a more powerful GPU. It's priced reasonably well, at least—$60 is a lot to pay for extra memory, but $329 was the MSRP for the Radeon RX 6600 back in 2021. If you want more memory in a current-generation card, you otherwise generally need to jump up into the $450 range (for the 12GB RX 7700 XT or the 16GB RTX 4060 Ti) or beyond.

	RX 7700 XT	RX 7600	RX 7600 XT	RX 6600	RX 6600 XT	RX 6650 XT	RX 6750 XT
Compute units (Stream processors)	54 (3,456)	32 (2,048)	32 (2,048)	28 (1,792)	32 (2,048)	32 (2,048)	40 (2,560)
Boost Clock	2,544 MHz	2,600 MHz	2,760 MHz	2,490 MHz	2,589 MHz	2,635 MHz	2,600 MHz
Memory Bus Width	192-bit	128-bit	128-bit	128-bit	128-bit	128-bit	192-bit
Memory Clock	2,250 MHz	2,250 MHz	2,250 MHz	1,750 MHz	2,000 MHz	2,190 MHz	2,250 MHz
Memory size	12GB GDDR6	8GB GDDR6	16GB GDDR6	8GB GDDR6	8GB GDDR6	8GB GDDR6	12GB GDDR6
Total board power (TBP)	245 W	165 W	190 W	132 W	160 W	180 W	250 W

The fact of the matter is that this is the same silicon we've already seen. The clock speed bumps do provide a small across-the-board performance uplift, and the impact of the extra RAM does become apparent in a few of our tests. But the card doesn't fundamentally alter the AMD-vs-Nvidia-vs-Intel dynamic in the $300-ish graphics card market, though it addresses a couple of the regular RX 7600's most glaring weaknesses.

Read 10 remaining paragraphs | Comments