• chevron_right

      Nvidia’s “Chat With RTX” is a ChatGPT-style app that runs on your own GPU

      news.movim.eu / ArsTechnica · Thursday, 15 February - 16:54 · 2 minutes

    A promotional image of

    Enlarge (credit: Nvidia)

    On Tuesday, Nvidia released Chat With RTX, a free personalized AI chatbot similar to ChatGPT that can run locally on a PC with an Nvidia RTX graphics card. It uses Mistral or Llama open-weights LLMs and can search through local files and answer questions about them.

    Chat With RTX works on Windows PCs equipped with NVIDIA GeForce RTX 30 or 40 Series GPUs with at least 8GB of VRAM. It uses a combination of retrieval-augmented generation (RAG), NVIDIA TensorRT-LLM software, and RTX acceleration to enable generative AI capabilities directly on users' devices. This setup allows for conversations with the AI model using local files as a dataset.

    "Users can quickly, easily connect local files on a PC as a dataset to an open-source large language model like Mistral or Llama 2, enabling queries for quick, contextually relevant answers," writes Nvidia in a promotional blog post.

    Using Chat With RTX, users can talk about various subjects or ask the AI model to summarize or analyze data, similar to how one might interact with ChatGPT. In particular, the Mistal-7B model has built-in conditioning to avoid certain sensitive topics (like sex and violence, of course), but users could presumably somehow plug in an uncensored AI model and discuss forbidden topics without the paternalism inherent in the censored models.

    Also, the application supports a variety of file formats, including .TXT, .PDF, .DOCX, and .XML. Users can direct the tool to browse specific folders, which Chat With RTX then scans to answer queries quickly. It even allows for the incorporation of information from YouTube videos and playlists, offering a way to include external content in its database of knowledge (in the form of embeddings) without requiring an Internet connection to process queries.

    Rough around the edges

    We downloaded and ran Chat With RTX to test it out. The download file is huge, at around 35 gigabytes, owing to the Mistral and Llama LLM weights files being included in the distribution. ("Weights" are the actual neural network files containing the values that represent data learned during the AI training process.) When installing, Chat With RTX downloads even more files, and it executes in a console window using Python with an interface that pops up in a web browser window.

    Several times during our tests on an RTX 3060 with 12GB of VRAM, Chat With RTX crashed. Like open source LLM interfaces, Chat With RTX is a mess of layered dependencies, relying on Python, CUDA, TensorRT, and others. Nvidia hasn't cracked the code for making the installation sleek and non-brittle. It's a rough-around-the-edges solution that feels very much like an Nvidia skin over other local LLM interfaces (such as GPT4ALL ). Even so, it's notable that this capability is officially coming directly from Nvidia.

    On the bright side (a massive bright side), local processing capability emphasizes user privacy, as sensitive data does not need to be transmitted to cloud-based services (such as with ChatGPT). Using Mistral 7B feels slightly less capable than ChatGPT-3.5 (the free version of ChatGPT), which is still remarkable for a local LLM running on a consumer GPU. It's not a true ChatGPT replacement yet, and it can't touch GPT-4 Turbo or Google Gemini Pro/Ultra in processing capability.

    Nvidia GPU owners can download Chat With RTX for free on the Nvidia website.

    Read on Ars Technica | Comments

    • chevron_right

      RTX 4090 review: Spend at least $1,599 for Nvidia’s biggest bargain in years

      news.movim.eu / ArsTechnica · Tuesday, 11 October, 2022 - 13:00

    The Nvidia RTX 4090 founders edition. If you can't tell, those lines are drawn on, though the heft of this $1,599 product might convince you that they're a reflection of real-world motion blur upon opening this massive box.

    Enlarge / The Nvidia RTX 4090 founders edition. If you can't tell, those lines are drawn on, though the heft of this $1,599 product might convince you that they're a reflection of real-world motion blur upon opening this massive box. (credit: Sam Machkovech)

    The Nvidia RTX 4090 makes me laugh.

    Part of that is due to its size. When a standalone GPU is as large as a modern video gaming console—it's nearly identical in total volume to the Xbox Series S and more than double the size of a Nintendo Switch—it's hard not to laugh incredulously at the thing. None of Nvidia's highest-end "reference" GPUs, previously branded as "Titan" models, have ever been so massive, and things only get more ludicrous when you move beyond Nvidia's "Founders Edition" and check out AIB options from third-party partners. (We haven't tested any models other than the 4090 FE yet.)

    After figuring out how to safely mount and run power to the RTX 4090, however, the laughs become decidedly different. You're going to consistently laugh with , not at , the RTX 4090, either in joy or excited disbelief.

    Read 54 remaining paragraphs | Comments

    • chevron_right

      We are currently testing the Nvidia RTX 4090—let us show you its heft

      news.movim.eu / ArsTechnica · Wednesday, 5 October, 2022 - 23:26 · 1 minute

    The Nvidia RTX 4090 founders edition. If you can't tell, those lines are drawn on, though the heft of this $1,599 might convince you that they're a reflection of real-world motion blur upon opening this massive box.

    Enlarge / The Nvidia RTX 4090 founders edition. If you can't tell, those lines are drawn on, though the heft of this $1,599 might convince you that they're a reflection of real-world motion blur upon opening this massive box. (credit: Sam Machkovech)

    It's a busy time in the Ars Technica GPU testing salt mines (not to be confused with the mining that GPUs used to be known for ). After wrapping up our take on the Intel Arc A700 series , we went right back to testing a GPU that we've had for a few days now: the Nvidia RTX 4090 .

    This beast of a GPU, provided by Nvidia to Ars Technica for review purposes, is priced well out of the average consumer range, even for a product category where the average price keeps creeping upward. Though we're not allowed to disclose anything about our testing as of press time, our upcoming coverage will reflect this GPU's $1,599-and-up reality. In the meantime, we thought an unboxing of Nvidia's "founders edition" of the 4090 would begin telling the story of exactly who this GPU might not be for.

    On paper, the Nvidia RTX 4090 is poised to blow past its Nvidia predecessors, with specs that handily surpass early 2022's overkill RTX 3090 Ti product . The 4090 comes packed with approximately 50 percent more CUDA cores and between 25 and 33 percent higher counts in other significant categories, particularly cores dedicated to tensor and ray-tracing calculations (which are also updated to new specs for Nvidia's new 5 nm process). However, one spec from the 3090 and 3090 Ti remains identical: its VRAM type and capacity (once again, 24GB of GDDR6X RAM).

    Read 4 remaining paragraphs | Comments

    • chevron_right

      Nvidia’s Ada Lovelace GPU generation: $1,599 for RTX 4090, $899 and up for 4080

      news.movim.eu / ArsTechnica · Tuesday, 20 September, 2022 - 15:43 · 1 minute

    Time to bust out the checkbook again, GPU lovers. The RTX 4090 is here (and it's not alone).

    Enlarge / Time to bust out the checkbook again, GPU lovers. The RTX 4090 is here (and it's not alone). (credit: Nvidia)

    After weeks of teases, Nvidia's newest computer graphics cards, the "Ada Lovelace" generation of RTX 4000 GPUs, are here. Nvidia CEO Jensen Huang debuted two new models on Tuesday: the RTX 4090, which will start at a whopping $1,599, and the RTX 4080 , which will launch in two configurations.

    The pricier card, slated to launch on October 12, occupies the same highest-end category as Nvidia's 2020 megaton RTX 3090 (previously designated by the company as its "Titan" product). The 4090's increase in physical size will demand three slots on your PC build of choice. The specs are indicative of a highest-end GPU: 16,384 CUDA cores (up from the 3090's 10,496 CUDA cores) and 2.52 GHz of boost clock (up from 1.695 GHz on the 3090). Despite the improvements, the card still performs within the same 450 W power envelope as the 3090. Its RAM allocation will remain at 24GB of GDDR6X memory.

    This jump in performance is fueled in part by Nvidia's long-rumored jump to TSMC's "4N" process, which is a new generation of 5 nm chips that provides a massive efficiency jump from the previous Ampere generation's 8 nm process.

    Read 6 remaining paragraphs | Comments