NVIDIA Launches New Catalog of GPU-Accelerated Microservices for AI

NVIDIA unveiled a collection of enterprise-grade generative AI microservices that allow businesses to build and deploy custom applications on their own platforms while maintaining control of their intellectual property.

These cloud-native microservices, built on the NVIDIA CUDA platform, include NVIDIA NIM microservices for optimized performance of over two dozen popular AI models from NVIDIA and its partners. Additionally, NVIDIA-accelerated software development kits, libraries, and tools are now available as NVIDIA CUDA-X microservices for retrieval-augmented generation (RAG), guardrails, data processing, and more. NVIDIA also announced separate healthcare-focused NIM and CUDA-X microservices.

This curated selection of microservices extends NVIDIA’s full-stack computing platform. It establishes a standardized approach for running custom AI models optimized for NVIDIA’s massive installed base of CUDA-enabled GPUs across various environments, including clouds, data centers, workstations, and PCs.

Leading application, data, and cybersecurity platform providers, including Adobe, Cadence, CrowdStrike, Getty Images, SAP, ServiceNow, and Shutterstock, are among the first to leverage the new NVIDIA generative AI microservices available in NVIDIA AI Enterprise 5.0.

NIM Microservices for Faster Deployments – Nvidia

NIM microservices are pre-built containers powered by NVIDIA inference software, enabling developers to significantly reduce deployment times. They offer industry-standard APIs for various domains, allowing for the rapid development of AI applications using a company’s proprietary data hosted securely on its infrastructure. These applications can scale on demand, providing flexibility and performance for generative AI production on NVIDIA-accelerated computing platforms.

NIM microservices offer the fastest and highest-performing production AI container for deploying models from various sources, including NVIDIA, A121, Adept, Cohere, Getty Images, Shutterstock, and open models from Google, Hugging Face, Meta, Microsoft, Mistral AI, and Stability AI.

CUDA-X Microservices for Diverse Applications – Nvidia

CUDA-X microservices provide comprehensive building blocks for data preparation, customization, and training, accelerating AI development across industries.

Enterprises can leverage CUDA-X microservices, including NVIDIA Riva for customizable speech and translation AI, NVIDIA cuOpt for routing optimization, and NVIDIA Earth-2 for high-resolution climate and weather simulations, to expedite AI adoption.

NeMo Retriever microservices allow developers to connect their AI applications to business data (text, images, visualizations) to generate highly relevant responses. This RAG functionality enables enterprises to provide more data to chatbots, copilots, and generative AI productivity tools, enhancing accuracy and insights.

Additional NVIDIA NeMo microservices are forthcoming for custom model development, including NVIDIA NeMo Curator for building clean training and retrieval datasets, NVIDIA NeMo Customizer for fine-tuning LLMs with domain-specific data, NVIDIA NeMo Evaluator for analyzing AI model performance, and NVIDIA NeMo Guardrails for LLMs.

Ecosystem Collaboration on Generative AI Microservices

Beyond leading application providers, data, infrastructure, and compute platform providers within the NVIDIA ecosystem are collaborating on NVIDIA microservices to deliver generative AI to enterprises.

Top data platform providers like Box, Cloudera, Cohesity, Datastax, Dropbox, and NetApp are working with NVIDIA microservices to help customers optimize RAG pipelines and integrate their proprietary data into generative AI applications. Snowflake leverages NeMo Retriever to harness enterprise data for building AI applications.

Enterprises can deploy NVIDIA microservices included with NVIDIA AI Enterprise 5.0 on their preferred infrastructure, including leading cloud platforms (AWS, Google Cloud, Azure, Oracle Cloud Infrastructure) or over 400 NVIDIA-Certified Systems from Cisco, Dell Technologies, HPE, HP, Lenovo, and Supermicro.

Additionally, NVIDIA AI Enterprise microservices are coming to infrastructure software platforms like VMware Private AI Foundation with NVIDIA and Red Hat OpenShift. This collaboration will aid enterprises in seamlessly integrating generative AI capabilities into their applications with optimized security, compliance, and control features. Canonical is also adding Charmed Kubernetes support for NVIDIA microservices through NVIDIA AI Enterprise.

NVIDIA’s vast ecosystem of AI and MLOps partners is incorporating support for NVIDIA microservices through NVIDIA AI Enterprise. This includes Abridge, Anyscale, Dataiku, DataRobot, Glean, H2O.ai, Securiti AI, Scale AI, OctoAI, and Weights & Biases.

Vector search providers like Apache Lucene, Datastax, Faiss, Kinetica, Milvus, Redis, and Weaviate are collaborating with NVIDIA NeMo Retriever microservices to power responsive RAG capabilities for enterprises.

Availability

Developers can experiment with NVIDIA microservices for free at ai.nvidia.com. Enterprises can deploy production-grade NIM microservices with NVIDIA AI Enterprise 5.0 running on NVIDIA-Certified Systems and leading cloud platforms.

Read official news here – https://nvidianews.nvidia.com/news/generative-ai-microservices-for-developers

Read more trending news here – https://biztrendnews.com/