Sunnyvale, CA – May 8, 2025 – Rafay Systems, a cloud-native and AI infrastructure orchestration and management company, announced general availability of the company’s Serverless Inference offering, a token-metered API for running open-source and privately trained or tuned LLMs. The company said many NVIDIA Cloud Providers (NCPs) and GPU Clouds are already leveraging the Rafay […]