Llama models Archives - insideAI News

AI Inference: NVIDIA Reports Blackwell Surpasses 1000 TPS/User Barrier with Llama 4 Maverick

May 23, 2025 by staff

NVIDIA said it has achieved a record for large language model inference, announcing that an NVIDIA DGX B200 node with eight Blackwell GPUs achieved more than 1,000 tokens ….

Filed Under: Recent News, Secondary Feature Tagged With: AI, AI compute, AI inference, artificial intelligence, Llama, Llama 4, Llama models, Nvidia, weekly

Multiverse Says It Compresses Llama Models by 80%

April 8, 2025 by staff

Multiverse Computing today released two new AI models compressed by CompactifAI, Multiverse’s AI compressor: 80 percent compressed versions of Llama 3.1-8B and Llama 3.3-70B. Both models have 60 percent fewer parameters than the original models, 84 percent greater energy effi ciency ….

Filed Under: Recent News, Secondary Feature Tagged With: Llama, Llama models, model compression, Multiverse, Multiverse Computing

From Legacy to Leading Edge: How Mainframe Data Can Transform AI and Analytics

From Legacy to Leading Edge: How Mainframe Data Can Transform AI and Analytics explores how enterprise data leaders can unlock untapped value by integrating mainframe data into modern AI and analytics ecosystems. Despite powering critical operations, mainframe data remains largely siloed. This whitepaper outlines strategic insights, common roadblocks, and a practical roadmap to help data leaders […]

Download

AI Inference: NVIDIA Reports Blackwell Surpasses 1000 TPS/User Barrier with Llama 4 Maverick

Multiverse Says It Compresses Llama Models by 80%

Sponsored Guest Articles

Re-Engineering Ethernet for AI Fabric

White Papers

From Legacy to Leading Edge: How Mainframe Data Can Transform AI and Analytics

Featured RSS Feed

More News from insideHPC