Article Image

IPFS News Link • Robots and Artificial Intelligence

Groq 30 Days to Starting With Large Customers

•, by Brian Wang

They are processing 30,000 inference input inference tokens and will put together about 1500 chips into an inference data center that will process 25 million inference tokens per second by the end of the year.

Groq uses fully synchronous SRAM memory. Nvidia uses HBM (High bandwidth stacked memory).
Nvidia announced that their H200 chip will process 24,000 inferences per second.

Groq says that their ASIC chip processes inferences with 3-10 times the energy efficiency of the Nvidia chips.

The AI inference chips and the AI models are making huge leaps in progress. There will soon be new stateless models.