Data Centre Magazine June 2024 | Page 110

space across multiple systems . “ MoE models allow much larger models by parallelising the infrastructure ,” he said . “ But it requires ultra-fast communication between the parallel expert nodes .”
This dramatic increase in model size and complexity is driving exponential growth in AI computing requirements , projected to surge from 4.3 petaflops today to nearly 800 petaflops within a couple of years . Moreover , AI inference – deploying trained models for real-world use cases – is expected to explode from just 5 % of workloads today to more than 60 % by 2026 .
“ After deploying the AI training clusters over the next few years , inference is what will truly change our daily lives ,” Dion says .
“ My favourite quote in the last six-plus months comes from Nvidia ’ s CEO Jensen Huang – ‘ AI is the defining technology of our time . By working with the most dynamic companies in the world , we will realise the promise of AI in every industry .’ When you think about that quote , it really encompasses everything we do . Imagine AI in every industrial sector – education , healthcare , finance and so on . When you hear the excitement
110 June 2024