Why AI resources must be distributed
Learnings from Singapore’s approach to water management
ADVANCEMENTS in artificial intelligence (AI) are heralding a new industrial revolution. Just as the first few industrial revolutions were propelled by steam, water, and electric power, the rise of the AI era will be characterised by the ubiquity of compute power that stores and processes data.
In fact, the ability to leverage the full potential of AI and ensure that it benefits everyone hinges on the availability and efficiency of compute power. Like water and electricity, compute power will eventually be expected to run on demand like a utility.
That is why the world would be well served by learning how Singapore ensures a robust and sustainable water supply for residents despite the lack of natural freshwater sources. It does so by a diversified strategy called the “Four National Taps”, which includes collecting rainwater from local catchments, importing water from neighbouring Malaysia, producing high-grade recycled water, and desalinating seawater.
How is this relevant to compute for AI?
Well, just as it is unrealistic to rely solely on dams to store and distribute a country’s water supply, the AI era will require compute power supplied by a gamut of data centres, PCs and edge devices, akin to Singapore’s diversified strategy for natural resources.
In fact, three key reasons underscore why the future of AI compute must be distributed.
Economics, distance, and the rules of the land
Without a doubt, data centres are at the heart of the AI revolution. Generative AI (Gen AI) applications based on large language models (LLMs) require the training of massive amounts of data. This requires large-scale, high-performance compute infrastructure consisting of hundreds and thousands of central processing units (CPUs), graphics processing units (GPUs), accelerators, and networking chips.
While data centres will continue to be critical, there are three reasons why we should look to diversify beyond them.
1. Economics: It can be expensive to run all AI workloads via data centres or the cloud. Whether it is owning or leasing data centres, or relying on cloud subscriptions; the investment, operations, and complexity can be out of reach for many organisations.
2. Distance: Sending data back and forth between the source and data centres can slow things down, especially if you do not have the right throughput and latency. This is not ideal for time-sensitive applications such as autonomous vehicles.
3. Rules of the land: An evolving regulatory environment means that, in some cases, organisations dealing with sensitive information may need their data to be stored within the country, their premises, or where the data is locally generated. This may not always be possible if an organisation relies solely on the cloud or data centre.
It is therefore necessary to spread AI-related compute across locations and devices, using different compute resources for different AI use cases. As compute becomes ever more powerful and efficient, why fine-tune a smaller language model in the data centre or cloud if you could do so right on your PC?
PCs are more important than ever
PCs are at a turning point with the arrival of AI-enabled PCs (or AI PCs). They have a combination of CPU, GPU, and Neural Processing Unit components, which means their performance can be enhanced by AI locally. Just think of how a few lines of instruction in PowerPoint could help create a visually stunning presentation in seconds.
Some may say they can already do this via web browsers on a three-year-old laptop. That may be so, but older PCs take longer to process, consume more energy, and incur higher costs for sending data back and forth to a cloud-based application or data centre – all of which could be tricky when handling sensitive data you don’t want leaving the premises or even the country.
These issues are amplified in enterprise environments. Increasingly, more and more employees are using AI applications in their everyday work; more businesses need to train or fine-tune AI models with proprietary data; and note that many enterprise software such as database management applications have a licensing model that charges businesses by the number of cores from CPUs in the cloud used to power the applications.
With AI PCs, the devices can optimise the running of these AI workloads, leading to better utilisation of hardware resources. The potential for reduced operational costs, improved efficiency, and productivity could lead to significant business benefits over time.
An edge in the AI age
Data centres and AI PCs aside, it stands to reason that more AI will move to the “edge”.
The edge includes applications such as the Internet of Things (IoT), autonomous vehicles, smart-city devices. To harness their full potential, though, they require data processing at the “edge” of a network, closer to where data is generated, rather than relying on centralised data centres.
The need for edge compute is paramount in the age of AI. It enables real-time processing, which is important when split-second decisions can impact safety, such as in industrial automation settings. Additionally, processing data locally lowers the volume sent to the cloud, thereby reducing network congestion, cutting data transfer costs, and improving security by minimising exposure during transmission. Last but not least, when Internet connectivity is down, edge compute ensures critical applications continue to function – vital for industries such as healthcare.
These AI use cases that leverage trained machine learning models to make predictions or decisions based on new input data are what we call “inferencing”. Different from training, which often needs more demanding compute infrastructure to support, inferencing can be done more easily at the edge via general compute servers, with familiarity of hardware, lower power consumption, and flexibility to handle different environments.
In fact, market research firm IDC has predicted that, by 2025, 75 per cent of enterprise-generated data globally will be created and processed not in traditional data centres or the cloud, but on the edge. It is important to recognise that not only will more AI and compute be on the edge, but the predominant workload will also be on inferencing. Just think of how many people “build” weather models versus how many people “use” weather models. That is training versus inferencing, and the latter will take the bulk of AI workloads in the future. Knowing this will help businesses prepare for the right compute infrastructure down the road.
The right tool for the right job
AI is complex, and depending on the use cases, it has vastly different compute requirements, including user experience, operational considerations, costs, government regulations and more. To grow AI sustainably, we must think about the kind of infrastructure that makes the most sense to supply the world’s insatiable demand for compute. This demand is going to continue to grow.
According to IDC, AI infrastructure investments in Singapore are projected to grow significantly at a compound annual growth rate of 14.8 per cent to reach US$1.4 billion by the end of 2027. These investments will be in areas such as specialised AI processors, data storage and management, and network and cloud servers. All these will serve as a springboard for high-value and data-intensive applications down the road. We will likely see this trend play out in markets globally.
Going back to the utility analogy – just as water supply for a nation requires more than dams, but also water reservoirs, treatment facilities, and more, so does the supply of compute power needs a network of infrastructure of varying kinds.
With lessons learned from utility supplies, including efficiency, security, and sustainability – it is critical to remember that there is no one-size-fits-all approach for most things, and this is so for compute power in the AI era.
The writer is vice-president of the Sales, Marketing & Communications Group and general manager for Asia-Pacific and Japan region at Intel
Decoding Asia newsletter: your guide to navigating Asia in a new global order. Sign up here to get Decoding Asia newsletter. Delivered to your inbox. Free.
Copyright SPH Media. All rights reserved.