APAC Enterprises Move AI Infrastructure to the Edge Amid Rising Inference Costs

AI spending in the Asia Pacific region continues to increase, yet many companies still face challenges in realizing value from their AI projects. A significant factor behind this struggle is the infrastructure supporting AI systems. Most current setups are not designed to run inference at the speed or scale required for real-world applications. Industry studies reveal that many AI projects fail to meet their return on investment (ROI) targets despite heavy spending on generative AI tools, largely due to infrastructure limitations.

This gap highlights the critical role AI infrastructure plays in determining performance, cost, and the ability to scale AI deployments across the region. To address these challenges, Akamai has introduced Inference Cloud, developed in partnership with NVIDIA and powered by the latest Blackwell GPUs. The core idea is straightforward: since most AI applications require real-time decision-making, these decisions should be made close to the users rather than in distant data centers. Akamai asserts that this shift can help companies reduce costs, minimize delays, and better support AI services that depend on split-second responses.

Why APAC Enterprises Move AI to the Edge: Overcoming Infrastructure Challenges

Jay Jenkins, CTO of Cloud Computing at Akamai, explained to AI News why enterprises in the Asia Pacific region are rethinking their AI deployment strategies. He emphasized that the gap between AI experimentation and full-scale production is much wider than many organizations anticipate. Jenkins noted, “Many AI initiatives fail to deliver on expected business value because enterprises often underestimate the gap between experimentation and production.” Despite strong interest in generative AI, high infrastructure costs, latency issues, and difficulties in scaling models often hinder progress.

Currently, most companies rely on centralized cloud services and large GPU clusters. However, as AI usage grows, these setups become prohibitively expensive, especially in areas far from major cloud hubs. Latency becomes a significant problem when models require multiple inference steps over long distances. Jenkins stated, “AI is only as powerful as the infrastructure and architecture it runs on.” Latency can degrade user experience and reduce the business value AI projects aim to provide. Additional challenges include multi-cloud environments, complex data regulations, and increasing compliance demands, all of which slow the transition from pilot projects to production.

How Edge Infrastructure Enhances AI Performance and Reduces Costs

As AI adoption in Asia Pacific moves from pilot phases to real deployments in applications and services, the focus has shifted from training to inference. Jenkins pointed out that day-to-day inference now consumes most computing resources, not the occasional training cycles. With organizations deploying language, vision, and multimodal models across multiple markets, the demand for fast, reliable inference is growing faster than expected. This surge has made inference the main bottleneck in the region. Models must operate in various languages, comply with different regulations, and process data in real time, putting enormous pressure on centralized systems that were not designed for such responsiveness.

Moving inference closer to users, devices, or agents can significantly change the cost structure. Shortening the distance data travels allows models to respond more quickly and avoids the expense of routing large data volumes between central cloud hubs. Physical AI systems—such as robots, autonomous machines, and smart city tools—require decisions within milliseconds to function properly. When inference is handled remotely, these systems fail to perform as expected.

Akamai’s analysis shows that enterprises in countries like India and Vietnam experience substantial cost savings when image-generation workloads are moved to the edge instead of centralized clouds. These savings come from better GPU utilization and lower egress fees.

Industries Leading the Shift to Edge-Based AI

The strongest early demand for edge inference comes from sectors where even minor delays can impact revenue, safety, or user engagement. Retail and e-commerce are among the first to adopt edge AI because slow experiences often lead shoppers to abandon their carts. Personalized recommendations, search functions, and multimodal shopping tools all benefit from local, fast inference.

The finance sector also sees latency as a critical factor affecting value. Jenkins explained that workloads such as fraud detection, payment approvals, and transaction scoring rely on rapid AI decision chains that must occur within milliseconds. Running inference closer to where data is generated helps financial firms accelerate processes and maintain compliance with data residency regulations.

The Growing Importance of Cloud and GPU Partnerships

As AI workloads expand, companies require infrastructure capable of keeping pace. Jenkins noted that this demand has driven closer collaboration between cloud providers and GPU manufacturers. Akamai’s partnership with NVIDIA exemplifies this trend, with GPUs, DPUs, and AI software deployed across thousands of edge locations.

The goal is to create an “AI delivery network” that distributes inference tasks across many sites rather than concentrating them in a few regions. This approach improves performance and supports compliance with local data regulations. Jenkins highlighted that nearly half of large APAC organizations struggle with varying data rules across markets, making local processing essential. Emerging partnerships are shaping the future of AI infrastructure in the region, especially for workloads requiring low-latency responses.

Security is integrated into these systems from the outset. Jenkins mentioned that zero-trust controls, data-aware routing, and protections against fraud and bots are becoming standard components of AI infrastructure.

Preparing for the Future of AI in Asia Pacific

Agentic AI systems, which make multiple sequential decisions, require infrastructure capable of operating at millisecond speeds. Jenkins acknowledged that the region’s diversity in connectivity, regulations, and technical readiness presents challenges but not insurmountable ones. AI workloads must be flexible enough to run where it makes the most sense. Research indicates that most enterprises in Asia Pacific already use public cloud services in production, but many expect to rely on edge services by 2027.

This transition will demand infrastructure that can keep data within country borders, route tasks to the nearest suitable location, and maintain functionality despite network instability. As inference moves to the edge, companies will need new operational approaches. Jenkins advised that organizations should prepare for a more distributed AI lifecycle, with models updated across multiple sites. This requires improved orchestration and strong visibility into performance, costs, and errors across both core and edge systems.

Data governance will become more complex but also more manageable when processing occurs locally. Since half of the region’s large enterprises already struggle with regulatory differences, placing inference closer to data sources can help address these challenges.

Security will require increased focus. While distributing inference to the edge can enhance resilience, every site must be secured. Companies need to protect APIs, data pipelines, and defend against fraud and bot attacks. Jenkins noted that many financial institutions already rely on Akamai’s security controls in these areas.

In summary, as APAC enterprises move AI infrastructure to the edge, they can expect improved performance, reduced costs, and better compliance with local regulations. This shift is becoming essential for organizations aiming to scale AI effectively and deliver real business value in the region.

For more stories on this topic, visit our category page.

Source: original article.

By Futurete

My name is Go Ka, and I’m the founder and editor of Future Technology X, a news platform focused on AI, cybersecurity, advanced computing, and future digital technologies. I track how artificial intelligence, software, and modern devices change industries and everyday life, and I turn complex tech topics into clear, accurate explanations for readers around the world.