AI innovation everywhere
Overview
Spanning the full spectrum of AI computing
AI applications are becoming critical across many markets. As the demand for AI chipsets grows, key development challenges arise for AI data centers, edge AI devices, and physical AI systems. These include:
- Optimizing AI compute performance.
- Balancing energy consumption vs other requirements.
- Managing limited memory and processing resources.
- Optimizing connectivity and communication protocols.
- Addressing data security and privacy concerns.
- Mission-critical application safety compliance.
Arteris network-on-chip (NoC) IP, system-on-chip (SoC) integration automation software, and semiconductor security assurance software are optimized to help customers achieve peak bandwidth, reduced latency, energy efficient compute, safety and security, and fast time to market for the full spectrum of AI computing.
From the AI data center to smart edge devices and physical AI sysems, Arteris helps architects maximize performance and efficiency, whether on single die chiplets or multi-die SoCs, while managing the spatial distribution of AI workloads, as demonstrated by the vast range of silicon-proven designs.
AI data center
AI computing infrastructure in the data center or cloud, is designed for massive data computation, movement, and storage to support the needs of AI workloads.
Models for natural language processing (NLP) and large language models (LLMs) such as GPT-5, require massive NLP computing to train on enormous datasets, up to 10T and above, often with the help of thousands of GPUs via massive parallelism.
Batch inference, particularly for applications like non-interactive research, can tolerate longer latencies—the time it takes to go from inquiry to results—and is an excellent fit for AI infrastructure computing.
Key considerations:
- Performance, specifically throughput (PFLOPs) and bandwidth (TB/s).
- Scale up and scale out, spanning thousands of nodes with UALink, NVLink, Ultra Ethernet, and so on.
- Latency between CPU clusters and HBM or DDR memory with the JEDEC HBM4 supporting up to ~2 TB/s per stack.
- Total cost of ownership (TCO) often strongly related to MegaWatts (MW) and up to GigaWatts (GW) of energy consumption. Modern AI racks (e.g., GB200 NVL72) draw ~120–132 kW per rack, and hyperscale campuses are planning hundreds of MW to multi-GW sites.
- XPU support across GPUs, TPUs, DPUs, IPUs, AI Accelerators, high-end CPUs, and the underlying data movement across these and high-bandwidth memory (e.g. HBM4).
Edge AI
AI at the edge means that data is processed closer to its source on devices like IoT sensors, smartphones, smart TVs, hearing aids, Wi-Fi routers, others consumer electronics, and local gateways. That means that AI runs locally, creates immediate benefits by inference, while not taking control away from the human or the physical world. The associated devices benefit for AI compute to boost its value, meaning, AI semiconductor considerations are an add-on while all other design characteristics still need to be preserved, from energy efficiency to form factor and cost consideration driving area constraint.Key considerations:
- Ultra-low latency allows real-time decisions without cloud access.
- Energy efficiency is used to support inference at the milliwatt (mW) to Watt range, often on rechargeable batteries.
- Form factors and cost trade-offs are associated with endpoint devices.
- Trade-offs in performance and bandwidth should accommodate form-factor and cost requirements.
- Security and data privacy concerns.
- XPU support across NPUs, edge TPUs, MCUs, embedded FPGAs, and embedded GPUs.
Physical AI (Closed-Loop Intelligence)
Physical AI is a class of AI systems characterized by closed-loop interaction with the physical world. These systems integrate perception, decision-making, and actuation under hard real-time and safety constraints. Physical AI expands beyond AI datacenters that think at scale, and beyond what inference-centric edge AI do locally, by interacting with the physical world, including all the associated consequences and implications for underlying electronics.
These systems require deterministic, worst-case bounded latency, sustained execution, and built-in safety and security, rather than just best-effort performance, to meet rigorous demands of interacting with the physical works in robotics, autonomous vehicles, smart drones, and advanced industrial automation.
Key considerations:
- Deterministic, worst-case bounded latency rather than best-effort p99 latency.
- Continuous, sustained workloads instead of burst inference.
- Control-loop-centric dataflows dominate over feed-forward pipelines.
- Safety-critical operation requiring fault isolation and containment.
- System-wide security assurance for regulatory compliance.
- Guaranteed delivery paths to actuators and control processors.
- XPU support across embedded GPUs, NPUs, CPUs, MCUs and FPGAs.
Advantages
Scale your SoC and reduce complexity
Performance and bandwidth
Increase chiplet and multi-die SoC bandwidth with HBM and multichannel memory support, multicast/broadcast writes, VCLink™ virtual channels, and source synchronous communications.
Scalability
Create highly scalable ring, mesh, and torus topologies with highly efficient approaches—unlike black box compilers. SoC architects can edit generated topologies and optimize each individual network router. Support for scale-up protocols like UALink, and broad ecosystem.
Energy efficiency, low power
Fewer wires and fewer gates consume less power, while breaking communication paths into smaller segments allows powering only active segments. A simple internal protocol enables aggressive clock gating, supporting lower TCO for data centers and increased edge battery life.
Innovations
Technology advances
AI chiplets and multi-die designs
The insatiable drive toward AI-driven computing in semiconductor devices is propelling adoption of multi-die systems and chiplet-based designs. These designs promise to help satisfy the escalating demands of today’s complex, high-performance computing and AI workloads.
Arteris accelerates AI-driven silicon innovation with its expanded multi-die solution, which delivers flexible design scalability, differentiated AI performance, alignment with evolving industry standards, silicon-proven chiplets, and broad ecosystem support.
Network-on-chip AI tiling accelerates semiconductor designs for AI applications
AI tiling is an emerging trend in chiplet and SoC design that uses proven, robust network-on-chip (NoC) IP to facilitate scaling, condense design time, speed testing, and reduce design risk.
It empowers SoC architects to quickly create modular, scalable AI designs, enabling faster integration, verification, and optimization across non-coherent and coherent data traffic.
AI tiling advantages – available with Arteris FlexGen®, FlexNoC® and Ncore™ NoC IPs:
- Scalable performance
- Power and area reduction
- Dynamic reuse and productivity gains
- Supports Arm, RISC-V, x86, and mixed architectures
Physical awareness
The majority of modern AI hardware relies on advanced nodes, from 5 nm and below on the edge to 3 nm, 2 nm, and increasingly Angstroms in data center designs. In such design, particularly with the massive underlying AI data movement, wires and physical closure, including timing closure, become increasingly challenging, often resulting in missed silicon schedules, re-designs, or both.
Arteris provides physically aware NoCs, enabling SoC architecture teams, logic designers and integrators to account for physical constraint management across power, performance and area (PPA) early in the design cycle. This results in 5x faster physical convergence over manual refinements, with fewer iterations from the back-end physical design team. The resulting physically optimized NoC IPs are ready for output to physical synthesis and place and route for implementation without further redesign of the overall chiplet and multi-die SoC.
Network‑on‑Chip AI Tiling to accelerate semiconductor designs for AI applications
AI Tiling is an emerging trend in chiplet and SoC design that uses proven, robust network‑on‑chip IP to facilitate scaling, condense design time, speed testing, and reduce design risk.
It empowers SoC architects to quickly create modular, scalable AI designs, enabling faster integration, verification, and optimization, across non-coherent and coherent data traffic.
AI tiling advantages – available with FlexGen, FlexNoC and Ncore NoC IPs
- Scalable Performance
- Power and Area Reduction
- Dynamic Reuse and Productivity
- Supports Arm, RISC-V, x86, and Mixed Architectures
Arteris NoC Tiling Innovation
Addresses AI Use Cases Across Markets
Edge computing inference applications Focus on modularity and hierarchical design, easier routing and layout plus area and power efficiency
Artificial Intelligence / Machine Learning
Partners
Andes Technology is a founding and premier member of RISC-V International and a leading supplier of high-performance/low-power RISC-V processor IP. Andes Technology and Arteris partner to advance innovation for RISC-V based SoC designs for AI, 5G, networking, mobile, storage, AIoT and space applications. The Andes QiLai RISC-V platform is a development board with a QiLai SoC featuring the Andes’ RISC-V processor IPs along with Arteris FlexNoC interconnect IP used for on-chip connectivity.
In March 2024, Arteris delivered on its previously announced collaboration with Arm to speed up automotive electronics innovation with an emulation-based validation system for Armv9 and CHI-E-based designs to speed up innovation in automotive electronics for autonomous driving, advanced driver-assistance systems (ADAS), cockpit and infotainment, vision, radar and lidar, body and chassis control, zonal controllers and other automotive applications. Arteris aligned its roadmap with Arm to enable designers to get to market faster with an optimized and pre-validated high-bandwidth, low-latency Ncore cache coherent interconnect IP for Arm’s Automotive Enhanced (AE) compute portfolio. The partnership helps customers realize SoCs with high performance and power efficiency for safety-critical tasks while reducing project schedules and costs. It offers mutual customers a greater choice of safe, integrated, and optimized automotive solutions to enable faster time to market via seamless integration and optimized flows with the highest quality of results, enabling ISO 26262 systems with the highest automotive safety integrity levels (ASIL).
The Damo Wujian Alliance, spearheaded by Damo Academy (an affiliate of Alibaba Group), is an ecosystem alliance driving the adoption and development of the RISC-V instruction-set architecture. The coalition focuses on high-performance System-on-Chip (SoC) designs, particularly in edge AI computing. As part of the alliance, Arteris plays a pivotal role by enabling the integration of Damo Academy’s / T-Head’s Xuantie RISC-V processor IP cores with its Ncore cache coherent network-on-chip (NoC) system IP, resulting in efficient data transport architectures within cores and between chips, enabling cutting-edge applications in AI, machine learning, and more.
Fraunhofer IESE is one of 76 institutes and research units of the Fraunhofer-Gesellschaft. Together they have a major impact on shaping applied research in Europe and contribute to Germany’s competitiveness in international markets. Our partnership with Fraunhofer enables early architecture analysis of DRAM performance effects on network-on-chip performance through connections between Ncore and FlexNoC SystemC simulation and the DRAMSys DRAM modeling and simulation framework.
Arteris and SiFive have partnered to accelerate the development of edge AI SoCs for consumer electronics and industrial applications. The partnership combines SiFive’s multi-core RISC-V processor IP and Arteris’ Ncore cache coherent interconnect IP, providing high performance and power efficiency with reduced project schedules and integration costs. The collaboration has led to the development of the SiFive 22G1 X280 Customer Reference Platform, incorporating SiFive X280 processor IP and Arteris Ncore cache coherent interconnect IP on the AMD Virtex UltraScale+ FPGA VCU118 Evaluation Kit.
Semidynamics is a provider of fully customizable RISC-V processor IP and specializes in high bandwidth, high-performance cores with Vector Units, Tensor Units and Gazzillion, and targeted at machine learning and AI applications. Our collaboration enhances the flexibility and highly configurable interoperability of processor IP with system IP, aiming to deliver Integrated and optimized solutions with focus on accelerating artificial intelligence, machine learning and high-performance computing (HPC) applications.
Products
Products for AI and machine learning
Resources
Resources
Unlock insights into AI security challenges & protection strategies. Dive deep into the dual role of AI in cybersecurity.
- Inside Chips Podcast: Data Movement in the AI Age with Charlie Janac
- SemiWiki: How NoC Tiling Capability is Changing the Game for AI Development with Andy Nightingale
- EE Journal: Managing the Massive Data Throughput: AI-Based Designs and The Value of NoC Tiling
- EE Journal: The Network-on-Chip Pioneer: How Arteris Enabling SoC Developers to Create Physically Valid NoCs Faster
- Electronic Design: All About NoCs
- SemiWiki: A Broad View of Design Architectures and the Role of the NoC with Arteris’ Michal Siwinski
- EE Journal: The Freedom to Innovate: Arteris and the Rise of RISC-V
- Building Better IP with RTL Architect NoC IP Physical Exploration
- Efficient Scaling of AI Accelerators Using NoC Tiling
- Is the Missing Safety Ingredient in Automotive AI Traceability?
- Lessons Learned Integrating AI/ML Accelerators into Complex ISO 26262 Compliant Systems-on-Chip
- The Role of Networks-on-Chips Enabling AI/ML Silicon and Systems
- Tiled Approach to System Scaling
- Smart NoC Automation: Accelerating AI-Ready SoC Design in the Era of Chiplet
- Magillem Registers – Automate the Hardware/Software Interface for Fast Chip Design
- Accelerating Timing Closure for Networks-on-Chip (NoCs) using Physical Awareness
- Optimizing Data Transport Architectures in RISC–V SoCs for AI/ML Applications
- Cache Coherency in Heterogeneous Systems
- Integration Challenges for RISC-V Designs
- Promises and Pitfalls of SoC Restructuring
- Scaling Performance in AI Systems
Latest news