Natalia Vassilieva

Natalia Vassilieva is Vice President and Field CTO at Cerebras Systems, leading customer engineering. Her expertise spans hardware and software, with a strong understanding of how algorithms run efficiently on different architectures. Prior to joining Cerebras, Natalia led the Software and AI group at Hewlett Packard Labs and served as the Head of HP Labs Russia. She has directed research in LLM training, NLP, computer vision, and information retrieval. Earlier in her career, she served an Associate Professor at St. Petersburg State University and as a lecturer at the Computer Science Center in St. Petersburg, Russia.

Abstract: Mapping Intelligence to Silicon

Modern AI spans an increasingly diverse set of model architectures and computational workflows. Different types of models, such as large language models, multimodal systems, diffusion models, and world models, exhibit distinct computational characteristics. Different workflows such as training, reinforcement learning, and inference also have different demands for compute infrastructure. Despite this diversity, today’s AI infrastructure, from hardware to core frameworks and libraries, remains largely homogeneous. To maintain computational efficiency, we impose uniformity of primitives – we force diverse mathematical structures of different models into rigid, dense matrix multiplications (GEMM).

This talk presents a first-principles analysis of the relationship between modern AI workloads and hardware topologies. We examine how different model architectures and computational workflows result in distinct computational patterns, and how those patterns translate into requirements for compute, memory, and communication. We compare alternative hardware approaches, from HBM-centric accelerators to ultra-high-bandwidth SRAM-based architectures, and explore the trade-offs they make across diverse workloads.

The central argument is that no single hardware architecture can efficiently serve the full spectrum of modern AI. As workloads diversify, forcing them onto uniform compute platforms leads to increasing inefficiency, cost, and wasted resources. The next era of AI progress will depend on tighter co-design between algorithms and hardware, enabling increasingly heterogeneous computing systems that align computational structure with the hardware best suited to execute it. Ironically, AI itself may accelerate this transition by reducing the cost of developing specialized hardware and software, making heterogeneity not only technically desirable, but economically practical.