The AI community is co-designing applications and algorithms in order to make progress in AI, and the algorithms in turn influence the co-design of processor architecture. We have seen orders of magnitude improvements in computational efficiency for these workloads over the past few years, as processors become more tailored to AI workloads. We expect to see further gains from co-optimizing hardware architectures and AI algorithms. In this work, I’ll discuss the ways NVIDIA has been co-designing its processors for AI, including the GPU as well as our recently open-sourced Deep Learning Accelerator. We’ll discuss lower-precision arithmetic, memory subsystems, and processor architecture all built to support AI. We’ll also talk about how we incorporate insights from AI in practice into our processors, in order to improve them for the workloads of the future. GPUs have always been co-designed processors for important workloads. This talk will discuss how we think about building them.