event-icon
Description
During recent years, advances in machine learning have taken speech recognition performance to a level where speech as a man-machine interface is becoming a reality. While current architectures and systems leverage the cloud to address the large memory and compute requirements, recent concerns with privacy are pushing speech recognition implementation from the cloud to the edge. In this session, we present an overview of speech recognition implementations based on the earlier Hidden Markov Model (HMM) based approaches and compare and contrast them with Deep Neural Networks (DNNs) based implementations. We discuss feed-forward neural networks, as well as Recurrent Neural Networks (RNNs), and show how speech recognition algorithms differ from DNNs used in vision applications. A SoC (System on Chip) designer has to take several parameters into consideration to reach the target cost and power goals. Several considerations such as vocabulary size, number of microphones, and usage scenarios such as near field versus far field inputs have to be traded off for end applications that range from speech controlled thermostats, voice controlled automobiles to smart speakers that let you order on the web.
Tags