Neuchips Unveiling Gen AI Inferencing Accelerators at CES 2024


January 3, 2024

Neuchips Unveiling Gen AI Inferencing Accelerators at CES 2024

Summary: Neuchips will showcase their Raptor Gen AI accelerator chip and Evo PCIe accelerator card LLM solutions at CES 2024, offering significant cost savings and improved performance for natural language processing applications.

  • Neuchips' Raptor chip offers significant cost savings compared to existing solutions for large language model inference.
  • The combination of Raptor and Evo provides an optimized stack for enterprises to easily access leading LLMs.
  • Evo's ultra-low power consumption and scalability make it a cost-effective and future-proof solution for AI workloads.


Neuchips, a leading provider of AI Application-Specific Integrated Circuits (ASIC) solutions, is set to showcase its Raptor Gen AI accelerator chip and Evo PCIe accelerator card LLM solutions at CES 2024. The Raptor chip, previously known as N3000, is a game-changer in the deployment of large language models (LLMs) inference, offering significant cost savings compared to existing solutions.

“We are excited to introduce our Raptor chip and Evo card to the industry at CES 2024,” said Ken Lau, CEO of Neuchips. “Neuchips’ solutions represent a major advancement in price to performance ratio for natural language processing. With Neuchips, organizations of all sizes can now leverage the power of LLMs for a wide range of AI applications.”

By combining Raptor and Evo, Neuchips provides an optimized stack that makes leading LLMs easily accessible for enterprises. These AI solutions not only reduce hardware costs but also minimize electricity usage, resulting in lower total cost of ownership.

At CES 2024, Neuchips will demonstrate the capabilities of Raptor and Evo by accelerating the Whisper and Llama AI chatbots on a Personal AI Assistant application. This demonstration highlights the immense power of LLM inferencing for real business needs.

Enterprises interested in experiencing Neuchips’ breakthrough performance can visit booth 62700 to enroll in a free trial program. Furthermore, technical sessions will be conducted to showcase how Raptor and Evo can significantly reduce deployment costs for speech-to-text applications.

The Raptor chip is capable of delivering up to 200 tera operations per second (TOPS) per chip. Its exceptional performance in AI inferencing operations, such as Matrix Multiply, Vector, and embedding table lookup, makes it ideal for Gen-AI and transformer-based AI models. This remarkable throughput is achieved through Neuchips’ patented compression and efficiency optimizations specifically designed for neural networks.

Complementing the Raptor chip is Neuchips’ Evo acceleration card, which boasts ultra-low power consumption. By combining PCIe Gen 5 with eight lanes and LPDDR5 32 GB, Evo achieves 64 GB/s host I/O bandwidth and 1.6-Tbps per second of memory bandwidth at just 55 watts per card.

Evo also offers 100% scalability, allowing customers to increase performance linearly by adding more chips. This modular design ensures investment protection for future AI workloads. Additionally, Neuchips plans to launch Viper, a half-height half-length (HHHL) form factor product, in the second half of 2024, providing even greater deployment flexibility and bringing data center-class AI acceleration in a compact design.Neuchips Unveiling Gen AI Inferencing Accelerators at CES 2024

Neuchips Unveiling Gen AI Inferencing Accelerators at CES 2024

(Source)



Event Info

About CES: CES, the Consumer Electronics Show, is an annual event held in Las Vegas, Nevada, organized by the Consumer Technology Association (CTA). With a history dating back to 1967, it has become the world's premier platform for unveiling and exploring the latest innovations in consumer electronics and technology. Drawing exhibitors ranging from industry titans to startups across diverse sectors, including automotive, health and wellness, robotics, gaming, and artificial intelligence, CES transforms Las Vegas into a global tech hub, offering a glimpse into the future of technology through a wide array of showcases, from startup-focused Eureka Park to cutting-edge automotive and health tech exhibitions.

CES Website: http://www.ces.tech/
CES LinkedIn: https://www.linkedin.com/showcase/ceslasvegas/


Technology Explained


LLM: A Large Language Model (LLM) is a highly advanced artificial intelligence system, often based on complex architectures like GPT-3.5, designed to comprehend and produce human-like text on a massive scale. LLMs possess exceptional capabilities in various natural language understanding and generation tasks, including answering questions, generating creative content, and delivering context-aware responses to textual inputs. These models undergo extensive training on vast datasets to grasp the nuances of language, making them invaluable tools for applications like chatbots, content generation, and language translation.


LPDDR5: LPDDR5 is a type of computer memory technology that is used in many modern computers. It stands for Low Power Double Data Rate 5 and is the latest version of the LPDDR memory standard. It is a type of dynamic random access memory (DRAM) that is designed to be more power efficient than its predecessors. It is used in many modern laptops, tablets, and smartphones to provide faster performance and longer battery life. LPDDR5 is also used in some high-end gaming PCs and workstations to provide faster loading times and smoother gaming experiences. It is also used in some servers and data centers to provide faster data processing and storage.


PCIe: PCIe (Peripheral Component Interconnect Express) is a high-speed serial computer expansion bus standard for connecting components such as graphics cards, sound cards, and network cards to a motherboard. It is the most widely used interface in the computer industry today, and is used in both desktop and laptop computers. PCIe is capable of providing up to 16 times the bandwidth of the older PCI standard, allowing for faster data transfer speeds and improved performance. It is also used in a variety of other applications, such as storage, networking, and communications. PCIe is an essential component of modern computing, and its applications are only expected to grow in the future.



Leave a Reply