Meet Groq, a young company developing what sounds like Nvidia’s nightmare. This is a chip called LPU (or Language Processing Unit) specifically designed for huge language models (LLM) and allows them to operate at a speed that is not currently seen in the existing chatbots on the market.
Groq was founded in 2016 by Jonathan Ross, who worked on the original team at Google that developed the company’s TPU (or Tensor Processing Unit) for cloud AI performance. Besides the hardware, the company also implemented software, which can demonstrate the capabilities of its chip. In the article where the company’s chip was revealed, and signed by more than 20 writers including Ross, there is a lot of emphasis on the software side. Before approaching the development of the hardware, the developers at Groq worked on creating a compiler, which can take machine learning algorithms and convert them into a format that can also run on simpler processors – such as the company’s processor, which is slightly different from the GPUs we know and the processors that come with our computers. This change, it is claimed, allows the models to run faster compared to those that run on the hardware available on the market today.
On the hardware side, Groq’s LPU is built differently from the super-popular graphics processors that are at the heart of the server farms that run the chatbots today – such as NVIDIA’s (which rode this demand to another record-breaking quarter). Unlike the graphics processors that include a high-bandwidth memory component (HBM) or the processors in our personal computers that come with DRAM (or Dynamic Random Access Memory), Groq actually uses SRAM (or Static Random Access Memory). The first two types need a consistent refresh to operate, an operation that adds to the latency process and requires re-searching the information in memory, while in SRAM the information is stored in a static configuration – so Groq can always access this information at a much higher speed than the competitors by predetermining Processes for locating the information.
What makes this product especially perfect for huge language models is the fact that they work in the same way, of predetermined processes that connect directly to Groq’s architecture – both on the software side and on the hardware side. And the results speak for themselves. In an impressive demo, you can see in the video (right here, below the paragraph) how CEO Jonathan Ross lets CNN presenter Becky Anderson talk to an LLM-based voice chatbot and the conversation between the two is fast and fluid, so much so that Anderson stops at some point and wonders if they have pre-entered answers to the model. That is, the delay that we are familiar with from the moment we enter a prompt and receive the output – disappears. And here, we mention that the output goes through a TTS stage, that is, generating the answer as a “voice assistant” voice.
Groq’s chips deal with inference, one of the most important areas in the giant language models (LLM), that is, not training the model, but drawing conclusions and predicting missing data – based on the existing data. The longer and more complex the prompt, the better performance on the inference side is needed to make sure the model doesn’t take forever to output a result according to your request. Nvidia also began to emphasize the inference performance of its chips.
This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Strictly Necessary Cookies
Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.
If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.
3rd Party Cookies
This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.
Keeping this cookie enabled helps us to improve our website.
Please enable Strictly Necessary Cookies first so that we can save your preferences!
Additional Cookies
This website uses the following additional cookies:
(List the cookies that you are using on the website here.)
Please enable Strictly Necessary Cookies first so that we can save your preferences!