In collaboration with Tsinghua University, DeepSeek is developing large language models (LLMs) based on new methods to enable better and faster results. According to the researchers from the university, Deepseek has developed a technique that combines generative reward modelling (GRM) and self-principled critique tuning. 

The company announced that the new DeepSeek-GRM model outperformed existing models, having  “achieved competitive performance” with strong public reward models. Reward modelling is a process that guides an LLM about human preferences. The company also intends to make GRM models open source. 

DeepSeek was reported by Reuters to be releasing its R2 model this month. However, there is no confirmation from the Chinese-based start-up. 

DeepSeek rocked the AI community with its cost-efficient R1 model. It was founded in 2023 by Liang Wenfeng. Last month, the AI startup upgraded its V3 model, named DeepSeek-V3-0324. The company claims that the model will be offering

“enhanced reasoning capabilities, optimised front-end web development and upgraded Chinese writing proficiency”.

In February, DeepSeek open-sourced five of its code repositories. This initiative allowed the developers and reviewers to contribute to its software development. The start-up envisions  “sincere progress with full transparency”.

As AI technology is continuously evolving and there is cut-throat competition between the emerging AI companies, including OpenAI and Anthropic, time will decide who will gain a bigger share of the market. It will depend on speed, accuracy, diversity, and real-time information. Most importantly, the performance and price of the models will determine the global trust and acceptance of the AI chatbots.