The Creators of ‘Sam Altman Leads the Committee That Oversees Sam Altman’s Actions’ Have Unveiled That GPT-4 Is Overseeing GPT-4

“GPT-4 for catching GTP-4’s mistakes.” This is how OpenAI president Greg Brockman has presented the company’s latest proposal for improving its flagship model in the programming field. The new approach involves implementing CriticGPT, which is a model based on GPT-4 specifically designed to detect errors in ChatGPT code output.

The Microsoft-backed company claims that CriticGPT has proven to be very effective in helping people detect errors in ChatGPT's responses. According to internal tests, individuals who received assistance from CriticGPT outperformed those who worked alone by 60%. The model is now ready to move to the next stage, according to the company.

A New Tool for Reinforcement Learning

When training models like GPT-4, developers use reinforcement learning from human feedback (RLHF). Broadly speaking, this machine learning technique uses human-created responses, also known as AI trainers, to enhance the model accuracy's for specific tasks.

OpenAI will deploy models similar to CriticGPT to assist trainers in identifying subtle errors often produced by GPT-4 through ChatGPT. “This is a step towards being able to evaluate outputs from advanced AI systems that can be difficult for people to rate without better tools,” the company states in a recent blog post.

How does CriticGPT work? As you can see in the image below, the model provides “critiques” on ChatGPT answers. While these critiques may not always be accurate, they can help human trainers identify previously unnoticed issues. As such, OpenAI describes them as “helpful” to the RLHF process.

CriticGPT, based on GPT-4, also underwent reinforcement learning from human feedback. Interestingly, test results indicate that it could be beneficial for ChatGPT, based on GPT-4, to improve in programming tasks. It’s important to note that some studies have highlighted a significant percentage of incorrect responses from the model in this area.

The Wait Is Over: The ChatGPT App for Mac With GPT-4o Is Now Available for Free to All Users

The Creators of ‘Sam Altman Leads the Committee That Oversees Sam Altman’s Actions’ Have Unveiled That GPT-4 Is Overseeing GPT-4

ChatGPT has a lot of room for improvement in the programming field.

The company is working on ensuring the chatbot provide better responses.

A New Tool for Reinforcement Learning

A New Tool for Reinforcement Learning

Receive "Xatakaletter", our weekly newsletter