The Creators of ‘Sam Altman Leads the Committee That Oversees Sam Altman’s Actions’ Have Unveiled That GPT-4 Is Overseeing GPT-4

  • ChatGPT has a lot of room for improvement in the programming field.

  • The company is working on ensuring the chatbot provide better responses.

Altman
No comments Twitter Flipboard E-mail

“GPT-4 for catching GTP-4’s mistakes.” This is how OpenAI president Greg Brockman has presented the company’s latest proposal for improving its flagship model in the programming field. The new approach involves implementing CriticGPT, which is a model based on GPT-4 specifically designed to detect errors in ChatGPT code output.

The Microsoft-backed company claims that CriticGPT has proven to be very effective in helping people detect errors in ChatGPT's responses. According to internal tests, individuals who received assistance from CriticGPT outperformed those who worked alone by 60%. The model is now ready to move to the next stage, according to the company.

A New Tool for Reinforcement Learning

When training models like GPT-4, developers use reinforcement learning from human feedback (RLHF). Broadly speaking, this machine learning technique uses human-created responses, also known as AI trainers, to enhance the model accuracy's for specific tasks.

OpenAI will deploy models similar to CriticGPT to assist trainers in identifying subtle errors often produced by GPT-4 through ChatGPT. “This is a step towards being able to evaluate outputs from advanced AI systems that can be difficult for people to rate without better tools,” the company states in a recent blog post.

ChatGPT

How does CriticGPT work? As you can see in the image below, the model provides “critiques” on ChatGPT answers. While these critiques may not always be accurate, they can help human trainers identify previously unnoticed issues. As such, OpenAI describes them as “helpful” to the RLHF process.

ChatGPT

CriticGPT, based on GPT-4, also underwent reinforcement learning from human feedback. Interestingly, test results indicate that it could be beneficial for ChatGPT, based on GPT-4, to improve in programming tasks. It’s important to note that some studies have highlighted a significant percentage of incorrect responses from the model in this area.

Furthermore, the company aims to enhance the safety of its models following the dissolution of its Superalignment team. For this purpose, it has established a committee led by Altman. One of the committee’s missions is to present recommendations to the board of directors, chaired by Greg Brockman, for a company led by Altman, the CEO, himself.

Image | OpenAI | Milad Fakurian | Village Global

Related | YouTube Sees a Future Where AI Will Clone Current Music, But Convincing Record Companies Won’t Be Easy

Home o Index