
Based on the natural language processing tool driven by artificial intelligence technology, ChatGPT can complete tasks such as chatting, text translation, copywriting, poetry creation, code editing, and scheme calculation, and is one of the consumer applications with rapid growth in user scale in history.
OpenAI is an AI research and deployment company with a mission to ensure that artificial general intelligence benefits all of humanity. OpenAI's mission is to ensure that artificial general intelligence (AGI) – a highly autonomous system that surpasses humans in economically valuable work – benefits all of humanity. OpenAI will try to directly build safe and beneficial AGI, but OpenAI will also consider our mission accomplished if work helps others achieve that outcome. OpenAI has trained a model called ChatGPT, which interacts in a conversational manner. The conversational format enables ChatGPT to answer follow-up questions, admit mistakes, question incorrect premises, and reject inappropriate requests. ChatGPT is an instruction manual that is trained to follow instructions in prompts and provide detailed responses.
ChatGPT sometimes writes answers that sound reasonable but are incorrect or meaningless. Solving this problem is challenging because: (1) during RL training, there is currently no source of truth; (2) the training model is more cautious, causing it to reject questions that it can answer correctly; and (3) supervised training of misleading models, because the ideal answer depends on what the model knows, not what the human demonstrators know.
ChatGPT is sensitive to adjustments to the wording of the input or to trying the same prompt multiple times. For example, given one wording of a question, the model can claim not to know the answer, but given a slight wording, it can answer correctly. Ideally, when the user provides a fuzzy query, the model asks some clarifying questions. Instead, our current model typically guesses the user's intent.
Today's ChatGPT research release is a new step in OpenAI's iterative deployment of increasingly secure and useful AI systems. Many of the lessons learned from the deployment of earlier models informed the security mitigations for this release, including a significant reduction in harmful and disreal output through the use of reinforcement learning with human feedback.