GPT-3
Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model that uses deep learning to produce human-like text. It is the third-generation language prediction model in the GPT-n series created by OpenAI, a for-profit San Francisco-based artificial intelligence research laboratory.[2] GPT-3's full version has a capacity of 175 billion machine learning parameters, which is over two orders of magnitude greater than that of its predecessor, GPT-2.[1]:14 GPT-3, which was introduced in May 2020, and is in beta testing as of July 2020,[3] is part of a trend in natural language processing (NLP) systems of pre-trained language representations.[1] Prior to the release of GPT-3, the largest language model was Microsoft's Turing NLG, introduced in February 2020, with a capacity of 17 billion parameters or less than 10 percent compared to GPT-3.[4]
Original author(s) | OpenAI[1] |
---|---|
Initial release | June 11, 2020 (beta) |
Repository | ![]() |
Website | openai |
Part on a series on |
Artificial intelligence |
---|
Technology |
Glossary |
The quality of the text generated by GPT-3 is so high that it is difficult to distinguish from that written by a human, which has both benefits and risks.[4] Thirty-one OpenAI researchers and engineers presented the original May 28, 2020 paper introducing GPT-3. In their paper, they warned of GPT-3's potential dangers and called for research to mitigate risk.[1]:34 David Chalmers, an Australian philosopher, described GPT-3 as "one of the most interesting and important AI systems ever produced."[5]
Background
According to The Economist, improved algorithms, powerful computers, and an increase in digitized data have fueled a revolution in machine learning, with new techniques in the 2010s resulting in "rapid improvements in tasks" including manipulating language.[6] Software models are trained to learn by using thousands or millions of examples in a "structure ... loosely based on the neural architecture of the brain".[6] One architecture used in natural language processing (NLP) is a neural network based on a deep learning model that was first introduced in 2017—the Transformer.[7] GPT-n models are based on this Transformer-based deep learning neural network architecture. There are a number of NLP systems capable of processing, mining, organizing, connecting, contrasting, understanding and generating answers to questions.[8]
On June 11, 2018, OpenAI researchers and engineers posted their original paper on generative models—language models—artificial intelligence systems—that could be pre-trained with an enormous and diverse corpus of text via datasets, in a process they called generative pre-training (GP).[9] The authors described how language understanding performances in natural language processing (NLP) were improved in GPT-n through a process of "generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task." This eliminated the need for human supervision and for time-intensive hand-labeling.[9]
In February 2020, Microsoft introduced its Turing Natural Language Generation (T-NLG), which was then the "largest language model ever published at 17 billion parameters."[10] It performed better than any other language model at a variety of tasks which included summarizing texts and answering questions.[10]
Capabilities
A May 28, 2020 arXiv preprint by a group of 31 engineers and researchers at OpenAI described the development of GPT-3, a third-generation "state-of-the-art language model".[1][4] The team increased the capacity of GPT-3 by over two orders of magnitude from that of its predecessor, GPT-2, making GPT-3 the largest non-sparse language model to date.[1]:14[2] GPT-3's higher number of parameters grants it a higher level of accuracy relative to previous versions with smaller capacity.[11] GPT-3's capacity is ten times larger than that of Microsoft's Turing NLG.[4]
Sixty percent of the weighted pre-training dataset for GPT-3 comes from a filtered version of Common Crawl consisting of 410 billion byte-pair-encoded tokens.[1]:9 Other sources are 19 billion tokens from WebText2 representing 22% of the weighted total, 12 billion tokens from Books1 representing 8%, 55 billion tokens from Books2 representing 8%, and 3 billion tokens from Wikipedia representing 3%.[1]:9 GPT-3 was trained on hundreds of billions of words and is capable of coding in CSS, JSX, Python, among others.[3] Since GPT-3's training data was all-encompassing, it does not require further training for distinct language tasks.[3]
On June 11, 2020, OpenAI announced that users could request access to its user-friendly GPT-3 API—a "machine learning toolset"—to help OpenAI "explore the strengths and limits" of this new technology.[12][13] The invitation described how this API had a general-purpose "text in, text out" interface that can complete almost "any English language task", instead of the usual single use-case.[12] According to one user, who had access to a private early release of the OpenAI GPT-3 API, GPT-3 was "eerily good" at writing "amazingly coherent text" with only a few simple prompts.[14]
Because GPT-3 can "generate news articles which human evaluators have difficulty distinguishing from articles written by humans,"[4] GPT-3 has the "potential to advance both the beneficial and harmful applications of language models."[1]:34 In their May 28, 2020 paper, the researchers described in detail the potential "harmful effects of GPT-3"[4] which include "misinformation, spam, phishing, abuse of legal and governmental processes, fraudulent academic essay writing and social engineering pretexting".[1] The authors draw attention to these dangers to call for research on risk mitigation.[1]:34
Reviews
In his July 29, 2020 review in The New York Times, Farhad Manjoo said that GPT-3—which can generate computer code and poetry, as well as prose—is not just "amazing", "spooky", and "humbling", but also "more than a little terrifying".[15]
Daily Nous presented a series of articles by nine philosophers on GPT-3.[16] Australian philosopher David Chalmers described GPT-3 as "one of the most interesting and important AI systems ever produced".[5]
A review in Wired said that GPT-3 was "provoking chills across Silicon Valley".[17]
An article in Towards Data Science stated that GPT-3 was trained on hundreds of billions of words and is capable of coding in CSS, JSX, Python, and other languages.[3]
The National Law Review said that GPT-3 is an "impressive step in the larger process", with OpenAI and others finding "useful applications for all of this power" while continuing to "work toward a more general intelligence".[18]
References
- Brown, Tom B.; Mann, Benjamin; Ryder, Nick; Subbiah, Melanie; Kaplan, Jared; Dhariwal, Prafulla; Neelakantan, Arvind; Shyam, Pranav; Sastry, Girish; Askell, Amanda; Agarwal, Sandhini; Herbert-Voss, Ariel; Krueger, Gretchen; Henighan, Tom; Child, Rewon; Ramesh, Aditya; Ziegler, Daniel M.; Wu, Jeffrey; Winter, Clemens; Hesse, Christopher; Chen, Mark; Sigler, Eric; Litwin, Mateusz; Gray, Scott; Chess, Benjamin; Clark, Jack; Berner, Christopher; McCandlish, Sam; Radford, Alec; Sutskever, Ilya; Amodei, Dario (July 22, 2020). "Language Models are Few-Shot Learners". arXiv:2005.14165.
- Shead, Sam (July 23, 2020). "Why everyone is talking about the A.I. text generator released by an Elon Musk-backed lab". CNBC. Retrieved July 31, 2020. Four preprints were released between May 28 and July 22, 2020.
- Bussler, Frederik (July 21, 2020). "Will GPT-3 Kill Coding?". Towards Data Science. Retrieved August 1, 2020.
- Sagar, Ram (June 3, 2020). "OpenAI Releases GPT-3, The Largest Model So Far". Analytics India Magazine. Retrieved July 31, 2020.
- Chalmers, David (July 30, 2020). Weinberg, Justin (ed.). "GPT-3 and General Intelligence". Daily Nous. Philosophers On GPT-3 (updated with replies by GPT-3). Retrieved August 4, 2020.
- "An understanding of AI's limitations is starting to sink in". The Economist. June 11, 2020. ISSN 0013-0613. Retrieved July 31, 2020.
- Polosukhin, Illia; Kaiser, Lukasz; Gomez, Aidan N.; Jones, Llion; Uszkoreit, Jakob; Parmar, Niki; Shazeer, Noam; Vaswani, Ashish (June 12, 2017). "Attention Is All You Need". arXiv:1706.03762 [cs.CL].
- "Natural Language Processing". Retrieved July 31, 2020.
- Radford, Alec; Narasimhan, Karthik; Salimans, Tim; Sutskever, Ilya (June 11, 2018). "Improving Language Understanding by Generative Pre-Training" (PDF). p. 12. Retrieved July 31, 2020.
- Sterling, Bruce (February 13, 2020). "Web Semantics: Microsoft Project Turing introduces Turing Natural Language Generation (T-NLG)". Wired. ISSN 1059-1028. Retrieved July 31, 2020.
- Ray, Tiernan (June 1, 2020). "OpenAI's gigantic GPT-3 hints at the limits of language models for AI". ZDNet. Retrieved July 31, 2020.
- "OpenAI API". OpenAI. June 11, 2020.
- "TechCrunch – Startup and Technology News". TechCrunch. June 11, 2020. Retrieved July 31, 2020.
If you’ve ever wanted to try out OpenAI’s vaunted machine learning toolset, it just got a lot easier. The company has released an API that lets developers call its AI tools in on “virtually any English language task.”
- Arram (July 9, 2020). "GPT-3: An AI that's eerily good at writing almost anything". Arram Sabeti. Retrieved July 31, 2020.
- Manjoo, Farhad (July 29, 2020). "How Do You Know a Human Wrote This?". The New York Times. ISSN 0362-4331. Retrieved August 4, 2020.
- Weinberg, Justin, ed. (July 30, 2020). "Philosophers On GPT-3 (updated with replies by GPT-3)". Daily Nous. Retrieved July 31, 2020.
- Simonite, Tom (July 22, 2020). "Did a Person Write This Headline, or a Machine?". Wired. ISSN 1059-1028. Retrieved July 31, 2020.
- Claypoole, Theodore (July 30, 2020). "New AI Tool GPT-3 Ascends to New Peaks, But Proves How Far We Still Need to Travel". The National Law Review. Retrieved August 4, 2020.
External links
- Video: OpenAI GPT-3 - Good At Almost Everything! (Two Minute Papers)
- Video: GPT3: An Even Bigger Language Model (Computerphile)
- Video: GPT-3 vs Human Brain (Lex Fridman)