Language Models are Few-Shot Learners

← Skyline

NeurIPS 2020 Open Access

Tom Brown, Benjamin Mann, Nick Ryder, et al.

doi: 10.48550/arxiv.2005.14165

21,483

Citations

4812

Highly Influential

citations

2020

Published

NeurIPS 2020

Venue

TL;DR — Shows that 175B-parameter language models can perform few-shot learning across a wide range of NLP tasks without gradient updates.

Abstract

GPT-3 demonstrates that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches.

Cite this paper

BibTeX

APA

MLA

Chicago

@inproceedings{brown2020language, title = {Language Models are Few-Shot Learners}, author = {Tom Brown and Benjamin Mann and Nick Ryder and et al.}, year = {2020}, booktitle = {NeurIPS 2020}, doi = {10.48550/arxiv.2005.14165} }