OpenAI’s GPT-3 algorithm now produces billions of words per day

When OpenAI released its huge algorithm GPT-3 from native languages ​​last year, the jaws dropped. Encoders and developers with special access to an early API quickly discovered new (and unexpected) things that GPT-3 could do with nothing but speed. It wrote acceptable poetry, produced decent code, calculated simple sums, and with a few modifications, wrote articles.

All this, it seems, was just the beginning. In a recent update of the blog post, OpenAI said that tens of thousands of developers are now making applications on the GPT-3 platform.

More than 300 programs (and counting) use GPT-3 and the algorithm generates 4.5 billion words per day for them.

These are, of course, many words. But to determine the quantity, let’s use a little math.

The coming stream of algorithmic content

Every month, users publish about 70 million posts on WordPress, which, as a matter of fact, is the dominant content management system online.

Suppose an average article is 800 words long – which is speculation on my part, but not super long or short – people disappear about 56 billion words a month or 1.8 billion words a day on WordPress.

If our average assumption is the number of words in the ballpark, GPT-3 produces more two times the daily word count of WordPress posts. Even if you make an average of more than 2,000 words per article (which seems high to me), the two are roughly equal.

Not every word that GPT-3 generates is a readable word, and it does not necessarily yield blog posts (more on applications below). But in both cases, it appears that the production of GPT-3, just nine months later, predicts a looming stream of algorithmic content.

GPT-3 offers a variety of applications

So, how exactly are all those words used? Just as the initial burst of activity suggested, developers are building a series of applications around GPT-3.

Viable, for example, shows themes in customer feedback – such as surveys, reviews and helpline tickets – and provides brief summaries for businesses looking to improve their services. Fable Studio brings virtual characters to life in interactive stories with GPT-3-generated dialogue. And Algolia uses GPT-3 to power an advanced search tool.

Instead of code, developers use ‘fast programming’ by giving GPT-3 some examples of the kind of production they hope to deliver. More advanced users can fine-tune things by providing algorithm data sets with examples or even human feedback.

In this regard, GPT-3 (and other similar algorithms) can accelerate the use of machine learning in natural language processing (NLP). While the learning curve was previously strong for working with machine learning algorithms, OpenAI says that many in the GPT-3 developer community have no background in AI or programming.

“It’s almost this new interface to work with computers,” said Greg Brockman, OpenAI’s chief technology officer and co-founder, Nature in an article earlier this month.

A walled garden for AI

OpenAI licensed GPT-3 to Microsoft – which invested $ 1 billion in OpenAI in exchange for such partnerships – but did not disclose the code.

The company argues that the earnings of their machine learning products can fund their larger mission. In addition, they say they can control the way the technology is used by strictly providing access to it with an API.

One concern, for example, is that advanced natural language algorithms such as GPT-3 may increase online disinformation. Another is that large-scale algorithms also contain built-in bias and that it requires a lot of care and attention to limit their effects.

At the pinnacle of initial madness, Sam Altman, CEO of OpenAI tweeted, “The GPT-3 hype is way too much. It’s impressive (thanks for the nice compliments!), But it still has serious weaknesses and sometimes makes a lot of silly mistakes. ”

Deep learning algorithms do not have common sense or contextual awareness. Of course, GPT-3, with the right encouragement, easily parroted the online ugliness that was part of its training dataset.

To address these issues, OpenAI examines developers and applications before granting access to GPT-3. They also set guidelines for developers, work on tools to identify and mitigate prejudice, and require processes and people to be in place to monitor bad behavior programs.

Whether these precautions will be adequate, as access to GPT-3 scale remains to be seen.

Researchers want algorithms to provide some common sense, an understanding of cause and effect, and moral judgment. “What we have today is essentially a mouth without a brain,” said Yejin Choi, a computer scientist at the University of Washington and the Allen Institute for AI. Nature.

As long as these traits remain out of reach, researchers and GPT-3 human handlers will have to work hard to ensure that the benefits outweigh the risks.

Alt-AI: Open Source Alternatives for GPT-3

Not everyone agrees with the walled garden approach.

Eleuther, a project aimed at making an open source competitor for GPT-3, unveiled their latest model GPT-Neo last week. The project uses OpenAI’s papers on GPT-3 as a starting point for their algorithms and guides them into distributed computing resources donated by the cloud computing company CoreWeave and Google.

They also created a carefully compiled training dataset called Pile. Eleuther co-founder Connor Leahy tells Wired the project has ‘put a lot of effort into compiling this data set, making sure it is well-filtered and diverse, and documenting its shortcomings and prejudices.’

GPT-Neo’s performance may not yet match GPT-3, but it does match GPT-3’s least advanced version, according to Wired. Meanwhile, other open source projects are also underway.

“There is currently a tremendous excitement for open source NLP and for the production of useful models outside of large technology enterprises,” said Alexander Rush, a professor of computer science at Cornell University. “There’s something like an NLP space race.”

The risks of open source remain: once the code is in nature, there is no return, and no control over how it is used.

But Rush argues that developing algorithms in public allows researchers outside of large companies to investigate them, warts and all, and solve problems.

The new command line

Open source or not, GPT-3 will not be alone for long. Google Brain, for example, recently announced its own major natural language model, weighing in at 1.6 trillion parameters.

In a recent Tech Crunch article, Oren Etzioni, CEO of the Allen Insitute for AI, and venture capitalist Matt McIlwain wrote that they expect GPT-3 and the addition of other large-scale natural language algorithms to provide more accessibility and lower costs.

And in particular, they see ‘fast programming’ as a major shift.

Text, Etzioni and McIlwain wrote, could increasingly become the new command line, a universal translator of sorts that enables the “codeless” to harness machine learning and bring new ideas to life: ‘We think it will bring a whole new empowering generation of creators. , with trillions of parameters at their fingertips, in a completely low-code / no-code way. ”

It seems like machines are chatting a lot. And our work has now been selected to make sure the conversation is meaningful.

Image credit: Emil Widlund / Unsplash

Source