GPT-3: we are at the very beginning of a new app ecosystem

Join Transform 2021 for key issues in AI and enterprise data. Learn more.


The most impressive thing about OpenAI’s natural language processing (NLP) model, GPT-3, is the sheer size. With more than 175 billion weighted connections between words known as parameters, the transformer-encoder-decoding model blows its 1.5 billion parameter predecessor, GPT-2, out of the water. This enabled the model to generate text that is surprisingly human, after being fed only a few examples of the task you want to do.

Its release in 2020 dominated the headlines, and people struggled to get on the waiting list to access the API offered on OpenAI’s cloud service. Now, months later, as more users have accessed the API (myself included), interesting applications and use cases appear every day. Debuild.co, for example, has many interesting demos where you can build an application by giving the program some simple instructions in plain English.

Despite the hype, questions remain as to whether GPT-3 will be the foundation on which an NLP application ecosystem will rest, or whether newer, stronger NLP models will drop it from its throne. As businesses begin to propose and design NLP applications, this is what they need to know about GPT-3 and its potential ecosystem.

GPT-3 and the NLP arms race

As I have described in the past, there are actually two approaches to training an NLP model: generalized and ungeneralized.

A non-generalized approach has specific objectives that are pre-trained and that correspond to a known use case. Basically, these models lie deep in a smaller, more focused dataset rather than growing up in a massive dataset. An example of this is Google’s PEGASUS model, which was built specifically to enable text summarization. PEGASUS is pre-trained on a data set that is very much in line with its ultimate goal. It is then set up on text summary datasets to provide the latest results. The advantage of the ungeneralized approach is that it can dramatically increase the accuracy of specific tasks. However, it is also significantly less flexible than a general model and still requires many examples of training before it can be accurate.

A general approach, on the other hand, goes far. These are GPT-3’s 175 billion parameters at work, and are read essentially all over the internet. This allows GPT-3 to perform basically any NLP task with just a handful of examples, although its accuracy is not always ideal. In fact, the OpenAI team emphasizes the limits of general pre-training and even assumes that GPT-3 has ‘significant weaknesses in text synthesis’.

OpenAI has decided to go bigger is better when it comes to accuracy issues, and each version of the model increases the number of parameters in order of magnitude. Participants took note. Google researchers recently released a paper highlighting a Switch Transformer NLP model with 1.6 billion parameters. This is a simply ridiculous number, but it could mean we’ll see an arms race when it comes to generalized models. Although these are the two largest generalized models, Microsoft does have Turing-NLG with 17 billion parameters and may also want to participate in the arms race. If you think it cost OpenAI almost $ 12 million to train GPT-3, such an arms race could be costly.

Promising GPT-3 applications

The flexibility of GPT-3 is what makes it attractive from the point of view of an application ecosystem. You can use it to do just about anything you can imagine with language. Predictably, startups have begun exploring how to use GPT-3 to power next-generation NLP applications. Here is a list of interesting GPT-3 products compiled by Alex Schmitt at Cherry Ventures.

Many of these applications are generally consumer-oriented, such as the “Love Letter Generator”, but there are also more technical applications such as the “HTML Generator”. As businesses consider how and where to incorporate GPT-3 into their business processes, some of the most promising cases for early use are in healthcare, finance and video conferencing.

For businesses in healthcare, financial services and insurance, the streamlined line of research is in great need. Data in these fields is growing exponentially and it is becoming impossible to stay on top of your field in light of this increase. NLP applications built on GPT-3 can scrape the latest reports, papers, results, and so on and summarize the key findings contextually to save researchers time.

And as video conferencing and telehealth became increasingly important during the pandemic, we saw an increase in demand for NLP tools that could be applied to video conferencing. What GPT-3 offers is the ability to not only write a dissertation and take notes of an individual meeting, but also to generate ‘too long’; not read ”(TL; DR) summaries.

How businesses and businesses can build a moat

Despite these promising use cases, the most important inhibitor of a GPT-3 application ecosystem is how easily a copycat can replicate the performance of any application that develops with GPT-3’s API.

Everyone who uses GPT-3’s API is pre-trained on the same NLP model on the same data, so the only distinction is the fine-tuning of data that an organization uses to specialize the use case. The finer you use the data, the more differentiated and more sophisticated.

What does it mean? Larger organizations with a larger number of users or more data than their competitors will be better able to take advantage of the promise of GPT-3. GPT-3 will not lead to disruptive startup; this will allow businesses and large organizations to optimize their offerings because of their current advantage.

What does it mean for businesses and beginners moving forward?

Applications built using GPT-3’s API are just beginning to scratch the surface of potential use cases, so we’re not yet seen developing an ecosystem of interesting proof-of-concepts. How such an ecosystem would make money and decay is also another open question.

Because differentiation is fine-tuned in this context, I expect companies to adhere to the generalization of GPT-3 for certain NLP tasks, while sticking to ungeneralized models like PEGASUS for more specific NLP tasks.

As the number of parameters expands exponentially among the major NLP players, we can see users shifting between ecosystems, depending on who is currently in charge.

Regardless of whether a GPT-3 application ecosystem expires or is replaced by another NLP model, businesses need to be excited about the relative ease with which it becomes possible to create highly articulated NLP models. They need to investigate use cases and consider how they can leverage their position in the market to quickly develop value additions for their customers and their own business processes.

Dattaraj Rao is Innovation and R&D Architect at Persistent Systems and author of the book Keras to Kubernetes: The Journey of a Machine Learning Model to Production. At Persistent Systems, he heads the AI ​​Research Lab. He holds 11 patents in machine learning and computer vision.

VentureBeat

VentureBeat’s mission is to be a digital city square for tech makers to gain knowledge about transformative technology and transactions. Our website provides essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community and gain access to:

  • updated information on the topics that interest you
  • our newsletters
  • thought leader content and access to our valued opportunities, such as Transform, discounts
  • network features, and more

Become a member

Source