- Prompt Hunt Newsletter
- Posts
- New Lexica model and GPT-4 rumors
New Lexica model and GPT-4 rumors
Plus a wee GPT refresher course
Here's your daily briefing:
Using Midjourney v4 to generate fictional synthesizers:
I like synths so here are some fictional synths made with AI synths 🎹
1. The CRANIUM3000
86B FM operators / 666 LFOs - its built-in speakers aren't the best, use the L/R outputs 🔊
@midjourney v4 + a quick generative sequenced track
— fabians.eth (@fabianstelzer)
11:50 AM • Nov 8, 2022
They say history is written by the victors. Are we nearing the point where it'll be more accurate to say "history is written by those with the best deepfakes"?
Researchers at Cornell University have recently brought the live avatar technology to the megapixel resolution
They also progressed the "cross-driving synthesis" challenge i.e. producing an animated image from a significantly different source image
#innovation#ai#tech
— Pascal Bornet (@pascal_bornet)
7:30 AM • Nov 7, 2022
Thought-provoking thread, with evidence, about how text-to-image generators could inadvertently perpetuate harmful stereotypes:
Text-to-image generation models (like Stable Diffusion and DALLE) are being used to generate millions of images a day.
We show that these models perpetuate and amplify dangerous stereotypes related to race, gender, crime, poverty, and more (arxiv.org/abs/2211.03759)
A thread🧵— Federico Bianchi (@federicobianchy)
6:21 PM • Nov 8, 2022
Daniel Eckler dropped another AI mega thread. Worth a read over your a cup of coffee to get a "lay of the land" with regard to AI and it's current use cases:
ANOTHER AI MEGA THREAD 🧵
— Daniel Eckler ✦ (@daniel_eckler)
2:42 PM • Nov 7, 2022
Samples from the new Lexica model are 🤯:
Here are a few samples from the latest Lexica model. Will be live for everyone to play with in a few days.
To beta test it, just reply here with a prompt.
— Sharif Shameem (@sharifshameem)
3:00 PM • Nov 7, 2022
Some test prompts solicited from Twitter users:



If you're interested in learning more about machine learning (ML), this thread from Sanju Sinha is a great place to start:
A lot of Machine Learning (ML) I learned during my Ph.D. was from youtube. I didn't have a guide to do this effectively and thus here it is:
A complete guide to studying ML from youtube: 13 best and most recent ML courses available on YouTube. 👩🏫🧵⤵️
— Sanju Sinha (@Sanjusinha7)
9:45 PM • Nov 7, 2022
And if you're still finding yourself sometimes confusing machine learning with deep learning with artificial intelligence in general, save this image:


We read these tweets from Robert Scoble...
Disruption is coming.
GPT-4 is better than anyone expects.
And it is one of several such AIs that will ship next year.
— Robert Scoble (@Scobleizer)
6:39 AM • Nov 8, 2022
GPT-4 will take over all information and social needs.
Audience is about to move somewhere else.
— Robert Scoble (@Scobleizer)
9:28 AM • Nov 8, 2022
...and got to wondering about GPT-4.
When is it coming out? How will it compare to GPT-3? Will it save the world?
These led to our doing a little reading, mostly this piece from DataCamp:
We'll share with you what we gathered below, but first, a refresher.
What is GPT?
(All quotes below are from the aforementioned DataCamp piece, unless otherwise indicated.)
Generative Pre-trained Transformer (GPT) is a text generation deep learning model trained on the data available on the internet. It is used for question & answers, text summarization, machine translation, classification, code generation, and conversation AI.
There are endless applications for GPT models, and you can even fine-tune them on specific data to create even better results. By using transformers, you will be saving costs on computing, time, and other resources.
What came before GPT?
Before GPT-1, most Natural Language Processing (NLP) models were trained for particular tasks like classification, translation, etc. They all were using supervised learning. This type of learning comes with two issues: lack of annotated data and failure to generalize tasks.

How do GPT 1, 2, and 3 compare to each other?
GPT-1 (117M parameters) paper (Improving Language Understanding by Generative Pre-Training) was published in 2018. It has proposed a generative language model that was trained on unlabeled data and fine-tuned on specific downstream tasks such as classification and sentiment analysis.
GPT-2 (1.5B parameters) paper (Language Models are Unsupervised Multitask Learners) was published in 2019. It was trained on a larger dataset with more model parameters to build an even more powerful language model. GPT-2 uses task conditioning, Zero-Shot Learning, and Zero Short Task Transfer to improve model performance.

GPT-3 (175B parameters) paper (Language Models are Few-Shot Learners) was published in 2020. The model has 100 times more parameters than GPT-2. It was trained on an even larger dataset to achieve good results on downstream tasks. It has surprised the world with human-like story writing, SQL queries and Python scripts, language translation, and summarization. It has achieved a state-of-the-art result using In-context learning, few-shot, one-shot, and zero-shot settings.
What's new in GPT-4?
Even though GPT-4 is so highly anticipated, there's still relatively little confirmed information out about it. Sam Altman, CEO of OpenAI, did a Q&A last year where he gave some hints about their ideas for GPT-4. Most of the "what to expect" rumors and predictions we share below stem from that session.
Model Size
According to Altman, GPT-4 won’t be much bigger than GPT-3. So, we can assume that it will have around 175B-280B parameters, similar to Deepmind’s language model Gopher.
The large model Megatron NLG is three times larger than GPT-3 with 530B parameters and did not exceed in performance. The smaller model that came after it reached higher performance levels. In simple words, a large size does not mean higher performance.
Altman said that they are focusing on making smaller models perform better. The large language models required a large dataset, massive computing resources, and complex implementation. Even deploying large models becomes cost ineffective for various companies.
Optimality (Getting the best out of a given model)
Large models are mostly under-optimized. It is expensive to train the model, and companies have to make a trade between accuracy and cost. For example, GPT-3 was trained only once, despite errors. Due to unaffordable costs, researchers could not perform hyperparameter optimization.
Microsoft and OpenAI have proved that GPT-3 could be improved if they have trained it on optimal hyperparameters. In the findings, they have discovered that a 6.7B GPT-3 model with optimized hyperparameters has increased the performance as much as a 13B GPT-3 model.
They have discovered new parameterization (μP) that the best hyperparameters for the smaller models are the same as the best for the larger ones with the same architecture. It has allowed researchers to optimize large models at a fraction of the cost.
Multimodal?
During the Q&A, Altman said that the GPT-4 won’t be multimodal like DALL-E. It will be a text-only model.
Why is that? Good multimodal is hard to build compared to language only or vision only. Combining textual and visual information is a challenging task. It also means that they have to provide better performance than GPT-3 and DALL-E 2.
So, we won’t be expecting anything fancy in GPT-4.
Alignment issues
The GPT-4 will be more aligned than GPT-3. OpenAI is struggling with AI alignment. They want language models to follow our intention and adhere to our values.
They have taken the first step by training InstructGPT. It is a GPT-3 model trained on human feedback to follow instructions. The model was perceived to be better than GPT-3 by human judges. Regardless of language benchmarks.
WEN RELEASE
The GPT-4 release date is still unconfirmed, and we can assume that the company is focusing more on other technology like text-to-image and speech recognition. So, you might see it next year or next month. We can’t be sure. What we can be sure of is that the next version will solve the problem of the older version and present better results.
In conclusion:
GPT-4 will be a text-only large language model with better performance on a similar size as GPT-3. It will also be more aligned with human commands and values.
You might hear conflicting news on GPT-4 consisting of 100 trillion parameters and only focusing on code generation. But they all are speculation at this point. There is so much more that we don’t know about, and OpenAI has not revealed anything concrete about the launch date, model architecture, size, and dataset.
Just like GPT-3, GPT-4 will be used for various language applications such as code generation, text summarization, language translation, classification, chatbot, and grammar correction. The new version of the model will be more secure, less biased, more accurate, and more aligned. It will also be cost-efficient and robust.

"a sunlit indoor lounge area with a pool with clear water and another pool with translucent pastel pink water, next to a big window, digital art"
