NEWART

NEWART

Share this post

NEWART
NEWART
Have AI Brain to Image Systems Arrived? Not quite.
User's avatar
Discover more from NEWART
Newart was created to explore the past and future of TECHNOLOGY x CREATIVITY
Over 4,000 subscribers
Already have an account? Sign in

Have AI Brain to Image Systems Arrived? Not quite.

Bryan Cheong's avatar
Bryan Cheong
Mar 08, 2023

Share this post

NEWART
NEWART
Have AI Brain to Image Systems Arrived? Not quite.
Share

Our thoughts have always been the starting point for our creations - from building machines to writing stories. But what if our thoughts could become tangible before our very eyes? This notion - once far-fetched - has started to feel more within reach due to the advent of text to image AI image generation.

On Friday, March 3rd a white paper started to go viral on Twitter that some thought might signal the advent of thought to image was upon us, the wonder, awe and fear ensued as people pondered the notion of our minds being read and visualized by AI.

Twitter avatar for @danberridge
deiniolb 🐻👉 aisuite.io @danberridge
I'm speechless. Not peer-reviewed yet but a submitted paper. The 'presented images' were shown to a group of humans. The 'reconstructed images' were the result of an fMRI output to Stable Diffusion. In other words, #stablediffusion literally read people's minds. Source 👇
Image
3:01 AM ∙ Mar 3, 2023
22,754Likes4,027Retweets

While the technology to create images directly from brain activity has proven possible in recent decades - with this being the latest research on the subject - it isn’t a thought to image system as some hoped, and others feared.

Twitter avatar for @docmilanfar
Peyman 𝕄𝕀𝕃𝔸ℕ𝔽𝔸ℝ @docmilanfar
So much overhyping, it makes your head spin Brainwaves to images is something that’s been around since at least 2008. To somehow suggest that hooking up stable diffusion means that it is “literally reading people’s minds” is absolutely ridiculous. scientificamerican.com/article/transl…
Twitter avatar for @danberridge
deiniolb 🐻👉 aisuite.io @danberridge
I'm speechless. Not peer-reviewed yet but a submitted paper. The 'presented images' were shown to a group of humans. The 'reconstructed images' were the result of an fMRI output to Stable Diffusion. In other words, #stablediffusion literally read people's minds. Source 👇 https://t.co/sne2LhF8wZ
7:23 AM ∙ Mar 4, 2023
633Likes68Retweets

To create a machine that can draw what you're thinking of, three steps are needed.

Step One is a matter of how your brain works, since the brain activity of someone imagining something is different from the brain activity in the visual cortex of someone actually seeing something, and there must be some correlation between the two types of brain activities that can be picked out by a functional magnetic resonance imaging (fMRI) scan. Step Two requires mapping brain activity seen in an fMRI scan to an image or video representation, with a perceptible difference between brain activity for different objects. There needs to be a consistent and perceptible difference in the brain activity of someone looking at an elephant versus someone looking at a car. Step 3 is the ability to reconstruct or draw images from the encoded representation obtained in Step Two.

By perfecting these three steps, we may be able to generate tangible images from our thoughts in the future.

Very recent advances with generative AI, particularly with diffusion models, are able to do Step Three very well now. For this reason, the paper "High-resolution image reconstruction with latent diffusion models from human brain activity" by Yu Takagi and Shinji Nishimoto, released in December of 2022, has generated a lot of online fascination at how vivid the images an AI had drawn directly from brain activity were.

What Stable Diffusion and other rapidly improving image-generation AI models provide is a new way to interpret this different brain activity and actually visualize them. Previous approaches, before we had the computationally cheap Stable Diffusion, produced fuzzy forms and images that did not look meaningful, such as these images from "Deep image reconstruction from human brain activity" in 2019:

This is a powerful example of how our new, open-source AI models can synergistically combine with older technologies to produce astonishing results. There has been no significant change in fMRI technology for reading brain activity, it’s just that we have much better tools at mapping this activity into latent representations that can be turned into meaningful images through Stable Diffusion.

It’s easy to forget how recent and advanced the 2019 papers using self-supervised and GAN models already were. Back in 2011, a team in UC Berkeley created some of the earliest reconstructions of video clips through brain activity:

This was before generative AI, so in their approach, they actually used a statistical maximum likelihood method. They found all the video clips that created the most similar response to brain activity that was detected, and created an “average clip” across those similar video clips. The resulting videos could reconstruct a lot of the forms and colors, but were of course a lot less compelling than generated images from diffusion models. Now, generative AI has removed the need for using such crude methods.

So we have reasonable proofs of concepts in Step 2 and Step 3, but how about Step 1? Can we read the brain activity of imagined scenes and images, and not just things people are directly looking at with their own eyes?

This depends on the individual. According to the paper “The neural correlates of visual imagery vividness–An fMRI study and literature review” in published 2018, visual imagery vividness is a subjective experience that varies across individuals and depends on several factors, such as attention, memory, emotion, motivation, and cognitive style. So some individuals activate similar visual cortex activity as has been used for the brain-reading proofs of concepts when they’re just imagining they see things, but others do not. This makes sense on a fundamental level, since people often overestimate how much they are able to visualize, and some people have aphantasia, or are unable to visualize at all. 

Our brains are very good at tricking us into thinking that we are fully visualizing something even when we are not. Imagine a bicycle, picture it clearly in your head. Is the image clear? Now draw a bicycle out on paper. It is likely that you find that you can’t really remember where the struts and supports and chains of a bicycle actually go, despite tricking yourself into thinking you had a clear image of a bicycle just moments before. The brain is very good at filling the gaps of what we can’t visualize by drawing our attention away from it. Luckily, models like Stable Diffusion and other generative models are also very good at filling in gaps and in-painting.

Some day, they may perform in-painting on behalf of images generated from reading our minds. Allowing for a kind of camera for imagination, doing with neurones what cameras do with photons - the advent of nuerography isn’t here yet - but it’s hard to image it isn’t inevitable.


Subscribe to NEWART

By Newart · Launched 2 years ago
Newart was created to explore the past and future of TECHNOLOGY x CREATIVITY

Share this post

NEWART
NEWART
Have AI Brain to Image Systems Arrived? Not quite.
Share
A guest post by
Bryan Cheong
Data scientist and machine learning engineer in San Francisco
Subscribe to Bryan

Discussion about this post

User's avatar
Photocopiers Terrified the Publishing World
“What Hath Xerox Wrought” asked a 1976 Time Magazine headline, the proceeding article prompted readers to imagine if: “some inventive and omnipotent god…
Jul 25, 2023 • 
Louis Anslow
26

Share this post

NEWART
NEWART
Photocopiers Terrified the Publishing World
3
Art Has Always Been Artificial
From paint in tubes to typewriters to photography to digital art, the history of creativity is a history of disruptive technologies.
Jun 7, 2023 • 
Louis Anslow
20

Share this post

NEWART
NEWART
Art Has Always Been Artificial
3
The Copier Wars: Fair Use’s Rude Awakening
Ernie Smith of Tedium has done a longer more detail deep-dive into the history of the photocopier and its implications, a follow-up on our previous post…
Apr 13, 2024 • 
Ernie Smith
8

Share this post

NEWART
NEWART
The Copier Wars: Fair Use’s Rude Awakening

Ready for more?

© 2025
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Create your profile

User's avatar

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.