Window into the mind: A new fMRI brain decoding study promises new technologies, stirs up controversy
A neuroimaging research group from the University of California, Berkeley, recently astonished the world with images of video clips reconstructed from the subjects' brain activity. The study, performed by the research group of Dr. Jack Gallant, was published online in the journal "Current Biology" September 22, 2011. It has since received wide publicity in the technology blogosphere (e.g., "Gizmodo," "MIT Technology Review"), as well as popular media, and remains the journal's most downloaded paper.
Below is an outtake from a video posted by the research team on their YouTube channel. The video shows the reconstructed movie side by side with the real thing. The result? Moving desert animals and inkblots reconstructed from functional magnetic resonance imaging (fMRI) scans appear to be almost painted with thousands of averaged YouTube clips. This blurry, eerily impressionist and yet undoubtedly recognizable rendition of the real footage seems not only to offer a "glimpse into the brain" but also to comment on the power of user-generated content. (Incidentally, over the weekend, the video drew more than a million views, and a documentary is rumored to be in the works.)
So how did we suddenly find ourselves in the midst of a technological leap into "brain reading"?
Prediction and reconstruction of visual stimuli from fMRI data has been a somewhat active field for the past five years or so, with various groups showing success in "brain decoding" simple patterns and images from the visual cortex (1) or even in predicting simple memorized visual content (2). But what makes this new study groundbreaking? The authors were able to model and predict dynamic, "movie-like" data, which is vastly more indicative of our visual experience.
In order to achieve this, the researchers overcame a major hurdle: blood oxygen level-dependent (BOLD) signals measured via fMRI are excruciatingly slow. In fact, they evolve much more slowly than both the underlying neuronal activity and the frame rate of the videos. As a result, it has not previously been possible to model brain activity due to rapid and dynamic visual content, as in movies.
The group overcame this limitation by creating a two-fold signal processing algorithm that utilized sophisticated filters to model the fast brain response separately from the slow BOLD signal. First, a subject lay in the scanner, watching several hours of movies from a training data set. Each movie frame was first filtered by a series of neuron-inspired “motion-energy” filters in order to model the fast brain response. Then, the output was fed into a second filtering stage that mimicked the slow BOLD signal. The whole thing was then fitted to the brain recordings. This was done for each fMRI voxel, or 3-d pixel of the fMRI image. The outcome was a voxel-by-voxel description of the brain activity elicited by any type of video. "We built a model for each voxel that describes how shape and motion information in the movie is mapped into brain activity," said Shinji Nishimoto, a post-doctoral researcher who headed the study.
When the subjects watched a new set of “test” movies in the scanner, researchers viewing their brain activity could predict with an astonishing 75 percent chance correctness what video segment they were seeing.
This raised the question of whether it would be possible to use the model in order not just to predict but to reconstruct unknown visual content. To this end, the researchers fed 18 million seconds of completely random YouTube videos into the encoding model and generated a huge library of potential brain states that would result from each of those videos. The subject was then shown another unknown set of "test" movies. The resulting BOLD signal was compared against the potential BOLD signals in the library. The video frames corresponding to the 100 library videos with the most similar BOLD signal to the measured one were averaged to produce the final reconstructed image.
Wow . . . what now?
While the technique is still quite primitive, it opens doors to many speculations about its impact on technology and culture, by the researchers themselves and the media alike.
- Diagnoses and prosthetics — Key potential applications of brain decoding devices are in diagnosing diseases such a stroke and in brain-computer interfaces that would enable paralyzed individuals to interact with a computer or drive a prosthetic arm.
- Digitally storing subjective experience — Being able one day to record and store visual information, decode dreams and perhaps reconstruct memories has been the stuff of science fiction since long before this study. Some day, it just may become a major industry.
- Ethical and privacy issues — The technique has just taken its baby steps, and it is already surrounded by ethical debates and controversy. Among concerns is its use by military entities or the legal system as an interrogation tool. Other major questions are about privacy. On their website, the researchers address privacy and accessibility issues that may arise for neurally obtained information, comparing them to the current battle around availability of genetic information.
While we all wait a few decades for our dream decoders, in the near future, we can expect to see the technique make its first steps towards commercialization. It is sure to inspire the development of faster, more efficient decoding/encoding algorithms and perhaps spur the effort to develop portable fMRI devices or alternative methods of brain recording.