Join getAbstract to access the summary!

Meet DALL-E, the A.I. That Draws Anything at Your Command

Join getAbstract to access the summary!

Meet DALL-E, the A.I. That Draws Anything at Your Command

New technology that blends language and images could serve graphic artists – and speed disinformation campaigns.

The New York Times,

5 min read
3 take-aways
Audio & text

What's inside?

Is OpenAI too dangerous for the internet?



Editorial Rating

8

Qualities

  • Scientific
  • Eye Opening
  • Overview

Recommendation

Microsoft has invested a billion dollars in OpenAI and created DALL-E, a program that combines image and text analysis to generate realistic images based on verbal commands. This is the latest in neural network capabilities – systems that filter vast amounts of data to identify and classify images. While this application could be a boon to graphic designers and digital assistant developers, it’s also potentially dangerous in the wrong hands. Disinformation and “deep fakes” already exist on the internet, which DALL-E could exacerbate. It’s not yet on the market and is only available to researchers – for now.

Take-Aways

  • Microsoft’s OpenAI is a billion-dollar artificial intelligence lab that created DALL-E.
  • DALL-E generates images on command via neural networks, which access and filter vast amounts of data.
  • DALL-E could be used to create “deep fakes” and is currently only available to researchers.

Summary

Microsoft’s OpenAI is a billion-dollar artificial intelligence lab that created DALL-E.

Alex Nichol, one of the researchers at Microsoft’s OpenAI, types the command, “a teapot in the shape of an avocado” into DALL-E, a new software that uses AI to generate images with a few simple descriptive words. The system immediately created several realistic images that incorporated “teapot” and “avocado.” Such technology could be very useful to graphic designers, as DALL-E’s capabilities improve. Currently, DALL-E (an homage to the Pixar film WALL-E and surrealist painter Salvador Dali) isn’t perfect. Nichol asked it to put the Eiffel Tower on the moon, for instance, and it generated a picture of the moon in the sky above the tower.

“You could use it for good things, but certainly you could use it for all sorts of other crazy, worrying applications, and that includes deep fakes.” (Subbarao Kambhampati, computer science professor at Arizona State University)

But some of DALL-E’s images are startlingly photo-realistic, raising concerns that the system could convincingly manufacture images that could be mistaken for the real thing. The internet is already full of “deep fakes” and disinformation. How can OpenAI ensure that DALL-E isn’t abused by bad actors?

DALL-E generates images on command via neural networks, which access and filter vast amounts of data.

As recently as five years ago, AI labs all over the world were developing technology for identifying and even generating images of people and things. Later, they did the same with language. Now, experts are combining the technologies, and DALL-E is “notable” in that it can interpret language to generate an image.

“DALL-E looks for patterns as it analyzes millions of digital images as well as text captions that describe what each image depicts. In this way, it learns to recognize the links between the images and the words.”

OpenAI uses neural networks, which “loosely resemble” the human brain. Neural networks “learn” to identify and classify information by analyzing vast amounts of data. From language commands it generates features that build the image of a teddy bear, for instance, or a trumpet. A second neural network, a “diffusion model” generates the pixels to manufacture the image. Researchers are now expanding the system to include audio. These systems could improve performance among search engines, digital assistants and graphic artists.

DALL-E could be used to create “deep fakes” and is currently only available to researchers.

Due to the possibility that DALL-E could be abused by people who want to spread disinformation, it hasn’t been made available to the public. It has some of the bugs that plague other AI-driven applications, such as ethnic and gender bias, which can be used to proliferate pornography and hate speech. This technology could also produce “deep fakes,” spreading disinformation in a manner similar to the 2016 presidential campaign. Once everything online is potentially fake, nothing is certifiably real. This would undermine users’ confidence in cyberspace as a place to find reliable, accurate information.

“People need to know that the images they see may not be real.” (Boris Dayma, independent AI researcher)

DALL-E is “not a product,” according to its developers. Instead, it is a test for AI’s capabilities, which will need to include mitigation to prevent abuse. Currently, DALL-E is on a “tight leash” and users can’t use it while they’re on their own. Each image contains a watermark. However, other developers in other countries are working on the same technology, and it may be only a matter of time before it burgeons across the internet.

About the Author

Cade Metz is a technology correspondent with The New York Times. He covers artificial intelligence, driverless cars, robotics, virtual reality and other emerging technologies. He is the author of Genius Makers: The Mavericks Who Brought AI to Google, Facebook, and the World.

This document is restricted to personal use only.

Did you like this summary?

Read the article

Comment on this summary