{"id":151211,"date":"2021-01-10T14:00:59","date_gmt":"2021-01-10T11:00:59","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/heres-how-openais-magical-dall-e-image-generator-works\/"},"modified":"2021-01-10T14:00:59","modified_gmt":"2021-01-10T11:00:59","slug":"heres-how-openais-magical-dall-e-image-generator-works","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/heres-how-openais-magical-dall-e-image-generator-works\/","title":{"rendered":"#Here\u2019s how OpenAI\u2019s magical DALL-E image generator works"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_84 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a287a4442edc\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a287a4442edc\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/heres-how-openais-magical-dall-e-image-generator-works\/#What_is_DALL%C2%B7E_and_what_can_it_do\" >What is DALL\u00b7E and what can it do?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/heres-how-openais-magical-dall-e-image-generator-works\/#How_was_DALL%C2%B7E_built\" >How was DALL\u00b7E built?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/buradabiliyorum.com\/en\/heres-how-openais-magical-dall-e-image-generator-works\/#How_%E2%80%9Csmart%E2%80%9D_is_DALL%C2%B7E\" >How \u201csmart\u201d is DALL\u00b7E?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/buradabiliyorum.com\/en\/heres-how-openais-magical-dall-e-image-generator-works\/#Zero-Shot_Visual_Reasoning\" >Zero-Shot Visual Reasoning<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/buradabiliyorum.com\/en\/heres-how-openais-magical-dall-e-image-generator-works\/#What_does_it_mean\" >What does it mean?<\/a><\/li><\/ul><\/nav><\/div>\n<p>&#8220;<strong>#Here\u2019s how OpenAI\u2019s magical DALL-E image generator works<\/strong>&#8221;<\/p>\n<div>\n                                It seems like every few months, someone publishes a machine learning paper or demo that makes my jaw drop. This month, it\u2019s OpenAI\u2019s new image-generating model,<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/openai.com\/blog\/dall-e\/\">DALL\u00b7E<\/a>.<\/p>\n<p>This behemoth 12-billion-parameter neural network takes a text caption (i.e. \u201can armchair in the shape of an avocado\u201d) and generates images to match it:<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" title=\"Generated images of avocado chairs\" alt=\"Generated images of avocado chairs\" width=\"1432\" height=\"1428\" class=\" lazy\" src=\"https:\/\/daleonai.com\/images\/screen-shot-2021-01-06-at-1.37.37-pm.png\" data-lazy=\"true\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/neural\/2021\/01\/10\/heres-how-openais-magical-dall-e-generates-images-from-text-syndication\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fthenextweb.com%2Fneural%2F2021%2F01%2F10%2Fheres-how-openais-magical-dall-e-generates-images-from-text-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: From https:\/\/openai.com\/blog\/dall-e\/\" data-title=\"Share From https:\/\/openai.com\/blog\/dall-e\/ on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share From https:\/\/openai.com\/blog\/dall-e\/ on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>From https:\/\/openai.com\/blog\/dall-e\/<\/figcaption><\/figure>\n<p>I think its pictures are pretty inspiring (I\u2019d buy one of those avocado chairs), but what\u2019s even more impressive is DALL\u00b7E\u2019s ability to understand and render concepts of space, time, and even logic (more on that in a second).<\/p>\n<p>In this post, I\u2019ll give you a quick overview of what DALL\u00b7E can do, how it works, how it fits in with recent trends in ML, and why it\u2019s significant. Away we go!<\/p>\n<h2 id=\"what-is-dalle-and-what-can-it-do\"><span class=\"ez-toc-section\" id=\"What_is_DALL%C2%B7E_and_what_can_it_do\"><\/span>What is DALL\u00b7E and what can it do?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>In July, DALL\u00b7E\u2019s creator, the company OpenAI, released a similarly huge model called GPT-3 that wowed the world with<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/daleonai.com\/gpt3-explained-fast\">its ability to generate human-like text<\/a>, including Op Eds, poems, sonnets, and even computer code. DALL\u00b7E is a natural extension of GPT-3 that parses text prompts and then responds not with words but in pictures. In one example from OpenAI\u2019s blog, for example, the model renders images from the prompt \u201ca living room with two white armchairs and a painting of the colosseum. The painting is mounted above a modern fireplace\u201d:<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" title=\"DALLE generated images\" alt=\"DALLE generated images\" width=\"1424\" height=\"1428\" class=\" lazy\" src=\"https:\/\/daleonai.com\/images\/screen-shot-2021-01-06-at-2.39.07-pm.png\" data-lazy=\"true\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/neural\/2021\/01\/10\/heres-how-openais-magical-dall-e-generates-images-from-text-syndication\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fthenextweb.com%2Fneural%2F2021%2F01%2F10%2Fheres-how-openais-magical-dall-e-generates-images-from-text-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: From https:\/\/openai.com\/blog\/dall-e\/.\" data-title=\"Share From https:\/\/openai.com\/blog\/dall-e\/. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share From https:\/\/openai.com\/blog\/dall-e\/. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>From https:\/\/openai.com\/blog\/dall-e\/.<\/figcaption><\/figure>\n<p>Pretty slick, right? You can probably already see how this might be useful for designers. Notice that DALL\u00b7E can generate a large set of images from a prompt. The pictures are then ranked by a second OpenAI model, called<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/openai.com\/blog\/clip\/\">CLIP<\/a>, that tries to determine which pictures match best.<\/p>\n<h2 id=\"how-was-dalle-built\"><span class=\"ez-toc-section\" id=\"How_was_DALL%C2%B7E_built\"><\/span>How was DALL\u00b7E built?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Unfortunately, we don\u2019t have a ton of details on this yet because OpenAI has yet to publish a full paper. But at its core, DALL\u00b7E uses the same new neural network architecture that\u2019s responsible for tons of recent advances in ML: the<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/1706.03762\">Transformer<\/a>. Transformers, discovered in 2017, are an easy-to-parallelize type of neural network that can be scaled up and trained on huge datasets. They\u2019ve been particularly revolutionary in natural language processing (they\u2019re the basis of models like BERT, T5, GPT-3, and others), improving the quality of<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/blog.google\/products\/search\/search-language-understanding-bert\/\">Google Search<\/a><span>\u00a0<\/span>results, translation, and even in<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/daleonai.com\/how-alphafold-works\">predicting the structures of proteins<\/a>.<\/p>\n<p><em>[Read:\u00a0<span dir=\"auto\">Meet the 4 scale-ups using data to save the planet<\/span>]<\/em><\/p>\n<p>Most of these big language models are trained on enormous text datasets (like all of Wikipedia or<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/commoncrawl.org\/\">crawls of the web<\/a>). What makes DALL\u00b7E unique, though, is that it was trained on sequences that were a combination of words and pixels. We don\u2019t yet know what the dataset was (it probably contained images and captions), but I can guarantee you it was probably massive.<\/p>\n<h2 id=\"how-smart-is-dalle\"><span class=\"ez-toc-section\" id=\"How_%E2%80%9Csmart%E2%80%9D_is_DALL%C2%B7E\"><\/span>How \u201csmart\u201d is DALL\u00b7E?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>While these results are impressive, whenever we train a model on a huge dataset, the skeptical machine learning engineer is right to ask whether the results are merely high-quality because they\u2019ve been copied or memorized from the source material.<\/p>\n<p>To prove DALL\u00b7E isn\u2019t just regurgitating images, the OpenAI authors forced it to render some pretty unusual prompts:<\/p>\n<p>\u201cA professional high quality illustration of a giraffe turtle chimera.\u201d<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" alt=\"\" width=\"1436\" height=\"1140\" class=\" lazy\" src=\"https:\/\/daleonai.com\/images\/screen-shot-2021-01-06-at-1.39.04-pm.png\" data-lazy=\"true\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/neural\/2021\/01\/10\/heres-how-openais-magical-dall-e-generates-images-from-text-syndication\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fthenextweb.com%2Fneural%2F2021%2F01%2F10%2Fheres-how-openais-magical-dall-e-generates-images-from-text-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: From https:\/\/openai.com\/blog\/dall-e\/.\" data-title=\"Share From https:\/\/openai.com\/blog\/dall-e\/. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share From https:\/\/openai.com\/blog\/dall-e\/. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>From https:\/\/openai.com\/blog\/dall-e\/.<\/figcaption><\/figure>\n<p>\u201cA snail made of a harp.\u201d<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" alt=\"\" width=\"1438\" height=\"1434\" class=\" lazy\" src=\"https:\/\/daleonai.com\/images\/screen-shot-2021-01-06-at-1.39.12-pm.png\" data-lazy=\"true\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/neural\/2021\/01\/10\/heres-how-openais-magical-dall-e-generates-images-from-text-syndication\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fthenextweb.com%2Fneural%2F2021%2F01%2F10%2Fheres-how-openais-magical-dall-e-generates-images-from-text-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: From https:\/\/openai.com\/blog\/dall-e\/\" data-title=\"Share From https:\/\/openai.com\/blog\/dall-e\/ on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share From https:\/\/openai.com\/blog\/dall-e\/ on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>From https:\/\/openai.com\/blog\/dall-e\/<\/figcaption><\/figure>\n<p>It\u2019s hard to imagine the model came across many giraffe-turtle hybrids in its training data set, making the results more impressive.<\/p>\n<p>What\u2019s more, these weird prompts hint at something even more fascinating about DALL\u00b7E: its ability to perform \u201czero-shot visual reasoning.\u201d<\/p>\n<h2 id=\"zero-shot-visual-reasoning\"><span class=\"ez-toc-section\" id=\"Zero-Shot_Visual_Reasoning\"><\/span>Zero-Shot Visual Reasoning<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Typically, in machine learning, we train models by giving them thousands or millions of examples of tasks we want them to preform and hope they pick up on the pattern.<\/p>\n<p>To train a model that identifies dog breeds, for example, we might show a neural network thousands of pictures of dogs labeled by breed and then test its ability to tag new pictures of dogs. It\u2019s a task with limited scope that seems almost quaint compared to OpenAI\u2019s latest feats.<\/p>\n<p>Zero-shot learning, on the other hand, is the ability of models to perform tasks that they weren\u2019t specifically trained to do. For example, DALL\u00b7E was trained to generate images from captions. But with the right text prompt, it can also transform images into sketches:<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" alt=\"\" width=\"1436\" height=\"1438\" class=\" lazy\" src=\"https:\/\/daleonai.com\/images\/screen-shot-2021-01-06-at-1.41.02-pm.png\" data-lazy=\"true\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/neural\/2021\/01\/10\/heres-how-openais-magical-dall-e-generates-images-from-text-syndication\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fthenextweb.com%2Fneural%2F2021%2F01%2F10%2Fheres-how-openais-magical-dall-e-generates-images-from-text-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Results from the prompt, \u201cthe exact same cat on the top as a sketch on the bottom\u201d. From https:\/\/openai.com\/blog\/dall-e\/\" data-title=\"Share Results from the prompt, \u201cthe exact same cat on the top as a sketch on the bottom\u201d. From https:\/\/openai.com\/blog\/dall-e\/ on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Results from the prompt, \u201cthe exact same cat on the top as a sketch on the bottom\u201d. From https:\/\/openai.com\/blog\/dall-e\/ on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Results from the prompt, \u201cthe exact same cat on the top as a sketch on the bottom\u201d. From https:\/\/openai.com\/blog\/dall-e\/<\/figcaption><\/figure>\n<p>DALL\u00b7E can also render custom text on street signs:<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" alt=\"\" width=\"1174\" height=\"1172\" class=\" lazy\" src=\"https:\/\/daleonai.com\/images\/screen-shot-2021-01-06-at-2.51.53-pm.png\" data-lazy=\"true\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/neural\/2021\/01\/10\/heres-how-openais-magical-dall-e-generates-images-from-text-syndication\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fthenextweb.com%2Fneural%2F2021%2F01%2F10%2Fheres-how-openais-magical-dall-e-generates-images-from-text-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Results from the prompt\u00a0\u201cA store front that has the word \u2018openai\u2019 written on it\u2019.\u201d From https:\/\/openai.com\/blog\/dall-e\/.\" data-title=\"Share Results from the prompt\u00a0\u201cA store front that has the word \u2018openai\u2019 written on it\u2019.\u201d From https:\/\/openai.com\/blog\/dall-e\/. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Results from the prompt\u00a0\u201cA store front that has the word \u2018openai\u2019 written on it\u2019.\u201d From https:\/\/openai.com\/blog\/dall-e\/. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Results from the prompt\u00a0\u201cA store front that has the word \u2018openai\u2019 written on it\u2019.\u201d From https:\/\/openai.com\/blog\/dall-e\/.<\/figcaption><\/figure>\n<p>In this way, DALL\u00b7E can act almost like a Photoshop filter, even though it wasn\u2019t specifically designed to behave this way.<\/p>\n<p>The model even shows an \u201cunderstanding\u201d of visual concepts (i.e. \u201cmacroscopic\u201d or \u201ccross-section\u201d pictures), places (i.e. \u201ca photo of the food of china\u201d), and time (\u201ca photo of alamo square, san francisco, from a street at night\u201d; \u201ca photo of a phone from the 20s\u201d). For example, here\u2019s what it spit out in response to the prompt \u201ca photo of the food of china\u201d:<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" alt=\"\" width=\"1444\" height=\"860\" class=\" lazy\" src=\"https:\/\/daleonai.com\/images\/screen-shot-2021-01-06-at-1.42.22-pm.png\" data-lazy=\"true\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/neural\/2021\/01\/10\/heres-how-openais-magical-dall-e-generates-images-from-text-syndication\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fthenextweb.com%2Fneural%2F2021%2F01%2F10%2Fheres-how-openais-magical-dall-e-generates-images-from-text-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: \u201ca photo of the food of china\u201d from https:\/\/openai.com\/blog\/dall-e\/.\" data-title=\"Share \u201ca photo of the food of china\u201d from https:\/\/openai.com\/blog\/dall-e\/. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share \u201ca photo of the food of china\u201d from https:\/\/openai.com\/blog\/dall-e\/. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>\u201ca photo of the food of china\u201d from https:\/\/openai.com\/blog\/dall-e\/.<\/figcaption><\/figure>\n<p>In other words, DALL\u00b7E can do more than just paint a pretty picture for a caption; it can also, in a sense, answer questions visually.<\/p>\n<p>To test DALL\u00b7E\u2019s visual reasoning ability, the authors had it take a visual IQ test. In the examples below, the model had to complete the lower right corner of the grid, following the test\u2019s hidden pattern.<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" alt=\"\" width=\"1464\" height=\"896\" class=\" lazy\" src=\"https:\/\/daleonai.com\/images\/screen-shot-2021-01-07-at-1.22.26-pm.png\" data-lazy=\"true\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/neural\/2021\/01\/10\/heres-how-openais-magical-dall-e-generates-images-from-text-syndication\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fthenextweb.com%2Fneural%2F2021%2F01%2F10%2Fheres-how-openais-magical-dall-e-generates-images-from-text-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: A screenshot of the visual IQ test OpenAI used to test DALL\u00b7E\u00a0from https:\/\/openai.com\/blog\/dall-e\/.\" data-title=\"Share A screenshot of the visual IQ test OpenAI used to test DALL\u00b7E\u00a0from https:\/\/openai.com\/blog\/dall-e\/. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share A screenshot of the visual IQ test OpenAI used to test DALL\u00b7E\u00a0from https:\/\/openai.com\/blog\/dall-e\/. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>A screenshot of the visual IQ test OpenAI used to test DALL\u00b7E\u00a0from https:\/\/openai.com\/blog\/dall-e\/.<\/figcaption><\/figure>\n<p>\u201cDALL\u00b7E is often able to solve matrices that involve continuing simple patterns or basic geometric reasoning,\u201d write the authors, but it did better at some problems than others. When the puzzles\u2019s colors were inverted, DALL\u00b7E did worse\u2013\u201csuggesting its capabilities may be brittle in unexpected ways.\u201d<\/p>\n<h2 id=\"what-does-it-mean\"><span class=\"ez-toc-section\" id=\"What_does_it_mean\"><\/span>What does it mean?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>What strikes me the most about DALL\u00b7E is its ability to perform surprisingly well on so many different tasks, ones the authors didn\u2019t even anticipate:<\/p>\n<p>\u201cWe find that DALL\u00b7E [\u2026] is able to perform several kinds of image-to-image translation tasks when prompted in the right\u00a0way.<\/p>\n<p>We did not anticipate that this capability would emerge, and made no modifications to the neural network or training procedure to encourage it.\u201d<\/p>\n<p>It\u2019s amazing, but not wholly unexpected; DALL\u00b7E and GPT-3 are two examples of a greater <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">theme<\/a> in deep learning: that extraordinarily big neural networks trained on unlabeled internet data (an example of \u201cself-supervised learning\u201d) can be highly versatile, able to do lots of things weren\u2019t specifically designed for.<\/p>\n<p>Of course, don\u2019t mistake this for <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">general<\/a> intelligence. It\u2019s<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/lacker.io\/ai\/2020\/07\/06\/giving-gpt-3-a-turing-test.html\">not hard<\/a><span>\u00a0<\/span>to trick these types of models into looking pretty dumb. We\u2019ll know more when they\u2019re openly accessible and we can start playing around with them. But that doesn\u2019t mean I can\u2019t be excited in the meantime.<\/p>\n<p><i><span style=\"font-weight: 400;\">This <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/daleonai.com\/dalle-5-mins\">article<\/a> was written by <\/span><\/i><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/daleonai.com\/\"><i><span style=\"font-weight: 400;\">Dale Markowitz<\/span><\/i><\/a><i><span style=\"font-weight: 400;\">, an Applied AI Engineer at Google based in Austin, Texas, where she works on applying machine learning to new fields and industries. She also likes solving her own life problems with AI, and talks about it on <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/social-mediaa\/\" data-internallinksmanager029f6b8e52c=\"1\" title=\"Social Media\" target=\"_blank\" rel=\"noopener\">YouTube<\/a>.<\/span><\/i><\/p>\n<p class=\"c-post-pubDate\">\n                                    Published January 10, 2021 \u2014 11:00 UTC\n                                <\/p>\n<\/p><\/div>\n<p><script async src=\"\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><script data-src=\"https:\/\/connect.facebook.net\/en_US\/sdk.js#xfbml=1&amp;appId=378011798897423&amp;version=v2.6\" id=\"socialSrcFacebook\" type=\"text\/template\"><\/script><\/p>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMLG0nwswvr63Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\">For forums sites go to <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/forum.buradabiliyorum.com\/\" target=\"_blank\" rel=\"noopener\">Forum.BuradaBiliyorum.Com<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/neural\/2021\/01\/10\/heres-how-openais-magical-dall-e-generates-images-from-text-syndication\/\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;#Here\u2019s how OpenAI\u2019s magical DALL-E image generator works&#8221; It seems like every few months, someone publishes a machine learning paper or demo that makes my jaw drop. This month, it\u2019s OpenAI\u2019s new image-generating model,\u00a0DALL\u00b7E. This behemoth 12-billion-parameter neural network takes a text caption (i.e. \u201can armchair in the shape of an avocado\u201d) and generates images&#8230;<\/p>\n","protected":false},"author":1,"featured_media":151212,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/img-cdn.tnwcdn.com\/image\/neural?filter_last=1&fit=1280,640&url=https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/01\/1-copy-13.jpg&signature=26c3b1942685e4f52d0c49fc5f48536f","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-151211","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/151211","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=151211"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/151211\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/151212"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=151211"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=151211"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=151211"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}