{"id":432670,"date":"2022-04-16T14:00:50","date_gmt":"2022-04-16T11:00:50","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/dall-e-2-shows-the-power-of-generative-deep-learning-but-raises-dispute-over-ai-practices\/"},"modified":"2022-04-16T14:00:50","modified_gmt":"2022-04-16T11:00:50","slug":"dall-e-2-shows-the-power-of-generative-deep-learning-but-raises-dispute-over-ai-practices","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/dall-e-2-shows-the-power-of-generative-deep-learning-but-raises-dispute-over-ai-practices\/","title":{"rendered":"#DALL-E 2 shows the power of generative deep learning, but raises dispute over AI practices"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_85 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a3ad7298a044\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a3ad7298a044\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-1'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/dall-e-2-shows-the-power-of-generative-deep-learning-but-raises-dispute-over-ai-practices\/#%E2%80%9CDALL-E_2_shows_the_power_of_generative_deep_learning_but_raises_dispute_over_AI_practices%E2%80%9D\" >&#8220;DALL-E 2 shows the power of generative deep learning, but raises dispute over AI practices&#8221;<\/a><ul class='ez-toc-list-level-2' ><li class='ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/dall-e-2-shows-the-power-of-generative-deep-learning-but-raises-dispute-over-ai-practices\/#The_beauty_of_DALL-E_2\" >The beauty of DALL-E 2<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/buradabiliyorum.com\/en\/dall-e-2-shows-the-power-of-generative-deep-learning-but-raises-dispute-over-ai-practices\/#The_science_behind_DALL-E_2\" >The science behind DALL-E 2<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/buradabiliyorum.com\/en\/dall-e-2-shows-the-power-of-generative-deep-learning-but-raises-dispute-over-ai-practices\/#Disputes_over_deep_learning_and_AI_research\" >Disputes over deep learning and AI research<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/buradabiliyorum.com\/en\/dall-e-2-shows-the-power-of-generative-deep-learning-but-raises-dispute-over-ai-practices\/#The_business_case_for_DALL-E_2\" >The business case for DALL-E 2<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h1><span class=\"ez-toc-section\" id=\"%E2%80%9CDALL-E_2_shows_the_power_of_generative_deep_learning_but_raises_dispute_over_AI_practices%E2%80%9D\"><\/span>&#8220;DALL-E 2 shows the power of generative deep learning, but raises dispute over AI practices&#8221;<span class=\"ez-toc-section-end\"><\/span><\/h1>\n<div>\n                            <em>This article is part of our\u00a0coverage of the latest in\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/tag\/ai-research-papers\/\">AI research<\/a>.<\/em><\/p>\n<p>Artificial intelligence research lab OpenAI made headlines again, this time with DALL-E 2, a machine learning model that can generate stunning images from text de<a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">script<\/a>ions. DALL-E 2 builds on the success of its predecessor DALL-E and improves the quality and resolution of the output images thanks to advanced deep learning techniques.<\/p>\n<p>The announcement of DALL-E 2 was accompanied by a <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/social-mediaa\/\" data-internallinksmanager029f6b8e52c=\"1\" title=\"Social Media\" target=\"_blank\" rel=\"noopener\">social media<\/a> campaign by OpenAI\u2019s engineers and its CEO, Sam Altman, who shared wonderful photos created by the generative machine learning model on Twitter.<\/p>\n<p>DALL-E 2 shows how far the AI research community has come toward harnessing the power of deep learning and addressing some of its limits. It also provides an outlook of how generative deep learning models might finally unlock new creative applications for everyone to use. At the same time, it reminds us of some of the obstacles that remain in AI research and disputes that need to be settled.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"The_beauty_of_DALL-E_2\"><\/span>The beauty of DALL-E 2<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Like other milestone OpenAI announcements, DALL-E 2 comes with a<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/cdn.openai.com\/papers\/dall-e-2.pdf\">detailed paper<\/a><span>\u00a0<\/span>and an<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/openai.com\/dall-e-2\/\">interactive blog post<\/a><span>\u00a0<\/span>that shows how the machine learning model works. There\u2019s also a video that provides an overview of what the <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/technology\/\" data-internallinksmanager029f6b8e52c=\"4\" title=\"Technology\" target=\"_blank\" rel=\"noopener\">technology<\/a> is capable of doing and what its limitations are.<\/p>\n<p><iframe loading=\"lazy\" title=\"YouTube video player\" srcdoc=\"&lt;style&gt;*{padding:0;margin:0;overflow:hidden}html,body{background:#000;height:100%}img{position:absolute;top:0;left:0;width:100%;height:100%;object-fit:cover;transition:opacity .1s cubic-bezier(0.4,0,1,1)}a:hover img+img{opacity:1!important}&lt;\/style&gt;&lt;a href=\" https:=\"\" src=\"https:\/\/img.youtube.com\/vi\/qTgPSKKjfVg\/hqdefault.jpg\" style=\"top: 50%;left:50%;width:68px;height:48px;transform:translate3d(-50%,-50%,0)\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<p>\u00a0<\/p>\n<p>DALL-E 2 is a \u201cgenerative model,\u201d a special branch of machine learning that creates complex output instead of performing prediction or classification tasks on input data. You provide DALL-E 2 with a text description, and it generates an image that fits the description.<\/p>\n<p>Generative models are a hot area of research that received much attention with the introduction of<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2018\/05\/28\/generative-adversarial-networks-artificial-intelligence-ian-goodfellow\/\">generative adversarial networks<\/a><span>\u00a0<\/span>(GAN) in 2014. The field has seen tremendous improvements in recent years, and generative models have been used for a vast variety of tasks, including creating artificial faces,<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/09\/04\/what-is-deepfake\/\">deepfakes<\/a>, synthesized voices, and more.<\/p>\n<p>However, what sets DALL-E 2 apart from other generative models is its capability to maintain semantic consistency in the images it creates.<\/p>\n<p>For example, the following images (from the DALL-E 2 blog post) are generated from the description \u201cAn astronaut riding a horse.\u201d One of the descriptions ends with \u201cas a pencil drawing\u201d and the other \u201cin photorealistic style.\u201d<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-1384782\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.11.png\" alt=\"DALL-E 2\" width=\"886\" height=\"474\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.11.png 690w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.11-280x150.png 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.11-252x135.png 252w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.11-505x270.png 505w\" sizes=\"auto, (max-width: 886px) 100vw, 886px\"\/><\/figure>\n<\/div>\n<p>\u00a0<\/p>\n<p>The model remains consistent in drawing the astronaut sitting on the back of the horse and holding his\/her hands in front. This kind of consistency shows itself in most examples OpenAI has shared.<\/p>\n<p>The following examples (also from OpenAI\u2019s website) show another feature of DALL-E 2, which is to generate variations of an input image. Here, instead of providing DALL-E 2 with a text description, you provide it with an image, and it tries to generate other forms of the same image. Here, DALL-E maintains the relations between the elements in the image, including the girl, the laptop, the headphones, the cat, the city lights in the background, and the night sky with moon and clouds.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-1384784\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.27.png\" alt=\"DALL-E 2\" width=\"904\" height=\"480\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.27.png 691w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.27-280x149.png 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.27-254x135.png 254w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.27-508x270.png 508w\" sizes=\"auto, (max-width: 904px) 100vw, 904px\"\/><\/figure>\n<\/div>\n<p>\u00a0<\/p>\n<p>Other examples suggest that DALL-E 2 seems to understand depth and dimensionality, a great challenge for algorithms that process 2D images.<\/p>\n<p>Even if the examples on OpenAI\u2019s website were cherry-picked, they are impressive. And the examples shared on Twitter show that DALL-E 2 seems to have found a way to represent and reproduce the relationships between the elements that appear in an image, even when it is \u201cdreaming up\u201d something for the first time.<\/p>\n<blockquote class=\"twitter-tweet\">\n<p dir=\"ltr\" lang=\"en\">\u201ca raccoon astronaut with the cosmos reflecting on the glass of his helmet dreaming of the stars\u201d<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/twitter.com\/OpenAI?ref_src=twsrc%5Etfw\">@OpenAI<\/a> DALL-E 2 <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/t.co\/HkGDtVlOWX\">pic.twitter.com\/HkGDtVlOWX<\/a><\/p>\n<p>\u2014 Andrew Mayne (@AndrewMayne) <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/twitter.com\/AndrewMayne\/status\/1511827454536474626?ref_src=twsrc%5Etfw\">April 6, 2022<\/a><\/p>\n<\/blockquote>\n<figure class=\"wp-block-embed aligncenter is-type-rich is-provider-twitter wp-block-embed-twitter\">\n<\/figure>\n<p>In fact, to prove how good DALL-E 2 is, Altman took to Twitter and asked users to suggest prompts to feed to the generative model. The results (see the thread below) are fascinating.<\/p>\n<blockquote class=\"twitter-tweet\">\n<p dir=\"ltr\" lang=\"und\"><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/t.co\/VCA4WP6bkG\">pic.twitter.com\/VCA4WP6bkG<\/a><\/p>\n<p>\u2014 Sam Altman (@sama) <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/twitter.com\/sama\/status\/1511732024028983298?ref_src=twsrc%5Etfw\">April 6, 2022<\/a><\/p>\n<\/blockquote>\n<h2><span class=\"ez-toc-section\" id=\"The_science_behind_DALL-E_2\"><\/span>The science behind DALL-E 2<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>DALL-E 2 takes advantage of CLIP and diffusion models, two advanced deep learning techniques created in the past few years. But at its heart, it shares the same concept as all other<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/01\/28\/deep-learning-explainer\/\">deep neural networks<\/a>: representation learning.<\/p>\n<div class=\"wp-block-columns\">\nConsider an image classification model. The neural network transforms pixel colors into a set of numbers that represent its features. This vector is sometimes also called the \u201cembedding\u201d of the input. Those features are then mapped to the output layer, which contains a probability score for each class of image that the model is supposed to detect. During training, the neural network tries to learn the best feature representations that discriminate between the classes.\n<\/div>\n<p>Ideally, the machine learning model should be able to learn latent features that remain consistent across different lighting conditions, angles, and background environments. But as has often been seen, deep learning models often learn the wrong representations. For example, a neural network might think that green pixels are a feature of the \u201csheep\u201d class because all the images of sheep it has seen during training contain a lot of grass. Another model that has been trained on pictures of bats taken during the night might consider darkness a feature of all bat pictures and misclassify pictures of bats taken during the day. Other models might become sensitive to objects being centered in the image and placed in front of a certain type of background.<\/p>\n<p>Learning the wrong representations is partly why neural networks are brittle, sensitive to changes in the environment, and poor at <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">general<\/a>izing beyond their training data. It is also why neural networks trained for one application need to be<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2019\/06\/10\/what-is-transfer-learning\/\">finetuned for other applications<\/a> \u2014 the features of the final layers of the neural network are usually very task-specific and can\u2019t generalize to other applications.<\/p>\n<p>In theory, you could create a huge training dataset that contains all kinds of variations of data that the neural network should be able to handle. But creating and labeling such a dataset would require immense human effort and is practically impossible.<\/p>\n<p>This is the problem that<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/openai.com\/blog\/clip\/\">Contrastive Learning-Image Pre-training<\/a><span>\u00a0<\/span>(CLIP) solves. CLIP trains two neural networks in parallel on images and their captions. One of the networks learns the visual representations in the image and the other learns the representations of the corresponding text. During training, the two networks try to adjust their parameters so that similar images and descriptions produce similar embeddings.<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-1384786 js-lazy\" alt=\"DALL-E 2\" width=\"689\" height=\"428\" sizes=\"auto, (max-width: 689px) 100vw, 689px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.41.png\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.41.png 689w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.41-280x174.png 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.41-217x135.png 217w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.41-435x270.png 435w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2022%2F04%2F16%2Fdall-e-2-shows-power-of-generative-deep-learning-but-raises-dispute-over-ai-practices%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Contrastive Learning \u2014 Image Pre-training\" data-title=\"Share Contrastive Learning \u2014 Image Pre-training on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Contrastive Learning \u2014 Image Pre-training on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Contrastive Learning \u2014 Image Pre-training<\/figcaption><noscript><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-1384786\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.41.png\" alt=\"DALL-E 2\" width=\"689\" height=\"428\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.41.png 689w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.41-280x174.png 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.41-217x135.png 217w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.41-435x270.png 435w\"\/><\/noscript><\/figure>\n<p>One of the main benefits of CLIP is that it does not need its training data to be labeled for a specific application. It can be trained on the huge number of images and loose descriptions that can be found on the web. Additionally, without the rigid boundaries of classic categories, CLIP can learn more flexible representations and generalize to a wide variety of tasks. For example, if an image is described as \u201ca boy hugging a puppy\u201d and another described as \u201ca boy riding a pony,\u201d the model will be able to learn a more robust representation of what a \u201cboy\u201d is and how it relates to other elements in images.<\/p>\n<p>CLIP has already proven to be very useful for<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/08\/12\/what-is-one-shot-learning\/\">zero-shot and few-shot learning<\/a>, where a machine learning model is shown on-the-fly to perform tasks that it hasn\u2019t been trained for.<\/p>\n<p>The other machine learning technique used in DALL-E 2 is \u201cdiffusion,\u201d a kind of generative model that learns to create images by gradually noising and denoising its training examples.<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/benanne.github.io\/2022\/01\/31\/diffusion.html\">Diffusion models are like autoencoders<\/a>, which transform input data into an embedding representation and then reproduce the original data from the embedding information.<\/p>\n<p>DALL-E trains a CLIP model on images and captions. It then uses the CLIP model to train the diffusion model. Basically, the diffusion model uses the CLIP model to generate the embeddings for the text prompt and its corresponding image. It then tries to generate the image that corresponds to the text.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><figcaption>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-1384787 js-lazy\" alt=\"DALL-E 2\" width=\"911\" height=\"568\" sizes=\"auto, (max-width: 911px) 100vw, 911px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.51.png\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.51.png 690w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.51-280x174.png 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.51-217x135.png 217w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.51-433x270.png 433w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2022%2F04%2F16%2Fdall-e-2-shows-power-of-generative-deep-learning-but-raises-dispute-over-ai-practices%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: DALL-E 2 architecture\" data-title=\"Share DALL-E 2 architecture on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share DALL-E 2 architecture on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>DALL-E 2 architecture<\/figcaption><noscript><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-1384787\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.51.png\" alt=\"DALL-E 2\" width=\"911\" height=\"568\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.51.png 690w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.51-280x174.png 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.51-217x135.png 217w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Screen-Shot-2022-04-15-at-15.33.51-433x270.png 433w\"\/><\/noscript><\/figure>\n<\/figcaption><\/figure>\n<\/div>\n<h2><span class=\"ez-toc-section\" id=\"Disputes_over_deep_learning_and_AI_research\"><\/span>Disputes over deep learning and AI research<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>For the moment, DALL-E 2 will only be made available to a limited number of users who have signed up for the waitlist. Since the release of<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2019\/02\/25\/openai-artificial-intelligence-fake-news\/\">GPT-2<\/a>, OpenAI has been reluctant to release its AI models to the public. GPT-3, its most advanced language model, is only available<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/08\/17\/openai-gpt-3-commercial-ai\/\">through an API interface<\/a>. There\u2019s no access to the actual code and parameters of the model.<\/p>\n<p>OpenAI\u2019s policy of not releasing its models to the public has not rested well with the AI community and has attracted criticism from some renowned figures in the field.<\/p>\n<blockquote class=\"twitter-tweet\">\n<p dir=\"ltr\" lang=\"en\">The evolution of API for running cutting edge AI:<br \/>\u2013 run it on your own machine<br \/>\u2013 run it in the cloud<br \/>\u2013 apply pay for and query an api endpoint<br \/>\u2013 pretty please ask one of the authors to run it for you on Twitter<br \/>\ud83e\udd72<\/p>\n<p>\u2014 Andrej Karpathy (@karpathy) <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/twitter.com\/karpathy\/status\/1512117132716355590?ref_src=twsrc%5Etfw\">April 7, 2022<\/a><\/p>\n<\/blockquote>\n<p>DALL-E 2 has also resurfaced some of the longtime disagreements over the preferred approach toward<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/05\/13\/what-is-artificial-general-intelligence-agi\/\">artificial general intelligence<\/a>. OpenAI\u2019s latest innovation has certainly proven that with the right architecture and inductive biases, you can still squeeze more out of neural networks.<\/p>\n<p>Proponents of pure deep learning approaches jumped on the opportunity to slight their critics, including a recent essay by cognitive scientist Gary Marcus titled, \u201c<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/nautil.us\/deep-learning-is-hitting-a-wall-14467\/\">Deep Learning is Hitting a Wall<\/a>.\u201d Marcus endorses a<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/03\/04\/gary-marcus-hybrid-ai\/\">hybrid approach<\/a><span>\u00a0<\/span>that combines neural networks with symbolic systems.<\/p>\n<blockquote class=\"twitter-tweet\">\n<p dir=\"ltr\" lang=\"en\">What I slightly\u2026respect?\u2026is the willingness to continue to double down in the face of increasing evidence over years and years, and to create such a public record of it. <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/t.co\/r3xbGctWeY\">https:\/\/t.co\/r3xbGctWeY<\/a><\/p>\n<p>\u2014 Sam Altman (@sama) <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/twitter.com\/sama\/status\/1512470178038124544?ref_src=twsrc%5Etfw\">April 8, 2022<\/a><\/p>\n<\/blockquote>\n<p>Based on the examples that have been shared by the OpenAI team, DALL-E 2 seems to manifest some of the commonsense capabilities that have<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/07\/26\/ai-visual-reasoning-agent-dataset\/\">so long been missing in deep learning<\/a><span>\u00a0<\/span>systems. But it remains to be seen how deep this commonsense and semantic stability goes, and how DALL-E 2 and its successors will deal with more complex concepts such as compositionally.<\/p>\n<p>The DALL-E 2 paper mentions some of the limitations of the model in generating text and complex scenes. Responding to the many tweets directed his way, Marcus pointed out that the DALL-E 2 paper in fact proves some of the points he has been making in his papers and essays.<\/p>\n<blockquote class=\"twitter-tweet\">\n<p dir=\"ltr\" lang=\"en\">Compositionality *is* the wall.<\/p>\n<p>Even \u201cred cube\u201d and \u201cblue cube\u201d on their own are represented unreliably; not one of ten images correctly captures the full phrasal description.<\/p>\n<p>The images are beautiful, but no match for the precision of language. <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/t.co\/uvoXUtETwi\">https:\/\/t.co\/uvoXUtETwi<\/a><\/p>\n<p>\u2014 Gary Marcus \ud83c\uddfa\ud83c\udde6 (@GaryMarcus) <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/twitter.com\/GaryMarcus\/status\/1512647983317151747?ref_src=twsrc%5Etfw\">April 9, 2022<\/a><\/p>\n<\/blockquote>\n<p>Some scientists have pointed out that despite the fascinating results of DALL-E 2, some of the key challenges of artificial intelligence remain unsolved. Melanie Mitchell, Professor of Complexity at the Santa Fe Institute and author of\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/01\/13\/melanie-mitchell-ai-guide-for-thinking-humans\/\"><em>Artificial Intelligence: A Guide For Thinking Humans<\/em><\/a>, raised some important questions in a Twitter thread.<\/p>\n<p>Mitchell referred to<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"http:\/\/www.foundalis.com\/res\/bps\/bpidx.htm\">Bongard problems<\/a>, a set of challenges that test the understanding of concepts such as sameness, adjacency, numerosity, concavity\/convexity, and closedness\/openness.<\/p>\n<blockquote class=\"twitter-tweet\">\n<p dir=\"ltr\" lang=\"en\">Very impressive\u2014indeed, awe-inspiring\u2014AI demos this last week, e.g., from OpenAI (image generation) and Google (text generation).<\/p>\n<p>These demos seem to convince many people that current AI is getting closer and closer to human-level intelligence. \ud83e\uddf5<\/p>\n<p>(1\/8)<\/p>\n<p>\u2014 Melanie Mitchell (@MelMitchell1) <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/twitter.com\/MelMitchell1\/status\/1512455146248224770?ref_src=twsrc%5Etfw\">April 8, 2022<\/a><\/p>\n<\/blockquote>\n<p>\u201cWe humans can solve these visual puzzles due to our core knowledge of basic concepts and our abilities of flexible abstraction and analogy,\u201d Mitchell tweeted. \u201cIf such an AI system were created, I would be convinced that the field is making real progress on human-level intelligence. Until then, I will admire the impressive products of machine learning and big data, but will not mistake them for progress toward general intelligence.\u201d<\/p>\n<h2><span class=\"ez-toc-section\" id=\"The_business_case_for_DALL-E_2\"><\/span>The business case for DALL-E 2<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Since switching from non-profit to a \u201ccapped profit\u201d structure, OpenAI has been trying to<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/05\/31\/microsoft-gpt-3-and-the-future-of-openai\/\">find the balance<\/a><span>\u00a0<\/span>between scientific research and product development. The company\u2019s strategic partnership with Microsoft has given it solid channels to monetize some of its technologies, including<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/11\/05\/microsoft-azure-openai-service-gpt-3\/\">GPT-3<\/a><span>\u00a0<\/span>and<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/08\/13\/openai-codex-api\/\">Codex<\/a>.<\/p>\n<p>In a<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/blog.samaltman.com\/dall-star-e-2\">blog<span>\u00a0<\/span><\/a>post, Altman suggested a possible DALL-E 2 product launch in the summer. Many analysts are already suggesting applications for DALL-E 2, such as creating graphics for articles (I could certainly use some for mine) and doing basic edits on images. DALL-E 2 will enable more people to express their creativity without the need for special skills with tools.<\/p>\n<p>Altman suggests that advances in AI are taking us toward \u201ca world in which good ideas are the limit for what we can do, not specific skills.\u201d<\/p>\n<p>In any case, the more interesting applications of DALL-E will surface as more and more users tinker with it. For example,<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/07\/05\/openai-github-gpt-3-copilot\/\">the idea for Copilot and Codex<\/a><span>\u00a0<\/span>emerged as users started using GPT-3 to generate source code for software.<\/p>\n<p>If OpenAI releases a paid API service a la GPT-3, then more and more people will be able to build apps with DALL-E 2 or integrate the technology into existing applications. But<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/01\/25\/gpt-3-startups-businesses\/\">as was the case with GPT-3<\/a>, building a business model around a potential DALL-E 2 product will have its own unique challenges. A lot of it will depend on the costs of training and running DALL-E 2, the details of which have not been published yet.<\/p>\n<p>And as the exclusive license holder to GPT-3\u2019s technology,<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/09\/24\/microsoft-openai-gpt-3-license\/\">Microsoft will be the main winner<\/a><span>\u00a0<\/span>of any innovation built on top of DALL-E 2 because it will be able to do it faster and cheaper. Like GPT-3, DALL-E 2 is a reminder that as the AI community continues to gravitate toward creating<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2019\/11\/25\/ai-research-neural-networks-compute-costs\/\">larger neural networks trained on ever-larger training datasets<\/a>, power will continue to be consolidated in a few very wealthy companies that have the financial and technical resources needed for AI research.<\/p>\n<p><em>This article was originally published by Ben Dickson on<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/\">TechTalks<\/a>, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech, and what we need to look out for. You can read the original article<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2022\/04\/11\/openai-dall-e-2\/\">here<\/a>.<\/em>\n                        <\/div>\n<p><script async src=\"\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMLG0nwswvr63Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\">For forums sites go to <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/forum.buradabiliyorum.com\/\" target=\"_blank\" rel=\"noopener\">Forum.BuradaBiliyorum.Com<\/a><\/span><\/strong>\n<\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/news\/dall-e-2-shows-power-of-generative-deep-learning-but-raises-dispute-over-ai-practices\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;DALL-E 2 shows the power of generative deep learning, but raises dispute over AI practices&#8221; This article is part of our\u00a0coverage of the latest in\u00a0AI research. Artificial intelligence research lab OpenAI made headlines again, this time with DALL-E 2, a machine learning model that can generate stunning images from text descriptions. DALL-E 2 builds on&#8230;<\/p>\n","protected":false},"author":1,"featured_media":432671,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/img-cdn.tnwcdn.com\/image\/neural?filter_last=1&fit=1280,640&url=https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/04\/Untitled-design-84-2.jpg&signature=3681cc767a6d909013fa0af8c6b3a6e3","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-432670","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/432670","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=432670"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/432670\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/432671"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=432670"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=432670"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=432670"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}