{"id":301506,"date":"2021-07-17T11:00:26","date_gmt":"2021-07-17T08:00:26","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/dont-mistake-openai-codex-for-a-programmer\/"},"modified":"2021-07-17T11:00:26","modified_gmt":"2021-07-17T08:00:26","slug":"dont-mistake-openai-codex-for-a-programmer","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/dont-mistake-openai-codex-for-a-programmer\/","title":{"rendered":"#Don\u2019t mistake OpenAI Codex for a programmer"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_85 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a3ab7781fc9c\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a3ab7781fc9c\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/dont-mistake-openai-codex-for-a-programmer\/#The_%E2%80%9Cno_free_lunch%E2%80%9D_theorem\" >The \u201cno free lunch\u201d theorem<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/dont-mistake-openai-codex-for-a-programmer\/#Size_vs_cost\" >Size vs cost<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/buradabiliyorum.com\/en\/dont-mistake-openai-codex-for-a-programmer\/#Generating_vs_understanding_code\" >Generating vs understanding code<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/buradabiliyorum.com\/en\/dont-mistake-openai-codex-for-a-programmer\/#Responsible_use_and_reporting_of_AI\" >Responsible use and reporting of AI<\/a><\/li><\/ul><\/nav><\/div>\n<p>&#8220;<strong>#Don\u2019t mistake OpenAI Codex for a programmer<\/strong>&#8221;<\/p>\n<div>In a new paper, researchers at OpenAI have revealed details about Codex, a deep learning model that generates software source code. Codex powers Copilot, an \u201c<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/07\/05\/openai-github-gpt-3-copilot\/\">AI pair programmer<\/a>\u201d tool developed jointly by OpenAI and GitHub. Copilot is currently available in beta test mode to a limited number of users.<\/p>\n<p>The<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/2107.03374\">paper<\/a><span>\u00a0<\/span>is a fascinating read that explains the process through which the scientists at OpenAI managed to repurpose their flagship language model GPT-3 to create Codex. But more importantly, the paper also sheds much-needed light on how far you can trust deep learning in programming.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"The_%E2%80%9Cno_free_lunch%E2%80%9D_theorem\"><\/span>The \u201cno free lunch\u201d theorem<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Codex is a descendent of<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/08\/17\/openai-gpt-3-commercial-ai\/\">GPT-3<\/a>, a massive deep learning language model release last year. The complexity of<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/01\/28\/deep-learning-explainer\/\">deep learning models<\/a><span>\u00a0<\/span>is often measured by the number of parameters they have. In <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">general<\/a>, a model\u2019s learning capacity increases with the number of parameters. GPT-3 came with 175 billion parameters, more than two orders of magnitude larger than its predecessor,<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2019\/02\/25\/openai-artificial-intelligence-fake-news\/\">GPT-2<\/a><span>\u00a0<\/span>(1.5 billion parameters). GPT-3 was trained on more than 600 gigabytes, more than 50 times larger than GPT-2\u2019s training dataset.<\/p>\n<p>Aside from the huge increase in size, the main innovation of GPT-3 was \u201c<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/08\/12\/what-is-one-shot-learning\/\">few-shot learning<\/a>,\u201d the capability to perform tasks it wasn\u2019t trained for. The<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/2005.14165v1\">paper that introduced GPT-3<\/a><span>\u00a0<\/span>was titled \u201cLanguage Models are Few-Shot Learners\u201d and stated: \u201cHere we show<span>\u00a0<\/span><strong>that scaling up language models greatly improves task-agnostic, few-shot performance<\/strong><span>\u00a0<\/span>[emphasis mine], sometimes even reaching competitiveness with prior state-of-the-art fine-tuning <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>roaches.\u201d<\/p>\n<p>Basically, the premise was a large-enough model trained on a large corpus of text can match or outperform several models that are specialized for specific tasks.<\/p>\n<p>But according to the new paper by OpenAI, none of the various versions of GPT-3 were able to solve any of the coding problems used to evaluate Codex. To be fair, there were no coding samples in GPT-3\u2019s training dataset, so we can\u2019t expect it to be able to code. But the OpenAI scientists also tested GPT-J, a 6 billion-parameter model trained on<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/2101.00027\">The Pile<\/a>, an 800-gigabyte dataset that includes 95 gigabytes of GitHub and 32 gigabytes of StackExchange data. Opesolved 11.4 percent of the coding problems. Codex, a version of GPT-3\u2019s 12-billion parameter fine-tuned on 159 gigabytes of code examples from GitHub, solved 28.8 percent of the problems. A separate version of Codex, called Codex-S, which was fine-tuned through supervised learning boosted the performance to 37.7 percent (other GPT and Codex models are trained through<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/02\/10\/unsupervised-learning-vs-supervised-learning\/\">unsupervised learning<\/a>).<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-1360951 js-lazy\" alt=\"Codex can solve a large number of coding challenges. A version of the model finetuned with supervised learning (Codex-S) further improves performance.\" width=\"644\" height=\"420\" sizes=\"auto, (max-width: 644px) 100vw, 644px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD1-1.jpeg\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD1-1.jpeg 644w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD1-1-280x183.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD1-1-414x270.jpeg 414w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD1-1-207x135.jpeg 207w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F07%2F17%2Fdont-mistake-openai-codex-for-a-programmer-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Codex can solve a large number of coding challenges. A version of the model fine-tuned with supervised learning (Codex-S) further improves performance.\" data-title=\"Share Codex can solve a large number of coding challenges. A version of the model fine-tuned with supervised learning (Codex-S) further improves performance. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Codex can solve a large number of coding challenges. A version of the model fine-tuned with supervised learning (Codex-S) further improves performance. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Codex can solve a large number of coding challenges. A version of the model fine-tuned with supervised learning (Codex-S) further improves performance.<\/figcaption><noscript><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-1360951\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD1-1.jpeg\" alt=\"Codex can solve a large number of coding challenges. A version of the model finetuned with supervised learning (Codex-S) further improves performance.\" width=\"644\" height=\"420\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD1-1.jpeg 644w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD1-1-280x183.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD1-1-414x270.jpeg 414w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD1-1-207x135.jpeg 207w\"\/><\/noscript><\/figure>\n<p>Codex proves that machine learning is still ruled by the \u201c<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/en.wikipedia.org\/wiki\/No_free_lunch_theorem\">no free lunch<\/a>\u201d theorem (NFL), which means that generalization comes at the cost of performance. In other words, machine learning models are more accurate when they are designed to solve one specific problem. On the other hand, when their problem domain is broadened, their performance decreases.<\/p>\n<p>Codex can perform one specialized task (transforming function descriptions and signatures into source code) with high accuracy at the cost of poor<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2018\/02\/20\/ai-machine-learning-nlg-nlp\/\">natural language processing<\/a><span>\u00a0<\/span>capabilities. On the other hand, GPT-3 is a general language model that can generate decent text about a lot of topics (including complicated programming concepts) but can\u2019t write a single line of code.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Size_vs_cost\"><\/span>Size vs cost<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<div class=\"wp-block-image\">\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1360952 js-lazy\" alt=\"Coins\" width=\"696\" height=\"464\" sizes=\"auto, (max-width: 696px) 100vw, 696px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD2-1.jpeg\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD2-1.jpeg 696w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD2-1-280x187.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD2-1-405x270.jpeg 405w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD2-1-203x135.jpeg 203w\"\/><noscript><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1360952\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD2-1.jpeg\" alt=\"Coins\" width=\"696\" height=\"464\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD2-1.jpeg 696w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD2-1-280x187.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD2-1-405x270.jpeg 405w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD2-1-203x135.jpeg 203w\"\/><\/noscript><\/figure>\n<\/div>\n<p>The experiments of OpenAI\u2019s researchers show that the performance of Codex improved as they increased the size of the machine learning model. At 300 million parameters, Codex solved 13.2 percent of the evaluation problems against the 28.8 percent performance of the 12-billion-parameter model.<\/p>\n<p>But the full version of GPT-3 is 175 billion parameters, a full order of magnitude larger than the one used to create Codex. Wouldn\u2019t training the larger model on the Codex training data yield better results?<\/p>\n<p>One probable reason for stopping at 12 billion could be the dataset size. A larger Codex model would need a larger dataset. Training it on the 159-gigabyte corpus would probably cause overfitting, where the model becomes very good at memorizing and rehearsing its training examples and very bad at dealing with novel situations. Gathering and maintaining larger datasets is an expensive and time-consuming process.<\/p>\n<p>An equally vexing problem would be the cost of Codex. Aside from a scientific experiment, Codex was supposed to become the backbone of a future product that can turn in profits for a research lab that is<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/05\/31\/microsoft-gpt-3-and-the-future-of-openai\/\">quasi-owned<\/a><span>\u00a0<\/span>by a commercial entity. As I\u2019ve already discussed before, the costs of training and running the 175-billion GPT-3 model would make it very hard to develop<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/09\/21\/gpt-3-economy-business-model\/\">a profitable business model<\/a><span>\u00a0<\/span>around it.<\/p>\n<p>However, a smaller but fine-tuned version of GPT-3 would be much more manageable in terms of profits and losses.<\/p>\n<p>Finally, as OpenAI\u2019s experiments show, Codex\u2019s size\/performance ratio follows a logarithmic scale. This means that performance gains gradually reduce as you increase the size of the model. Therefore, the added costs of gathering data and training and running the larger model might not be worth the small performance boost.<\/p>\n<p>And note that code generation is a very lucrative market. Given the high hourly salaries of programmers, even saving a few hours\u2019 worth of coding time per month would be enough to cover the subscription fees of Codex. In other domains where labor is less expensive, automating tasks with large language models will be more challenging from a profit and loss perspective.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Generating_vs_understanding_code\"><\/span>Generating vs understanding code<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<div class=\"wp-block-image\">\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-1360953 js-lazy\" alt=\"Github Copilot\" width=\"696\" height=\"392\" sizes=\"auto, (max-width: 696px) 100vw, 696px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD3-1.jpeg\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD3-1.jpeg 696w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD3-1-280x158.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD3-1-479x270.jpeg 479w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD3-1-240x135.jpeg 240w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F07%2F17%2Fdont-mistake-openai-codex-for-a-programmer-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: GitHub Copilot.\" data-title=\"Share GitHub Copilot. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share GitHub Copilot. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>GitHub Copilot.<\/figcaption><noscript><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-1360953\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD3-1.jpeg\" alt=\"Github Copilot\" width=\"696\" height=\"392\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD3-1.jpeg 696w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD3-1-280x158.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD3-1-479x270.jpeg 479w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD3-1-240x135.jpeg 240w\"\/><\/noscript><\/figure>\n<p>One thing that needs to be reminded is that, no matter how fascinating Codex\u2019s output is, the deep learning model does not understand programming. Like all other\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/07\/12\/linguistics-for-the-age-of-ai\/\">deep learning\u2013based language models<\/a>, Codex is capturing statistical correlations between code fragments.<\/p>\n<p>In their paper, the OpenAI scientists acknowledge that Codex \u201cis not sample efficient to train\u201d and that \u201ceven seasoned developers do not encounter anywhere near this amount of code over their careers.\u201d<\/p>\n<p>They further add that \u201ca strong student who completes an introductory computer <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/sciencee\/\" data-internallinksmanager029f6b8e52c=\"5\" title=\"Science\" target=\"_blank\" rel=\"noopener\">science<\/a> course is expected to be able to solve a larger fraction of problems than Codex-12B.\u201d<\/p>\n<p>Here\u2019s an interesting excerpt from the paper: \u201cWe sample tokens from Codex until we encounter one of the following stop sequences: \u2018nclass\u2019, \u2018ndef\u2019, \u2018n#\u2019, \u2018nif\u2019, or \u2018nprint\u2019, since the model will continue generating additional functions or statements otherwise.\u201d<\/p>\n<p>This means that Codex will mindlessly continue to generate code even if it has already finished the block that addresses the problem stated in the prompt.<\/p>\n<p>This is a scheme that works well when you want to solve simple problems that recur time and again. But when you zoom out and try to write a large program that tackles a problem that must be solved in multiple steps, the limits of Codex become evident.<\/p>\n<p>OpenAI\u2019s scientists found that as the number of components in the function description increased, the model\u2019s performance decreased exponentially.<\/p>\n<p>\u201cThis behavior is uncharacteristic of a human programmer, who should be able to correctly implement a program for a chain of arbitrary length if they can do so for a chain of length two,\u201d the researchers write in their paper.<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-1360954 js-lazy\" alt=\"OpenAI\u2019s Codex fails at coding problems that require the synthesis of multiple components.\" width=\"657\" height=\"420\" sizes=\"auto, (max-width: 657px) 100vw, 657px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD4-1.jpeg\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD4-1.jpeg 657w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD4-1-280x179.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD4-1-422x270.jpeg 422w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD4-1-211x135.jpeg 211w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F07%2F17%2Fdont-mistake-openai-codex-for-a-programmer-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: OpenAI\u2019s Codex fails at coding problems that require the synthesis of multiple components.\" data-title=\"Share OpenAI\u2019s Codex fails at coding problems that require the synthesis of multiple components. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share OpenAI\u2019s Codex fails at coding problems that require the synthesis of multiple components. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>OpenAI\u2019s Codex fails at coding problems that require the synthesis of multiple components.<\/figcaption><noscript><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-1360954\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD4-1.jpeg\" alt=\"OpenAI\u2019s Codex fails at coding problems that require the synthesis of multiple components.\" width=\"657\" height=\"420\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD4-1.jpeg 657w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD4-1-280x179.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD4-1-422x270.jpeg 422w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD4-1-211x135.jpeg 211w\"\/><\/noscript><\/figure>\n<p>Further exposing Codex\u2019s lack of understanding of program structure and code is the fact that it \u201ccan recommend syntactically incorrect or undefined code, and can invoke functions, variables, and attributes that are undefined or outside the scope of the codebase,\u201d according to the paper. Practically, this means that in some cases, the machine learning model will stitch together different pieces of code it has previously seen, even if they don\u2019t fit together.<\/p>\n<p>In their paper, the researchers also discuss \u201cmisalignment\u201d issues in Codex, where the model can solve a specific problem but doesn\u2019t do so due to various mistakes. Codex uses the contents of the file you\u2019re working on as context to generate its output. If your code contains subtle bugs (which is quite normal if you\u2019re a human programmer), Codex may \u201cdeliberately\u201d suggest code that superficially appears good but is incorrect, the researchers warn.<\/p>\n<p>Misalignment is an interesting phenomenon that needs further study. But OpenAI\u2019s experiments further show that \u201cmisalignment would likely persist and even get worse if data, parameters, and training time were scaled up,\u201d which might be another reason for keeping the model\u2019s size balanced at 12 billion parameters.<\/p>\n<p>The paper also talks extensively about the possibility for Codex to produce deprecated and vulnerable code (which is worthy of a separate article, so I didn\u2019t discuss it here).<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-1360955 js-lazy\" alt=\"OpenAI\u2019s Codex might make deliberate mistakes if the context of the prompt contains subtle mistakes\" width=\"696\" height=\"399\" sizes=\"auto, (max-width: 696px) 100vw, 696px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD5.jpeg\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD5.jpeg 696w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD5-280x161.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD5-471x270.jpeg 471w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD5-235x135.jpeg 235w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F07%2F17%2Fdont-mistake-openai-codex-for-a-programmer-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: OpenAI\u2019s Codex might make deliberate mistakes if the context of the prompt contains subtle mistakes.\" data-title=\"Share OpenAI\u2019s Codex might make deliberate mistakes if the context of the prompt contains subtle mistakes. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share OpenAI\u2019s Codex might make deliberate mistakes if the context of the prompt contains subtle mistakes. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>OpenAI\u2019s Codex might make deliberate mistakes if the context of the prompt contains subtle mistakes.<\/figcaption><noscript><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-1360955\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD5.jpeg\" alt=\"OpenAI\u2019s Codex might make deliberate mistakes if the context of the prompt contains subtle mistakes\" width=\"696\" height=\"399\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD5.jpeg 696w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD5-280x161.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD5-471x270.jpeg 471w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BD5-235x135.jpeg 235w\"\/><\/noscript><\/figure>\n<h2><span class=\"ez-toc-section\" id=\"Responsible_use_and_reporting_of_AI\"><\/span>Responsible use and reporting of AI<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/07\/05\/openai-github-gpt-3-copilot\/\">As I said after the release of Copilot<\/a>, \u201cAI Pair Programmer,\u201d the term used on GitHub\u2019s webpage for Copilot, is inaccurate.<\/p>\n<p>Codex is not a programmer. And it\u2019s also not going to take your job (if you\u2019re a programmer). Coding is just part of what programmers do. OpenAI\u2019s scientists observe that in its current state Codex \u201cmay somewhat reduce the cost of producing software by increasing programmer productivity,\u201d but it won\u2019t replace the other tasks that software developers regularly do, such as \u201cconferring with colleagues, writing design specifications, and upgrading existing software stacks.\u201d<\/p>\n<p>Mistaking Codex for a programmer can also lead to \u201cover-reliance,\u201d where a programmer blindly approves any code generated by the model without revising it. Given the obvious and subtle mistakes Codex can make, overlooking this threat can entail quality and security risks. \u201cHuman oversight and vigilance is required for safe use of code generation systems like Codex,\u201d OpenAI\u2019s researchers warn in their paper.<\/p>\n<p>Overall, the reaction of the programmer community shows that Codex is a very useful tool with a possibly huge impact on the future of the software industry. At the same time, given the hype surrounding the release of Copilot, it is important to understand its unwanted implications. In this regard, it is worth commending the folks at OpenAI for responsibly studying, documenting, and reporting the limits and threats of Codex.<\/p>\n<\/div>\n<p><i><span>This article was originally published by Ben Dickson on\u00a0<\/span><\/i><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/\"><i><span>TechTalks<\/span><\/i><\/a><i><span>, a publication that examines trends in <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/technology\/\" data-internallinksmanager029f6b8e52c=\"4\" title=\"Technology\" target=\"_blank\" rel=\"noopener\">technology<\/a>, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech, and what we need to look out for. You can read the original article <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/07\/15\/openai-codex-ai-programming\/\">here<\/a>.<\/span><\/i><\/p>\n<\/div>\n<p><script async src=\"\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMLG0nwswvr63Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\">For forums sites go to <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/forum.buradabiliyorum.com\/\" target=\"_blank\" rel=\"noopener\">Forum.BuradaBiliyorum.Com<\/a><\/span><\/strong>\n<\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/news\/dont-mistake-openai-codex-for-a-programmer-syndication\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;#Don\u2019t mistake OpenAI Codex for a programmer&#8221; In a new paper, researchers at OpenAI have revealed details about Codex, a deep learning model that generates software source code. Codex powers Copilot, an \u201cAI pair programmer\u201d tool developed jointly by OpenAI and GitHub. Copilot is currently available in beta test mode to a limited number of&#8230;<\/p>\n","protected":false},"author":1,"featured_media":301507,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/img-cdn.tnwcdn.com\/image\/neural?filter_last=1&fit=1280,640&url=https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/07\/BDHed1.jpg&signature=23f73d04f4dabeaafce68bb39236eaba","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-301506","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/301506","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=301506"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/301506\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/301507"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=301506"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=301506"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=301506"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}