{"id":643056,"date":"2024-10-31T12:00:24","date_gmt":"2024-10-31T09:00:24","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/can-openais-strawberry-program-deceive-humans-2\/"},"modified":"2024-10-31T12:00:24","modified_gmt":"2024-10-31T09:00:24","slug":"can-openais-strawberry-program-deceive-humans-2","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/can-openais-strawberry-program-deceive-humans-2\/","title":{"rendered":"#Can OpenAI\u2019s Strawberry program deceive humans?"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_84 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a29a5da22a9f\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a29a5da22a9f\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/can-openais-strawberry-program-deceive-humans-2\/#True_intentions\" >True intentions<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/can-openais-strawberry-program-deceive-humans-2\/#Powers_of_persuasion\" >Powers of persuasion<\/a><\/li><\/ul><\/nav><\/div>\n<div>\nOpenAI, the company that made ChatGPT, has launched a new artificial intelligence (AI) system called <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/openai.com\/index\/learning-to-reason-with-llms\/\">Strawberry<\/a>. It is designed not just to provide quick responses to questions, like ChatGPT, but to think or \u201creason\u201d.<\/p>\n<p>This raises several major concerns. If Strawberry really is capable of some form of reasoning, could this AI system cheat and deceive humans?<\/p>\n<p>OpenAI can program the AI in ways that mitigate its ability to manipulate humans. But <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/cdn.openai.com\/o1-system-card.pdf\">the company\u2019s own evaluations<\/a> rate it as a \u201cmedium risk\u201d for its ability to assist experts in the \u201coperational planning of reproducing a known biological threat\u201d \u2013 in other words, a biological weapon. It was also rated as a medium risk for its ability to persuade humans to change their thinking.<\/p>\n<p>It remains to be seen how such a system might be used by those with bad intentions, such as con artists or hackers. Nevertheless, OpenAI\u2019s evaluation states that medium-risk systems can be released for wider use \u2013 a position I believe is misguided.<\/p>\n<p>Strawberry is not one AI \u201cmodel\u201d, or program, but several \u2013 known collectively as o1. These models <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.theverge.com\/2024\/9\/12\/24242439\/openai-o1-model-reasoning-strawberry-chatgpt\">are intended to<\/a> answer complex questions and solve intricate maths problems. They are also capable of writing computer code \u2013 to help you make your own website or <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>, for example.<\/p>\n<p>An apparent ability to reason might come as a surprise to some, since this is <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">general<\/a>ly considered a precursor to judgment and decision making \u2013 something that has often seemed a distant goal for AI. So, on the surface at least, it would seem to move artificial intelligence a step closer to human-like intelligence.<\/p>\n<p>When things look too good to be true, there\u2019s often a catch. Well, this set of new AI models is designed to maximise their goals. What does this mean in practice? To achieve its desired objective, the path or the strategy chosen by AI may <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.vox.com\/future-perfect\/371827\/openai-chatgpt-artificial-intelligence-ai-risk-strawberry\">not always necessarily be fair<\/a>, or align with human values.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"True_intentions\"><\/span>True intentions<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>For example, if you were to play chess against Strawberry, in theory, could its reasoning allow it to <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.thestack.technology\/openais-unripe-strawberry-model-hacked-its-testing-infrastructure\/\">hack the scoring system<\/a> rather than figure out the best strategies for winning the <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/game\/\" data-internallinksmanager029f6b8e52c=\"7\" title=\"Game\" target=\"_blank\" rel=\"noopener\">game<\/a>?<\/p>\n<p>The AI might also be able to lie to humans about its true intentions and capabilities, which would pose a serious safety concern if it were to be deployed widely. For example, if the AI knew it was infected with malware, could it \u201cchoose\u201d to <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.vox.com\/future-perfect\/371827\/openai-chatgpt-artificial-intelligence-ai-risk-strawberry\">conceal this fact<\/a> in the knowledge that a human operator might opt to disable the whole system if they knew?<\/p>\n<figure class=\"align-center \"><img decoding=\"async\" src=\"https:\/\/images.theconversation.com\/files\/621408\/original\/file-20240924-18-2b8gp0.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;fit=clip\" sizes=\"(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px\" srcset=\"https:\/\/images.theconversation.com\/files\/621408\/original\/file-20240924-18-2b8gp0.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=600&amp;h=400&amp;fit=crop&amp;dpr=1 600w, https:\/\/images.theconversation.com\/files\/621408\/original\/file-20240924-18-2b8gp0.jpg?ixlib=rb-4.1.0&amp;q=30&amp;auto=format&amp;w=600&amp;h=400&amp;fit=crop&amp;dpr=2 1200w, https:\/\/images.theconversation.com\/files\/621408\/original\/file-20240924-18-2b8gp0.jpg?ixlib=rb-4.1.0&amp;q=15&amp;auto=format&amp;w=600&amp;h=400&amp;fit=crop&amp;dpr=3 1800w, https:\/\/images.theconversation.com\/files\/621408\/original\/file-20240924-18-2b8gp0.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;h=503&amp;fit=crop&amp;dpr=1 754w, https:\/\/images.theconversation.com\/files\/621408\/original\/file-20240924-18-2b8gp0.jpg?ixlib=rb-4.1.0&amp;q=30&amp;auto=format&amp;w=754&amp;h=503&amp;fit=crop&amp;dpr=2 1508w, https:\/\/images.theconversation.com\/files\/621408\/original\/file-20240924-18-2b8gp0.jpg?ixlib=rb-4.1.0&amp;q=15&amp;auto=format&amp;w=754&amp;h=503&amp;fit=crop&amp;dpr=3 2262w\" alt=\"AI chatbot icons\"\/><figcaption><span class=\"caption\">Strawberry goes a step beyond the capabilities of AI chatbots.<\/span><br \/><span class=\"attribution\"><a rel=\"nofollow noopener\" target=\"_blank\" class=\"source\" href=\"https:\/\/www.shutterstock.com\/image-photo\/shanghaichinafeb-2024-google-gemini-openai-chatgpt-2426619081\">Robert Way \/ Shutterstock<\/a><\/span><\/figcaption><\/figure>\n<p>These would be classic examples of unethical AI behaviour, where cheating or deceiving is acceptable if it leads to a desired goal. It would also be quicker for the AI, as it wouldn\u2019t have to waste any time figuring out the next best move. It may not necessarily be morally correct, however.<\/p>\n<p>This leads to a rather interesting yet worrying discussion. What level of reasoning is Strawberry capable of and what could its unintended consequences be? A powerful AI system that\u2019s capable of cheating humans could pose serious ethical, legal and financial risks to us.<\/p>\n<p>Such risks become grave in critical situations, such as designing weapons of mass destruction. OpenAI rates its own Strawberry models as \u201cmedium risk\u201d for their potential to assist scientists in developing <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.nato.int\/cps\/en\/natohq\/official_texts_197768.htm\">chemical, biological, radiological and nuclear weapons<\/a>.<\/p>\n<p>OpenAI <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/cdn.openai.com\/o1-system-card.pdf\">says<\/a>: \u201cOur evaluations found that o1-preview and o1-mini can help experts with the operational planning of reproducing a known biological threat.\u201d But it goes on to say that experts already have significant expertise in these areas, so the risk would be limited in practice. It adds: \u201cThe models do not enable non-experts to create biological threats, because creating such a threat requires hands-on laboratory skills that the models cannot replace.\u201d<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Powers_of_persuasion\"><\/span>Powers of persuasion<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>OpenAI\u2019s evaluation of Strawberry also investigated the risk that it could persuade humans to change their beliefs. The new o1 models were found to be more persuasive and more manipulative than ChatGPT.<\/p>\n<p>OpenAI also tested a mitigation system that was able to reduce the manipulative capabilities of the AI system. Overall, Strawberry was labelled a <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/cdn.openai.com\/o1-system-card.pdf\">medium risk for \u201cpersuasion\u201d<\/a> in Open AI\u2019s tests.<\/p>\n<p>Strawberry was rated low risk for its ability to operate autonomously and on cybersecurity.<\/p>\n<p>Open AI\u2019s policy states that \u201cmedium risk\u201d models can be released for wide use. In my view, this underestimates the threat. The deployment of such models could be catastrophic, especially if bad actors manipulate the <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/technology\/\" data-internallinksmanager029f6b8e52c=\"4\" title=\"Technology\" target=\"_blank\" rel=\"noopener\">technology<\/a> for their own pursuits.<\/p>\n<p>This calls for strong checks and balances that will only be possible through AI regulation and legal frameworks, such as penalising incorrect risk assessments and the misuse of AI.<\/p>\n<p>The UK government stressed the need for \u201csafety, security and robustness\u201d in their 2023 AI white paper, but that\u2019s not nearly enough. There is an urgent need to prioritise human safety and devise rigid scrutiny protocols for AI models such as Strawberry.<!-- Below is The Conversation's page counter tag. Please DO NOT REMOVE. --><img loading=\"lazy\" decoding=\"async\" style=\"border: none !important; box-shadow: none !important; margin: 0 !important; max-height: 1px !important; max-width: 1px !important; min-height: 1px !important; min-width: 1px !important; opacity: 0 !important; outline: none !important; padding: 0 !important;\" alt=\"The Conversation\" width=\"1\" height=\"1\" class=\"js-lazy\" src=\"https:\/\/counter.theconversation.com\/content\/239748\/count.gif?distributor=republish-lightbox-basic\"\/><!-- End of code. If you don't see any code above, please get new code from the Advanced tab after you click the republish button. The page counter does not collect any personal data. More info: https:\/\/theconversation.com\/republishing-guidelines --><img loading=\"lazy\" decoding=\"async\" style=\"border: none !important; box-shadow: none !important; margin: 0 !important; max-height: 1px !important; max-width: 1px !important; min-height: 1px !important; min-width: 1px !important; opacity: 0 !important; outline: none !important; padding: 0 !important;\" src=\"https:\/\/counter.theconversation.com\/content\/239748\/count.gif?distributor=republish-lightbox-basic\" alt=\"The Conversation\" width=\"1\" height=\"1\" class=\"\" srcset=\"\"\/><\/p>\n<p><em><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/theconversation.com\/profiles\/shweta-singh-1289019\">Shweta Singh<\/a>, Assistant Professor, Information Systems and Management, <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/theconversation.com\/institutions\/warwick-business-school-university-of-warwick-2650\">Warwick Business School, University of Warwick<\/a><\/em><\/p>\n<p><em>This article is republished from <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/theconversation.com\">The Conversation<\/a> under a Creative Commons license. Read the <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/theconversation.com\/openais-strawberry-program-is-reportedly-capable-of-reasoning-it-might-be-able-to-deceive-humans-239748\">original article<\/a>.<\/em>\n<\/div>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMN63nwsw68G3Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/news\/can-openais-strawberry-program-deceive-humans\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>OpenAI, the company that made ChatGPT, has launched a new artificial intelligence (AI) system called Strawberry. It is designed not just to provide quick responses to questions, like ChatGPT, but to think or \u201creason\u201d. This raises several major concerns. If Strawberry really is capable of some form of reasoning, could this AI system cheat and&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-643056","post","type-post","status-publish","format-standard","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/643056","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=643056"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/643056\/revisions"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=643056"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=643056"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=643056"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}