{"id":492689,"date":"2022-09-13T16:00:20","date_gmt":"2022-09-13T13:00:20","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/what-does-europes-approach-to-ai-mean-for-gpt-and-dall-e\/"},"modified":"2022-09-13T16:00:20","modified_gmt":"2022-09-13T13:00:20","slug":"what-does-europes-approach-to-ai-mean-for-gpt-and-dall-e","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/what-does-europes-approach-to-ai-mean-for-gpt-and-dall-e\/","title":{"rendered":"#What does Europe&#8217;s approach to AI mean for GPT and DALL-E?"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_85 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a3df42456fcd\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a3df42456fcd\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-1'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/what-does-europes-approach-to-ai-mean-for-gpt-and-dall-e\/#%E2%80%9CWhat_does_Europes_approach_to_AI_mean_for_GPT_and_DALL-E%E2%80%9D\" >&#8220;What does Europe&#8217;s approach to AI mean for GPT and DALL-E?&#8221;<\/a><ul class='ez-toc-list-level-2' ><li class='ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/what-does-europes-approach-to-ai-mean-for-gpt-and-dall-e\/#Privacy_vs_ownership\" >Privacy vs ownership<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/buradabiliyorum.com\/en\/what-does-europes-approach-to-ai-mean-for-gpt-and-dall-e\/#Greetings_humanoids\" >Greetings, humanoids<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/buradabiliyorum.com\/en\/what-does-europes-approach-to-ai-mean-for-gpt-and-dall-e\/#Public_data_versus_PII\" >Public data versus PII<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/buradabiliyorum.com\/en\/what-does-europes-approach-to-ai-mean-for-gpt-and-dall-e\/#Why_not_ban_scraping\" >Why not ban scraping?\u00a0<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/buradabiliyorum.com\/en\/what-does-europes-approach-to-ai-mean-for-gpt-and-dall-e\/#Further_gray_areas\" >Further gray areas<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h1><span class=\"ez-toc-section\" id=\"%E2%80%9CWhat_does_Europes_approach_to_AI_mean_for_GPT_and_DALL-E%E2%80%9D\"><\/span>&#8220;What does Europe&#8217;s approach to AI mean for GPT and DALL-E?&#8221;<span class=\"ez-toc-section-end\"><\/span><\/h1>\n<div id=\"article-main-content\">\n                            <span style=\"font-weight: 400;\">The global AI explosion has supercharged the need for a common sense, human-centered methodology for dealing with data privacy and ownership. Leading the way is Europe\u2019s <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">General<\/a> Data Protection Regulation (GDPR), but there\u2019s more than just personally identifiable information (PII) at stake in the modern market.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">What about the data we generate as content and art? It\u2019s certainly not legal to copy someone else\u2019s work and then present it as your own. But there are AI systems that attempt to <\/span><i><span style=\"font-weight: 400;\">scrape<\/span><\/i><span style=\"font-weight: 400;\"> as much human-generated content from the web as possible in order to generate content that\u2019s similar.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Can GDPR or any other EU-centered policies protect this kind of content? As it turns out, like most things in the machine learning world, it depends on the data.\u00a0<\/span><\/p>\n<h2><span class=\"ez-toc-section\" id=\"Privacy_vs_ownership\"><\/span><span style=\"font-weight: 400;\">Privacy vs ownership<\/span><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<div class=\"inarticle-wrapper neural channel-cta hs-embed-tnw\">\n<div id=\"hs-embed-tnw\" class=\"channel-cta-wrapper\">\n<div class=\"channel-cta-img\"><img class=\"js-lazy\" https:=\"\"\/><\/div>\n<p><noscript><img decoding=\"async\" src=\"https:\/\/thenextweb.com\/news\/src=\" https:=\"\"\/><\/noscript><\/p>\n<div class=\"channel-cta-input\">\n<h2 class=\"channel-cta-title\"><span class=\"ez-toc-section\" id=\"Greetings_humanoids\"><\/span>Greetings, humanoids<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p class=\"channel-cta-tagline\">Subscribe to our <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/news\/\" data-internallinksmanager029f6b8e52c=\"2\" title=\"News\" target=\"_blank\" rel=\"noopener\">news<\/a>letter now for a weekly recap of our favorite AI stories in your inbox.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p><span style=\"font-weight: 400;\">GDPR\u2019s primary purpose is to protect European citizens from harmful actions and consequences related to the misuse, abuse, or exploitation of their private information. It\u2019s not much use to citizens (or organizations) when it comes to protecting intellectual property (IP).\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Unfortunately, the policies and regulations put in place to protect IP are, to the best of our knowledge, not equipped to cover data scraping and anonymization. That makes it difficult to understand exactly where the regulations <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>ly when it comes to scraping the web for content.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">These techniques, and the data they obtain, are used to create massive databases for use in training large AI models such as OpenAI\u2019s GPT-3 and DALL-E 2 systems.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The only way to teach an AI to imitate humans is to expose it to human-generated data. And the more data you shove in an AI system, the more robust its output tends to be.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">It works like this: imagine you draw a picture of a flower and post it to an online forum for artists. Using scraping techniques, a tech outfit sucks up your image along with billions of others so it can create a massive dataset of artwork. The next time someone asks the AI to generate an image of a \u201cflower,\u201d there\u2019s a greater-than-zero possibility that your work will feature in the AI\u2019s interpretation of the prompt.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As to whether such use would be ethical remains an open question.\u00a0<\/span><\/p>\n<h2><span class=\"ez-toc-section\" id=\"Public_data_versus_PII\"><\/span><span style=\"font-weight: 400;\">Public data versus PII<\/span><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">While the GDPR\u2019s regulatory oversight could be described as far-reaching when it comes to protecting private information and giving Europeans the<\/span> <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/gdpr.eu\/right-to-be-forgotten\/\"><i><span style=\"font-weight: 400;\">right to erasure<\/span><\/i><\/a><span style=\"font-weight: 400;\">, it seemingly does very little to protect content from scraping. However, that doesn\u2019t mean GDPR and other EU regulations are entirely feckless in this regard.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Individuals and organizations have to follow very specific rules for scraping PII, lest they fall afoul of the law \u2014 something that can become quite costly.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As an example, it\u2019s becoming nigh impossible for Clearview AI, a company that builds facial recognition databases for government use by <\/span><i><span style=\"font-weight: 400;\">scraping<\/span><\/i><span style=\"font-weight: 400;\"> <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/social-mediaa\/\" data-internallinksmanager029f6b8e52c=\"1\" title=\"Social Media\" target=\"_blank\" rel=\"noopener\">social media<\/a> data, to conduct business in Europe. EU watchdogs from at least seven nations have either issued hefty fines already or recommended fines over the company\u2019s refusal to comply with GDPR and similar regulations.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">On the complete other side of the spectrum, companies such as Google, OpenAI, and Meta employ similar <\/span><i><span style=\"font-weight: 400;\">data scraping <\/span><\/i><span style=\"font-weight: 400;\">practices either directly or via the purchase or use of scraped datasets for many of their AI models without any repercussion. And, while big tech\u2019s faced its fair share of fines in Europe, very few of the infractions have involved data scraping.\u00a0<\/span><\/p>\n<h2><span class=\"ez-toc-section\" id=\"Why_not_ban_scraping\"><\/span><span style=\"font-weight: 400;\">Why not ban scraping?\u00a0<\/span><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Scraping, on the surface, might seem like a practice with too much potential for misuse not to ban outright. However, for many organizations that rely on scraping, the data being obtained isn\u2019t necessarily \u201ccontent\u201d or \u201cPII,\u201d but information that can serve the public.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We reached out to the UK\u2019s agency for handling data privacy, the<\/span> <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/ico.org.uk\/\"><span style=\"font-weight: 400;\">Information Commissioner\u2019s Office<\/span><\/a><span style=\"font-weight: 400;\"> (ICO), to find out how they regulated scraping techniques and internet-scale datasets and to understand why it was so important not to over-regulate.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">A spokesperson for the ICO told TNW:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The use of publicly available information can bring many benefits, from research to developing new products, services and innovations \u2014 including in the AI space. However, where this information is personal data, it\u2019s important to understand that data protection law applies. This is the case whether the techniques used to collect the data involve scraping or anything else.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In other words, it\u2019s more about the kind of data being used than how it\u2019s gathered.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Whether you copy paste images from Facebook profiles or use machine learning to scrape the web for labeled images, you\u2019re likely to run afoul of GDPR and other European privacy regulations if you build a facial recognition engine without consent from the people whose faces are in its database.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">But it\u2019s generally acceptable to scrape the internet for massive amounts of data as long as you either<\/span> <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/ico.org.uk\/media\/for-organisations\/documents\/2013559\/big-data-ai-ml-and-data-protection.pdf\"><span style=\"font-weight: 400;\">anonymize it<\/span><\/a><span style=\"font-weight: 400;\"> or ensure that there is no PII in the dataset.<\/span><\/p>\n<h2><span class=\"ez-toc-section\" id=\"Further_gray_areas\"><\/span><span style=\"font-weight: 400;\">Further gray areas<\/span><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">However, even within the allowed use cases, there still exist some gray areas that do concern private information.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">GPT-2 and GPT-3, for example, are<\/span> <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/ai.googleblog.com\/2020\/12\/privacy-considerations-in-large.html\"><span style=\"font-weight: 400;\">known to occasionally output PII<\/span><\/a><span style=\"font-weight: 400;\"> in the form of addresses, phone numbers, and other information that\u2019s apparently baked into its corpus via large scale training datasets.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Here, where it\u2019s evident that the company behind GPT-2 and GPT-3 are taking steps to mitigate this, GDPR and similar regulations are doing their job.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Simply put, we can either choose not to train large AI models or allow the companies training them the opportunity to explore edge cases and attempt to mitigate concerns.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">What might be needed is a GDUR, a General Data Use Regulation, something that could give clear guidelines into how human-generated content can legally be used in large datasets.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">At a minimum, it seems like it\u2019s worth having a conversation about whether European citizens should have as much right to have the content they create removed from datasets as their selfies and profile pics.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For now, in the UK and throughout the rest of Europe, it seems the right to erasure only extends to our PII. Anything we put online is likely to end up in some AI\u2019s training dataset.\u00a0 <\/span>\n                        <\/div>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMLG0nwswvr63Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\">For forums sites go to <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/forum.buradabiliyorum.com\/\" target=\"_blank\" rel=\"noopener\">Forum.BuradaBiliyorum.Com<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/news\/what-does-europes-approach-data-privacy-mean-for-gpt-and-dall-e\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;What does Europe&#8217;s approach to AI mean for GPT and DALL-E?&#8221; The global AI explosion has supercharged the need for a common sense, human-centered methodology for dealing with data privacy and ownership. Leading the way is Europe\u2019s General Data Protection Regulation (GDPR), but there\u2019s more than just personally identifiable information (PII) at stake in the&#8230;<\/p>\n","protected":false},"author":1,"featured_media":492690,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/img-cdn.tnwcdn.com\/image\/neural?filter_last=1&fit=1280,640&url=https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2013\/12\/data-lock-encryption-security-e1661182810510.jpg&signature=1065917807450555e362c220bb570e5c","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-492689","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/492689","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=492689"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/492689\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/492690"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=492689"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=492689"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=492689"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}