{"id":652755,"date":"2025-02-08T13:50:19","date_gmt":"2025-02-08T10:50:19","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/research-shows-ai-datasets-have-human-values-blind-spots\/"},"modified":"2025-02-08T13:50:19","modified_gmt":"2025-02-08T10:50:19","slug":"research-shows-ai-datasets-have-human-values-blind-spots","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/research-shows-ai-datasets-have-human-values-blind-spots\/","title":{"rendered":"#Research shows AI datasets have human values blind spots"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_84 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a29da309dcd2\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a29da309dcd2\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/research-shows-ai-datasets-have-human-values-blind-spots\/#Why_it_matters\" >Why it matters<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/research-shows-ai-datasets-have-human-values-blind-spots\/#What_other_research_is_being_done\" >What other research is being done<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/buradabiliyorum.com\/en\/research-shows-ai-datasets-have-human-values-blind-spots\/#Whats_next\" >What\u2019s next<\/a><\/li><\/ul><\/nav><\/div>\n<div>\n<p><a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/www.smart-laboratory.org\/\" target=\"_blank\" rel=\"nofollow noopener\">My colleagues and I<\/a> at Purdue University have uncovered a significant imbalance in the human values embedded in AI systems. The systems were predominantly oriented toward information and utility values and less toward pro<a href=\"https:\/\/buradabiliyorum.com\/en\/category\/social-mediaa\/\" data-internallinksmanager029f6b8e52c=\"1\" title=\"Social Media\" target=\"_blank\" rel=\"noopener\">social<\/a>, well-being and civic values.<\/p>\n<p>At the heart of many AI systems lie vast collections of images, text and other forms of data used to train models. While these datasets are meticulously curated, it is not uncommon that they sometimes contain unethical or prohibited content.<\/p>\n<p>To ensure AI systems do not use harmful content when responding to users, researchers introduced a method called <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/doi.org\/10.48550\/arXiv.2204.05862\" target=\"_blank\" rel=\"nofollow noopener\">reinforcement learning from human feedback<\/a>. Researchers use highly curated datasets of human preferences to shape the behaviour of AI systems to be helpful and honest.<\/p>\n<p>In our study, <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/neurips.cc\/virtual\/2024\/poster\/97583\" target=\"_blank\" rel=\"nofollow noopener\">we examined<\/a> three open-source training datasets used by leading U.S. AI companies. We constructed a taxonomy of human values through a literature review from moral philosophy, value theory, and <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/sciencee\/\" data-internallinksmanager029f6b8e52c=\"5\" title=\"Science\" target=\"_blank\" rel=\"noopener\">science<\/a>, <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/technology\/\" data-internallinksmanager029f6b8e52c=\"4\" title=\"Technology\" target=\"_blank\" rel=\"noopener\">technology<\/a> and society studies. The values are well-being and peace; information seeking; justice, human rights and animal rights; duty and accountability; wisdom and knowledge; civility and tolerance; and empathy and helpfulness. We used the taxonomy to manually annotate a dataset, and then used the annotation to train an AI language model.<\/p>\n<p>Our model allowed us to examine the AI companies\u2019 datasets. We found that these datasets contained several examples that train AI systems to be helpful and honest when users ask questions like \u201cHow do I book a flight?\u201d The datasets contained very limited examples of how to answer questions about topics related to empathy, justice and human rights. Overall, wisdom and knowledge and information seeking were the two most common values, while justice, human rights and animal rights was the least common value.<\/p>\n<figure class=\"align-center zoomable\"><a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=1000&amp;fit=clip\" target=\"_blank\" rel=\"nofollow noopener\"><img decoding=\"async\" sizes=\"(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px\" alt=\"a chart with three boxes on the left and four on the right\" class=\"js-lazy\" src=\"https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;fit=clip\" srcset=\"https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=600&amp;h=397&amp;fit=crop&amp;dpr=1 600w, https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=30&amp;auto=format&amp;w=600&amp;h=397&amp;fit=crop&amp;dpr=2 1200w, https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=15&amp;auto=format&amp;w=600&amp;h=397&amp;fit=crop&amp;dpr=3 1800w, https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;h=499&amp;fit=crop&amp;dpr=1 754w, https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=30&amp;auto=format&amp;w=754&amp;h=499&amp;fit=crop&amp;dpr=2 1508w, https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=15&amp;auto=format&amp;w=754&amp;h=499&amp;fit=crop&amp;dpr=3 2262w\"\/><img decoding=\"async\" src=\"https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;fit=clip\" alt=\"a chart with three boxes on the left and four on the right\" class=\"\" srcset=\"https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=600&amp;h=397&amp;fit=crop&amp;dpr=1 600w, https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=30&amp;auto=format&amp;w=600&amp;h=397&amp;fit=crop&amp;dpr=2 1200w, https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=15&amp;auto=format&amp;w=600&amp;h=397&amp;fit=crop&amp;dpr=3 1800w, https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;h=499&amp;fit=crop&amp;dpr=1 754w, https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=30&amp;auto=format&amp;w=754&amp;h=499&amp;fit=crop&amp;dpr=2 1508w, https:\/\/images.theconversation.com\/files\/646919\/original\/file-20250204-15-t93inq.jpg?ixlib=rb-4.1.0&amp;q=15&amp;auto=format&amp;w=754&amp;h=499&amp;fit=crop&amp;dpr=3 2262w\"\/><\/a><figcaption><span class=\"caption\">The researchers started by creating a taxonomy of human values.<\/span><br \/><span class=\"attribution\"><a rel=\"nofollow\" target=\"_blank\" class=\"source\" href=\"https:\/\/neurips.cc\/virtual\/2024\/poster\/97583\" target=\"_blank\" rel=\"nofollow noopener\">Obi et al<\/a>, <a rel=\"nofollow\" target=\"_blank\" class=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by-nd\/4.0\/\" target=\"_blank\" rel=\"nofollow noopener\">CC BY-ND<\/a><\/span><\/figcaption><\/figure>\n<h2><span class=\"ez-toc-section\" id=\"Why_it_matters\"><\/span>Why it matters<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The imbalance of human values in datasets used to train AI could have significant implications for how AI systems interact with people and <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>roach complex social issues. As AI becomes more integrated into sectors such as <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/natlawreview.com\/article\/what-expect-2025-ai-legal-tech-and-regulation-65-expert-predictions\" target=\"_blank\" rel=\"nofollow noopener\">law<\/a>, <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/bipartisanpolicy.org\/explainer\/ai-in-health-care-five-key-developments\/\" target=\"_blank\" rel=\"nofollow noopener\">health care<\/a> and <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/www.cbc.ca\/news\/business\/meta-ai-generated-characters-future-social-media-1.7424641\" target=\"_blank\" rel=\"nofollow noopener\">social media<\/a>, it\u2019s important that these systems reflect a balanced spectrum of collective values to ethically serve people\u2019s needs.<\/p>\n<p>This research also comes at a crucial time for government and policymakers as society grapples with questions about <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/theconversation.com\/regulating-ai-3-experts-explain-why-its-difficult-to-do-and-important-to-get-right-198868\" target=\"_blank\" rel=\"nofollow noopener\">AI governance and ethics<\/a>. Understanding the values embedded in AI systems is important for ensuring that they serve humanity\u2019s best interests.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"What_other_research_is_being_done\"><\/span>What other research is being done<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Many researchers are working to align AI systems with human values. The introduction of reinforcement learning from human feedback <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/doi.org\/10.48550\/arXiv.2204.05862\" target=\"_blank\" rel=\"nofollow noopener\">was groundbreaking<\/a> because it provided a way to guide AI behavior toward being helpful and truthful.<\/p>\n<p>Various companies are developing techniques to prevent harmful behaviors in AI systems. However, our group was the first to introduce a systematic way to analyze and understand what values were actually being embedded in these systems through these datasets.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Whats_next\"><\/span>What\u2019s next<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>By making the values embedded in these systems visible, we aim to help AI companies create more balanced datasets that better reflect the values of the communities they serve. The companies can use our technique to find out where they are not doing well and then improve the diversity of their AI training data.<\/p>\n<p>The companies we studied might no longer use those versions of their datasets, but they can still benefit from our process to ensure that their systems align with societal values and norms moving forward.<!-- Below is The Conversation's page counter tag. Please DO NOT REMOVE. --><img loading=\"lazy\" decoding=\"async\" style=\"border: none !important; box-shadow: none !important; margin: 0 !important; max-height: 1px !important; max-width: 1px !important; min-height: 1px !important; min-width: 1px !important; opacity: 0 !important; outline: none !important; padding: 0 !important;\" alt=\"The Conversation\" width=\"1\" height=\"1\" class=\"js-lazy\" src=\"https:\/\/counter.theconversation.com\/content\/246479\/count.gif?distributor=republish-lightbox-basic\"\/><!-- End of code. If you don't see any code above, please get new code from the Advanced tab after you click the republish button. The page counter does not collect any personal data. More info: https:\/\/theconversation.com\/republishing-guidelines --><img loading=\"lazy\" decoding=\"async\" style=\"border: none !important; box-shadow: none !important; margin: 0 !important; max-height: 1px !important; max-width: 1px !important; min-height: 1px !important; min-width: 1px !important; opacity: 0 !important; outline: none !important; padding: 0 !important;\" src=\"https:\/\/counter.theconversation.com\/content\/246479\/count.gif?distributor=republish-lightbox-basic\" alt=\"The Conversation\" width=\"1\" height=\"1\" class=\"\" srcset=\"\"\/><\/p>\n<p><em><a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/theconversation.com\/profiles\/ike-obi-2285164\" target=\"_blank\" rel=\"nofollow noopener\">Ike Obi<\/a>, Ph.D. student in Computer and Information Technology, <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/theconversation.com\/institutions\/purdue-university-1827\" target=\"_blank\" rel=\"nofollow noopener\">Purdue University<\/a><\/em><\/p>\n<p><em>This article is republished from <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/theconversation.com\" target=\"_blank\" rel=\"nofollow noopener\">The Conversation<\/a> under a Creative Commons license. Read the <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/theconversation.com\/ai-datasets-have-human-values-blind-spots-new-research-246479\" target=\"_blank\" rel=\"nofollow noopener\">original article<\/a>.<\/em><\/p>\n<\/div>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMN63nwsw68G3Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/category\/technology\/\" target=\"_blank\" >Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/news\/ai-datasets-human-values-blind-spots-new-research\" target=\"_blank\" >Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>My colleagues and I at Purdue University have uncovered a significant imbalance in the human values embedded in AI systems. The systems were predominantly oriented toward information and utility values and less toward prosocial, well-being and civic values. At the heart of many AI systems lie vast collections of images, text and other forms of&#8230;<\/p>\n","protected":false},"author":1,"featured_media":652756,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/img-cdn.tnwcdn.com\/image\/tnw-blurple?filter_last=1&fit=1280%2C640&url=https%3A%2F%2Fcdn0.tnwcdn.com%2Fwp-content%2Fblogs.dir%2F1%2Ffiles%2F2024%2F11%2FUntitled-design.jpg&signature=161d28274a53b54b7903cf422bbf7e8e","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-652755","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/652755","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=652755"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/652755\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/652756"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=652755"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=652755"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=652755"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}