{"id":285623,"date":"2021-06-28T16:08:48","date_gmt":"2021-06-28T13:08:48","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/reinforcement-learning-could-be-the-link-between-ai-and-human-level-intelligence\/"},"modified":"2021-06-28T16:08:48","modified_gmt":"2021-06-28T13:08:48","slug":"reinforcement-learning-could-be-the-link-between-ai-and-human-level-intelligence","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/reinforcement-learning-could-be-the-link-between-ai-and-human-level-intelligence\/","title":{"rendered":"#Reinforcement learning could be the link between AI and human-level intelligence"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_85 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a4278c513bcd\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a4278c513bcd\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/reinforcement-learning-could-be-the-link-between-ai-and-human-level-intelligence\/#Natural_selection\" >Natural selection<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/reinforcement-learning-could-be-the-link-between-ai-and-human-level-intelligence\/#Reinforcement_learning_and_artificial_general_intelligence\" >Reinforcement learning and artificial general intelligence<\/a><\/li><\/ul><\/nav><\/div>\n<p>&#8220;<strong>#Reinforcement learning could be the link between AI and human-level intelligence<\/strong>&#8221;<\/p>\n<div>Last week, I wrote an analysis of \u201cReward Is Enough,\u201d a paper by scientists at DeepMind. As the title suggests, the researchers hypothesize that the right reward is<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/06\/07\/deepmind-artificial-intelligence-reward-maximization\/\">all you need to create the abilities associated with intelligence<\/a>, such as perception, motor functions, and language.<\/p>\n<p>This is in contrast with AI systems that try to replicate<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/04\/09\/what-is-narrow-artificial-intelligence-ani\/\">specific functions of natural intelligence<\/a><span>\u00a0<\/span>such as classifying images, navigating physical environments, or completing sentences.<\/p>\n<p>The researchers go as far as suggesting that with well-defined reward, a complex environment, and the right reinforcement learning algorithm, we will be able to reach artificial <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">general<\/a> intelligence, the kind of problem-solving and cognitive abilities found in humans and, to a lesser degree, in animals.<\/p>\n<p>The article and the paper triggered a heated debate on <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/social-mediaa\/\" data-internallinksmanager029f6b8e52c=\"1\" title=\"Social Media\" target=\"_blank\" rel=\"noopener\">social media<\/a>, with reactions going from full support of the idea to outright rejection. Of course, both sides make valid claims. But the truth lies somewhere in the middle. Natural evolution is proof that the reward hypothesis is scientifically valid. But implementing the pure reward <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>roach to reach human-level intelligence has some very hefty requirements.<\/p>\n<p>In this post, I\u2019ll try to disambiguate in simple terms where the line between theory and practice stands.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Natural_selection\"><\/span>Natural selection<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1358438 js-lazy\" alt=\"\" width=\"696\" height=\"392\" sizes=\"auto, (max-width: 696px) 100vw, 696px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/EvoI1.jpeg\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/EvoI1.jpeg 696w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/EvoI1-280x158.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/EvoI1-479x270.jpeg 479w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/EvoI1-240x135.jpeg 240w\"\/><figcaption>Credit: <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.pexels.com\/photo\/selective-photo-of-gray-shark-726478\/\">George Desipris<\/a><\/figcaption><noscript><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1358438\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/EvoI1.jpeg\" alt=\"\" width=\"696\" height=\"392\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/EvoI1.jpeg 696w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/EvoI1-280x158.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/EvoI1-479x270.jpeg 479w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/EvoI1-240x135.jpeg 240w\"\/><\/noscript><\/figure>\n<p>In their paper, the DeepMind scientists present the following hypothesis: \u201cIntelligence, and its associated abilities, can be understood as subserving the maximisation of reward by an agent acting in its environment.\u201d<\/p>\n<p>Scientific evidence supports this claim.<\/p>\n<p>Humans and animals owe their intelligence to a very simple law: natural selection. I\u2019m not an expert on the topic, but I suggest reading<span>\u00a0<\/span><em>The Blind Watchmaker<\/em><span>\u00a0<\/span>by biologist Richard Dawkins, which provides a very accessible account of how evolution has led to all forms of life and intelligence on out planet.<\/p>\n<p>In a nutshell, nature gives preference to lifeforms that are better fit to survive in their environments. Those that can withstand challenges posed by the environment (weather, scarcity of food, etc.) and other lifeforms (predators, viruses, etc.) will survive, reproduce, and pass on their genes to the next generation. Those that don\u2019t get eliminated.<\/p>\n<p>According to Dawkins, \u201cIn nature, the usual selecting agent is direct, stark and simple. It is the grim reaper. Of course, the<span>\u00a0<\/span><em>reasons<\/em><span>\u00a0<\/span>for survival are anything but simple \u2014 that is why natural selection can build up animals and plants of such formidable complexity. But there is something very crude and simple about death itself. And nonrandom death is all it takes to select phenotypes, and hence the genes that they contain, in nature.\u201d<\/p>\n<p>But how do different lifeforms emerge? Every newly born organism inherits the genes of its parent(s). But unlike the digital world, copying in organic life is not an exact thing. Therefore, offspring often undergo mutations, small changes to their genes that can have a huge impact across generations. These mutations can have a simple effect, such as a small change in muscle texture or skin color. But they can also become the core for developing new organs (e.g., lungs, kidneys, eyes), or shedding old ones (e.g., tail, gills).<\/p>\n<p>If these mutations help improve the chances of the organism\u2019s survival (e.g., better camouflage or faster speed), they will be preserved and passed on to future generations, where further mutations might reinforce them. For example, the first organism that developed the ability to parse light information had an enormous advantage over all the others that didn\u2019t, even though its ability to see was not comparable to that of animals and humans today. This advantage enabled it to better survive and reproduce. As its descendants reproduced, those whose mutations improved their sight outmatched and outlived their peers. Through thousands (or millions) of generations, these changes resulted in a complex organ such as the eye.<\/p>\n<p>The simple mechanisms of mutation and natural selection has been enough to give rise to all the different lifeforms that we see on Earth, from bacteria to plants, fish, birds, amphibians, and mammals.<\/p>\n<p>The same self-reinforcing mechanism has also created the brain and its associated wonders. In her book<span>\u00a0<\/span><em><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/09\/28\/ai-conscience-patricia-churchland\/\">Conscience: The Origin of Moral Intuition<\/a><\/em>, scientist Patricia Churchland explores how natural selection led to the development of the cortex, the main part of the brain that gives mammals the ability to learn from their environment. The evolution of the cortex has enabled mammals to develop social behavior and learn to live in herds, prides, troops, and tribes. In humans, the evolution of the cortex has given rise to complex cognitive faculties, the capacity to develop rich languages, and the ability to establish social norms.<\/p>\n<p>Therefore, if you consider survival as the ultimate reward, the main hypothesis that DeepMind\u2019s scientists make is scientifically sound. However, when it comes to implementing this rule, things get very complicated.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Reinforcement_learning_and_artificial_general_intelligence\"><\/span>Reinforcement learning and artificial general intelligence<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1358439 js-lazy\" alt=\"\" width=\"696\" height=\"392\" sizes=\"auto, (max-width: 696px) 100vw, 696px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/Reinforcement-learning-artificial-intelligence.jpeg\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/Reinforcement-learning-artificial-intelligence.jpeg 696w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/Reinforcement-learning-artificial-intelligence-280x158.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/Reinforcement-learning-artificial-intelligence-479x270.jpeg 479w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/Reinforcement-learning-artificial-intelligence-240x135.jpeg 240w\"\/><noscript><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1358439\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/Reinforcement-learning-artificial-intelligence.jpeg\" alt=\"\" width=\"696\" height=\"392\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/Reinforcement-learning-artificial-intelligence.jpeg 696w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/Reinforcement-learning-artificial-intelligence-280x158.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/Reinforcement-learning-artificial-intelligence-479x270.jpeg 479w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/Reinforcement-learning-artificial-intelligence-240x135.jpeg 240w\"\/><\/noscript><\/figure>\n<p>In their paper, DeepMind\u2019s scientists make the claim that the reward hypothesis can be implemented with<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2019\/05\/28\/what-is-reinforcement-learning\/\">reinforcement learning algorithms<\/a>, a branch of AI in which an agent gradually develops its behavior by interacting with its environment. A reinforcement learning agent starts by making random actions. Based on how those actions align with the goals it is trying to achieve, the agent receives rewards. Across many episodes, the agent learns to develop sequences of actions that maximize its reward in its environment.<\/p>\n<p>According to the DeepMind scientists, \u201cA sufficiently powerful and general reinforcement learning agent may ultimately give rise to intelligence and its associated abilities. In other words, if an agent can continually adjust its behavior so as to improve its cumulative reward, then any abilities that are repeatedly demanded by its environment must ultimately be produced in the agent\u2019s behavior.\u201d<\/p>\n<p>In an<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/venturebeat.com\/2021\/01\/02\/leading-computer-scientists-debate-the-next-steps-for-ai-in-2021\/\">online debate in December<\/a>, computer scientist Richard Sutton, one of the paper\u2019s co-authors, said, \u201cReinforcement learning is the first computational theory of intelligence\u2026 In reinforcement learning, the goal is to maximize an arbitrary reward signal.\u201d<\/p>\n<p>DeepMind has a lot of experience to prove this claim. They have already developed reinforcement learning agents that can<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2019\/01\/02\/humanizing-ai-deep-learning-alphazero\/\">outmatch humans<\/a><span>\u00a0<\/span>in Go, chess, Atari, StarCraft, and other <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/game\/\" data-internallinksmanager029f6b8e52c=\"7\" title=\"Game\" target=\"_blank\" rel=\"noopener\">game<\/a>s. They have also developed reinforcement learning models to make progress in<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.nature.com\/articles\/d41586-020-03348-4\">some of the most complex problems of science<\/a>.<\/p>\n<p>The scientists further wrote in their paper, \u201cAccording to our hypothesis, general intelligence can instead be understood as, and implemented by, maximizing<span>\u00a0<\/span><strong>a singular reward in a single, complex environment<span>\u00a0<\/span><\/strong>[emphasis mine].\u201d<\/p>\n<p>This is where hypothesis separates from practice. The keyword here is \u201ccomplex.\u201d The environments that DeepMind (and its quasi-rival<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/08\/17\/openai-gpt-3-commercial-ai\/\">OpenAI<\/a>) have so far explored with reinforcement learning are not nearly as complex as the physical world. And they still required the financial backing and vast computational resources of<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2020\/12\/21\/deepminds-annual-report-why-its-hard-to-run-a-commercial-ai-lab\/\">very wealthy tech companies<\/a>. In some cases, they still had to dumb down the environments to speed up the training of their reinforcement learning models and cut down the costs. In others, they had to redesign the reward to make sure the RL agents did not get stuck the wrong local optimum.<\/p>\n<p>(It is worth noting that the scientists do acknowledge in their paper that they can\u2019t offer \u201ctheoretical guarantee on the sample efficiency of reinforcement learning agents.\u201d)<\/p>\n<p>Now, imagine what it would take to use reinforcement learning to replicate evolution and reach human-level intelligence. First, you would need a simulation of the world. But at what level would you simulate the world? My guess is that anything short of quantum scale would be inaccurate. And we don\u2019t have a fraction of the compute power needed to create quantum-scale simulations of the world.<\/p>\n<p>Let\u2019s say we did have the compute power to create such a simulation. We could start at around 4 billion years ago, when the first life-forms emerged. You would need to have an exact representation of the state of Earth at the time. We would need to know the initial state of the environment at the time. And we still don\u2019t have a definite theory on that.<\/p>\n<p>An alternative would be to create a shortcut and start from, say, 8 million years ago, when our monkey ancestors still lived on earth. This would cut down the time of training, but we would have a much more complex initial state to start from. At that time, there were millions of different lifeforms on Earth, and they were closely interrelated. They evolved together. Taking any of them out of the equation could have a huge impact on the course of the simulation.<\/p>\n<p>Therefore, you basically have two key problems: compute power and initial state. The further you go back in time, the more compute power you\u2019ll need to run the simulation. On the other hand, the further you move forward, the more complex your initial state will be. And evolution has created all sorts of intelligent and non-intelligent life-forms and making sure that we could reproduce the exact steps that led to human intelligence without any guidance and only through reward is a hard bet.<\/p>\n<p>Many will say that you don\u2019t need an exact simulation of the world and you only need to approximate the problem space in which your reinforcement learning agent wants to operate in.<\/p>\n<p>For example, in their paper, the scientists mention the example of a house-cleaning robot: \u201cIn order for a kitchen robot to maximize cleanliness, it must presumably have abilities of perception (to differentiate clean and dirty utensils), knowledge (to understand utensils), motor control (to manipulate utensils), memory (to recall locations of utensils), language (to predict future mess from dialogue), and social intelligence (to encourage young children to make less mess). A behavior that maximises cleanliness must therefore yield all these abilities in service of that singular goal.\u201d<\/p>\n<p>This statement is true, but downplays the complexities of the environment. Kitchens were created by humans. For instance, the shape of drawer handles, doorknobs, floors, cupboards, walls, tables, and everything you see in a kitchen has been optimized for the sensorimotor functions of humans. Therefore, a robot that would want to work in such an environment would need to develop sensorimotor skills that are similar to those of humans. You can create shortcuts, such as avoiding the complexities of bipedal walking or hands with fingers and joints. But then, there would be incongruencies between the robot and the humans who will be using the kitchens. Many scenarios that would be easy to handle for a human (walking over an overturned chair) would become prohibitive for the robot.<\/p>\n<p>Also, other skills, such as language, would require even more similar infrastructure between the robot and the humans who would share the environment. Intelligent agents must be able to develop abstract mental models of each other to cooperate or compete in a shared environment. Language omits many important details, such as sensory experience, goals, needs. We fill in the gaps with our intuitive and conscious knowledge of our interlocutor\u2019s mental state. We might make wrong assumptions, but those are the exceptions, not the norm.<\/p>\n<p>And finally, developing a notion of \u201ccleanliness\u201d as a reward is very complicated because it is very tightly linked to human knowledge, life, and goals. For example, removing every piece of food from the kitchen would certainly make it cleaner, but would the humans using the kitchen be happy about it?<\/p>\n<p>A robot that has been optimized for \u201ccleanliness\u201d would have a hard time co-existing and cooperating with living beings that have been optimized for survival.<\/p>\n<p>Here, you can take shortcuts again by creating hierarchical goals, equipping the robot and its reinforcement learning models with prior knowledge, and using human feedback to steer it in the right direction. This would help a lot in making it easier for the robot to understand and interact with humans and human-designed environments. But then you would be cheating on the reward-only approach. And the mere fact that your robot agent starts with predesigned limbs and image-capturing and sound-emitting devices is itself the integration of prior knowledge.<\/p>\n<p>In theory, reward only is enough for any kind of intelligence. But in practice, there\u2019s a trade off between environment complexity, reward design, and agent design.<\/p>\n<p>In the future, we might be able to achieve a level of compute power that will make it possible to reach general intelligence through pure reward and reinforcement learning. But for the time being, what works is hybrid approaches that involve learning and complex engineering of rewards and AI agent architectures.<\/p>\n<p><i>This article was originally published by Ben Dickson on\u00a0<\/i><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/\"><i>TechTalks<\/i><\/a><i>, a publication that examines trends in <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/technology\/\" data-internallinksmanager029f6b8e52c=\"4\" title=\"Technology\" target=\"_blank\" rel=\"noopener\">technology<\/a>, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech, and what we need to look out for. You can read the original article <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/06\/17\/evolution-rewards-artificial-intelligence\/\">here<\/a>.<\/i><\/p>\n<\/div>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMLG0nwswvr63Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\">For forums sites go to <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/forum.buradabiliyorum.com\/\" target=\"_blank\" rel=\"noopener\">Forum.BuradaBiliyorum.Com<\/a><\/span><\/strong>\n<\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/news\/reinforcement-learning-link-ai-general-level-intelligence-syndication\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;#Reinforcement learning could be the link between AI and human-level intelligence&#8221; Last week, I wrote an analysis of \u201cReward Is Enough,\u201d a paper by scientists at DeepMind. As the title suggests, the researchers hypothesize that the right reward is\u00a0all you need to create the abilities associated with intelligence, such as perception, motor functions, and language&#8230;.<\/p>\n","protected":false},"author":1,"featured_media":285624,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/img-cdn.tnwcdn.com\/image\/neural?filter_last=1&fit=1280,640&url=https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/06\/EvolutionHed1.jpg&signature=eb72505783d019cc290d3a2602b93fd4","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-285623","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/285623","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=285623"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/285623\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/285624"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=285623"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=285623"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=285623"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}