{"id":467208,"date":"2022-06-24T19:00:54","date_gmt":"2022-06-24T16:00:54","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/everything-you-need-to-know-about-model-free-and-model-based-reinforcement-learning\/"},"modified":"2022-06-24T19:00:54","modified_gmt":"2022-06-24T16:00:54","slug":"everything-you-need-to-know-about-model-free-and-model-based-reinforcement-learning","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/everything-you-need-to-know-about-model-free-and-model-based-reinforcement-learning\/","title":{"rendered":"#Everything you need to know about model-free and model-based reinforcement learning"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_84 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a2e37fac7896\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a2e37fac7896\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-1'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/everything-you-need-to-know-about-model-free-and-model-based-reinforcement-learning\/#%E2%80%9CEverything_you_need_to_know_about_model-free_and_model-based_reinforcement_learning%E2%80%9D\" >&#8220;Everything you need to know about model-free and model-based reinforcement learning&#8221;<\/a><ul class='ez-toc-list-level-2' ><li class='ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/everything-you-need-to-know-about-model-free-and-model-based-reinforcement-learning\/#Greetings_humanoids\" >Greetings, humanoids<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h1><span class=\"ez-toc-section\" id=\"%E2%80%9CEverything_you_need_to_know_about_model-free_and_model-based_reinforcement_learning%E2%80%9D\"><\/span>&#8220;Everything you need to know about model-free and model-based reinforcement learning&#8221;<span class=\"ez-toc-section-end\"><\/span><\/h1>\n<div id=\"article-main-content\">\n                            Reinforcement learning is one of the exciting branches of artificial intelligence. It plays an important role in <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/game\/\" data-internallinksmanager029f6b8e52c=\"7\" title=\"Game\" target=\"_blank\" rel=\"noopener\">game<\/a>-playing AI systems, modern robots,\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/06\/14\/google-reinforcement-learning-ai-chip-design\/\">chip-design systems<\/a>, and other <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>lications.<\/p>\n<p>There are many different types of\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/09\/02\/deep-reinforcement-learning-explainer\/\">reinforcement learning algorithms<\/a>, but two main categories are \u201cmodel-based\u201d and \u201cmodel-free\u201d RL. They are both inspired by our understanding of learning in humans and animals.<\/p>\n<p>Nearly every book on reinforcement learning contains a chapter that explains the differences between model-free and model-based reinforcement learning. But seldom are the biological and evolutionary precedents discussed in books about reinforcement learning algorithms for computers.<\/p>\n<div class=\"inarticle-wrapper neural channel-cta hs-embed-tnw\">\n<div id=\"hs-embed-tnw\" class=\"channel-cta-wrapper\">\n<div class=\"channel-cta-img\"><img decoding=\"async\" src=\"https:\/\/s3.amazonaws.com\/uploads.tnw\/uploads\/neural-newsletter_header-1.gif\"\/><\/div>\n<p><noscript><img decoding=\"async\" src=\"https:\/\/thenextweb.com\/news\/src=\" https:=\"\"\/><\/noscript><\/p>\n<div class=\"channel-cta-input\">\n<h2 class=\"channel-cta-title\"><span class=\"ez-toc-section\" id=\"Greetings_humanoids\"><\/span>Greetings, humanoids<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p class=\"channel-cta-tagline\">Subscribe to our <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/news\/\" data-internallinksmanager029f6b8e52c=\"2\" title=\"News\" target=\"_blank\" rel=\"noopener\">news<\/a>letter now for a weekly recap of our favorite AI stories in your inbox.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>I found a very interesting explanation of model-free and model-based RL in\u00a0<em><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/11\/15\/birth-of-intelligence-book-review\/\">The Birth of Intelligence<\/a><\/em>, a book that explores the evolution of intelligence. In a conversation with\u00a0<em>TechTalks<\/em>, Daeyeol Lee, neuroscientist and author of\u00a0<em>The Birth of Intelligence<\/em>, discussed different modes of reinforcement learning in humans and animals, AI and natural intelligence, and future directions of research.<\/p>\n<p><em>American psychologist Edward Thorndike proposed the \u201claw of effect,\u201d which became the basis for model-free reinforcement learning<\/em><\/p>\n<p><img alt=\"https:\/\/i0.wp.com\/bdtechtalks.com\/wp-content\/uploads\/2022\/06\/Edward-Thorndike.jpg?resize=696%2C435&amp;ssl=1\" class=\"js-lazy\" https:=\"\"\/><\/p>\n<p><noscript><img decoding=\"async\" src=\"https:\/\/i0.wp.com\/bdtechtalks.com\/wp-content\/uploads\/2022\/06\/Edward-Thorndike.jpg?resize=696%2C435&amp;ssl=1\" alt=\"https:\/\/i0.wp.com\/bdtechtalks.com\/wp-content\/uploads\/2022\/06\/Edward-Thorndike.jpg?resize=696%2C435&amp;ssl=1\" class=\"\" srcset=\"\"\/><\/noscript><br \/>\nIn the late nineteenth century, psychologist Edward Thorndike proposed the \u201c<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/en.wikipedia.org\/wiki\/Law_of_effect\">law of effect<\/a>,\u201d which states that actions with positive effects in a particular situation become more likely to occur again in that situation, and responses that produce negative effects become less likely to occur in the future.<\/p>\n<p>Thorndike explored the law of effect with an experiment in which he placed a cat inside a puzzle box and measured the time it took for the cat to escape it. To escape, the cat had to manipulate a <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/watch-movies-tv-seriess\/\" data-internallinksmanager029f6b8e52c=\"8\" title=\"Watch Movies &amp; TV Series\" target=\"_blank\" rel=\"noopener\">series<\/a> of gadgets such as strings and levers. Thorndike observed that as the cat interacted with the puzzle box, it learned the behavioral responses that could help it escape. Over time, the cat became faster and faster at escaping the box. Thorndike concluded that the cat learned from the reward and punishments that its actions provided.<\/p>\n<p>The law of effect later paved the way for\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/en.wikipedia.org\/wiki\/Behaviorism\">behaviorism<\/a>, a branch of psychology that tries to explain human and animal behavior in terms of stimuli and responses.<\/p>\n<p>The law of effect is also the basis for model-free reinforcement learning. In model-free reinforcement learning, an agent perceives the world, takes an action, and measures the reward. The agent usually starts by taking random actions and gradually repeats those that are associated with more rewards.<\/p>\n<p>\u201cYou basically look at the state of the world, a snapshot of what the world looks like, and then you take an action. Afterward, you increase or decrease the probability of taking the same action in the given situation depending on its outcome,\u201d Lee said. \u201cThat\u2019s basically what model-free reinforcement learning is. The simplest thing you can imagine.\u201d<\/p>\n<p>In model-free reinforcement learning, there\u2019s no direct knowledge or model of the world. The RL agent must directly experience every outcome of each action through trial and error.<\/p>\n<p><em>American psychologist Edward C. Tolman proposed the idea of \u201clatent learning,\u201d which became the basis of model-based reinforcement learning<\/em><\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img class=\"aligncenter js-lazy\" alt=\"https:\/\/i0.wp.com\/bdtechtalks.com\/wp-content\/uploads\/2022\/06\/edward-c-tolman.jpg?resize=696%2C435&amp;ssl=1\" https:=\"\"\/><noscript><img class=\"aligncenter\" https:=\"\" alt=\"https:\/\/i0.wp.com\/bdtechtalks.com\/wp-content\/uploads\/2022\/06\/edward-c-tolman.jpg?resize=696%2C435&amp;ssl=1\" srcset=\"\"\/><\/noscript><\/figure>\n<p>Thorndike\u2019s law of effect was prevalent until the 1930s, when Edward Tolman, another psychologist, discovered an important insight while exploring how fast rats could learn to navigate mazes. During his experiments, Tolman realized that animals could learn things about their environment without reinforcement.<\/p>\n<p>For example, when a rat is let loose in a maze, it will freely explore the tunnels and gradually learn the structure of the environment. If the same rat is later reintroduced to the same environment and is provided with a reinforcement signal, such as finding food or searching for the exit, it can reach its goal much quicker than animals who did not have the opportunity to explore the maze. Tolman called this \u201clatent learning.\u201d<\/p>\n<p>Latent learning enables animals and humans to develop a mental representation of their world and simulate hypothetical scenarios in their minds and predict the outcome. This is also the basis of model-based reinforcement learning.<\/p>\n<p>\u201cIn model-based reinforcement learning, you develop a model of the world. In terms of computer <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/sciencee\/\" data-internallinksmanager029f6b8e52c=\"5\" title=\"Science\" target=\"_blank\" rel=\"noopener\">science<\/a>, it\u2019s a transition probability, how the world goes from one state to another state depending on what kind of action you produce in it,\u201d Lee said. \u201cWhen you\u2019re in a given situation where you\u2019ve already learned the model of the environment previously, you\u2019ll do a mental simulation. You\u2019ll basically search through the model you\u2019ve acquired in your brain and try to see what kind of outcome would occur if you take a particular series of actions. And when you find the path of actions that will get you to the goal that you want, you\u2019ll start taking those actions physically.\u201d<\/p>\n<p>The main benefit of model-based reinforcement learning is that it obviates the need for the agent to undergo trial-and-error in its environment. For example, if you hear about an accident that has blocked the road you usually take to work, model-based RL will allow you to do a mental simulation of alternative routes and change your path. With model-free reinforcement learning, the new information would not be of any use to you. You would proceed as usual until you reached the accident scene, and then you would start updating your value function and start exploring other actions.<\/p>\n<p>Model-based reinforcement learning has especially been successful in developing\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2018\/07\/02\/ai-plays-chess-go-poker-video-games\/\">AI systems that can master board games<\/a>\u00a0such as chess and Go, where the environment is deterministic.<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img class=\"aligncenter js-lazy\" alt=\"https:\/\/i0.wp.com\/bdtechtalks.com\/wp-content\/uploads\/2021\/11\/Birth-of-intelligence-AI.jpg?resize=696%2C392&amp;ssl=1\" https:=\"\"\/><noscript><img class=\"aligncenter\" https:=\"\" alt=\"https:\/\/i0.wp.com\/bdtechtalks.com\/wp-content\/uploads\/2021\/11\/Birth-of-intelligence-AI.jpg?resize=696%2C392&amp;ssl=1\" srcset=\"\"\/><\/noscript><\/figure>\n<p>In some cases, creating a decent model of the environment is either not possible or too difficult. And model-based reinforcement learning can potentially be very time-consuming, which can prove to be dangerous or even fatal in time-sensitive situations.<\/p>\n<p>\u201cComputationally, model-based reinforcement learning is a lot more elaborate. You have to acquire the model, do the mental simulation, and you have to find the trajectory in your neural processes and then take the action,\u201d Lee said.<\/p>\n<p>Lee added, however, that model-based reinforcement learning does not necessarily have to be more complicated than model-free RL.<\/p>\n<p>\u201cWhat determines the complexity of model-free RL is all the possible combinations of stimulus set and action set,\u201d he said. \u201cAs you have more and more states of the world or sensor representation, the pairs that you\u2019re going to have to learn between states and actions are going to increase. Therefore, even though the idea is simple, if there are many states and those states are mapped to different actions, you\u2019ll need a lot of memory.\u201d<\/p>\n<p>On the contrary, in model-based reinforcement learning, the complexity will depend on the model you build. If the environment is really complicated but can be modeled with a relatively simple model that can be acquired quickly, then the simulation would be much simpler and cost-efficient.<\/p>\n<p>\u201cAnd if the environment tends to change relatively frequently, then rather than trying to relearn the stimulus-action pair associations whenever the world changes, you can have a much more efficient outcome if you\u2019re using model-based reinforcement learning,\u201d Lee said.<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" alt=\"https:\/\/i0.wp.com\/bdtechtalks.com\/wp-content\/uploads\/2022\/06\/daeyeol-lee.jpg?resize=410%2C512&amp;ssl=1\" width=\"410\" height=\"512\" class=\"js-lazy\" https:=\"\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2022%2F06%2F24%2Feverything-you-need-to-know-about-model-free-and-model-based-reinforcement-learning%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Daeyeol Lee, professor of neuroscience at Johns Hopkins School of Medicine\" data-title=\"Share Daeyeol Lee, professor of neuroscience at Johns Hopkins School of Medicine on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Daeyeol Lee, professor of neuroscience at Johns Hopkins School of Medicine on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Daeyeol Lee, professor of neuroscience at Johns Hopkins School of Medicine<\/figcaption><noscript><img loading=\"lazy\" https:=\"\" alt=\"https:\/\/i0.wp.com\/bdtechtalks.com\/wp-content\/uploads\/2022\/06\/daeyeol-lee.jpg?resize=410%2C512&amp;ssl=1\" width=\"410\" height=\"512\" class=\"\" srcset=\"\"\/><\/noscript><\/figure>\n<p>Basically, neither model-based nor model-free reinforcement learning is a perfect solution. And wherever you see a reinforcement learning system tackling a complicated problem, there\u2019s a likely chance that it is using both model-based and model-free RL\u2014and possibly more forms of learning.<\/p>\n<p>Research in neuroscience shows that humans and animals have multiple forms of learning, and the brain constantly switches between these modes depending on the certainty it has on them at any given moment.<\/p>\n<p>\u201cIf the model-free RL is working really well and it is accurately predicting the reward all the time, that means there\u2019s less uncertainty with model-free and you\u2019re going to use it more,\u201d Lee said. \u201cAnd on the contrary, if you have a really accurate model of the world and you can do the mental simulations of what\u2019s going to happen every moment of time, then you\u2019re more likely to use model-based RL.\u201d<\/p>\n<p>In recent years, there has been growing interest in creating AI systems that combine multiple modes of reinforcement learning. Recent\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2022\/04\/04\/reinforcement-learning-td-mpc\/\">research by scientists at UC San Diego<\/a>\u00a0shows that combining model-free and model-based reinforcement learning achieves superior performance in control tasks.<\/p>\n<p>\u201cIf you look at a complicated algorithm like AlphaGo, it has elements of both model-free and model-based RL,\u201d Lee said. \u201cIt learns the state values based on board configurations, and that is basically model-free RL, because you\u2019re trying values depending on where all the stones are. But it also does forward search, which is model-based.\u201d<\/p>\n<p>But despite remarkable achievements, progress in reinforcement learning is still slow. As soon as RL models are faced with complex and unpredictable environments, their performance starts to degrade. For example, creating a\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2019\/04\/17\/openai-five-neural-networks-dota-2\/\">reinforcement learning system that played Dota 2<\/a>\u00a0at championship level required tens of thousands of hours of training, a feat that is physically impossible for humans. Other tasks such as\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2019\/10\/21\/openai-rubiks-cube-reinforcement-learning\/\">robotic hand manipulation<\/a>\u00a0also require huge amounts of training and trial-and-error.<\/p>\n<p>Part of the reason reinforcement learning still struggles with efficiency is the gap remaining in our knowledge of learning in humans and animals. And we have much more than just model-free and model-based reinforcement learning, Lee believes.<\/p>\n<p>\u201cI think our brain is a pandemonium of learning algorithms that have evolved to handle many different situations,\u201d he said.<\/p>\n<p>In addition to constantly switching between these modes of learning, the brain manages to maintain and update them all the time, even when they are not actively involved in decision-making.<\/p>\n<p>\u201cWhen you have multiple learning algorithms, they become useless if you turn some of them off. Even if you\u2019re relying on one algorithm\u2014say model-free RL\u2014the other algorithms must continue to run. I still have to update my world model rather than keep it frozen because if I don\u2019t, several hours later, when I realize that I need to switch to the model-based RL, it will be obsolete,\u201d Lee said.<\/p>\n<p>Some interesting work in AI research shows how this might work. A\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2022\/01\/24\/ai-thinking-fast-and-slow\/\">recent technique<\/a>\u00a0inspired by psychologist Daniel Kahneman\u2019s System 1 and System 2 thinking shows that maintaining different learning modules and updating them in parallel helps improve the efficiency and accuracy of AI systems.<\/p>\n<p>Another thing that we still have to figure out is how to apply the right inductive biases in our AI systems to make sure they learn the right things in a cost-efficient way. Billions of years of evolution have provided humans and animals with the inductive biases needed to learn efficiently and with as little data as possible.<\/p>\n<p>\u201cThe information that we get from the environment is very sparse. And using that information, we have to <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">general<\/a>ize. The reason is that the brain has inductive biases and has biases that can generalize from a small set of examples. That is the product of evolution, and a lot of neuroscientists are getting more interested in this,\u201d Lee said.<\/p>\n<p>However, while inductive biases might be easy to understand for an object recognition task, they become a lot more complicated for abstract problems such as building <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/social-mediaa\/\" data-internallinksmanager029f6b8e52c=\"1\" title=\"Social Media\" target=\"_blank\" rel=\"noopener\">social<\/a> relationships.<\/p>\n<p>\u201cThe idea of inductive bias is quite universal and applies not just to perception and object recognition but to all kinds of problems that an intelligent being has to deal with,\u201d Lee said. \u201cAnd I think that is in a way orthogonal to the model-based and model-free distinction because it\u2019s about how to build an efficient model of the complex structure based on a few observations. There\u2019s a lot more that we need to understand.\u201d\n                        <\/p><\/div>\n<p><script async src=\"\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMLG0nwswvr63Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\">For forums sites go to <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/forum.buradabiliyorum.com\/\" target=\"_blank\" rel=\"noopener\">Forum.BuradaBiliyorum.Com<\/a><\/span><\/strong>\n<\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/news\/everything-you-need-to-know-about-model-free-and-model-based-reinforcement-learning\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;Everything you need to know about model-free and model-based reinforcement learning&#8221; Reinforcement learning is one of the exciting branches of artificial intelligence. It plays an important role in game-playing AI systems, modern robots,\u00a0chip-design systems, and other applications. There are many different types of\u00a0reinforcement learning algorithms, but two main categories are \u201cmodel-based\u201d and \u201cmodel-free\u201d RL. They&#8230;<\/p>\n","protected":false},"author":1,"featured_media":467209,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/img-cdn.tnwcdn.com\/image\/neural?filter_last=1&fit=1280,640&url=https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/06\/Model-free-reinforcement-learning-hed.jpg&signature=f17cf502cd5c0a56428d0d9336650ba5","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-467208","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/467208","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=467208"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/467208\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/467209"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=467208"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=467208"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=467208"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}