{"id":451168,"date":"2022-05-22T17:00:35","date_gmt":"2022-05-22T14:00:35","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/new-deep-learning-technique-paves-path-to-pizza-making-robots\/"},"modified":"2022-05-22T17:00:35","modified_gmt":"2022-05-22T14:00:35","slug":"new-deep-learning-technique-paves-path-to-pizza-making-robots","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/new-deep-learning-technique-paves-path-to-pizza-making-robots\/","title":{"rendered":"#New deep learning technique paves path to pizza-making robots"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_84 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a27ec2b79810\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a27ec2b79810\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-1'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/new-deep-learning-technique-paves-path-to-pizza-making-robots\/#%E2%80%9CNew_deep_learning_technique_paves_path_to_pizza-making_robots%E2%80%9D\" >&#8220;New deep learning technique paves path to pizza-making robots&#8221;<\/a><ul class='ez-toc-list-level-2' ><li class='ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/new-deep-learning-technique-paves-path-to-pizza-making-robots\/#Greetings_humanoids\" >Greetings humanoids<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/buradabiliyorum.com\/en\/new-deep-learning-technique-paves-path-to-pizza-making-robots\/#Handling_deformable_objects_with_reinforcement_learning_and_deep_learning\" >Handling deformable objects with reinforcement learning and deep learning<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/buradabiliyorum.com\/en\/new-deep-learning-technique-paves-path-to-pizza-making-robots\/#Learning_abstract_skills_with_neural_networks\" >Learning abstract skills with neural networks<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/buradabiliyorum.com\/en\/new-deep-learning-technique-paves-path-to-pizza-making-robots\/#Planning_long-horizon_deformable_object_tasks\" >Planning long-horizon deformable object tasks<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/buradabiliyorum.com\/en\/new-deep-learning-technique-paves-path-to-pizza-making-robots\/#Preparing_the_pizza_dough_with_DiffSkill\" >Preparing the pizza dough with DiffSkill<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h1><span class=\"ez-toc-section\" id=\"%E2%80%9CNew_deep_learning_technique_paves_path_to_pizza-making_robots%E2%80%9D\"><\/span>&#8220;New deep learning technique paves path to pizza-making robots&#8221;<span class=\"ez-toc-section-end\"><\/span><\/h1>\n<div>\n                            <em>This article is part of our coverage of the latest in <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/tag\/ai-research-papers\/\">AI research<\/a>.<\/em><\/p>\n<p>For humans, working with deformable objects is not significantly more difficult than handling rigid objects. We learn naturally to shape them, fold them, and manipulate them in different ways and still recognize them.<\/p>\n<p>But for robots and artificial intelligence systems, manipulating deformable objects present a huge challenge. Consider the <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/watch-movies-tv-seriess\/\" data-internallinksmanager029f6b8e52c=\"8\" title=\"Watch Movies &amp; TV Series\" target=\"_blank\" rel=\"noopener\">series<\/a> of steps that a robot must take to shape a ball of dough into pizza crusts. It must keep track of the dough as it changes shape, and at the same time, it must choose the right tool for each step of the work. These are challenging tasks for current AI systems, which are more stable in handling rigid-body objects, which have more predictable states.<\/p>\n<div class=\"corona-wrapper neural-cta hs-embed-tnw\">\n<div class=\"neural-cta-wrapper\">\n<div class=\"neural-cta-img\"><img decoding=\"async\" src=\"https:\/\/s3.amazonaws.com\/events.tnw\/hardfork-2018\/uploads\/companies\/neural-newsletter_header.gif\"\/><\/div>\n<p><noscript><img decoding=\"async\" src=\"https:\/\/s3.amazonaws.com\/events.tnw\/hardfork-2018\/uploads\/companies\/neural-newsletter_header.gif\"\/><\/noscript><\/p>\n<div class=\"neural-cta-input\">\n<h2 class=\"neural-cta-title\"><span class=\"ez-toc-section\" id=\"Greetings_humanoids\"><\/span>Greetings humanoids<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p class=\"neural-cta-tagline\">Subscribe now for a weekly recap of our favorite AI stories<\/p>\n<p><!--[if lte IE 8]><![endif]--><\/div>\n<\/div>\n<\/div>\n<p>Now, a new deep learning technique developed by researchers at MIT, Carnegie Mellon University, and the University of California at San Diego, shows promise to make robotics systems more stable in handling deformable objects. Called <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/xingyu-lin.github.io\/diffskill\/\">DiffSkill<\/a>, the technique uses deep neural networks to learn simple skills and a planning module for combining the skills to solve tasks that require multiple steps and tools.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Handling_deformable_objects_with_reinforcement_learning_and_deep_learning\"><\/span>Handling deformable objects with reinforcement learning and deep learning<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>If an AI system wants to handle an object, it has to be able to detect and define its state and predict how it will look in the future. This is a problem that has been largely solved for rigid objects. With a good set of training examples, a <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2021\/01\/28\/deep-learning-explainer\/\">deep neural network<\/a> will be able to detect a rigid object from different angles. However, when it comes to deformable objects, the space of possible states becomes much more complicated.<\/p>\n<p>\u201cFor rigid objects, we can describe its state with six numbers: Three numbers for its XYZ coordinates and another three numbers for its orientation,\u201d Xingyu Lin, Ph.D. student at CMU and lead author of the DiffSkill paper, told TechTalks.<\/p>\n<p>\u201cHowever, deformable bodies, such as the dough or fabrics, have infinite degrees of freedom, making it much more difficult to describe their states precisely. Furthermore, the ways they deform are also harder to model in a mathematical way compared to rigid bodies.\u201d<\/p>\n<p>The development of differentiable physics simulators enabled the <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>lication of gradient-based methods to solve deformable object manipulation tasks. This is in contrast to the traditional <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2019\/05\/28\/what-is-reinforcement-learning\/\">reinforcement learning<\/a> approach that tries to learn the dynamics of the environment and objects through pure trial-and-error interactions.<\/p>\n<p>DiffSkill was inspired by <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/openreview.net\/forum?id=xCcdBRQEDW\">PlasticineLab<\/a>, a differentiable physics simulator that was presented at the ICLR conference in 2021. PlasticineLab showed that differentiable simulators can help short-horizon tasks.<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"size-featured_img wp-image-1387238 js-lazy\" alt=\"PlasticineLab is a differentiable physics-based simulator for deformable objects. It is suitable for training gradient-based models.\" width=\"796\" height=\"353\" sizes=\"auto, (max-width: 796px) 100vw, 796px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab-796x353.webp\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab-796x353.webp 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab-280x124.webp 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab-270x120.webp 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab-540x240.webp 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab.webp 1392w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2022%2F05%2F22%2Fnew-deep-learning-technique-paves-path-to-pizza-making-robots%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: PlasticineLab is a differentiable physics-based simulator for deformable objects. It is suitable for training gradient-based models.\" data-title=\"Share PlasticineLab is a differentiable physics-based simulator for deformable objects. It is suitable for training gradient-based models. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share PlasticineLab is a differentiable physics-based simulator for deformable objects. It is suitable for training gradient-based models. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>PlasticineLab is a differentiable physics-based simulator for deformable objects. It is suitable for training gradient-based models.<\/figcaption><noscript><img decoding=\"async\" loading=\"lazy\" class=\"size-featured_img wp-image-1387238\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab-796x353.webp\" alt=\"PlasticineLab is a differentiable physics-based simulator for deformable objects. It is suitable for training gradient-based models.\" width=\"796\" height=\"353\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab-796x353.webp 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab-280x124.webp 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab-270x120.webp 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab-540x240.webp 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/PlasticineLab.webp 1392w\"\/><\/noscript><\/figure>\n<p>But differentiable simulators still struggle with long-horizon problems that require multiple steps and the use of different tools. AI systems based on differentiable simulators also require the agent to know the full simulation state and relevant physical parameters of the environment. This is especially limiting for real-world applications, where the agent usually perceives the world through visual and depth sensory data (RGB-D).<\/p>\n<p>\u201cWe started to ask if we can extract [the steps required to accomplish a task] as skills and also learn abstract notions about the skills so that we can chain them to solve more complex tasks,\u201d Lin said.<\/p>\n<p>DiffSkill is a framework where the AI agent learns skill abstraction using the differentiable physics model and composes them to accomplish complicated manipulation tasks.<\/p>\n<p>Lin\u2019s past work was focused on using reinforcement learning for the manipulation of deformable objects such as cloth, ropes, and liquids. For DiffSkill, he chose dough manipulation because of the challenges it poses.<\/p>\n<p>\u201cDough manipulation is particularly interesting because it cannot be easily performed with the robot gripper, but requires using different tools sequentially, something humans are good at but is not very common for robots to do,\u201d Lin said.<\/p>\n<p>Once trained, DiffSkill can successfully accomplish a set of dough manipulation tasks using only RGB-D input.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Learning_abstract_skills_with_neural_networks\"><\/span>Learning abstract skills with neural networks<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"size-featured_img wp-image-1387239 js-lazy\" alt=\"DiffSkill trains a neural network to predict the feasibility of a goal state from the initial state and parameters obtained from a differentiable physics simulator.\" width=\"796\" height=\"281\" sizes=\"auto, (max-width: 796px) 100vw, 796px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction-796x281.webp\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction-796x281.webp 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction-280x99.webp 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction-270x95.webp 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction-540x190.webp 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction.webp 1392w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2022%2F05%2F22%2Fnew-deep-learning-technique-paves-path-to-pizza-making-robots%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: DiffSkill trains a neural network to predict the feasibility of a goal state from the initial state and parameters obtained from a differentiable physics simulator.\" data-title=\"Share DiffSkill trains a neural network to predict the feasibility of a goal state from the initial state and parameters obtained from a differentiable physics simulator. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share DiffSkill trains a neural network to predict the feasibility of a goal state from the initial state and parameters obtained from a differentiable physics simulator. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>DiffSkill trains a neural network to predict the feasibility of a goal state from the initial state and parameters obtained from a differentiable physics simulator.<\/figcaption><noscript><img decoding=\"async\" loading=\"lazy\" class=\"size-featured_img wp-image-1387239\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction-796x281.webp\" alt=\"DiffSkill trains a neural network to predict the feasibility of a goal state from the initial state and parameters obtained from a differentiable physics simulator.\" width=\"796\" height=\"281\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction-796x281.webp 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction-280x99.webp 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction-270x95.webp 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction-540x190.webp 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-neural-skill-abstraction.webp 1392w\"\/><\/noscript><\/figure>\n<p>DiffSkill is composed of two key components: a \u201cneural skill abstractor\u201d that uses <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2019\/08\/05\/what-is-artificial-neural-network-ann\/\">neural networks<\/a> to learn individual skills and a \u201cplanner\u201d that composes the skill to solve long-horizon tasks.<\/p>\n<p>DiffSkill uses a differentiable physics simulator to generate training examples for the skill abstractor. These samples show how to achieve a short-horizon goal with a single tool, such as using a roller to spread the dough or a spatula to displace the dough.<\/p>\n<p>These examples are presented to the skill abstractor as RGB-D videos. Given an image observation, the skill abstractor must predict whether the desired goal is feasible or not. The model learns and tunes its parameters by comparing its prediction with the actual outcome of the physics simulator.<\/p>\n<blockquote class=\"twitter-tweet\" data-width=\"500\" data-dnt=\"true\">\n<p lang=\"en\" dir=\"ltr\">Robotic manipulation of deformable objects like dough requires long-horizon reasoning over the use of different tools. Our method DiffSkill utilizes a differentiable simulator to learn and compose skills for these challenging tasks. <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/twitter.com\/hashtag\/ICLR2022?src=hash&amp;ref_src=twsrc%5Etfw\">#ICLR2022<\/a> <br \/>Website: <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/t.co\/1JFDUxfIyC\">https:\/\/t.co\/1JFDUxfIyC<\/a> <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/t.co\/rNRJ1XskGB\">pic.twitter.com\/rNRJ1XskGB<\/a><\/p>\n<p>\u2014 Xingyu Lin (@Xingyu2017) <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/twitter.com\/Xingyu2017\/status\/1519442478130253827?ref_src=twsrc%5Etfw\">April 27, 2022<\/a><\/p>\n<\/blockquote>\n<p>At the same time, DiffSkill trains a variational autoencoder (VAE) to learn a latent-space representation of the examples generated by the physics simulator. The VAE encodes images in a lower-dimension space that preserves important features and discards information that is not relevant to the task. By transferring the high-dimensional image space into the latent space, the VAE plays an important role in enabling DiffSkill to plan over long horizons and predict outcomes by observing sensory data.<\/p>\n<p>One of the important challenges of training the VAE is making sure it learns the right features and <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">general<\/a>izes to the real world, where the composition of visual data is different from those generated by the physics simulator. For example, the color of the roller pin or the table is not relevant to the task, but the position and angle of the roller and the location of the dough are.<\/p>\n<p>Currently, the researchers are using a technique called \u201cdomain randomization,\u201d which randomizes the irrelevant properties of the training environment such as background and lighting, and keeps the important features such as the position and orientation of tools. This makes the VAE more stable when applied to the real world.<\/p>\n<p>\u201cDoing this is not easy, as we need to cover all possible variations that are different between the simulation and the real world [known as the sim2real gap],\u201d Lin said. \u201cA better way is to use a 3D point cloud as representation of the scene, which is much easier to transfer from simulation to the real world. In fact, we are working on a follow-up project using point cloud as input.\u201d<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Planning_long-horizon_deformable_object_tasks\"><\/span>Planning long-horizon deformable object tasks<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"size-featured_img wp-image-1387240 js-lazy\" alt=\"DiffSkill uses a planner module to evaluate different combinations and sequences of skills that can accomplish the target goal.\" width=\"796\" height=\"219\" sizes=\"auto, (max-width: 796px) 100vw, 796px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner-796x219.webp\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner-796x219.webp 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner-280x77.webp 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner-270x74.webp 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner-540x149.webp 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner.webp 1392w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2022%2F05%2F22%2Fnew-deep-learning-technique-paves-path-to-pizza-making-robots%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: DiffSkill uses a planner module to evaluate different combinations and sequences of skills that can accomplish the target goal.\" data-title=\"Share DiffSkill uses a planner module to evaluate different combinations and sequences of skills that can accomplish the target goal. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share DiffSkill uses a planner module to evaluate different combinations and sequences of skills that can accomplish the target goal. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>DiffSkill uses a planner module to evaluate different combinations and sequences of skills that can accomplish the target goal.<\/figcaption><noscript><img decoding=\"async\" loading=\"lazy\" class=\"size-featured_img wp-image-1387240\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner-796x219.webp\" alt=\"DiffSkill uses a planner module to evaluate different combinations and sequences of skills that can accomplish the target goal.\" width=\"796\" height=\"219\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner-796x219.webp 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner-280x77.webp 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner-270x74.webp 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner-540x149.webp 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-planner.webp 1392w\"\/><\/noscript><\/figure>\n<p>Once the skill abstractor is trained, DiffSkill uses the planner module to solve long-horizon tasks. The planner must determine the number and sequence of skills needed to go from the initial state to the destination.<\/p>\n<p>This planner iterates over possible combinations of skills and the inter<a href=\"https:\/\/buradabiliyorum.com\/en\/category\/social-mediaa\/\" data-internallinksmanager029f6b8e52c=\"1\" title=\"Social Media\" target=\"_blank\" rel=\"noopener\">media<\/a>te outcomes they yield. The variational autoencoder comes in handy here. Instead of predicting full image outcomes, DiffSkill uses the VAE to predict the latent-space outcome of intermediate steps toward the final goal.<\/p>\n<p>The combination of abstract skills and latent-space representations makes it much more computationally efficient to draw a trajectory from the initial state to the goal. In fact, the researchers didn\u2019t need to optimize the search function and used an exhaustive search of all combinations.<\/p>\n<p>\u201cThe computation is not too much since we are planning over the skills and the horizon is not very long,\u201d Lin said. \u201cThis exhaustive search eliminates the need for designing a sketch for the planner and might lead to novel solutions not considered by the designer in a more general way, although we did not observe this in the limited tasks we tried. Furthermore, more sophisticated search techniques could be applied as well\u201d<\/p>\n<p>According to the DiffSkill paper, \u201coptimization can be done efficiently in around 10 seconds for each skill combination on a single NVIDIA 2080Ti GPU.\u201d<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Preparing_the_pizza_dough_with_DiffSkill\"><\/span>Preparing the pizza dough with DiffSkill<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-featured_img wp-image-1387241 js-lazy\" alt=\"pizza\" width=\"796\" height=\"281\" sizes=\"auto, (max-width: 796px) 100vw, 796px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation-796x281.webp\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation-796x281.webp 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation-280x99.webp 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation-270x95.webp 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation-540x191.webp 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation.webp 1392w\"\/><noscript><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-featured_img wp-image-1387241\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation-796x281.webp\" alt=\"pizza\" width=\"796\" height=\"281\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation-796x281.webp 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation-280x99.webp 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation-270x95.webp 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation-540x191.webp 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-pizza-dough-preparation.webp 1392w\"\/><\/noscript><\/figure>\n<p>The researchers tested the performance of DiffSkill against several baseline methods that have been applied to deformable objects, including two model-free reinforcement learning algorithms and a trajectory optimizer that only uses the physics simulator.<\/p>\n<p>The models were tested on several tasks that require multiple steps and tools. For example, in one of the tasks, the AI agent must lift the dough with a spatula, place it on a cutting board, and spread it with a roller.<\/p>\n<p>The results show that DiffSkill is significantly better than other techniques at solving long-horizon, multiple-tool tasks using only sensory information. The experiments show that when well trained, DiffSkill\u2019s planner can find good intermediate states between the initial and goal states and find decent sequences of skills to solve tasks.<\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img decoding=\"async\" loading=\"lazy\" class=\"size-featured_img wp-image-1387242 js-lazy\" alt=\"DiffSkill\u2019s planner can predict intermediate steps with impressive accuracy.\" width=\"796\" height=\"335\" sizes=\"auto, (max-width: 796px) 100vw, 796px\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions-796x335.webp\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions-796x335.webp 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions-280x118.webp 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions-270x114.webp 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions-540x227.webp 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions.webp 1392w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2022%2F05%2F22%2Fnew-deep-learning-technique-paves-path-to-pizza-making-robots%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: DiffSkill\u2019s planner can predict intermediate steps with impressive accuracy.\" data-title=\"Share DiffSkill\u2019s planner can predict intermediate steps with impressive accuracy. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share DiffSkill\u2019s planner can predict intermediate steps with impressive accuracy. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>DiffSkill\u2019s planner can predict intermediate steps with impressive accuracy.<\/figcaption><noscript><img decoding=\"async\" loading=\"lazy\" class=\"size-featured_img wp-image-1387242\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions-796x335.webp\" alt=\"DiffSkill\u2019s planner can predict intermediate steps with impressive accuracy.\" width=\"796\" height=\"335\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions-796x335.webp 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions-280x118.webp 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions-270x114.webp 270w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions-540x227.webp 540w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/DiffSkill-intermediate-steps-predictions.webp 1392w\"\/><\/noscript><\/figure>\n<p>\u201cOne takeaway is that a set of skills can provide very important temporal abstraction, allowing us to reason over long-horizon,\u201d Lin said. \u201cThis is also similar to how human approaches different tasks: thinking at different temporal abstractions instead of thinking what to do at every next second.\u201d<\/p>\n<p>However, there are also limits to DiffSkill\u2019s capacity. For example, when performing one of the tasks that required three-stage planning, DiffSkill\u2019s performance degrades significantly (though it is still better than other techniques). Lin also mentioned that in some cases, the feasibility predictor produces false positives. The researchers believe that learning a better latent space can help solve this problem.<\/p>\n<p>The researchers are also exploring other directions to improve DiffSkill, including a more efficient planner algorithm that can be used for longer horizon tasks.<\/p>\n<p>Lin hopes that one day, he can use DiffSkill on real pizza-making robots. \u201cWe are still far from this. Various challenges emerge from control, sim2real transfer, and safety. But we are now more confident at trying some long-horizon tasks,\u201d he said.<\/p>\n<p><em>This article was originally published by Ben Dickson on<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/\">TechTalks<\/a>, a publication that examines trends in <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/technology\/\" data-internallinksmanager029f6b8e52c=\"4\" title=\"Technology\" target=\"_blank\" rel=\"noopener\">technology<\/a>, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech, and what we need to look out for. You can read the original article\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/bdtechtalks.com\/2022\/03\/28\/datarobot-no-code-ai\/\">here<\/a>.<\/em>\n                        <\/div>\n<p><script async src=\"\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMLG0nwswvr63Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\">For forums sites go to <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/forum.buradabiliyorum.com\/\" target=\"_blank\" rel=\"noopener\">Forum.BuradaBiliyorum.Com<\/a><\/span><\/strong>\n<\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/news\/new-deep-learning-technique-paves-path-to-pizza-making-robots\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;New deep learning technique paves path to pizza-making robots&#8221; This article is part of our coverage of the latest in AI research. For humans, working with deformable objects is not significantly more difficult than handling rigid objects. We learn naturally to shape them, fold them, and manipulate them in different ways and still recognize them&#8230;.<\/p>\n","protected":false},"author":1,"featured_media":451169,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/img-cdn.tnwcdn.com\/image\/neural?filter_last=1&fit=1280,640&url=https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/05\/RobotPizza.jpg&signature=6ac11f10998502e3a24f329632f83cba","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-451168","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/451168","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=451168"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/451168\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/451169"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=451168"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=451168"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=451168"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}