{"id":646881,"date":"2024-12-15T16:40:19","date_gmt":"2024-12-15T13:40:19","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/what-are-ai-world-models-and-why-do-they-matter\/"},"modified":"2024-12-15T16:40:19","modified_gmt":"2024-12-15T13:40:19","slug":"what-are-ai-world-models-and-why-do-they-matter","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/what-are-ai-world-models-and-why-do-they-matter\/","title":{"rendered":"#What are AI &#8216;world models,&#8217; and why do they matter?"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_84 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a276ad7d8d30\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a276ad7d8d30\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/what-are-ai-world-models-and-why-do-they-matter\/#Modeling_the_world\" >Modeling the world<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/what-are-ai-world-models-and-why-do-they-matter\/#High_hurdles\" >High hurdles<\/a><\/li><\/ul><\/nav><\/div>\n<div>\n<p id=\"speakable-summary\" class=\"wp-block-paragraph\">World models, also known as world simulators, are being touted by some as the next big thing in AI. <\/p>\n<p class=\"wp-block-paragraph\">AI pioneer Fei-Fei Li\u2019s World Labs has raised $230 million to build \u201clarge world models,\u201d and DeepMind hired one of the creators of OpenAI\u2019s video generator, Sora, to work on \u201cworld simulators.\u201d (Sora was released on Monday; here are some early impressions.) <\/p>\n<p class=\"wp-block-paragraph\">But what the heck <em>are <\/em>these things?<\/p>\n<p class=\"wp-block-paragraph\">World models take inspiration from the mental models of the world that humans develop naturally. Our brains take the abstract representations from our senses and form them into more concrete understanding of the world around us, producing what we called \u201cmodels\u201d long before AI adopted the phrase. The predictions our brains make based on these models influence how we perceive the world.<\/p>\n<p class=\"wp-block-paragraph\">A <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/arxiv.org\/pdf\/1803.10122\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">paper<\/a> by AI researchers David Ha and J\u00fcrgen Schmidhuber gives the example of a baseball batter. Batters have milliseconds to decide how to swing their bat \u2014 shorter than the time it takes for visual signals to reach the brain. The reason they\u2019re able to hit a 100-mile-per-hour fastball is because they can instinctively predict where the ball will go, Ha and Schmidhuber say.<\/p>\n<p class=\"wp-block-paragraph\">\u201cFor professional players, this all h<a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>ens subconsciously,\u201d the research duo writes. \u201cTheir muscles reflexively swing the bat at the right time and location in line with their internal models\u2019 predictions. They can quickly act on their predictions of the future without the need to consciously roll out possible future scenarios to form a plan.\u201d<\/p>\n<p class=\"wp-block-paragraph\">It\u2019s these subconscious reasoning aspects of world models that some believe are prerequisites for human-level intelligence.<\/p>\n<h2 class=\"wp-block-heading\" id=\"h-modeling-the-world\"><span class=\"ez-toc-section\" id=\"Modeling_the_world\"><\/span>Modeling the world<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p class=\"wp-block-paragraph\">While the concept has been around for decades, world models have gained popularity recently in part because of their promising applications in the field of generative video.<\/p>\n<p class=\"wp-block-paragraph\">Most, if not all, AI-generated videos veer into uncanny valley territory. Watch them long enough and something<em> <\/em>bizarre<em> <\/em>will happen, like limbs twisting and merging into each other.<\/p>\n<p class=\"wp-block-paragraph\">While a generative model trained on years of video might accurately predict that a basketball bounces, it doesn\u2019t actually have any idea why \u2014 just like language models don\u2019t really understand the concepts behind words and phrases. But a world model with even a basic grasp of why the basketball bounces like it does will be better at showing it do that thing.<\/p>\n<p class=\"wp-block-paragraph\">To enable this kind of insight, world models are trained on a range of data, including photos, audio, videos, and text, with the intent of creating internal representations of how the world works, and the ability to reason about the consequences of actions.<\/p>\n<figure class=\"wp-block-image aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"480\" src=\"https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/06\/ezgif-4-3c4969c67b.gif?w=680\" alt=\"Runway Gen-3\" class=\"wp-image-2796335\"\/><figcaption class=\"wp-element-caption\"><span class=\"wp-element-caption__text\">A sample from AI startup Runway\u2019s Gen-3 video generation model. <\/span><span class=\"wp-block-image__credits\"><strong>Image Credits:<\/strong>Runway<\/span><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">\u201cA viewer expects that the world they\u2019re watching behaves in a similar way to their reality,\u201d Alex Mashrabov, Snap\u2019s ex-AI chief of AI and the CEO of Higgsfield, which is building generative models for video, said. \u201cIf a feather drops with the weight of an anvil or a bowling ball shoots up hundreds of feet into the air, it\u2019s jarring and takes the viewer out of the moment. With a strong world model, instead of a creator defining how each object is expected to move \u2014 which is tedious, cumbersome, and a poor use of time \u2014 the model will understand this.\u201d<\/p>\n<p class=\"wp-block-paragraph\">But better video generation is only the tip of the iceberg for world models. Researchers including Meta chief AI scientist Yann LeCun say the models could someday be used for sophisticated forecasting and planning in both the digital and physical realm.<\/p>\n<p class=\"wp-block-paragraph\">In a talk earlier this year, LeCun described how a world model could help achieve a desired goal through reasoning. A model with a base representation of a \u201cworld\u201d (e.g. a video of a dirty room), given an objective (a clean room), could come up with a sequence of actions to achieve that objective (deploy vacuums to sweep, clean the dishes, empty the trash) not because that\u2019s a pattern it has observed but because it knows at a deeper level how to go from dirty to clean.<\/p>\n<p class=\"wp-block-paragraph\">\u201cWe need machines that understand the world; [machines] that can remember things, that have intuition, have common sense \u2014 things that can reason and plan to the same level as humans,\u201d LeCun said. \u201cDespite what you might have heard from some of the most enthusiastic people, current AI systems are not capable of any of this.\u201d<\/p>\n<p class=\"wp-block-paragraph\">While LeCun estimates that we\u2019re at least a decade away from the world models he envisions, today\u2019s world models are showing promise as elementary physics simulators.<\/p>\n<figure class=\"wp-block-image aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"450\" src=\"https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/02\/ezgif-7-32e8e27b50-1.gif?w=680\" alt=\"OpenAI Sora Minecraft\" class=\"wp-image-2666345\"\/><figcaption class=\"wp-element-caption\"><span class=\"wp-element-caption__text\">Sora controlling a player in Minecraft \u2014 and rendering the world. <\/span><span class=\"wp-block-image__credits\"><strong>Image Credits:<\/strong>OpenAI<\/span><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">OpenAI notes in a blog that Sora, which it considers to be a world model, can simulate actions like a painter leaving brush strokes on a canvas. Models like Sora \u2014 and Sora itself \u2014 can also effectively <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/1803.10122\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">simulate<\/a> <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/2005.12126\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">video<\/a> <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/arstechnica.com\/gadgets\/2024\/03\/googles-genie-model-creates-interactive-2d-worlds-from-a-single-image\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">games<\/a>. For example, Sora can render a Minecraft-like UI and <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/game\/\" data-internallinksmanager029f6b8e52c=\"7\" title=\"Game\" target=\"_blank\" rel=\"noopener\">game<\/a> world.<\/p>\n<p class=\"wp-block-paragraph\">Future world models may be able to generate 3D worlds on demand for gaming, virtual photography, and more, World Labs co-founder Justin Johnson said on an <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/a16z.com\/podcast\/the-frontier-of-spatial-intelligence-with-fei-fei-li\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">episode<\/a> of the a16z podcast.<\/p>\n<p class=\"wp-block-paragraph\">\u201cWe already have the ability to create virtual, interactive worlds, but it costs hundreds and hundreds of millions of dollars and a ton of development time,\u201d Johnson said. \u201c[World models] will let you not just get an image or a clip out, but a fully simulated, vibrant, and interactive 3D world.\u201d<\/p>\n<h2 class=\"wp-block-heading\" id=\"h-high-hurdles\"><span class=\"ez-toc-section\" id=\"High_hurdles\"><\/span>High hurdles<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p class=\"wp-block-paragraph\">While the concept is enticing, many technical challenges stand in the way.<\/p>\n<p class=\"wp-block-paragraph\">Training and running world models requires massive compute power even compared to the amount currently used by generative models. While some of the latest language models can run on a modern smartphone, Sora (arguably an early world model) would require thousands of GPUs to train and run, especially if their use becomes commonplace.<\/p>\n<p class=\"wp-block-paragraph\">World models, like all AI models, also hallucinate \u2014 and internalize biases in their training data. A world model trained largely on videos of sunny weather in European cities might struggle to comprehend or depict Korean cities in snowy conditions, for example, or simply do so incorrectly.<\/p>\n<p class=\"wp-block-paragraph\">A <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">general<\/a> lack of training data threatens to exacerbate these issues, says Mashrabov. <\/p>\n<p class=\"wp-block-paragraph\">\u201cWe have seen models being really limited with generations of people of a certain type or race,\u201d he said. \u201cTraining data for a world model must be broad enough to cover a diverse set of scenarios, but also highly specific to where the AI can deeply understand the nuances of those scenarios.\u201d<\/p>\n<p class=\"wp-block-paragraph\">In a recent <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/runwayml.com\/research\/introducing-general-world-models\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">post<\/a>, AI startup Runway\u2019s CEO, Crist\u00f3bal Valenzuela, says that data and engineering issues prevent today\u2019s models from accurately capturing the behavior of a world\u2019s inhabitants (e.g. humans and animals). \u201cModels will need to generate consistent maps of the environment,\u201d he said, \u201cand the ability to navigate and interact in those environments.\u201d<\/p>\n<figure class=\"wp-block-image aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"790\" height=\"442\" src=\"https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/02\/ezgif-1-02294151d2.gif?w=680\" alt=\"OpenAI Sora\" class=\"wp-image-2666128\"\/><figcaption class=\"wp-element-caption\"><span class=\"wp-element-caption__text\">A Sora-generated video. <\/span><span class=\"wp-block-image__credits\"><strong>Image Credits:<\/strong>OpenAI<\/span><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">If all the major hurdles are overcome, though, Mashrabov believes that world models could \u201cmore robustly\u201d bridge AI with the real world \u2014 leading to breakthroughs not only in virtual world generation but robotics and AI decision-making.<\/p>\n<p class=\"wp-block-paragraph\">They could also spawn more capable robots. <\/p>\n<p class=\"wp-block-paragraph\">Robots today are limited in what they can do because they don\u2019t have an awareness of the world around them (or their own bodies). World models could give them that awareness, Mashrabov said \u2014 at least to a point.<\/p>\n<p class=\"wp-block-paragraph\">\u201cWith an advanced world model, an AI could develop a personal understanding of whatever scenario it\u2019s placed in,\u201d he said, \u201cand start to reason out possible solutions.\u201d<\/p>\n<p class=\"wp-block-paragraph\"><em>TechCrunch has an AI-focused <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/news\/\" data-internallinksmanager029f6b8e52c=\"2\" title=\"News\" target=\"_blank\" rel=\"noopener\">news<\/a>letter!\u00a0<\/em><em>Sign up here<\/em><em>\u00a0to get it in your inbox every Wednesday.<\/em><\/p>\n<\/div>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMN63nwsw68G3Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/category\/technology\/\" target=\"_blank\" >Technology<\/a><\/span> category.<\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/techcrunch.com\/2024\/12\/14\/what-are-ai-world-models-and-why-do-they-matter\/\" target=\"_blank\" >Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>World models, also known as world simulators, are being touted by some as the next big thing in AI. AI pioneer Fei-Fei Li\u2019s World Labs has raised $230 million to build \u201clarge world models,\u201d and DeepMind hired one of the creators of OpenAI\u2019s video generator, Sora, to work on \u201cworld simulators.\u201d (Sora was released on&#8230;<\/p>\n","protected":false},"author":1,"featured_media":646882,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/techcrunch.com\/wp-content\/uploads\/2021\/04\/GettyImages-1205638055.jpg?resize=1200,800","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[77337,130390,151410,153190],"class_list":["post-646881","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology","tag-ai","tag-artificial-intelligence-ai","tag-evergreens","tag-world-models"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/646881","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=646881"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/646881\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/646882"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=646881"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=646881"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=646881"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}