{"id":203689,"date":"2021-03-16T21:02:13","date_gmt":"2021-03-16T18:02:13","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/solving-big-ais-big-energy-problem\/"},"modified":"2021-03-16T21:02:13","modified_gmt":"2021-03-16T18:02:13","slug":"solving-big-ais-big-energy-problem","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/solving-big-ais-big-energy-problem\/","title":{"rendered":"#Solving Big AI\u2019s Big Energy Problem"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_85 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a3a1c5e9f5dd\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a3a1c5e9f5dd\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/solving-big-ais-big-energy-problem\/#Now_Computing_Power_Doubling_Every_34_Months\" >Now: Computing Power Doubling Every 3.4 Months<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/solving-big-ais-big-energy-problem\/#The_Future_is_Getting_Small\" >The Future is Getting Small\u00a0<\/a><\/li><\/ul><\/nav><\/div>\n<p>&#8220;<strong>#Solving Big AI\u2019s Big Energy Problem<\/strong>&#8221;<br \/>\n<img decoding=\"async\" src=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2015\/05\/artificialintelligence-1200x673.jpg\" \/><\/p>\n<div>\n                                <span style=\"font-weight: 400;\">It seems that the more ground-breaking deep learning models are in AI, the more massive they get. This summer\u2019s most buzzed-about model for natural language processing, GPT-3, is a perfect example. To reach the levels of accuracy and speed to write like a human, the model <\/span> <span style=\"font-weight: 400;\">needed 175 billion parameters, 350 GB of memory and <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/venturebeat.com\/2020\/06\/01\/ai-machine-learning-openai-gpt-3-size-isnt-everything\/\"><span style=\"font-weight: 400;\">$12 million<\/span><\/a><span style=\"font-weight: 400;\"> to train (think of training as the \u201clearning\u201d phase). But, beyond cost alone, big AI models like this have a big energy problem.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">UMass Amherst <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/1906.02243\"><span style=\"font-weight: 400;\">researchers found<\/span><\/a><span style=\"font-weight: 400;\"> that the computing power needed to train a large AI model can produce over 600,000 pounds of CO2 emissions \u2013 that\u2019s five times the amount of the typical car over its lifespan! These models often take even more energy to process in real-world production settings (otherwise known as the inference phase). <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.forbes.com\/sites\/moorinsights\/2019\/05\/09\/google-cloud-doubles-down-on-nvidia-gpus-for-inference\/#e08756f67926\"><span style=\"font-weight: 400;\">NVIDIA estimates<\/span><\/a><span style=\"font-weight: 400;\"> that 80-90 percent of the cost incurred from running a neural network model comes during inference, rather than training.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To make more progress in the AI field, popular opinion suggests we\u2019ll have to make a huge environmental tradeoff. But that\u2019s not the case. Big models can be shrunk down to size to run on an everyday workstation or server, without having to sacrifice accuracy and speed. But first, let\u2019s look at why machine learning models got so big in the first place.<\/span><\/p>\n<h3><span class=\"ez-toc-section\" id=\"Now_Computing_Power_Doubling_Every_34_Months\"><\/span><strong>Now: Computing Power Doubling Every 3.4 Months<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><span style=\"font-weight: 400;\">A little over a decade ago, researchers at Stanford University discovered that the processors used to power the complex graphics in video <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/game\/\" data-internallinksmanager029f6b8e52c=\"7\" title=\"Game\" target=\"_blank\" rel=\"noopener\">game<\/a>s, called GPUs, could be <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/neuralmagic.com\/blog\/history-gpus\/\"><span style=\"font-weight: 400;\">used for deep learning<\/span><\/a><span style=\"font-weight: 400;\"> models. This discovery led to a race to create more and more powerful dedicated hardware for deep learning <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>lications. In turn, the models data scientists created became bigger and bigger. The logic was that bigger models would lead to more accurate outcomes. The more powerful the hardware, the faster these models would run.\u00a0<\/span><\/p>\n<p><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/openai.com\/blog\/ai-and-compute\/\"><span style=\"font-weight: 400;\">Research from OpenAI<\/span><\/a><span style=\"font-weight: 400;\"> proves that this assumption has been widely adopted in the field. Between 2012 and 2018, computing power for deep learning models doubled every 3.4 months. So, that means in a six year time period, the computing power used for AI grew a shocking 300,000x. As referenced above, this power is not just for training algorithms, but also to use them in production settings. More recent <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/arxiv.org\/pdf\/2007.05558.pdf\"><span style=\"font-weight: 400;\">research from MIT<\/span><\/a><span style=\"font-weight: 400;\"> suggests that we may reach the upper limits of computing power sooner than we think.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">What\u2019s more, resource constraints have kept the use of deep learning algorithms limited to those who can afford it. When deep learning can be applied to everything from detecting cancerous cells in medical imaging to stopping hate speech online, we can\u2019t afford to limit access. Then again, we can\u2019t afford the environmental consequences of proceeding with infinitely bigger, more power-hungry models.<\/span><\/p>\n<h3><span class=\"ez-toc-section\" id=\"The_Future_is_Getting_Small\"><\/span><strong>The Future is Getting Small\u00a0<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><span style=\"font-weight: 400;\">Luckily, researchers have found a number of new ways to shrink deep learning models and repurpose training datasets via smarter algorithms. That way, big models can run in production settings with less power, and still achieve the desired results based on the use case. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">These techniques have the potential to democratize machine learning for more organizations who don\u2019t have millions of dollars to invest in training algorithms and moving them into production. This is especially important for \u201cedge\u201d use cases, where larger, specialized AI hardware is not physically practical. Think tiny devices like cameras, car dashboards, smartphones, and more.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Researchers are shrinking models by removing some of the unneeded connections in neural networks (<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/2004.14340\"><span style=\"font-weight: 400;\">pruning<\/span><\/a><span style=\"font-weight: 400;\">), or by making some of their mathematical operations less complex to process (<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.qualcomm.com\/news\/onq\/2019\/03\/12\/heres-why-quantization-matters-ai\"><span style=\"font-weight: 400;\">quantization<\/span><\/a><span style=\"font-weight: 400;\">). These smaller, faster models can run anywhere at similar accuracy and performance to their larger counterparts. That means we\u2019ll no longer need to race to the top of computing power, causing even more environmental damage. Making big models smaller and more efficient is the future of deep learning.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Another major issue is training big models over and over again on new datasets for different use cases. A technique called <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.amazon.science\/blog\/when-does-transfer-learning-work\"><span style=\"font-weight: 400;\">transfer learning<\/span><\/a><span style=\"font-weight: 400;\"> can help prevent this problem. Transfer learning uses pretrained models as a starting point. The model\u2019s knowledge can be \u201ctransferred\u201d to a new task using a limited dataset, without having to retrain the original model from scratch. This is a crucial step toward cutting down on the computing power, energy and money required to train new models.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The bottom line? Models can (and should) be shrunk whenever possible to use less computing power. And knowledge can be recycled and reused instead of starting the deep learning training process from scratch. Ultimately, finding ways to reduce model size and related computing power (without sacrificing performance or accuracy) will be the next great unlock for deep learning. That way, anyone will be able to run these applications in production at lower cost, without having to make a massive environmental tradeoff. Anything is possible when we think small about big AI \u2013 even the next application to help stop the devastating effects of climate change.<\/span><\/p>\n<p class=\"c-post-pubDate\">\n                                    Published March 16, 2021 \u2014 18:02 UTC\n                                <\/p>\n<\/p><\/div>\n<p><script data-src=\"https:\/\/connect.facebook.net\/en_US\/sdk.js#xfbml=1&amp;appId=378011798897423&amp;version=v2.6\" id=\"socialSrcFacebook\" type=\"text\/template\"><\/script><\/p>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMLG0nwswvr63Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\">For forums sites go to <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/forum.buradabiliyorum.com\/\" target=\"_blank\" rel=\"noopener\">Forum.BuradaBiliyorum.Com<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/neural\/2021\/03\/16\/solving-big-ais-big-energy-problem\/\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;#Solving Big AI\u2019s Big Energy Problem&#8221; It seems that the more ground-breaking deep learning models are in AI, the more massive they get. This summer\u2019s most buzzed-about model for natural language processing, GPT-3, is a perfect example. To reach the levels of accuracy and speed to write like a human, the model needed 175 billion&#8230;<\/p>\n","protected":false},"author":1,"featured_media":203690,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/img-cdn.tnwcdn.com\/image\/neural?filter_last=1&fit=1280,640&url=https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2015\/05\/artificialintelligence.jpg&signature=7e1770f07441ac16dc6ee963fe61acc6","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-203689","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/203689","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=203689"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/203689\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/203690"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=203689"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=203689"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=203689"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}