{"id":657998,"date":"2025-03-20T11:07:39","date_gmt":"2025-03-20T08:07:39","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/pruna-ai-open-sources-its-ai-model-optimization-framework\/"},"modified":"2025-03-20T11:07:39","modified_gmt":"2025-03-20T08:07:39","slug":"pruna-ai-open-sources-its-ai-model-optimization-framework","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/pruna-ai-open-sources-its-ai-model-optimization-framework\/","title":{"rendered":"#Pruna AI open sources its AI model optimization framework"},"content":{"rendered":"<div>\n<p id=\"speakable-summary\" class=\"wp-block-paragraph\"><a rel=\"nofollow\" target=\"_blank\" rel=\"nofollow\" href=\"https:\/\/www.pruna.ai\/\">Pruna AI<\/a>, a European startup that has been working on compression algorithms for AI models, is making its optimization framework <a rel=\"nofollow\" target=\"_blank\" rel=\"nofollow\" href=\"https:\/\/github.com\/PrunaAI\/pruna\">open source<\/a> on Thursday.<\/p>\n<p class=\"wp-block-paragraph\">Pruna AI has been creating a framework that <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>lies several efficiency methods, such as caching, pruning, quantization and distillation, to a given AI model.  <\/p>\n<p class=\"wp-block-paragraph\">\u201cWe also standardize saving and loading the compressed models, applying combinations of these compression methods, and also evaluating your compressed model after you compress it,\u201d Pruna AI co-fonder and CTO John Rachwan told TechCrunch.<\/p>\n<p class=\"wp-block-paragraph\">In particular, Pruna AI\u2019s framework can evaluate if there\u2019s significant quality loss after compressing a model and the performance gains that you get.<\/p>\n<p class=\"wp-block-paragraph\">\u201cIf I were to use a metaphor, we are similar to how Hugging Face standardized transformers and diffusers \u2014 how to call them, how to save them, load them, etc. We are doing the same, but for efficiency methods,\u201d he added.<\/p>\n<p class=\"wp-block-paragraph\">Big AI labs have already been using various compression methods already. For instance, OpenAI has been relying on distillation to create faster versions of its flagship models.<\/p>\n<p class=\"wp-block-paragraph\">This is likely how OpenAI developed GPT-4 Turbo, a faster version of GPT-4. Similarly, the <a rel=\"nofollow\" target=\"_blank\" rel=\"nofollow\" href=\"https:\/\/huggingface.co\/black-forest-labs\/FLUX.1-schnell\">Flux.1-schnell<\/a> image generation model is a distilled version of the Flux.1 model from Black Forest Labs.<\/p>\n<p class=\"wp-block-paragraph\">Distillation is a technique used to extract knowledge from a large AI model with a \u201cteacher-student\u201d model. Developers send requests to a teacher model and record the outputs. Answers are sometimes compared with a dataset to see how accurate they are. These outputs are then used to train the student model, which is trained to approximate the teacher\u2019s behavior.<\/p>\n<p class=\"wp-block-paragraph\">\u201cFor big companies, what they usually do is that they build this stuff in-house. And what you can find in the open source world is usually based on single methods. For example, let\u2019s say one quantization method for LLMs, or one caching method for diffusion models,\u201d Rachwan said. \u201cBut you cannot find a tool that aggregates all of them, makes them all easy to use and combine together. And this is the big value that Pruna is bringing right now.\u201d<\/p>\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"5201\" height=\"2772\" src=\"https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg\" alt=\"\" class=\"wp-image-2983501\" srcset=\"https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg 5201w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=150,80 150w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=300,160 300w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=768,409 768w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=680,362 680w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=1200,640 1200w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=1280,682 1280w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=430,229 430w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=720,384 720w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=900,480 900w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=800,426 800w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=1536,819 1536w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=2048,1092 2048w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=668,356 668w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=1158,617 1158w, https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI-Co-founders.jpg?resize=708,377 708w\" sizes=\"auto, (max-width: 5201px) 100vw, 5201px\"\/><figcaption class=\"wp-element-caption\"><span class=\"wp-element-caption__text\">Left to right: Rayan Nait Mazi, Bertrand Charpentier, John Rachwan, Stephan G\u00fcnnemann<\/span><span class=\"wp-block-image__credits\"><strong>Image Credits:<\/strong>Pruna AI<\/span><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">While Pruna AI supports any kind of models, from large language models to diffusion models, speech-to-text models and computer vision models, the company is focusing more specifically on image and video generation models right now.<\/p>\n<p class=\"wp-block-paragraph\">Some of Pruna AI\u2019s existing users include <a rel=\"nofollow\" target=\"_blank\" rel=\"nofollow\" href=\"https:\/\/www.scenario.com\/\">Scenario<\/a> and <a rel=\"nofollow\" target=\"_blank\" rel=\"nofollow\" href=\"https:\/\/www.photoroom.com\/\">PhotoRoom<\/a>. In addition to the open source edition, Pruna AI has an enterprise offering with advanced optimization features including an optimization agent.<\/p>\n<p class=\"wp-block-paragraph\">\u201cThe most exciting feature that we are releasing soon will be a compression agent,\u201d Rachwan said. \u201cBasically, you give it your model, you say: \u2018I want more speed but don\u2019t drop my accuracy by more than 2%.\u2019 And then, the agent will just do its magic. It will find the best combination for you, return it for you. You don\u2019t have to do anything as a developer.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Pruna AI charges by the hour for its pro version. \u201cIt\u2019s similar to how you would think of a GPU when you rent a GPU on AWS or any cloud service,\u201d Rachwan said.<\/p>\n<p class=\"wp-block-paragraph\">And if your model is a critical part of your AI infrastructure, you\u2019ll end up saving a lot of money on inference with the optimized model. For example, Pruna AI has made a Llama model eight times smaller without too much loss using its compression framework. Pruna AI hopes its customers will think about its compression framework as an investment that pays for itself.<\/p>\n<p class=\"wp-block-paragraph\">Pruna AI raised a $6.5 million seed funding round a few months ago. Investors in the startup include EQT Ventures, Daphni, Motier Ventures and Kima Ventures.<\/p>\n<\/div>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMN63nwsw68G3Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/category\/technology\/\" target=\"_blank\" >Technology<\/a><\/span> category.<\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/techcrunch.com\/2025\/03\/20\/pruna-ai-open-sources-its-ai-model-optimization-framework\/\" target=\"_blank\" >Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pruna AI, a European startup that has been working on compression algorithms for AI models, is making its optimization framework open source on Thursday. Pruna AI has been creating a framework that applies several efficiency methods, such as caching, pruning, quantization and distillation, to a given AI model. \u201cWe also standardize saving and loading the&#8230;<\/p>\n","protected":false},"author":1,"featured_media":657999,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/techcrunch.com\/wp-content\/uploads\/2025\/03\/Pruna-AI.png?resize=1200,608","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[77337,28340,32232,43905,70917],"class_list":["post-657998","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology","tag-ai","tag-europe","tag-france","tag-germany","tag-startups"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/657998","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=657998"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/657998\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/657999"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=657998"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=657998"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=657998"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}