{"id":690939,"date":"2025-09-19T21:30:44","date_gmt":"2025-09-19T18:30:44","guid":{"rendered":"https:\/\/buradabiliyorum.com\/en\/ai-progress-stalls-for-seo-tasks-despite-wave-of-new-models\/"},"modified":"2025-09-19T21:30:44","modified_gmt":"2025-09-19T18:30:44","slug":"ai-progress-stalls-for-seo-tasks-despite-wave-of-new-models","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/ai-progress-stalls-for-seo-tasks-despite-wave-of-new-models\/","title":{"rendered":"AI progress stalls for SEO tasks despite wave of new models"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_85 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a360d196b861\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a360d196b861\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/ai-progress-stalls-for-seo-tasks-despite-wave-of-new-models\/#Claude_still_leads_ChatGPT-5_rebounds_and_Gemini_rises_%E2%80%93_but_new_AI_models_show_limits_in_handling_real-world_SEO_tasks\" >Claude still leads, ChatGPT-5 rebounds, and Gemini rises \u2013\u00a0but new AI models show limits in handling real-world SEO tasks.<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/ai-progress-stalls-for-seo-tasks-despite-wave-of-new-models\/#The_AI_SEO_Benchmark\" >The AI SEO Benchmark<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/buradabiliyorum.com\/en\/ai-progress-stalls-for-seo-tasks-despite-wave-of-new-models\/#Initial_findings\" >Initial findings<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/buradabiliyorum.com\/en\/ai-progress-stalls-for-seo-tasks-despite-wave-of-new-models\/#A_new_wave_of_models\" >A new wave of models<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/buradabiliyorum.com\/en\/ai-progress-stalls-for-seo-tasks-despite-wave-of-new-models\/#Has_AI_progress_slowed_down\" >Has AI progress slowed down?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/buradabiliyorum.com\/en\/ai-progress-stalls-for-seo-tasks-despite-wave-of-new-models\/#Google_is_the_dark_horse\" >Google is the dark horse<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/buradabiliyorum.com\/en\/ai-progress-stalls-for-seo-tasks-despite-wave-of-new-models\/#Applying_the_Benchmark_Where_AI_stands_today\" >Applying the Benchmark: Where AI stands today<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"subhead\" itemprop=\"alternativeHeadline\"><span class=\"ez-toc-section\" id=\"Claude_still_leads_ChatGPT-5_rebounds_and_Gemini_rises_%E2%80%93_but_new_AI_models_show_limits_in_handling_real-world_SEO_tasks\"><\/span>Claude still leads, ChatGPT-5 rebounds, and Gemini rises \u2013\u00a0but new AI models show limits in handling real-world SEO tasks.<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><\/p>\n<div class=\"bialty-container\">\n<p>Recent AI model releases in the latter half of 2025 have not improved at performing SEO-related tasks.<\/p>\n<p><strong>TL;DR: What you need to know about the LLM benchmark<\/strong><\/p>\n<ul class=\"wp-block-list\">\n<li>Claude Opus 4.1 remains the best language model for performing SEO-related tasks like technical SEO, localization, SEO strategy, and on-page optimization.<\/li>\n<li>ChatGPT-5 has improved in our benchmark despite the public\u2019s negative reaction to its initial release.<\/li>\n<li>Copilot, which leverages GPT-5, is as performant as OpenAI\u2019s model. This is a major upgrade as it previously underperformed.<\/li>\n<li>Gemini 2.5 Pro is a strong third option. It has the most potential impact for SEOs and marketers due to the base product integration (Gmail, Sheets, Slides, Docs) and AI-focused modalities that push its utility even further (Opal, NotebookLM).<\/li>\n<\/ul>\n<h2 id=\"the-ai-seo-benchmark\" class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"The_AI_SEO_Benchmark\"><\/span>The AI SEO Benchmark<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>In April, Previsible launched the <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/previsible.io\/ai-seo-benchmark\/\" target=\"_blank\" rel=\"noopener\">AI SEO Benchmark<\/a>, a structured effort to evaluate how effectively large language models (LLMs) can perform real-world SEO tasks. This study was focused on answering two core questions:<\/p>\n<ol class=\"wp-block-list\">\n<li>Can AI reliably perform SEO tasks at an expert level?<\/li>\n<li>As these models improve, will their utility change how marketers should resource for SEO and GEO tasks?<\/li>\n<\/ol>\n<p>To answer these, we curated a comprehensive set of questions across multiple SEO disciplines, content strategy, on-page optimization, link building, and technical SEO. These questions were developed by a team of seasoned SEO professionals with 10+ years of experience in their respective specialties.<\/p>\n<p>We then ran leading LLMs through this battery of questions, scoring their responses out of 100. This benchmarking <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>roach mirrors how AI performance is tested in fields like software development, mathematical reasoning, and logic-based tasks.<\/p>\n<h3 class=\"wp-block-heading\" id=\"h-initial-findings\"><span class=\"ez-toc-section\" id=\"Initial_findings\"><\/span>Initial findings<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Our first benchmark in April delivered impressive, albeit unsurprising, results:<\/p>\n<ul class=\"wp-block-list\">\n<li>LLMs performed well across content-focused SEO tasks like keyword strategy and metadata creation. <\/li>\n<li>However, LLMs struggled with technical SEO, where precision and predictable thinking are critical.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\" id=\"h-a-new-wave-of-models\"><span class=\"ez-toc-section\" id=\"A_new_wave_of_models\"><\/span>A new wave of models<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Since then, the landscape has changed dramatically. Nearly every major AI provider has released a new model (with the notable exception of Meta\u2019s Llama). With this influx of updated capabilities, we\u2019ve re-run the benchmark and refreshed the leaderboard.<\/p>\n<p>So how do the latest models stack up? And what does this mean for how SEO teams allocate time, tools, and talent?<\/p>\n<p>In the next installment, we\u2019ll share updated scores, performance breakdowns by SEO discipline, and implications for marketers.\u00a0<\/p>\n<p>A lot has changed since April, so let\u2019s take a look at the Leaderboard now that nearly all major AI firms have released new models (except for Llama).<\/p>\n<figure class=\"wp-block-image size-full\"><img fetchpriority=\"high\" decoding=\"async\" width=\"2048\" height=\"1482\" http: alt=\"Llm Leaderboard Sept 10 2025 Scaled\" class=\"wp-image-462243\" srcset=\"https:\/\/searchengineland.com\/wp-content\/seloads\/2025\/09\/llm-leaderboard-sept-10-2025-scaled.jpg 2048w, https:\/\/searchengineland.com\/wp-content\/seloads\/2025\/09\/llm-leaderboard-sept-10-2025-768x556.jpg 768w, https:\/\/searchengineland.com\/wp-content\/seloads\/2025\/09\/llm-leaderboard-sept-10-2025-1536x1111.jpg 1536w\" data-lazy-sizes=\"(max-width: 2048px) 100vw, 2048px\" src=\"https:\/\/searchengineland.com\/wp-content\/seloads\/2025\/09\/llm-leaderboard-sept-10-2025-scaled.jpg\"><img fetchpriority=\"high\" decoding=\"async\" width=\"2048\" height=\"1482\" src=\"https:\/\/searchengineland.com\/wp-content\/seloads\/2025\/09\/llm-leaderboard-sept-10-2025-scaled.jpg\" alt=\"Llm Leaderboard Sept 10 2025 Scaled\" class=\"wp-image-462243\" srcset=\"https:\/\/searchengineland.com\/wp-content\/seloads\/2025\/09\/llm-leaderboard-sept-10-2025-scaled.jpg 2048w, https:\/\/searchengineland.com\/wp-content\/seloads\/2025\/09\/llm-leaderboard-sept-10-2025-768x556.jpg 768w, https:\/\/searchengineland.com\/wp-content\/seloads\/2025\/09\/llm-leaderboard-sept-10-2025-1536x1111.jpg 1536w\" sizes=\"(max-width: 2048px) 100vw, 2048px\"><\/figure>\n<p><a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/previsible.io\/ai-seo-benchmark\/\"><\/a><a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/previsible.io\/ai-seo-benchmark\/\" target=\"_blank\" rel=\"noopener\">AI SEO Benchmark<\/a><\/p>\n<p>The benchmark has seen some movement but hasn\u2019t broken through the ceiling of what was possible in April.<\/p>\n<p>If you\u2019re not a trained SEO, I\u2019d be extremely cautious about trusting LLMs to perform SEO tasks. <\/p>\n<p>In researching this post, we reached out to the SEO community for examples of AI run amok.\u00a0<\/p>\n<p>Here are a few examples:<\/p>\n<ul class=\"wp-block-list\">\n<li>When I first started using AI for SEO, it found 404 errors for URLs that didn\u2019t exist, which AI claimed had backlinks. I presented these findings to the dev team and management as some sort of big \u201cwin.\u201d<\/li>\n<li>I needed to perform a rank drop analysis for a large site with a short turnaround time. I ran the analysis through ChatGPT and was impressed by the categorization and the insights. The team was excited and wanted a deep dive, further analysis, and a presentation of the findings. When I dug a little deeper, all of the underlying \u201canalysis\u201d turned out to be meaningfully off base, and I had to start over and looked foolish.<\/li>\n<li>LLMs do not comply with wordcounts; they don\u2019t even understand them, so I\u2019m led to believe. So, I ran a script that automated a couple thousand pages of HTML edits and the result was full paragraphs of content and essays in title tags (usual max characters 160!) that also cost way more than I wanted to pay for!<\/li>\n<\/ul>\n<p>These are anecdotal experiences, but they come from professional SEOs. If you\u2019re an executive who cares about search, you still need trained SEOs who can utilize LLMs properly.<\/p>\n<h2 id=\"has-ai-progress-slowed-down\" class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Has_AI_progress_slowed_down\"><\/span>Has AI progress slowed down?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>For those who are not \u201cAGI-pilled,\u201d you\u2019ve probably noticed the moderate pace of change this year. There is disruption, but it is mostly impacting the <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/www.cnbc.com\/2025\/08\/18\/openai-sam-altman-warns-ai-market-is-in-a-bubble.html\" target=\"_blank\" rel=\"noopener\">hype bubble<\/a>, with ChatGPT-5 notably underperforming after its debut.<\/p>\n<p>That isn\u2019t surprising based on what <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/www.reuters.com\/technology\/artificial-intelligence\/openai-rivals-seek-new-path-smarter-ai-current-methods-hit-limitations-2024-11-11\/\" target=\"_blank\" rel=\"noopener\">Ilya Sutskiver told Reuters <\/a>last year about the \u201cscaling up pre-training\u2014the phase of training an AI model that uses a vast amount of unlabeled data to understand language patterns and structures\u2014has plateaued.\u201d<\/p>\n<p>AI will continue to progress. This benchmark focuses on current utility businesses. <\/p>\n<p>If these tools aren\u2019t providing value or efficiency in our current workflows, what good are they? Google has been making gains in that area.<\/p>\n<h2 id=\"google-is-the-dark-horse\" class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Google_is_the_dark_horse\"><\/span>Google is the dark horse<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>A year ago, I had written off Google\u2019s early Gemini models. As an early user, the experience was underwhelming and, frankly, unusable. However, my perspective has completely shifted with the release of Gemini 2.5 Pro.<\/p>\n<p>Gemini 2.5 not only performs impressively in our benchmark, but it\u2019s also deeply integrated across the Google ecosystem. That\u2019s where its true advantage lies. <\/p>\n<p>I can now draft an email that automatically understands the context of documents I\u2019ve created in Google Drive, reference meetings from Calendar, or pull insights from Google Docs and Sheets, all within a single interface. That\u2019s a real, seamless utility that no other LLM currently offers at scale.<\/p>\n<p>While many LLMs struggle to build a sustainable moat, Google already has one: ubiquitous data integration. The ability to retrieve and act on relevant information across all Google products is a strategic advantage that\u2019s hard to replicate.<\/p>\n<p>Is it perfect? Not yet. However, if the pace of product improvement continues, Google could quietly become the most dominant player in applied AI.<\/p>\n<h2 id=\"applying-the-benchmark-where-ai-stands-today\" class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Applying_the_Benchmark_Where_AI_stands_today\"><\/span>Applying the Benchmark: Where AI stands today<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>We built this benchmark to be a living tool, something we\u2019ll continue to update as new models are released and capabilities evolve. So where do things stand as of September 2025?<\/p>\n<p><strong>Can AI reliably perform SEO tasks at an expert level?<\/strong><\/p>\n<p><strong>No.<\/strong> Despite major advancements in LLMs, most still lack expert-level execution, especially in areas requiring nuanced strategy, technical precision, or systems thinking.<\/p>\n<p><strong>Will model improvements change how marketers resource SEO and GEO functions?<\/strong><\/p>\n<p><strong>Not meaningfully.<\/strong> We\u2019re seeing incremental gains in speed and support for certain tasks, but not enough to warrant a full shift in team structure or investment strategy. The utility lies in efficiency gains, not automation at scale.<\/p>\n<p>In short, don\u2019t expect ChatGPT or Gemini to replace your SEO team. Expect them to enhance it when used wisely. <\/p>\n<p>AI still disappoints on complex tasks. But the gap is closing.<\/p>\n<p>Stay tuned to the benchmark. More importantly, start leveraging these tools before your competitors do. Early adoption isn\u2019t just a productivity boost \u2013\u00a0it\u2019s a strategic advantage.<\/p>\n<\/div>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMN63nwsw68G3Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/buradabiliyorum.com\/en\/category\/technology\/\" target=\"_blank\" >Technology<\/a><\/span> category.<\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/searchengineland.com\/ai-progress-stalls-seo-tasks-new-models-462238\" target=\"_blank\" >Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Claude still leads, ChatGPT-5 rebounds, and Gemini rises \u2013\u00a0but new AI models show limits in handling real-world SEO tasks. Recent AI model releases in the latter half of 2025 have not improved at performing SEO-related tasks. TL;DR: What you need to know about the LLM benchmark Claude Opus 4.1 remains the best language model for&#8230;<\/p>\n","protected":false},"author":1,"featured_media":690940,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/searchengineland.com\/wp-content\/seloads\/2025\/09\/AI-SEO-progress-stalls.png","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-690939","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/690939","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=690939"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/690939\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/690940"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=690939"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=690939"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=690939"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}