{"id":507482,"date":"2022-11-07T11:58:58","date_gmt":"2022-11-07T08:58:58","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/large-language-models-like-gpt-3-arent-good-enough-for-pharma-and-finance\/"},"modified":"2022-11-07T11:58:58","modified_gmt":"2022-11-07T08:58:58","slug":"large-language-models-like-gpt-3-arent-good-enough-for-pharma-and-finance","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/large-language-models-like-gpt-3-arent-good-enough-for-pharma-and-finance\/","title":{"rendered":"#Large language models like GPT-3 aren\u2019t good enough for pharma and finance"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_85 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a3a862765a7b\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a3a862765a7b\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-1'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/large-language-models-like-gpt-3-arent-good-enough-for-pharma-and-finance\/#%E2%80%9CLarge_language_models_like_GPT-3_arent_good_enough_for_pharma_and_finance%E2%80%9D\" >&#8220;Large language models like GPT-3 aren\u2019t good enough for pharma and finance&#8221;<\/a><ul class='ez-toc-list-level-2' ><li class='ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/large-language-models-like-gpt-3-arent-good-enough-for-pharma-and-finance\/#A_lucrative_outlook\" >A lucrative outlook<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/buradabiliyorum.com\/en\/large-language-models-like-gpt-3-arent-good-enough-for-pharma-and-finance\/#Greetings_humanoids\" >Greetings, humanoids<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/buradabiliyorum.com\/en\/large-language-models-like-gpt-3-arent-good-enough-for-pharma-and-finance\/#The_problem_with_LLMs\" >The problem with LLMs<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/buradabiliyorum.com\/en\/large-language-models-like-gpt-3-arent-good-enough-for-pharma-and-finance\/#Reigning_in_human_knowledge_for_machine_use\" >Reigning in human knowledge for machine use<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/buradabiliyorum.com\/en\/large-language-models-like-gpt-3-arent-good-enough-for-pharma-and-finance\/#Why_use_AI_at_all\" >Why use AI at all?<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h1><span class=\"ez-toc-section\" id=\"%E2%80%9CLarge_language_models_like_GPT-3_arent_good_enough_for_pharma_and_finance%E2%80%9D\"><\/span>&#8220;Large language models like GPT-3 aren\u2019t good enough for pharma and finance&#8221;<span class=\"ez-toc-section-end\"><\/span><\/h1>\n<div id=\"article-main-content\">\n                            <span style=\"font-weight: 400;\">Natural language processing<\/span><span style=\"font-weight: 400;\"> (NLP) is among the most exciting subsets of machine learning. It lets us talk to computers like they\u2019re people and vice versa. Siri, Google Translate, and the helpful chat bot on your bank\u2019s website are all powered by this kind of AI \u2014 but not all NLP systems are created equal.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In today\u2019s AI landscape, smaller, targeted models trained on essential data are often better for business endeavors. However, there are massive NLP systems capable of incredible feats of communication. Called \u2018<\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/hai.stanford.edu\/news\/examining-emergent-abilities-large-language-models\"><span style=\"font-weight: 400;\">large language models<\/span><\/a><span style=\"font-weight: 400;\">\u2018 (LLMs), these are capable of answering plain language queries, and generating novel text. Unfortunately, they\u2019re mostly novelty acts unsuited for the kind of specialty work most professional organizations need from AI systems.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">OpenAI\u2019s <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/openai.com\/api\/\"><span style=\"font-weight: 400;\">GPT-3<\/span><\/a><span style=\"font-weight: 400;\">, one of the most popular LLMs, is a mighty feat of engineering. But it\u2019s also prone to outputting text that\u2019s subjective, inaccurate, or nonsensical. This makes these huge, popular models unfit for industries where accuracy is important.<\/span><\/p>\n<h2><span class=\"ez-toc-section\" id=\"A_lucrative_outlook\"><\/span><span style=\"font-weight: 400;\">A lucrative outlook<\/span><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<div class=\"inarticle-wrapper neural channel-cta hs-embed-tnw\">\n<div id=\"hs-embed-tnw\" class=\"channel-cta-wrapper\">\n<div class=\"channel-cta-img\"><img class=\"js-lazy\" https:=\"\"\/><\/div>\n<p><noscript><img decoding=\"async\" src=\"https:\/\/thenextweb.com\/news\/src=\" https:=\"\"\/><\/noscript><\/p>\n<div class=\"channel-cta-input\">\n<h2 class=\"channel-cta-title\"><span class=\"ez-toc-section\" id=\"Greetings_humanoids\"><\/span>Greetings, humanoids<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p class=\"channel-cta-tagline\">Subscribe to our <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/news\/\" data-internallinksmanager029f6b8e52c=\"2\" title=\"News\" target=\"_blank\" rel=\"noopener\">news<\/a>letter now for a weekly recap of our favorite AI stories in your inbox.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p><span style=\"font-weight: 400;\">While there\u2019s no such thing as a sure bet in the world of STEM, the forecast for NLP technologies in Europe is bright and sunny for the foreseeable future. The global market for NLP is estimated at about $13.5 billion today, but experts believe the market in Europe alone will swell to <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.globenewswire.com\/en\/news-release\/2022\/08\/11\/2497065\/0\/en\/Natural-Language-Processing-Market-Size-is-projected-to-reach-USD-91-Billion-by-2030-growing-at-a-CAGR-of-27-Straits-Research.html\"><span style=\"font-weight: 400;\">more than $21 billion<\/span><\/a><span style=\"font-weight: 400;\"> by 2030.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This indicates a wide-open market for new startups to form alongside established industry actors, such as <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.dataiku.com\/company\/\"><span style=\"font-weight: 400;\">Dataiku<\/span><\/a><span style=\"font-weight: 400;\"> and <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.arria.com\/origins-overview\/\"><span style=\"font-weight: 400;\">Arria NLG<\/span><\/a><span style=\"font-weight: 400;\">. The former, Dataiku, was initially formed in Paris, but managed to perform extremely well on the global funding stage and now has offices around the world. And the latter company, Arria NLG, is essentially a University of Aberdeen spinout that\u2019s expanded well beyond its Scottish origins. Both companies have reached massive success on the back of their natural language processing solutions by focusing on data-centric solutions that produce verifiable, accurate, results for enterprise, pharma, and government services.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">One reason for the massive success of these particular outlets is that it\u2019s extremely difficult to train and build AI models that are trustworthy. An LLM trained on a massive dataset, for example, will tend to output \u2018fake news\u2019 in the form of random statements. This is useful when you\u2019re looking for writing ideas or inspiration, but it\u2019s entirely untenable when accuracy and factual outputs are important.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">I spoke with Emmanuel Walckenaer, the CEO of one such company, <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/yseop.com\/\"><span style=\"font-weight: 400;\">Yseop<\/span><\/a><span style=\"font-weight: 400;\">. His Paris-based outfit is an AI startup that specializes in using NLP for natural language generation (NLG) in standardized industries such as pharma and finance. According to him, when it comes to building AI for these domains, there\u2019s no margin for error. \u201cIt has to be perfect,\u201d he told TNW.<\/span><\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" class=\"size-full wp-image-1391992 js-lazy\" alt=\"Yseop CEO Emmanuel Walckenaer\" width=\"1280\" height=\"720\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" https:=\"\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO.jpg 1280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO-280x158.jpg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO-240x135.jpg 240w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO-480x270.jpg 480w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO-796x448.jpg 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO-1200x675.jpg 1200w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2022%2F11%2F07%2Flarge-language-models-like-gpt-3-arent-good-enough-for-pharma-finance%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Yseop CEO Emmanuel Walckenaer\" data-title=\"Share Yseop CEO Emmanuel Walckenaer on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Yseop CEO Emmanuel Walckenaer on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Yseop CEO Emmanuel Walckenaer<\/figcaption><noscript><img loading=\"lazy\" class=\"size-full wp-image-1391992\" https:=\"\" alt=\"Yseop CEO Emmanuel Walckenaer\" width=\"1280\" height=\"720\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO.jpg 1280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO-280x158.jpg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO-240x135.jpg 240w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO-480x270.jpg 480w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO-796x448.jpg 796w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/yseopCEO-1200x675.jpg 1200w\"\/><\/noscript><\/figure>\n<h2><span class=\"ez-toc-section\" id=\"The_problem_with_LLMs\"><\/span><span style=\"font-weight: 400;\">The problem with LLMs<\/span><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">You\u2019d be hard-pressed to find a more popular topic among AI journalists in 2022 than LLMs such as GPT-3 and <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/blog.google\/technology\/ai\/lamda\/\"><span style=\"font-weight: 400;\">Google\u2019s LaMBDA<\/span><\/a><span style=\"font-weight: 400;\">. For the first time in history, pundits can \u201ctalk\u201d to a machine and that makes for fun, compelling articles. Not to mention the fact these models have gotten so good at imitating humans that <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.washingtonpost.com\/technology\/2022\/07\/22\/google-ai-lamda-blake-lemoine-fired\/\"><span style=\"font-weight: 400;\">some experts even think they\u2019re becoming sentient<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">While these systems are impressive, as mentioned above, they are usually <\/span><i><span style=\"font-weight: 400;\">completely untrustworthy<\/span><\/i><span style=\"font-weight: 400;\">. They\u2019re brittle, unreliable, and prone to making things up. In layperson\u2019s terms: they\u2019re dumb liars. This is because of the way they are trained.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">LLMs are amazing marriages of mathematics and linguistics. But, at their most basic, they\u2019re beholden to the data they\u2019re trained on. You can\u2019t expect to train an AI on, for example, <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/imerit.net\/blog\/top-11-reddit-datasets-for-machine-learning-all-pbm\/\"><span style=\"font-weight: 400;\">a corpus of Reddit posts<\/span><\/a><span style=\"font-weight: 400;\">, and not expect it to have some factual inconsistencies. As the old saying goes, you get out what you put in.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If, for example, you trained a LLM on a dataset full of cooking recipes, you could then develop a system capable of generating new recipes on demand. You might ask it to generate a novel recipe for something that isn\u2019t in its database \u2014 such as, perhaps, a gummy bear curry.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Just like a human chef would have to tap into their cooking background in order to figure out how to integrate gummy bears into something resembling a curry dish, the AI would attempt to throw together a new recipe based on the ones it had been trained on. If it had been trained on a database of curry recipes, there\u2019s a reasonable chance it\u2019d output something at least close to what a human might come up with given the same task.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">However, if the team training the AI used a giant dataset full of billions or trillions of internet files that have nothing to do with curry, there\u2019s no telling what the machine might spit out. It might give you a great recipe, it might output a random diatribe on NBA superstar Stephen Curry.\u00a0<\/span><\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" class=\"size-full wp-image-1391999 js-lazy\" alt=\"steph curry\" width=\"1024\" height=\"1145\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" https:=\"\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/steph-curry.jpeg 1024w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/steph-curry-188x210.jpeg 188w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/steph-curry-121x135.jpeg 121w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/steph-curry-241x270.jpeg 241w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/steph-curry-796x890.jpeg 796w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2022%2F11%2F07%2Flarge-language-models-like-gpt-3-arent-good-enough-for-pharma-finance%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: While Steph Curry is an amazing basketball player, he might not be the curry you\u2019re looking for. Credit: Keith Allison\" data-title=\"Share While Steph Curry is an amazing basketball player, he might not be the curry you\u2019re looking for. Credit: Keith Allison on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share While Steph Curry is an amazing basketball player, he might not be the curry you\u2019re looking for. Credit: Keith Allison on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>While Steph Curry is an amazing basketball player, he might not be the curry you\u2019re looking for. Credit: <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.flickr.com\/photos\/27003603@N00\">Keith Allison<\/a><\/figcaption><noscript><img loading=\"lazy\" class=\"size-full wp-image-1391999\" https:=\"\" alt=\"steph curry\" width=\"1024\" height=\"1145\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/steph-curry.jpeg 1024w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/steph-curry-188x210.jpeg 188w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/steph-curry-121x135.jpeg 121w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/steph-curry-241x270.jpeg 241w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/steph-curry-796x890.jpeg 796w\"\/><\/noscript><\/figure>\n<p><span style=\"font-weight: 400;\">That\u2019s sort of the fun part about working with huge LLMs, you never quite know what you\u2019ll get when you query them. However, there\u2019s no room for that kind of uncertainty in medical, financial, or business intelligence reports.<\/span><\/p>\n<h2><span class=\"ez-toc-section\" id=\"Reigning_in_human_knowledge_for_machine_use\"><\/span><span style=\"font-weight: 400;\">Reigning in human knowledge for machine use<\/span><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">The companies developing AI solutions for standardized industries don\u2019t have the luxury of brute-force training giant models on the biggest databases around just to see what they\u2019re capable of. The output from their systems is typically submitted for review by governing authorities such as the USFDA and global financial regulators. For this reason, these organizations have to be very careful about what kind of data they train their models on.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Walckenaer told me that Yseop\u2019s first priority is ensuring the data they use to train their systems is both accurate and ethically sourced. This means using only the <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>licable data and ensuring that no human\u2019s privacy is compromised by anonymizing it to remove any personally identifiable information.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Next, the company has to ensure its machine learning systems are free of bias, omission, and <\/span><i><span style=\"font-weight: 400;\">hallucination<\/span><\/i><span style=\"font-weight: 400;\">. Yes, you read that right: blackbox AI systems have <\/span><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.wired.com\/story\/ai-has-a-hallucination-problem-thats-proving-tough-to-fix\/\"><span style=\"font-weight: 400;\">a tendency to hallucinate<\/span><\/a><span style=\"font-weight: 400;\">, and that\u2019s a huge problem if you\u2019re trying to output information that\u2019s 100% accurate.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To overcome the problem of hallucination, Yseop relies on having humans in the loop at every stage. The company\u2019s algorithms and neural networks are co-developed by math wizards, linguistics experts, and AI developers. Their databases consist of data sourced directly from the researchers and businesses being served by the product. And the majority of their offerings are conducted via SaaS and designed to \u201caugment\u201d human professionals \u2014 as opposed to replacing them.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">With humans involved at every stage, there are checks in place to ensure the AI doesn\u2019t take the data it\u2019s been given and \u201challucinate\u201d new, made-up information. This, for example, keeps the system from using real patient data as a template for outputting fake data about patients that don\u2019t exist.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The next problem devs need to overcome with language processing is omission. This happens when an AI model skips over pertinent or essential parts of its database when it outputs information.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Massive LLMs such as GPT-3 don\u2019t really suffer from the omission problem \u2014 you never know what to expect from these \u201canything goes\u201d systems anyway. But targeted models that are designed to help professionals and businesses sort through finite datasets are only useful if they can be \u201ccontainerized\u201d in such a way as to surface all of the relevant information.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The last major hurdle that huge LLMs usually fail to pass is bias. One of the most common forms of bias is <\/span><i><span style=\"font-weight: 400;\">technical<\/span><\/i><span style=\"font-weight: 400;\">. This occurs when systems are designed in such a way that the outputs they produce don\u2019t follow the scientific method.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">A prime example of technical bias would be teaching a machine to \u201cpredict\u201d a person\u2019s sexuality. Since there\u2019s no scientific basis for this kind of AI (see our article on why supposed \u201cgaydars\u201d are <\/span><span style=\"font-weight: 400;\">nothing but hogwash and snake oil<\/span><span style=\"font-weight: 400;\">), they\u2019re only able to produce made-up outputs by employing pure technical bias.<\/span><\/p>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" class=\"size-full wp-image-1392000 js-lazy\" alt=\"no gaydar\" width=\"1116\" height=\"628\" sizes=\"auto, (max-width: 1116px) 100vw, 1116px\" https:=\"\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/no-gaydar.jpeg 1116w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/no-gaydar-280x158.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/no-gaydar-240x135.jpeg 240w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/no-gaydar-480x270.jpeg 480w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/no-gaydar-796x448.jpeg 796w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2022%2F11%2F07%2Flarge-language-models-like-gpt-3-arent-good-enough-for-pharma-finance%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: The only logical response when you hear about an AI-driven \u201cgaydar.\u201d\" data-title=\"Share The only logical response when you hear about an AI-driven \u201cgaydar.\u201d on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share The only logical response when you hear about an AI-driven \u201cgaydar.\u201d on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>The only logical response when you hear about an AI-driven \u201cgaydar.\u201d<\/figcaption><noscript><img loading=\"lazy\" class=\"size-full wp-image-1392000\" https:=\"\" alt=\"no gaydar\" width=\"1116\" height=\"628\" srcset=\"https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/no-gaydar.jpeg 1116w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/no-gaydar-280x158.jpeg 280w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/no-gaydar-240x135.jpeg 240w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/no-gaydar-480x270.jpeg 480w, https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/no-gaydar-796x448.jpeg 796w\"\/><\/noscript><\/figure>\n<p><span style=\"font-weight: 400;\">Other common forms of bias that can creep into NLP and NLG models include human bias \u2014 this happens when humans improperly label data due to cultural or intentional misinterpretation \u2014 and institutional bias.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The last one can be a huge problem for organizations that rely on accurate data and outputs to make important decisions. In standardized industries such as pharma and finance, this kind of bias can produce poor outcomes for patients and contribute to financial ruin. Suffice to say that bias is among the biggest problems in AI, and LLMs such as GPT-3 are, essentially, as biased as the databases they\u2019re trained on.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Though it can be difficult to eliminate bias outright, it can be mitigated by using only the highest quality, hand-checked data, and ensuring that the system\u2019s \u201cparameters\u201d \u2014 essentially, the virtual dials and knobs that allow developers to fine-tune an AI\u2019s outputs \u2014 are properly adjusted.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">GPT-3 and similar models are capable of mind-blowing feats of prose and, occasionally, they even fool some experts. But they\u2019re entirely unsuited for standardized industries where accuracy and accountability are paramount.<\/span><\/p>\n<h2><span class=\"ez-toc-section\" id=\"Why_use_AI_at_all\"><\/span><span style=\"font-weight: 400;\">Why use AI at all?<\/span><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><span style=\"font-weight: 400;\">It can start to seem like a bad idea to employ LLMs or NLP\/NLG at all when the stakes are high. In the pharmaceutical industry, for example, bias or omission could have a massive impact on the accuracy of clinical reports. And who wants to trust a machine that hallucinates with their financial future?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Luckily for all of us, companies such as Yseop don\u2019t use open-ended datasets full of unchecked information. Sure, you\u2019re unlikely to get Yseop\u2019s pharma models to write a song or produce a decent curry recipe (with their current datasets), but, because the data and parameters governing their outputs are used with careful scrutiny, they can be trusted for the tasks they\u2019re built for.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">But it still begs the question, why use AI at all? We\u2019ve gotten by this far with non-automated software solutions.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Walckenaer told me there may soon be no other choice. According to him, the human workforce can\u2019t keep up \u2014 at least in the pharmaceutical industry.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u201cThe need for medical writers is going to <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/trip-and-travel\/\" data-internallinksmanager029f6b8e52c=\"10\" title=\"Trip &amp; Travel\" target=\"_blank\" rel=\"noopener\">trip<\/a>le in the next ten years\u201d says Walckenaer, who also added that Yseop\u2019s systems can provide a 50% efficiency gain for applicable industries. That\u2019s a <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/game\/\" data-internallinksmanager029f6b8e52c=\"7\" title=\"Game\" target=\"_blank\" rel=\"noopener\">game<\/a>-changer. And there\u2019s even good news for those who fear being displaced by machines. He assured us that Yseop\u2019s systems were meant to augment skilled human labor, not replace people.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In other standardized industries, such as finance or in the domain of business intelligence, NLP and NLG can help minimize or even eliminate human error. That might not be as exciting as using a LLM capable of pretending to chat with you as a famous historical figure or generating fake news at the push of a button, but it currently saves thousands of businesses around the world time and money.<\/span>\n                        <\/div>\n<p><script async src=\"\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMLG0nwswvr63Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\">For forums sites go to <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/forum.buradabiliyorum.com\/\" target=\"_blank\" rel=\"noopener\">Forum.BuradaBiliyorum.Com<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/news\/large-language-models-like-gpt-3-arent-good-enough-for-pharma-finance\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;Large language models like GPT-3 aren\u2019t good enough for pharma and finance&#8221; Natural language processing (NLP) is among the most exciting subsets of machine learning. It lets us talk to computers like they\u2019re people and vice versa. Siri, Google Translate, and the helpful chat bot on your bank\u2019s website are all powered by this kind&#8230;<\/p>\n","protected":false},"author":1,"featured_media":507483,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/img-cdn.tnwcdn.com\/image\/neural?filter_last=1&fit=1280,640&url=https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2022\/11\/aigraphs.jpg&signature=7e3a1de5fd74a4c31d798f0168a4c8b1","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-507482","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/507482","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=507482"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/507482\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/507483"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=507482"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=507482"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=507482"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}