{"id":254012,"date":"2021-05-20T01:55:00","date_gmt":"2021-05-19T22:55:00","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/how-do-you-actually-use-regex-cloudsavvy-it\/"},"modified":"2021-05-20T01:55:00","modified_gmt":"2021-05-19T22:55:00","slug":"how-do-you-actually-use-regex-cloudsavvy-it","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/how-do-you-actually-use-regex-cloudsavvy-it\/","title":{"rendered":"#How Do You Actually Use Regex? \u2013 CloudSavvy IT"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_84 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a2e5b5379d0c\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a2e5b5379d0c\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/how-do-you-actually-use-regex-cloudsavvy-it\/#Regex_Syntax_Explained\" >Regex Syntax, Explained<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/how-do-you-actually-use-regex-cloudsavvy-it\/#First_Off_Use_a_Regex_Debugger\" >First Off: Use a Regex Debugger<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/buradabiliyorum.com\/en\/how-do-you-actually-use-regex-cloudsavvy-it\/#How_Does_Regex_Work\" >How Does Regex Work?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/buradabiliyorum.com\/en\/how-do-you-actually-use-regex-cloudsavvy-it\/#Character_Matching\" >Character Matching<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/buradabiliyorum.com\/en\/how-do-you-actually-use-regex-cloudsavvy-it\/#Quantifiers\" >Quantifiers<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/buradabiliyorum.com\/en\/how-do-you-actually-use-regex-cloudsavvy-it\/#Greedy_and_Lazy_Quantifiers\" >Greedy and Lazy Quantifiers<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/buradabiliyorum.com\/en\/how-do-you-actually-use-regex-cloudsavvy-it\/#Grouping_and_Lookarounds\" >Grouping and Lookarounds<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/buradabiliyorum.com\/en\/how-do-you-actually-use-regex-cloudsavvy-it\/#Differences_Between_Regex_Engines\" >Differences Between Regex Engines<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/buradabiliyorum.com\/en\/how-do-you-actually-use-regex-cloudsavvy-it\/#How_To_Run_Regex\" >How To Run Regex<\/a><\/li><\/ul><\/nav><\/div>\n<p><strong>&#8220;#How Do You Actually Use Regex? \u2013 CloudSavvy IT&#8221;<\/strong><\/p>\n<div id=\"article-content-area\">\n<img loading=\"lazy\" decoding=\"async\" class=\"type:primaryImage alignnone size-full wp-image-957\" data-pagespeed-lazy-src=\"https:\/\/www.cloudsavvyit.com\/p\/uploads\/2019\/06\/b0582a06.png?width=1200&amp;trim=1,1&amp;bg-color=000&amp;pad=1,1\" alt=\"\" width=\"700\" height=\"300\" src=\"\/pagespeed_static\/1.JiBnMqyl6S.gif\" onload=\"pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\" onerror=\"this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\"\/><\/p>\n<p>Regex, short for regular expression, is often used in programming languages for matching patterns in strings, find and replace, input validation, and reformatting text. Learning how to properly use Regex can make working with text much easier.<\/p>\n<h2 role=\"heading\" aria-level=\"2\"><span class=\"ez-toc-section\" id=\"Regex_Syntax_Explained\"><\/span>Regex Syntax, Explained<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Regex has a reputation for having horrendous syntax, but it\u2019s much easier to write than it is to read. For example, here is a <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">general<\/a> regex for <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/emailregex.com\/\">an\u00a0RFC 5322-compliant email validator<\/a>:<\/p>\n<pre>(?:[a-z0-9!#$%&amp;'*+\/=?^_`{|}~-]+(?:.[a-z0-9!#$%&amp;'*+\/=?^_`{|}~-]+)*|\"(?:[x01-&#13;\nx08x0bx0cx0e-x1fx21x23-x5bx5d-x7f]|\\[x01-x09x0bx0cx0e-x7f])*\")&#13;\n@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|[(?&#13;\n:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-&#13;\n9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[x01-x08x0bx0cx0e-x1fx21-x5ax53-x7f]|&#13;\n\\[x01-x09x0bx0cx0e-x7f])+)])<\/pre>\n<p>If it looks like someone smashed their face into the keyboard, you\u2019re not alone. But under the hood, all of this mess is actually programming a <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/en.wikipedia.org\/wiki\/Finite-state_machine\">finite-state machine<\/a>. This machine runs for each character, chugging along and matching based on rules you\u2019ve set. Plenty of online tools will render railroad diagrams, showing how your Regex machine works. Here\u2019s that same Regex in visual form:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-929\" data-pagespeed-lazy-src=\"https:\/\/www.cloudsavvyit.com\/p\/uploads\/2019\/06\/c91fd21c.png?trim=1,1&amp;bg-color=000&amp;pad=1,1\" alt=\"\" width=\"700\" height=\"300\" src=\"\/pagespeed_static\/1.JiBnMqyl6S.gif\" onload=\"pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\" onerror=\"this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\"\/><\/p>\n<p>Still very confusing, but it\u2019s a lot more understandable. It\u2019s a machine with moving parts that have rules defining how it all fits together. You can see how someone assembled this; it\u2019s not just a big glob of text.<\/p>\n<h2 role=\"heading\" aria-level=\"2\"><span class=\"ez-toc-section\" id=\"First_Off_Use_a_Regex_Debugger\"><\/span>First Off: Use a Regex Debugger<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-937\" data-pagespeed-lazy-src=\"https:\/\/www.cloudsavvyit.com\/p\/uploads\/2019\/06\/02b9beb8.png?trim=1,1&amp;bg-color=000&amp;pad=1,1\" alt=\"\" width=\"700\" height=\"300\" src=\"\/pagespeed_static\/1.JiBnMqyl6S.gif\" onload=\"pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\" onerror=\"this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\"\/><\/p>\n<p>Before we begin, unless your Regex is particularly short or you\u2019re particularly proficient, you should use an online debugger when writing and testing it. It makes understanding the syntax much easier. We recommend\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/regex101.com\/\">Regex101<\/a>\u00a0and\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/regexr.com\/\">RegExr<\/a>, both which offer testing and built-in syntax reference.<\/p>\n<h2 role=\"heading\" aria-level=\"2\"><span class=\"ez-toc-section\" id=\"How_Does_Regex_Work\"><\/span>How Does Regex Work?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>For now, let\u2019s focus on something much simpler. This is a diagram from <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/jex.im\/regulex\/#!flags=&amp;re=%5E(a%7Cb)*%3F%24\">Regulex<\/a> for a very short (and definitely not\u00a0RFC 5322 compliant) email-matching Regex:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-922\" data-pagespeed-lazy-src=\"https:\/\/www.cloudsavvyit.com\/p\/uploads\/2019\/06\/bce5746f.png?trim=1,1&amp;bg-color=000&amp;pad=1,1\" alt=\"\" width=\"700\" height=\"200\" src=\"\/pagespeed_static\/1.JiBnMqyl6S.gif\" onload=\"pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\" onerror=\"this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\"\/><\/p>\n<p>The Regex engine starts at the left and <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/trip-and-travel\/\" data-internallinksmanager029f6b8e52c=\"10\" title=\"Trip &amp; Travel\" target=\"_blank\" rel=\"noopener\">travel<\/a>s down the lines, matching characters as it goes. Group #1 matches any character except a line break, and will continue to match characters until the next block finds a match. In this case, it stops when it reaches an\u00a0<code>@<\/code>\u00a0symbol, which means Group #1 captures the name of the email address and everything after matches the domain.<\/p>\n<p>The Regex that defines Group #1 in our email example is:<\/p>\n<pre>(.+)<\/pre>\n<p>The parentheses define a capture group, which tells the Regex engine to include the contents of this group\u2019s match in a special variable. When you run a Regex on a string, the default return is the entire match (in this case, the whole email). But it also returns each capture group, which makes this Regex useful for pulling names out of emails.<\/p>\n<p>The period is the symbol for \u201cAny Character Except Newline.\u201d This matches everything on a line, so if you passed this email Regex an address like:<\/p>\n<pre>%$#^&amp;%*#%$#^@gmail.com<\/pre>\n<p>It would match <code>%$#^&amp;%*#%$#^<\/code>\u00a0as the name, even though that\u2019s ludicrous.<\/p>\n<p>The plus (+) symbol is a control structure that means \u201cmatch the preceding character or group one or more times.\u201d It ensures that the whole name is matched, and not just the first character. This is what creates the loop found on the railroad diagram.<\/p>\n<p>The rest of the Regex is fairly simple to decipher:<\/p>\n<pre>(.+)@(.+..+)<\/pre>\n<p>The first group stops when it hits the <code>@<\/code>\u00a0symbol. The next group then starts, which again matches multiple characters until it reaches a period character.<\/p>\n<p>Because characters like periods, parentheses, and slashes are used as part of the syntax in Regrex, anytime you want to match those characters you need to properly escape them with a backslash. In this example, to match the period we write <code>.<\/code>\u00a0and the parser treats it as one symbol meaning \u201cmatch a period.\u201d<\/p>\n<h2 role=\"heading\" aria-level=\"2\"><span class=\"ez-toc-section\" id=\"Character_Matching\"><\/span>Character Matching<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>If you have non-control characters in your Regex, the Regex engine will assume those characters will form a matching block. For example, the Regex:<\/p>\n<pre>he+llo<\/pre>\n<p>Will match the word \u201chello\u201d with any number of e\u2019s. Any other characters need to be escaped to work properly.<\/p>\n<p>Regex also has character classes, which act as shorthand for a set of characters. These can vary based on the Regex implementation, but these few are standard:<\/p>\n<ul>\n<li><code>.<\/code>\u00a0\u2013 matches anything except newline.<\/li>\n<li><code>w<\/code>\u00a0\u2013 matches any \u201cword\u201d character, including digits and underscores.<\/li>\n<li><code>d<\/code>\u00a0\u2013 matches numbers.<\/li>\n<li><code>b<\/code>\u00a0\u2013 matches whitespace characters (i.e., space, tab, newline).<\/li>\n<\/ul>\n<p>These three all have uppercase counterparts that invert their function. For example, <code>D<\/code>\u00a0matches anything that isn\u2019t a number.<\/p>\n<p>Regex also has character-set matching. For example:<\/p>\n<pre>[abc]<\/pre>\n<p>Will match either <code>a<\/code>, <code>b<\/code>, or <code>c<\/code>. This acts as one block, and the square brackets are just control structures. Alternatively, you can specify a range of characters:<\/p>\n<pre>[a-c]<\/pre>\n<p>Or negate the set, which will match any character that isn\u2019t in the set:<\/p>\n<pre>[^a-c]<\/pre>\n<h2 role=\"heading\" aria-level=\"2\"><span class=\"ez-toc-section\" id=\"Quantifiers\"><\/span>Quantifiers<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Quantifiers are an important part of Regex. They let you match strings where you don\u2019t know the\u00a0<em>exact<\/em> format, but you have a pretty good idea.<\/p>\n<p>The <code>+<\/code>\u00a0operator from the email example is a quantifier, specifically the \u201cone or more\u201d quantifier. If we don\u2019t know how long a certain string is, but we know it\u2019s made up of alphanumeric characters (and isn\u2019t empty), we can write:<\/p>\n<pre>w+<\/pre>\n<p>In addition to <code>+<\/code>, there\u2019s also:<\/p>\n<ul>\n<li>The <code>*<\/code>\u00a0operator, which matches \u201czero or more.\u201d Essentially the same as <code>+<\/code>, except it has the option of not finding a match.<\/li>\n<li>The <code>?<\/code>\u00a0operator, which matches \u201czero or one.\u201d It has the effect of making a character optional; either it\u2019s there or it isn\u2019t, and it won\u2019t match more than once.<\/li>\n<li>Numerical quantifiers. These can be a single number like <code>{3}<\/code>, which means \u201cexactly 3 times,\u201d or a range like <code>{3-6}<\/code>. You can leave out the second number to make it unlimited. For example, <code>{3,}<\/code>\u00a0means \u201c3 or more times\u201d.\u00a0Oddly enough, you can\u2019t leave out the first number, so if you want \u201c3 or less times,\u201d you\u2019ll have to use a range.<\/li>\n<\/ul>\n<h2 role=\"heading\" aria-level=\"2\"><span class=\"ez-toc-section\" id=\"Greedy_and_Lazy_Quantifiers\"><\/span>Greedy and Lazy Quantifiers<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Under the hood, the <code>*<\/code>\u00a0and <code>+<\/code> operators are <em>greedy<\/em>. It matches as much as possible, and gives back what is needed to start the next block. This can be a massive problem.<\/p>\n<p>Here\u2019s an example: say you\u2019re trying to match HTML, or anything else with closing braces. Your input text is:<\/p>\n<pre>&lt;div&gt;Hello World&lt;\/div&gt;<\/pre>\n<p>And you want to match everything within the brackets. You may write something like:<\/p>\n<pre>&lt;.*&gt;<\/pre>\n<p>This is the right idea, but it fails for one crucial reason: the Regex engine matches \u201c<code>div&gt;Hello World&lt;\/div&gt;<\/code>\u201d for the sequence <code>.*<\/code>, and then backtracks until the next block matches, in this case, a closing bracket (<code>&gt;<\/code>). You would expect it to backtrack to only match \u201c<code>div<\/code>\u201c, and then repeat again to match the closing div. But the backtracker runs from the end of the string, and will stop on the ending bracket, which ends up matching everything inside the brackets.<\/p>\n<p>The solution is to make our quantifier lazy, which means it will match as few characters as possible. Under the hood, this actually will only match one character, and then expand to fill the space until the next block match, which makes it much more performant in large Regex operations.<\/p>\n<p>Making a quantifier lazy is done by adding a question mark directly after the quantifier. This is a bit confusing because <code>?<\/code>\u00a0is already a quantifier (and is actually greedy by default). For our HTML example, the Regex is fixed with this simple addition:<\/p>\n<pre>&lt;.*?&gt;<\/pre>\n<p>The lazy operator can be tacked on to any quantifier, including <code>+?<\/code>, <code>{0,3}?<\/code>, and even <code>??<\/code>. Though the last one doesn\u2019t have any effect; because you\u2019re matching zero or one characters anyway, there\u2019s no room to expand.<\/p>\n<h2 role=\"heading\" aria-level=\"2\"><span class=\"ez-toc-section\" id=\"Grouping_and_Lookarounds\"><\/span>Grouping and Lookarounds<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Groups in Regex have a lot of purposes. At a basic level, they join together multiple tokens into one block. For example, you can create a group, then use a quantifier on the entire group:<\/p>\n<pre>ba(na)+<\/pre>\n<p>This groups the repeated \u201cna\u201d to match the phrase\u00a0<code>banana<\/code>, and <code>banananana<\/code>, and so on. Without the group, the Regex engine would just match the ending character over and over.<\/p>\n<p>This type of group with two simple parentheses is called a capture group, and will include it in the output:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-947\" data-pagespeed-lazy-src=\"https:\/\/www.cloudsavvyit.com\/p\/uploads\/2019\/06\/02b9beb8-1.png?trim=1,1&amp;bg-color=000&amp;pad=1,1\" alt=\"\" width=\"700\" height=\"200\" src=\"\/pagespeed_static\/1.JiBnMqyl6S.gif\" onload=\"pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\" onerror=\"this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\"\/><\/p>\n<p>If you\u2019d like to avoid this, and simply group tokens together for execution reasons, you can use a non-capturing group:<\/p>\n<pre>ba(?:na)<\/pre>\n<p>The question mark (a reserved character) defines a non-standard group, and the following character defines what kind of group it is. Starting groups with a question mark is ideal, because otherwise if you wanted to match semicolons in a group, you\u2019d need to escape them for no good reason. But you\u00a0<em>always<\/em> have to escape question marks in Regex.<\/p>\n<p>You can also name your groups, for convenience, when working with the output:<\/p>\n<pre>(?'group')<\/pre>\n<p>You can reference these in your Regex, which makes them work similar to variables. You can reference non-named groups with the token <code>1<\/code>, but this only goes up to 7, after which you\u2019ll need to start naming groups. The syntax for referencing named groups is:<\/p>\n<pre>k{group}<\/pre>\n<p>This references the results of the named group, which can be dynamic. Essentially, it checks if the group occurs multiple times but doesn\u2019t care about the position. For example, this can be used to match all text between three identical words:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-948\" data-pagespeed-lazy-src=\"https:\/\/www.cloudsavvyit.com\/p\/uploads\/2019\/06\/bc6cff37.png?trim=1,1&amp;bg-color=000&amp;pad=1,1\" alt=\"\" width=\"700\" height=\"200\" src=\"\/pagespeed_static\/1.JiBnMqyl6S.gif\" onload=\"pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\" onerror=\"this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\"\/><\/p>\n<p>The group class is where you\u2019ll find most of Regex\u2019s control structure, including lookaheads. Lookaheads ensure that an expression must match but doesn\u2019t include it in the result. In a way, it\u2019s similar to an if statement, and will fail to match if it returns false.<\/p>\n<p>The syntax for a positive lookahead is <code>(?=)<\/code>. Here\u2019s an example:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-950\" data-pagespeed-lazy-src=\"https:\/\/www.cloudsavvyit.com\/p\/uploads\/2019\/06\/bc6cff37-1.png?trim=1,1&amp;bg-color=000&amp;pad=1,1\" alt=\"\" width=\"350\" height=\"200\" src=\"\/pagespeed_static\/1.JiBnMqyl6S.gif\" onload=\"pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\" onerror=\"this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\"\/><\/p>\n<p>This matches the name part of an email address very cleanly, by stopping execution at the dividing <code>@<\/code>. Lookaheads don\u2019t consume any characters, so if you wanted to continue running after a lookahead succeeds, you can still match the character used in the lookahead.<\/p>\n<p>In addition to positive lookaheads, there are also:<\/p>\n<ul>\n<li><code>(?!)<\/code>\u00a0\u2013 Negative lookaheads, which ensure an expression\u00a0<em>doesn\u2019t<\/em> match.<\/li>\n<li><code>(?&lt;=)<\/code>\u00a0\u2013 Positive lookbehinds, which are not supported everywhere due to some technical constraints. These are placed before the expression you want to match, and they must have a fixed width (i.e., no quantifiers except <code>{number}<\/code>. In this example, you could use\u00a0<code>(?&lt;=@)w+.w+<\/code>\u00a0to match the domain part of the email.<\/li>\n<li><code>(?&lt;!)<\/code>\u00a0\u2013 Negative lookbehinds, which are same as positive lookbehinds, but negated.<\/li>\n<\/ul>\n<h2 role=\"heading\" aria-level=\"2\"><span class=\"ez-toc-section\" id=\"Differences_Between_Regex_Engines\"><\/span>Differences Between Regex Engines<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Not all Regex is created equal. Most Regex engines don\u2019t follow any specific standard, and some switch things up a bit to suit their language. Some features that work in one language may not work in another.<\/p>\n<p>For example, the versions of <code>sed<\/code>\u00a0compiled for macOS and FreeBSD do not support using <code>t<\/code>\u00a0to represent a tab character. You have to manually copy a tab character and paste it into the terminal to use a tab in command line <code>sed<\/code>.<\/p>\n<p>Most of this tutorial is compatible with PCRE, the default Regex engine used for PHP. But Java<a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">Script<\/a>\u2019s Regex engine is different\u2014it doesn\u2019t support named capture groups with quotation marks (it wants brackets) and can\u2019t do recursion, among other things. Even PCRE isn\u2019t entirely compatible with different versions, and it has <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/php.net\/manual\/en\/reference.pcre.pattern.differences.php\">many differences<\/a> from Perl regex.<\/p>\n<p>There are too many minor differences to list here, so you can use <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.regular-expressions.info\/refext.html\">this reference table<\/a> to compare the differences between multiple Regex engines. Also, Regex debuggers like <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/regex101.com\/\">Regex101<\/a> let you switch Regex engines, so make sure you\u2019re debugging using the correct engine.<\/p>\n<h2 role=\"heading\" aria-level=\"2\"><span class=\"ez-toc-section\" id=\"How_To_Run_Regex\"><\/span>How To Run Regex<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>We\u2019ve been discussing the matching portion of regular expressions, which makes up most of what makes a Regex. But when you actually want to run your Regex, you\u2019ll need to form it into a full regular expression.<\/p>\n<p>This usually takes the format:<\/p>\n<pre>\/match\/g<\/pre>\n<p>Everything inside the forward slashes is our match. The\u00a0<code>g<\/code>\u00a0is a <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.regular-expressions.info\/refmodifiers.html\">mode modifier<\/a>. In this case, it tells the engine not to stop running after it finds the first match. For find and replace Regex, you\u2019ll often have to format it like:<\/p>\n<pre>\/find\/replace\/g<\/pre>\n<p>This replaces all throughout the file. You can use capture group references when replacing, which makes Regex very good at formatting text. For example, this Regex will match any HTML tags and replace the standard brackets with square brackets:<\/p>\n<pre>\/&lt;(.+?)&gt;\/[1]\/g<\/pre>\n<p>When this runs, the engine will match <code>&lt;div&gt;<\/code>\u00a0and <code>&lt;\/div&gt;<\/code>, allowing you to replace this text (and this text only). As you can see, the inner HTML is unaffected:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-954\" data-pagespeed-lazy-src=\"https:\/\/www.cloudsavvyit.com\/p\/uploads\/2019\/06\/02b9beb8-2.png?trim=1,1&amp;bg-color=000&amp;pad=1,1\" alt=\"\" width=\"700\" height=\"300\" src=\"\/pagespeed_static\/1.JiBnMqyl6S.gif\" onload=\"pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\" onerror=\"this.onerror=null;pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon(this);\"\/><\/p>\n<p>This makes Regex very useful for finding and replacing text. The command line utility to do this is <code>sed<\/code>, which uses the basic format of:<\/p>\n<pre>sed '\/find\/replace\/g' file &gt; file<\/pre>\n<p>This runs on a file, and outputs to STDOUT. You\u2019ll need to pipe it to itself\u00a0(as shown here) to actually replace the file on disk.<\/p>\n<p>Regex is also supported in many text editors, and can really speed up your workflow when doing batch operations. <a rel=\"nofollow noopener\" target=\"_blank\" href=\"http:\/\/vimregex.com\/\">Vim<\/a>, <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/flight-manual.atom.io\/using-atom\/sections\/find-and-replace\/\">Atom<\/a>, and <a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/redirect.viglink.com\/?u=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fprevious-versions%2F87a13zt0%28v%3Dvs.110%29&amp;key=204a528a336ede4177fff0d84a044482\">VS Cod<\/a>e all have Regex find and replace built in.<\/p>\n<p>Of course, Regex can also be used programmatically, and is usually built in to a lot of languages. The exact implementation will depend on the language, so you\u2019ll need to consult your language\u2019s documentation.<\/p>\n<p>For example, in JavaScript regex can be created literally, or dynamically using the global RegExp object:<\/p>\n<pre>var re = new RegExp('abc')<\/pre>\n<p>This can be used directly by calling the <code>.exec()<\/code>\u00a0method of the newly created regex object, or by using the <code>.replace()<\/code>, <code>.match()<\/code>, and <code>.matchAll()<\/code>\u00a0methods on strings.\n<\/div>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMLG0nwswvr63Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\">For forums sites go to <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/forum.buradabiliyorum.com\/\" target=\"_blank\" rel=\"noopener\">Forum.BuradaBiliyorum.Com<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/www.cloudsavvyit.com\/921\/how-do-you-actually-use-regex\/\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;#How Do You Actually Use Regex? \u2013 CloudSavvy IT&#8221; Regex, short for regular expression, is often used in programming languages for matching patterns in strings, find and replace, input validation, and reformatting text. Learning how to properly use Regex can make working with text much easier. Regex Syntax, Explained Regex has a reputation for having&#8230;<\/p>\n","protected":false},"author":1,"featured_media":254013,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/www.cloudsavvyit.com\/p\/uploads\/2019\/06\/b0582a06.png","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-254012","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/254012","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=254012"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/254012\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/254013"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=254012"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=254012"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=254012"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}