ES6 RegEx
In JavaScript, Regular Expressions (RegEx) are patterns used to match character combinations in strings. With the introduction of ECMAScript 6 (ES6), several new features were added to the regular expression syntax, making it more powerful and easier to work with.
Key Features of ES6 Regular Expressions
Named Capturing Groups
ES6 introduced named capturing groups in regular expressions, allowing you to assign names to the capturing groups. This makes it easier to reference the captured values, especially in more complex patterns.
let regex = /(?<fName>\w+) (?<lName>\w+)/;
let s = "Rahul Kumar";
let match = s.match(regex);
console.log(match.groups.fName);
console.log(match.groups.lName);
Output
Rahul Kumar
In the above example, we have a regular expression that captures the first and last name as fName and lName using named groups. We can then access these groups directly via match.groups.fName and match.groups.lName.
The y (Sticky) Modifier
The y modifier makes the regular expression match from the current position in the string. This is useful for matching substrings sequentially without having to reset the search position.
let regex = /hello/y;
let s = "hello world hello";
console.log(regex.exec(s));
console.log(regex.exec(s));
Output
[ 'hello', index: 0, input: 'hello world hello', groups: undefined ] null
The y modifier forces the regular expression to start matching from the exact position in the string where the last match occurred. In the above code, after the first match, the regexâs last index is at the end of "hello", so the second call to exec() doesnât match anything.
The u (Unicode) Modifier
The u modifier allows regular expressions to properly handle Unicode characters, including characters outside the basic multilingual plane (BMP), such as emojis or characters from non-Latin alphabets.
let regex = /\u{1F600}/u;
let s = "đ";
console.log(regex.test(s));
Output
true
The u flag ensures that Unicode characters, like the emoji in this case, are correctly matched. Without it, the regular expression would fail to match characters outside the BMP.
The s (Dotall) Modifier
The s modifier allows the dot (.) in regular expressions to match newlines as well, which is a behavior that was previously unavailable in JavaScript.
let regex = /foo.bar/s;
let s = "foo\nbar";
console.log(regex.test(s));
Output
true
The s modifier enables the . character to match any character, including newline characters (\n). Without the s flag, . does not match newline characters, and the above test would fail.
The d (Decimal) Modifier (Experimental)
ES6 proposed the d flag to allow matching decimal points more easily. However, it has not yet been widely adopted in JavaScript engines, and as of now, it is not supported in most browsers.
The n (Named Property Access) Modifier
This flag allows you to create named properties inside regular expressions for easier access, making it simpler to extract and manipulate matched groups.
Regular Expression Syntax Enhancements in ES6
Apart from the modifiers, ES6 brought several improvements to the syntax of regular expressions
Unicode Property Escapes
ES6 introduces the \p{Property=Value} syntax for matching characters based on their Unicode properties. This allows for more powerful and specific character matching.
let regex = /\p{Script=Greek}/u;
let s = "ÎąÎ˛Îŗ";
console.log(regex.test(s));
Output
true
The regular expression matches any characters from the Greek script, making it easier to work with characters from specific languages or scripts.
Expanded Character Classes
ES6 allows the use of extended Unicode character classes such as \p{L} for letters, \p{N} for numbers, etc. This gives us more precise control over pattern matching, especially when dealing with international characters.
let regex = /\p{L}+/gu;
let s = "Hello 123";
console.log(s.match(regex));
Output
[ 'Hello' ]
The \p{L} matches any Unicode letter. Using g for global matching, we extract all letter sequences in the string.
Practical Examples of ES6 Regular Expressions
Validating a Date Format (YYYY-MM-DD)
let regex = /^(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})$/;
let s = "2024-12-31";
let match = s.match(regex);
if (match) {
console.log(match.groups.year);
console.log(match.groups.month);
console.log(match.groups.day);
}
Output
2024 12 31
The regular expression validates a date string and extracts the year, month, and day using named groups.
Matching a Valid Email Address
let regex = /^(?<username>[a-zA-Z0-9._%+-]+)@(?<domain>[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})$/;
let mail = "user@example.com";
let match = mail.match(regex);
if (match) {
console.log(match.groups.username);
console.log(match.groups.domain);
}
Output
user example.com
The regular expression matches a basic email format and extracts the username and domain using named groups.
Extracting Emoji from a String
let regex = /\p{Emoji}/gu;
let s = "Hello đ, how are you? đ";
console.log(s.match(regex));
Output
[ 'đ', 'đ' ]
The regex uses the Unicode property escape \p{Emoji} to match emoji characters in the string.