Nick Scialli • January 01, 2021 • 🚀 3 minute read
Regular Expressions are extremely powerful but their syntax can be pretty opaque. Today we’ll use regex to capture all content between two characters.
Example Problem Setup
Let’s say we have the following string:
“Hi there, my name is [name], I am [age] years old, and I work in the field of [profession].”
And we want to end up with the following array:
['name', 'age', 'profession'];
How can we do this?
Using the String .match Method
match method, which takes a string or regular expression as an argument.
const str = 'Hi there, my name is [name], I am [age] years old, and I work in the field of [profession].'; const matches = str.match(/some regex here/);
Capture Between the Brackets, Lazily
We want to capture between the brackets. Our first go at this might include a regex that looks like this:
/\[.+?\]/g. If we use this, we get the following:
const str = 'Hi there, my name is [name], I am [age] years old, and I work in the field of [profession].'; const matches = str.match(/\[.+?\]/g); console.log(matches); // ["[name]", "[age]", "[profession]"]
So close! But we don’t want the brackets included in our final strings.
Before we work on eliminating them, let’s evaluate what we did in our current regular expression.
The outer part
/ /g basically says a couple things: the forward slashes indicate that this is a regex and the
g indicates that this should be a global regex (i.e., we don’t want to stop at the first match).
\] mean that we want to match the opening and closing brackets, but we have to use backslashes to escape them because the brackets themselves have other uses in the world of regex.
Finally, we have
.+?. This means we want to capture any number of characters until we hit the next
+? is lazy (captures the smallest amount until the next bracket) whereas you might also be familiar with just
+, which will capture from the first opening bracket all the way up until the last closing bracket!
Removing the Brackets
To remove the brackets, we can use lookahead and lookbehind operators. Instead of matching the brackets, we can say we want to look behind for the opening bracket and look ahead for the closing braket, but not actually include them in our match.
The new regular expression with our lookahead and lookbehind operators is as follows:
const str = 'Hi there, my name is [name], I am [age] years old, and I work in the field of [profession].'; const matches = str.match(/(?<=\[).+?(?=\])/g); console.log(matches); // ["name", "age", "profession"]
Again, we look behind for the opening bracket
(?<=\[) and we look ahead for the closing bracket
If you'd like to support this blog by buying me a coffee I'd really appreciate it!
Nick Scialli is a software engineer at the U.S. Digital Service.