Regular expression to get a string between two strings where the last string can also be ‘end of string’

Posted on

Problem

I intend to extract a URL search parameter value using a regular expression in plain . The parameter can be in any order in the search query.

So is there a better approach than ?

const query1 = '?someBoolean=false&q=&location=&testParam=dummy_value&testParam2=dummy_value2&requiredParam=requiredValue';

const query2 = '?someBoolean=false&q=&location=&testParam=dummy_value&testParam2=dummy_value2&requiredParam=requiredValue&someMoreParam=dummy_value2';

let requiredParam = query1.match(/requiredParam=(.*?)$/) || [];
console.log('Using regex "/requiredParam=(.*?)$/"');
console.log(`For Query1: result = ${requiredParam[1]}`);

requiredParam = query2.match(/requiredParam=(.*?)&/) || [];
console.log('Using regex "/requiredParam=(.*?)&/"');
console.log(`For Query2: result = ${requiredParam[1]}`);


console.log('Combining both regex "/requiredParam=(.*?)(&|$)/"');
requiredParam = query1.match(/requiredParam=(.*?)(&|$)/) || [];
console.log(`For Query1: result = ${requiredParam[1]}`);
requiredParam = query2.match(/requiredParam=(.*?)(&|$)/) || [];
console.log(`For Query2: result = ${requiredParam[1]}`);

Edit:

My constraints:

  • Browser compatibility including IE 9
  • Using only vanilla javascript

Solution

URLSearchParams would make things significantly easier:

const query1 = '?someBoolean=false&q=&location=&testParam=dummy_value&testParam2=dummy_value2&requiredParam=requiredValue';
const query2 = '?someBoolean=false&q=&location=&testParam=dummy_value&testParam2=dummy_value2&requiredParam=requiredValue&someMoreParam=dummy_value2';

const params1 = new URLSearchParams(query1);
const params2 = new URLSearchParams(query2);

console.log(`For Query1: result = ${params1.get('requiredParam')}`);
console.log(`For Query2: result = ${params2.get('requiredParam')}`);

It’s supported natively in the vast majority of browsers, but not all. For the rest, here’s a polyfill. It’s better not to re-invent the wheel when you don’t need to, and it’s good when you’re able to use a standard API (with examples and documentation and Stack Overflow answers about it, etc).

As a side note – when using regular expressions, I’d recommend using capture groups only when necessary. If all you need to do is group some tokens together logically (like for a | alternation), non-capturing groups should be preferred. That is, if URLSearchParams didn’t exist, better to do (?:&|$) than (&|$). Reserve capturing groups for when you need to save and use the captured result somewhere – otherwise, non-capturing groups are more appropriate, less expensive, and require less cognitive overhead.

If you had to go the regex route, another slight improvement would be to use a negative character class instead of lazy repetition. In the pattern, you have:

(.*?)(&|$)

Lazy repetition is slow; it forces the engine to advance one character at a time, then check the rest of the pattern for a match, and repeat until the match is found. Since you know that the capture group will not contain any &s, better to match anything but &s:

([^&]*)

Once you do that, you don’t even need the final (&|$) or (?:&|$) due to the greedy repetition.

Bug with variable declarations

The keywords const and let are only supported (partially) by IE 111 2.

I don’t have IE 9 but I do have IE 11 and set the Document mode to IE 9.

IE emulation mode

Running the first line in a sandbox on jsBin.com :

const query1 = '?someBoolean=false&q=&location=&testParam=dummy_value&testParam2=dummy_value2&requiredParam=requiredValue';

led to an error in the console:

IE emulation error

In order to properly support IE 9 users, use var instead of const and let.

Leave a Reply

Your email address will not be published. Required fields are marked *