Problem
I came across this question from LeetCode OJ where I am given a string containing parentheses and I need to find out if the string contains valid matching parentheses or not.
Given a string containing just the characters
'('
,')'
,'{'
,'}'
,'['
and']'
, determine if the input string is valid.The brackets must close in the correct order,
"()"
and"()[]{}"
are all valid but"(]"
and"([)]"
are not.
This is the code I wrote (it got accepted and passed all the test cases.)
/** Java program to check if input string contains matching parameters
or not.
* @author : Anish
* @date : 01-11-2017
*/
import java.util.Stack;
class Solution
{
public boolean isValid(String str)
{
Stack<Character> stack = new Stack<Character>();
char top = ' ';
int len = str.length();
for (int i = 0; i < len; i++)
{
char ch = str.charAt(i);
if (ch == '(')
stack.push(')');
else if (ch == '[')
stack.push(']');
else if (ch == '{')
stack.push('}');
else if (!stack.empty())
{
top = stack.peek();
if (top == ch)
stack.pop();
else
stack.push(ch);
}
else if (stack.empty())
{
stack.push(ch);
}
}
if (stack.empty())
{
return true;
}
return false;
}
}
ANALYSIS
Time Complexity : O(n)
Run Time : 9ms (on the leetcode OJ)
ISSUES
- Even though my time is O(n), yet my percentile is 70.4 which effectively means there are better and faster solutions out there. In that case, how do I optimize the above code to make it faster?
- I have been using the Stack API provided by Java to solve the problem. Rather, if I were to redesign my own stack API and then use it to solve the problem, would I get faster runtime and better performance?
- If the above point is true, then does it mean that in general, user designed APIs are faster than what is provided by the Java libraries? (assuming the user designed API is clean and bug free)
Solution
Usage of Stack
Don’t use a Stack
. This class is very old, it’s thread safe (which you don’t need) and achieves it by synchronizing the push
and pop
methods, which will have an impact in performance.
I would use ArrayDeque
. By default it’s initialized with size 16. To reduce number of re-allocations, allocate a larger size, perhaps ou can guess how many elements you’ll need based on the size of the string. For example:
Queue<Character> queue = new ArrayDeque<>(str.size()/2);
Boxing of char
Note that each time you push/pop from the queue, you are boxing and unboxing char (to Character). For ultimate performance, create a char[] queue
, but then you’ll have to manage resizes and pointers to the top element yourself.
Fail fast
Consider that the input is {(])}
. When finding ]
, your program can already know that the string is invalid and stop iterating. This will make the processing of invalid inputs potentially much faster.
So what about replace this:
else if (!stack.empty())
{
top = stack.peek();
if (top == ch)
stack.pop();
else
stack.push(ch);
}
else if (stack.empty())
{
stack.push(ch);
}
}
else if (stack.empty())
{
stack.push(ch);
}
by:
else if (!stack.empty())
{
top = stack.peek();
if (top == ch)
stack.pop();
else
return false; // quit
}
}
else
{
return false; //quit
}
I didn’t test and I might be missing something, but it looks that as soon as you have an char not matching the expected closing parenthesis, you can conclude that it’s invalid.
My suggestion
This has all the previous points applied and runs much faster:
public boolean isValid(String str) {
char[] stack = new char[str.length()];
int p = -1;
for (int i = 0; i < str.length(); i++) {
char ch = str.charAt(i);
if (ch == '(') {
stack[++p] = ')';
} else if (ch == '[') {
stack[++p] = ']';
} else if (ch == '{') {
stack[++p] = '}';
} else if (p >= 0) {
char top = stack[p];
if (top == ch) {
p--;
} else {
return false;
}
} else {
return false;
}
}
return p < 0;
}
EDIT: Benchmarks
I’m adding the results of benchmarks done with JMH with some different solutions to show how performance is impacted.
I used one valid input: ([[{{([[{{[[[[]]]]}}]])}}]])
and one invalid input: ([[{{([[{{[[[[]]]]}}]]})}}]])
.
Parameters of the benchmark:
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Benchmark mode: Average time, time/op (nanoseconds)
Results:
Benchmark (input) Mode Cnt Score Error Units
Benchmark.finalSolution valid avgt 5 67.468 ± 1.411 ns/op
Benchmark.finalSolution invalid avgt 5 53.832 ± 1.543 ns/op
Benchmark.toCharArray valid avgt 5 241.692 ± 11.705 ns/op
Benchmark.toCharArray invalid avgt 5 261.955 ± 6.192 ns/op
Benchmark.original valid avgt 5 245.888 ± 3.329 ns/op
Benchmark.original invalid avgt 5 265.483 ± 3.312 ns/op
Benchmark.useArrayDeque valid avgt 5 100.597 ± 1.621 ns/op
Benchmark.useArrayDeque invalid avgt 5 102.695 ± 0.659 ns/op
Benchmark.useLinkedList valid avgt 5 180.679 ± 4.395 ns/op
Benchmark.useLinkedList invalid avgt 5 190.591 ± 4.010 ns/op
One thing that will definitely speed up your code is switching from a Stack<Character>
to a char[]
. Every time you push
and pop
a value from a Stack
, it’s boxed and unboxed to and from its primitive representation, and this is a somewhat expensive operation to do many times in a tight loop. I measured about a 5x speed-up in a pretty unscientific micro-benchmark.
Another smaller optimization you can do is check if the stack grows to over half the length of the input string. If this is ever the case, you know that the string is unbalanced, and you don’t have to keep checking.
Changes
-
Move the early declaration of
top
. Push it into the else-if block, where it’s used, to have it as close as possible to it’s usage. -
Always put braces around your blocks. Java code is compiled, so more braces don’t slow the code down, while preventing bugs.
-
If the stack is empty when you just read a closing paren, the string is unbalanced. The final else-if can be an early-return, which saves you a minuscle factor of time.
-
Instead of using an indexed for-loop and
charAt(i)
it would be cleaner (and more idiomatic) to use a for-each loop:for (char ch : str.toCharArray()) { if (ch == '(') { // ... }
I don’t know exactly how this impacts runtime, but it shouldn’t make a huge difference either way.
-
Instead of
if (condition) return true; else return false;
one can justreturn condition
. This applies for your code too:} return stack.empty(); }
you may use Deque which gives you the best performance always.
Deque<Character> stack = new LinkedList<Character>();
for pushing:
stack.offerFirst(temp);
for popping:
Charcter temp = stack.pollFirst();
All right, so after a lot of discussions and experiments, I finally came up with the code that gave me the best performance. But before I dwell on that, here are some things I would like to sum up (from all the answers and comments above) :
- Avoid wrapper class for primitive data types as the boxing/unboxing affects performance.
- The java.util.Stack library is old and uses synchronisation which again affects performance.
- If we can implement a stack with dynamic array, then it would lead to much better performance.
- Using a linked list for stack implementation may again affect performance.
- Keeping checks in between the code for early return always make the code more efficient and optimised (provided the checks are valid).
- Queue is a better choice than stack.
- LeetCode submissions give different runtimes for the same code.
- For each loop and indexed for loop give the same performance.
- For each loop is just better and easier to use.
- Keep the variables as close to their usage block as possible.
- Placement of braces doesn’t affect performance but may lead to more/less bugs depending on their usage and/or placement.
So, below is the link to the final code I wrote. It gave me 98% performance at best and 40% at worst.
Finally, thank you everyone for the in depth answers and the wealth of knowledge shared. I learned a lot.
P.S. Please point out the mistakes (if any) in my summarisation.