Verifying if a string can be created with the letters of another string

Posted on

Problem

This is basically based on the following question on stackoverflow which was based on a codewars assignment that can be found here

Problem statement

Complete the function scramble(str1, str2) that returns true if a portion of str1 characters can be rearranged to match str2, otherwise returns false.

Notes:

  • Only lower case letters will be used (a-z). No punctuation or digits will be included.
  • Performance needs to be considered

My solution

function scramble(str1, str2) {
  if (str1.length < str2.length) {
    return false;
  }
  const set = new Set([...str2]);
  return Array.from( set ).every( letter => str2.split(letter).length <= str1.split(letter).length );
}

As with the original question on SO, my question would be if there are more efficient ways of completing the assignment. I am mainly worried about the string.split, but I didn’t know of any other method that might be faster (I know I should substract 1 to get the actual letter count, but it doesn’t matter for this usecase)

Solution

Counting with str2.split(letter).length takes linear time on the size of the string, so it’s not efficient to repeat it for each distinct character in the input (even if that still results in a linear overall runtime, strictly speaking, since Unicode is limited in size). Here’s an accurate implementation of the solution Mike Brant described, since your rewrite that creates a map still does the slow count:

const getCounts = iterable => {
    const result = new Map();

    for (const x of iterable) {
        result.set(x, (result.get(x) ?? 0) + 1);
    }

    return result;
};

const scramble = (available, target) => {
    const remaining = getCounts(available);

    for (const c of target) {
        const count = remaining.get(c);

        if (!count) {
            return false;
        }

        remaining.set(c, count - 1);
    }

    return true;
};

An alternative solution with decent performance is to compare sorted versions of the strings:

const scrambleSorted = (available, target) => {
    let i = -1;

    for (const c of target) {
        i = available.indexOf(c, i + 1);

        if (i === -1) {
            return false;
        }
    }

    return true;
};

const scramble = (available, target) =>
    scrambleSorted([...available].sort(), [...target].sort());

(This would be the one with constant auxiliary space if you were allowed to mutate the input, but JavaScript doesn’t have mutable strings.)

Also, note that strings are iterables of codepoints, but string functions like split, which predate iterables, operate on UTF-16 code units. So anything that mixes iteration (e.g. new Set(str), [...s], Array.from(s), for (c of s)) won’t work with split-based counting.


Or, when the number of distinct characters is small and even contiguous, you can use an array:

const scramble = (available, target) => {
    if (target.length > available.length) {
        return false;
    }

    const remaining = Array(26).fill(0);

    for (let i = 0; i < available.length; i++) {
        const c = available.charCodeAt(i) - 97;
        remaining[c]++;
    }

    for (let i = 0; i < target.length; i++) {
        const c = target.charCodeAt(i) - 97;

        if (--remaining[c] === -1) {
            return false;
        }
    }

    return true;
};

Your logic seems hard to follow. Why spread str2 to array, which you then read into a Set, only to turn around and make the Set into an array, which you then iterate, comparing each letter’s count in each string by splitting them repeatedly?

You see how long and convoluted that sentence was? The last line of code there is kind of the same, dense and hard to follow.

The problem should be solved by counting characters for str1 into map keyed by each letter. You then iterate through str2 and decrement those counters. If any counter goes to -1 you have false condition. This approach means each letter in passed string only needs to be visited once making performance linear to the combined length of the strings in operational complexity. Your current solution has exponential complexity.

From a short review;

  • Whenever I deal with search algorithms, I like to use haystack and needle as variable names, they are more evocative then str1 and str2

  • There are no comments, I think you need to comment why you do const set = new Set([...str2]); The rest of the code is obvious once the reader figures that one out

  • For the counting part, there are really 3 strategies;

    • use String.split() Fast, uses memory though
    • use String.match() Fast, uses more memory though
    • use String.indexOf() Somehow slow nowadays, least memory

    So I think your approach is fine. More research here.

This is a counter proposal with the above in mind;

function scramble(haystack, needle) {
  if (haystack.length < needle.length) {
    return false;
  }
  //"Hello World" -> Set('H', 'e', 'l','o','w','r','d')
  const uniqueCharacters = new Set([...needle]);
  return Array.from(uniqueCharacters).every(c => occurrences(needle,c) <= occurrences(haystack, c));
}

function occurrences(haystack, needle){
  return haystack.split(needle).length - 1;
}

You don’t need to convert the string to an array before passing it to a set. The Set constructor can take any iterable as an argument. So, you can directly pass the string to get all the unique characters:

const set = new Set(str2)

In the code, you are converting a string to an array to a set to an array again. Even, inside the every callback you are doing split on both strings. Which means, you need to search through both strings set.size times, create an array and compare their length.

You can write it in such a way that str1 is looped only once and str2 is looped until it fails your condition.

  • You can create a counter function which maps each character with the number of times it occurs in a string. Call the function on str1
  • Loop through the second string and return false if
    • the counter doesn’t have that character.
    • the value becomes negative once you account for the current character.
  • If the loop is completed, the string can be scrambled.
function counter(str) {
  const counter = {};
  for (const c of str)
    counter[c] = counter[c] + 1 || 1;
  return counter
}

function scramble(str1, str2) {
  if (str1.length < str2.length)
    return false;

  const count = counter(str1);

  for (const c of str2) {
    if (!count.hasOwnProperty(c) || --count[c] < 0)
      return false
  }
  
  return true
}

console.log(
  scramble('rkqodlw', 'world'),
  scramble('cedewaraaossoqqyt', 'codewars'),
  scramble('katas', 'steak')
)

Well, the answer by Mike Brant indicated partly what was wrong with my original approach, as in iterating str2 potentially 2 times (if all characters would have been unique).

I still don’t think I have to map each letter by key for str1, in fact I see str1 as pretty much irrelevant, and only for verifying if a letter occurs at least as many times as it does in str2.

So, to make it less compact but only iterate str2 once, I went for the following solution

function scramble(str1, str2) {
  if (str1.length < str2.length) {
    return false;
  }
  const set = new Set();
  let c;
  for (let i = str2.length; i--;) {
    c = str2.charAt(i);
    if (set.has(c)) {
      continue;
    }
    if ( str2.split(c).length > str1.split(c).length) {
      return false;
    }
    set.add(c);
  }
  return true;
}

And if I make a map of str1 first, I went for the following solution

function createMap(str) {
  let result = {};
  let c;
  for (let i = str.length; i--;) {
    c = str.charAt(i);
    result[c] = (result[c] || 0) + 1;
  }
  return result;
}

function scramble(str1, str2) {
  if (str1.length < str2.length) {
    return false;
  }
  const set = new Set();
  const map = createMap(str1);
  let c;
  for (let i = str2.length; i--;) {
    c = str2.charAt(i);
    if (set.has(c)) {
      continue;
    }
    if (!map[c] || str2.split(c).length - 1 > map[c]) {
      return false;
    }
    set.add(c);
  }
  return true;
}

Both solutions are in the same ballpark when it comes to performance

Leave a Reply

Your email address will not be published. Required fields are marked *