Find common characters (LeetCode)

Question 1

The task is taken from leetcode

Given an array A of strings made only from lowercase letters, return a list of all characters that show up in all strings within the list (including duplicates). For example, if a character occurs 3 times in all strings but not 4 times, you need to include that character three times in the final answer.

You may return the answer in any order.

Example 1:

Input: ["bella","label","roller"]

Output: ["e","l","l"]

Example 2:

Input: ["cool","lock","cook"]

Output: ["c","o"]

Note:

1 <= A.length <= 100

1 <= A[i].length <= 100

A[i][j] is a lowercase letter

My imperative solution:

function commonChars(A) {
 const [first, ...rest] = A.sort((a,b) => -(a.length - b.length));
 const duplicates = [];
 [...first].forEach(e => {
 let isDuplicate = true;
 for (let x = 0, len = rest.length; x < len; x++) {
 let letters = rest[x];
 const i = letters.search(e);
 if (i !== -1) {
 letters = letters.slice(0, i) + letters.slice(i + 1);
 rest[x] = letters;
 } else {
 isDuplicate = false;
 }
 }
 if (isDuplicate) {
 duplicates.push(e);
 }
 });
 return duplicates;
}
const arr = ["cool","lock","cook"];
console.log(commonChars(arr));

My declarative solution

const commonChars2 = A => {
 const [first, ...rest] = A.sort((a,b) => b.length - a.length));
 return [...first].reduce((duplicates, e) => {
 let isDuplicate = true;
 for (let x = 0, len = rest.length; x < len; x++) {
 const letters = rest[x];
 const i = letters.search(e);
 if (i !== -1) {
 rest[x] = letters.slice(0, i) + letters.slice(i + 1);
 } else {
 isDuplicate = false;
 }
 }
 if (isDuplicate) { duplicates.push(e); }
 return duplicates;
 }, []);
};
const arr = ["cool","lock","cook"];
console.log(commonChars2(arr));

Is there also a good functional approach?

Question 2

Opportunities missed

You have essentially the same algorithm in both examples. When I first looked at the solutions the sort at the start was a worry. The logic is sound, to work from the smallest word as there will be no more characters to test then is contained in that word.

However you have the sort AoT and the result is you set first to be the longest word in the array.

Changing the sort to be the other direction works great improving the performance near 3-4 times if there is a much shorter word in the array. However there is no real benefit with the sort if all the words are of similar length.

Another missed optimization is breaking out of the inner loop when you set isDuplicates to false. Adding a break saves having to do any further iteration and gives another 10% performance. BTW the name duplicates would have just as much meaning.

String.search(arg) and `String.indexOf(arg)`

The real bottle neck is in the inner loop where you locate the current character e in the current word. letters.search(e)

It may look innocent but this is where knowing the language well come into play. String.search(arg) takes as the first argument a regExp but also works with just a string. The catch is that the string is first converted to a regExp before the search starts. This is a huge overhead.

If you use String.indexOf instead your code is almost twice as fast. I tried to write something that could beat it using Maps to reduce time complexity, and as such only starts to outperform when the word lengths get much longer 50+ characters.

Rewrite

So for small words your solution with slight modification is very fast.

Sort now from smallest to largest, Breaks from inner loop when no duplicates found.
Uses faster String.indexOf to do inner character search.

And some name changes..

function commonCharsNo(words) {
 const [first, ...rest] = words.sort((a, b) => (a.length - b.length));
 const duplicates = [];
 [...first].forEach(e => {
 let duplicate = true;
 for (let x = 0, len = rest.length; x < len; x++) {
 let word = rest[x];
 const i = word.indexOf(e);
 if (i !== -1) {
 rest[x] = word.slice(0, i) + word.slice(i + 1);
 } else {
 duplicate = false;
 break;
 }
 }
 if (duplicate) {
 duplicates.push(e);
 }
 });
 return duplicates;
}

Reduce complexity

The search (now indexOf) is where the code gets poor complexity \$O(m)\$. Its \$O(n.m)\$ where \$n\$ is the number of characters in the smallest word and \$m\$ is the mean number of characters per word.

You can reduce this to approx \$O(w.m)\$ where \$w\$ is the number of words and \$m\$ is the mean number of characters per word.

The improvement is small and only pays out for longer words.

function findCommonChars(words) {
 var chars;
 const result = [];
 words = [...words.sort((a,b)=>b.length - a.length)];
 const countChars = (word, res = new Map()) => {
 for (const char of word) {
 res.has(char) && (!chars || chars.has(char)) ? 
 res.get(char)[0]++ : res.set(char, [1]);
 }
 return res;
 }
 chars = countChars(words.pop());
 const charCounts = words.map(word => countChars(word));
 for (let [char, count] of chars.entries()) {
 for (const word of charCounts) {
 if (word.has(char)) { count = Math.min(count, word.get(char)[0]) }
 else {
 count = 0;
 break;
 }
 }
 while (count--) { result.push(char) }
 } 
 return result;
}

Question 3

Why sort the words at all? Why not just pick any of shortest as the pivot?

Question 4

@janos It reduces storage complexity and when there is a short word in the set, improves performance by a significant amount, as the pivot removes the overhead of mapping character not needed. If performance were critical then it would have to test at start hte best method to use, the OP's, the one I gave, and another variant more suited to high word counts

Blindman67 Blindman67 22.9k2 gold badges16 silver badges40 bronze badges · Accepted Answer · 2019-05-13 04:23:51Z

Opportunities missed

You have essentially the same algorithm in both examples. When I first looked at the solutions the sort at the start was a worry. The logic is sound, to work from the smallest word as there will be no more characters to test then is contained in that word.

However you have the sort AoT and the result is you set first to be the longest word in the array.

Changing the sort to be the other direction works great improving the performance near 3-4 times if there is a much shorter word in the array. However there is no real benefit with the sort if all the words are of similar length.

Another missed optimization is breaking out of the inner loop when you set isDuplicates to false. Adding a break saves having to do any further iteration and gives another 10% performance. BTW the name duplicates would have just as much meaning.

String.search(arg) and `String.indexOf(arg)`

The real bottle neck is in the inner loop where you locate the current character e in the current word. letters.search(e)

It may look innocent but this is where knowing the language well come into play. String.search(arg) takes as the first argument a regExp but also works with just a string. The catch is that the string is first converted to a regExp before the search starts. This is a huge overhead.

If you use String.indexOf instead your code is almost twice as fast. I tried to write something that could beat it using Maps to reduce time complexity, and as such only starts to outperform when the word lengths get much longer 50+ characters.

Rewrite

So for small words your solution with slight modification is very fast.

Sort now from smallest to largest, Breaks from inner loop when no duplicates found.
Uses faster String.indexOf to do inner character search.

And some name changes..

function commonCharsNo(words) {
 const [first, ...rest] = words.sort((a, b) => (a.length - b.length));
 const duplicates = [];
 [...first].forEach(e => {
 let duplicate = true;
 for (let x = 0, len = rest.length; x < len; x++) {
 let word = rest[x];
 const i = word.indexOf(e);
 if (i !== -1) {
 rest[x] = word.slice(0, i) + word.slice(i + 1);
 } else {
 duplicate = false;
 break;
 }
 }
 if (duplicate) {
 duplicates.push(e);
 }
 });
 return duplicates;
}

Reduce complexity

The search (now indexOf) is where the code gets poor complexity \$O(m)\$. Its \$O(n.m)\$ where \$n\$ is the number of characters in the smallest word and \$m\$ is the mean number of characters per word.

You can reduce this to approx \$O(w.m)\$ where \$w\$ is the number of words and \$m\$ is the mean number of characters per word.

The improvement is small and only pays out for longer words.

function findCommonChars(words) {
 var chars;
 const result = [];
 words = [...words.sort((a,b)=>b.length - a.length)];
 const countChars = (word, res = new Map()) => {
 for (const char of word) {
 res.has(char) && (!chars || chars.has(char)) ? 
 res.get(char)[0]++ : res.set(char, [1]);
 }
 return res;
 }
 chars = countChars(words.pop());
 const charCounts = words.map(word => countChars(word));
 for (let [char, count] of chars.entries()) {
 for (const word of charCounts) {
 if (word.has(char)) { count = Math.min(count, word.get(char)[0]) }
 else {
 count = 0;
 break;
 }
 }
 while (count--) { result.push(char) }
 } 
 return result;
}

Why sort the words at all? Why not just pick any of shortest as the pivot?
@janos It reduces storage complexity and when there is a short word in the set, improves performance by a significant amount, as the pivot removes the overhead of mapping character not needed. If performance were critical then it would have to test at start hte best method to use, the OP's, the one I gave, and another variant more suited to high word counts

Stack Exchange Network

Find common characters (LeetCode)

1 Answer 1

Opportunities missed

String.search(arg) and `String.indexOf(arg)`

Rewrite

Reduce complexity

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Find common characters (LeetCode)

1 Answer 1

Opportunities missed

String.search(arg) and String.indexOf(arg)

Rewrite

Reduce complexity

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

String.search(arg) and `String.indexOf(arg)`