Remove adjacent duplicate characters

Question 1

Introduction

This problem is essentially equivalent to this HackerRank problem.

The task is to remove all duplicate characters that are adjacent to each other in a given String of lowercase ASCII characters in range ['a','z'], until no adjacent characters are the same. If a String only contains adjacent duplicate characters, then return an empty String.

Examples:

baab => bb => "" (return an empty String)
aabcd => bcd

Implementation

The core implementation logic uses Stacks to do most of the work.

Translate the input String to a Stack made up of Characters
Create a new Stack that will contain the non-duplicate Characters which will be outputted
pop through the input Stack
- If the output Stack is empty, add the character
- If the first element of the output Stack is the same Character as the popped Character then
  - pop the output Stack because this indicates that the two adjacent Characters are duplicate
- Else
  - push the popped input Stack element to the output Stack

Feedback

I'm looking for general feedback as I'm still pretty green to Java. So if I have some serious design pattern issues, or if I'm missing something very obvious, I'd love to get feedback around that.

A couple thoughts on my mind:

Initially I tried implementing logic that iterated through Lists which worked but isn't as efficient as this implementation. One thing that I do like about this implementation is that it identifies the duplicate characters in one iteration, which you could achieve through Lists, I think, but the logic is more intuitive when using Stacks (I think).
Naming
- I think the name AdjacentDuplicateCharactersRemover is relatively appropriate, but what are good names for the underlying methods? I struggled with this for a while because they are both doing similar things (removing / filtering characters).

Actual Implementation

public class AdjacentDuplicateCharactersRemover {
 public static String removeAdjacentDuplicateCharacters(final String candidate) {
 if (candidate.length() < 2) {
 return candidate;
 }
 final Stack<Character> originalCharacters = new Stack<>();
 for (char c : candidate.toCharArray()) {
 originalCharacters.push(c);
 }
 final Stack<Character> filteredCharacters = AdjacentDuplicateCharactersRemover.filterAdjacentDuplicateCharacters(originalCharacters);
 final StringBuilder stringBuilder = new StringBuilder();
 while (!filteredCharacters.empty()) {
 stringBuilder.append(filteredCharacters.pop());
 }
 return stringBuilder.toString();
 }
 public static Stack<Character> filterAdjacentDuplicateCharacters(final Stack<Character> characters) {
 if (characters.size() < 2) {
 return characters;
 }
 final Stack<Character> filteredCharacters = new Stack<>();
 while (!characters.empty()) {
 final char poppedChar = characters.pop();
 if (filteredCharacters.empty() || poppedChar != filteredCharacters.peek()) {
 filteredCharacters.push(poppedChar);
 } else {
 filteredCharacters.pop();
 }
 }
 return filteredCharacters;
 }
}

Question 2

The naming of the class is OK, I think. It could be better named AdjacentSameCharacterPairsRemover. "Same" is shorter than "duplicate". "Pairs" corrects a misunderstanding that I initially had, which was that "aaa" → "" instead of "aaa" → "a". The name still wouldn't convey the idea that the elimination is applied repeatedly, but there's only so much information that you can fit into a name.

I don't think that the function name needs to be that long, if the class name is already descriptive. The parameter name, candidate, is peculiar. What office is that string running for, or what validation are you performing on it?

I don't recommend marking parameters and local variables as final — it just adds visual clutter. Functions should be short enough that you can easily keep track of all the variables in your head simultaneously.

You don't need the special case for if (candidate.length() < 2); the algorithm works just fine without it. The added complexity isn't worth the potential performance benefit.

I would work directly with the char[] and dispense with the Stack and StringBuilder. By doing so, you would eliminate the need for extra storage, the boxing of chars as Characters, the synchronized behaviour of java.util.Stack, and the semi-deprecated status of java.util.Stack. You also would't need to push all of the original characters onto the stack and then filter it, since the character array is the input. You wouldn't need to translate the Stack back to a StringBuilder to make the String. The code would be simplified so much that you could eliminate the helper function altogether.

public class AdjacentSameCharacterPairsRemover {
 // Suppress the default constructor
 private AdjacentSameCharacterPairsRemover() {}
 public static String process(String s) {
 char[] chars = s.toCharArray();
 int len = 0; // Length of result (initially empty)
 for (int i = 0; i < chars.length; i++) {
 if (len == 0 || chars[i] != chars[len - 1]) {
 chars[len++] = chars[i]; // Push one char onto "stack"
 } else {
 len--; // Pop one char from "stack"
 }
 }
 return new String(chars, 0, len);
 }
}

Question 3

What's the reason behind suppressing the default constructor? Is the idea here that if there's nothing to construct then don't allow anybody the ability to do so?

Question 4

Less clutter in the generated bytecode. Better indication of how the class is meant to be used. Harmless either way. Mainly a matter of taste.

Question 5

Problem Background

Essentially this problem is asking you to remove all palindromes of even length from the string as they are first encountered. So abccbaab becomes ab and NOT abcc even though both abccba and baab are even-length palindromes.

Naming

Unfortunately, I cannot think of a succinct name for what operation this function actually does. You could go with removeFirstEncounteredEvenLengthPalindromes but that is a bit much. I would just stick with removeEvenPalindromes and document the function with some commented examples.

Method

The good news is that you do have the right solution by using a stack. The bad news is that you are using some unnecessary intermediary data types for processing. Note that you do not need to put the data into an actual stack. Instead you can just use the output string as the stack. By eliminating these intermediary data types and processing steps, the code is much more readable.

Here is a char array solution:

public static String removeEvenPalindromes(final String input)
{
 char[] inputArray = input.toCharArray();
 int n = inputArray.length;
 int prevIndex = -1;
 for (int i = 0; i < n; i++)
 {
 if (prevIndex >= 0 && inputArray[i] == inputArray[prevIndex])
 {
 // pop a character from the stack
 prevIndex--;
 }
 else
 {
 // push a character onto the stack
 prevIndex++;
 inputArray[prevIndex] = inputArray[i];
 }
 }
 return new String(inputArray, 0, prevIndex+1);
}

And here is a StringBuilder solution:

public static String removeEvenPalindromes(final String input)
{
 StringBuilder builder = new StringBuilder();
 int inputLength = input.length();
 for (int i = 0; i < inputLength; i++)
 {
 char inputChar = input.charAt(i);
 int prevIndex = builder.length()-1;
 if (prevIndex >= 0 && inputChar == builder.charAt(prevIndex))
 {
 // pop a character from the stack
 builder.deleteCharAt(prevIndex);
 }
 else
 {
 // push a character onto the stack
 builder.append(inputChar);
 }
 }
 return builder.toString();
}

Question 6

This may be just me, but your function names seem a bit long. Generally long names are fine as long as they are absolutely needed to describe their functionality, but in this case you can probably knock off "Character" from all of those names and reduce "Adjacent" to just "Adj".

public class AdjDuplicatesRemover
public static String removeAdjDuplicates(final String candidate)
public static Stack<Character> filterAdjDuplicates(final Stack<Character> characters)

Question 7

maybe I'm not reading this correctly but how does this deal with the case like baab which should return an empty Stringbecause baab becomes bb which becomes "".

Question 8

Multiple return statements

Avoid multiple return statements in a method. It hinders you to apply refactorings like "extract method".

Parameter pass-through

Do not pass an object reference in and return it again if you in other cases create a new object to return.

The problem here is that the caller is not aware of this assumption. This can be a pitfall if the caller want to work with the result AND the parameter because in one case they will differ in object identity (size>= 2) on the other case it's the same object (==) (size <2).

Do not work on parameters

This belongs to the category of defensive programming. The caller may not be aware of that you change the passed in complex parameter "Stack" AND return an alternate object. A method should have only one expected type of outcome. The "final" key word does not hinder you to change contents of a stack.

If your parameter is a String then you have no side effects because of immutablitiy of String objects. But if you use "final" (as you do) you cannot reuse the reference.

Move to object scope

Although you have no object "state" you should consider this for future extensions.

Object scope enables you to test selective parts of your algorithm. With static methods you are only able to test the whole call stack.

Of course, if you want to test static methods you may mock descending static methods with PowerMockito.

Extract methods

To adress the SRP try to identify the responsibilities and extract them in separate methods.

Hard borders (pedantic)

This something I call "Using hard borders". This is because you need more assumptions than necessary.

Use "x <= 1" instead of "x < 2" because you introduced a hidden assumption: This work great with the Integer-Type.

The Code

public class AdjacentDuplicateCharactersRemover {
 public String removeAdjacentDuplicateCharacters(final String candidate) {
 String result = null;
 final Stack<Character> originalCharacters = createStackFrom(candidate);
 if (candidate.length() <= 1) {
 result = candidate;
 } else {
 final Stack<Character> filteredCharacters = filterAdjacentDuplicateCharacters(originalCharacters);
 result = toString(filteredCharacters);
 }
 return result;
 }
 private String toString(final Stack<Character> filteredCharacters) {
 final StringBuilder stringBuilder = new StringBuilder();
 while (!filteredCharacters.empty()) {
 stringBuilder.append(filteredCharacters.pop());
 }
 return stringBuilder.toString();
 }
 private Stack<Character> createStackFrom(final String candidate) {
 final Stack<Character> originalCharacters = new Stack<>();
 for (char c : candidate.toCharArray()) {
 originalCharacters.push(c);
 }
 return originalCharacters;
 }
 public Stack<Character> filterAdjacentDuplicateCharacters(final Stack<Character> characters) {
 final Stack<Character> filteredCharacters = new Stack<>();
 if (characters.size() <= 1) {
 filteredCharacters.addAll(characters);
 } else {
 filteredCharacters.addAll(filterDuplicates(characters));
 }
 return filteredCharacters;
 }
 private Stack<Character> filterDuplicates(final Stack<Character> characters) {
 final Stack<Character> workingCopy = createWorkingCopy(characters);
 Stack<Character> filteredCharacters = new Stack<>();
 while (!workingCopy.empty()) {
 final char poppedChar = workingCopy.pop();
 if (filteredCharacters.empty() || poppedChar != filteredCharacters.peek()) {
 filteredCharacters.push(poppedChar);
 } else {
 filteredCharacters.pop();
 }
 }
 return filteredCharacters;
 }
 private Stack<Character> createWorkingCopy(final Stack<Character> characters) {
 final Stack<Character> workingCopy = new Stack<>();
 workingCopy.addAll(characters);
 return workingCopy;
 }
}

200_success 200_success 146k22 gold badges190 silver badges479 bronze badges · Accepted Answer · 2016-07-03 05:18:24Z

The naming of the class is OK, I think. It could be better named AdjacentSameCharacterPairsRemover. "Same" is shorter than "duplicate". "Pairs" corrects a misunderstanding that I initially had, which was that "aaa" → "" instead of "aaa" → "a". The name still wouldn't convey the idea that the elimination is applied repeatedly, but there's only so much information that you can fit into a name.

I don't think that the function name needs to be that long, if the class name is already descriptive. The parameter name, candidate, is peculiar. What office is that string running for, or what validation are you performing on it?

I don't recommend marking parameters and local variables as final — it just adds visual clutter. Functions should be short enough that you can easily keep track of all the variables in your head simultaneously.

You don't need the special case for if (candidate.length() < 2); the algorithm works just fine without it. The added complexity isn't worth the potential performance benefit.

I would work directly with the char[] and dispense with the Stack and StringBuilder. By doing so, you would eliminate the need for extra storage, the boxing of chars as Characters, the synchronized behaviour of java.util.Stack, and the semi-deprecated status of java.util.Stack. You also would't need to push all of the original characters onto the stack and then filter it, since the character array is the input. You wouldn't need to translate the Stack back to a StringBuilder to make the String. The code would be simplified so much that you could eliminate the helper function altogether.

public class AdjacentSameCharacterPairsRemover {
 // Suppress the default constructor
 private AdjacentSameCharacterPairsRemover() {}
 public static String process(String s) {
 char[] chars = s.toCharArray();
 int len = 0; // Length of result (initially empty)
 for (int i = 0; i < chars.length; i++) {
 if (len == 0 || chars[i] != chars[len - 1]) {
 chars[len++] = chars[i]; // Push one char onto "stack"
 } else {
 len--; // Pop one char from "stack"
 }
 }
 return new String(chars, 0, len);
 }
}

What's the reason behind suppressing the default constructor? Is the idea here that if there's nothing to construct then don't allow anybody the ability to do so?
Less clutter in the generated bytecode. Better indication of how the class is meant to be used. Harmless either way. Mainly a matter of taste.

Stack Exchange Network

Remove adjacent duplicate characters

Introduction

Implementation

Feedback

Actual Implementation

4 Answers 4

Problem Background

Naming

Method

Multiple return statements

Parameter pass-through

Do not work on parameters

Move to object scope

Extract methods

Hard borders (pedantic)

The Code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Remove adjacent duplicate characters

Introduction

Implementation

Feedback

Actual Implementation

4 Answers 4

Problem Background

Naming

Method

Multiple return statements

Parameter pass-through

Do not work on parameters

Move to object scope

Extract methods

Hard borders (pedantic)

The Code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions