Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/

edited May 23, 2017 at 12:41

My first impression:

The implementation looks long and complicated for something that sounds simple
You've provided some example cases to verify correctness, which is a very good thing

My first reaction is to embody the example test cases in proper unit tests, and then replace the implementation with a different approach, and make the tests I broke work again.

The unit tests, straightforward from your examples:

@Test
public void testCamelCased() {
 assertEquals(Arrays.asList("I", "Some", "Camel", "Cased", "String"), split("ISomeCamelCasedString"));
}
@Test
public void testSnakeCased() {
 assertEquals(Arrays.asList("UNDERSCORED", "STRING"), split("UNDERSCORED_STRING"));
}
@Test
public void testMixed() {
 assertEquals(Arrays.asList("camel", "Cased", "and", "UNDERSCORED"), split("camelCased_and_UNDERSCORED"));
}

Next, I was thinking that this can be simplified by splitting on some clever regex. I'm no regex wiz, but there are a lot of those on Stack Overflow, and I found a suitable thread about splitting camelCase to its words, here here. Adapting it to also split on underscores was relatively easy, though probably not perfect, resulting in this:

private static final String RE_CAMELCASE_OR_UNDERSCORE =
 "(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])|_";
public static List<String> split(String string) {
 List<String> words = new ArrayList<String>();
 for (String word : string.split(RE_CAMELCASE_OR_UNDERSCORE)) {
 if (!word.isEmpty()) {
 words.add(word);
 }
 }
 return words;
}

This is probably not perfect, but a lot shorter than the original, and easier to understand how it works. If somebody can figure out how to get RE_CAMELCASE_OR_UNDERSCORE so that it doesn't produce empty elements, then the method can be shortened to simply:

return Arrays.asList(string.split(RE_CAMELCASE_OR_UNDERSCORE));

PS: this_is_usually_called_snake_cased, not "underscored".

My first impression:

The implementation looks long and complicated for something that sounds simple
You've provided some example cases to verify correctness, which is a very good thing

My first reaction is to embody the example test cases in proper unit tests, and then replace the implementation with a different approach, and make the tests I broke work again.

The unit tests, straightforward from your examples:

@Test
public void testCamelCased() {
 assertEquals(Arrays.asList("I", "Some", "Camel", "Cased", "String"), split("ISomeCamelCasedString"));
}
@Test
public void testSnakeCased() {
 assertEquals(Arrays.asList("UNDERSCORED", "STRING"), split("UNDERSCORED_STRING"));
}
@Test
public void testMixed() {
 assertEquals(Arrays.asList("camel", "Cased", "and", "UNDERSCORED"), split("camelCased_and_UNDERSCORED"));
}

Next, I was thinking that this can be simplified by splitting on some clever regex. I'm no regex wiz, but there are a lot of those on Stack Overflow, and I found a suitable thread about splitting camelCase to its words, here. Adapting it to also split on underscores was relatively easy, though probably not perfect, resulting in this:

private static final String RE_CAMELCASE_OR_UNDERSCORE =
 "(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])|_";
public static List<String> split(String string) {
 List<String> words = new ArrayList<String>();
 for (String word : string.split(RE_CAMELCASE_OR_UNDERSCORE)) {
 if (!word.isEmpty()) {
 words.add(word);
 }
 }
 return words;
}

return Arrays.asList(string.split(RE_CAMELCASE_OR_UNDERSCORE));

PS: this_is_usually_called_snake_cased, not "underscored".

My first impression:

The implementation looks long and complicated for something that sounds simple
You've provided some example cases to verify correctness, which is a very good thing

My first reaction is to embody the example test cases in proper unit tests, and then replace the implementation with a different approach, and make the tests I broke work again.

The unit tests, straightforward from your examples:

@Test
public void testCamelCased() {
 assertEquals(Arrays.asList("I", "Some", "Camel", "Cased", "String"), split("ISomeCamelCasedString"));
}
@Test
public void testSnakeCased() {
 assertEquals(Arrays.asList("UNDERSCORED", "STRING"), split("UNDERSCORED_STRING"));
}
@Test
public void testMixed() {
 assertEquals(Arrays.asList("camel", "Cased", "and", "UNDERSCORED"), split("camelCased_and_UNDERSCORED"));
}

Next, I was thinking that this can be simplified by splitting on some clever regex. I'm no regex wiz, but there are a lot of those on Stack Overflow, and I found a suitable thread about splitting camelCase to its words, here. Adapting it to also split on underscores was relatively easy, though probably not perfect, resulting in this:

private static final String RE_CAMELCASE_OR_UNDERSCORE =
 "(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])|_";
public static List<String> split(String string) {
 List<String> words = new ArrayList<String>();
 for (String word : string.split(RE_CAMELCASE_OR_UNDERSCORE)) {
 if (!word.isEmpty()) {
 words.add(word);
 }
 }
 return words;
}

return Arrays.asList(string.split(RE_CAMELCASE_OR_UNDERSCORE));

PS: this_is_usually_called_snake_cased, not "underscored".

added 14 characters in body

Source Link

edited Sep 13, 2014 at 21:24

janos

edited Sep 13, 2014 at 21:24

janos

112.9k
15
154
396

My first impression:

The implementation looks long and complicated for something that seemssounds simple
We haveYou've provided some example cases to verify correctness, which is a very good thing

My first reaction is to embody the example test cases in proper unit tests, and then replace the implementation with a different approach, and make the tests I broke work again.

The unit tests, straightforward from your examples:

@Test
public void testCamelCased() {
 assertEquals(Arrays.asList("I", "Some", "Camel", "Cased", "String"), split("ISomeCamelCasedString"));
}
@Test
public void testSnakeCased() {
 assertEquals(Arrays.asList("UNDERSCORED", "STRING"), split("UNDERSCORED_STRING"));
}
@Test
public void testMixed() {
 assertEquals(Arrays.asList("camel", "Cased", "and", "UNDERSCORED"), split("camelCased_and_UNDERSCORED"));
}

Next, I was thinking that this can be simplified by splitting on some clever regex. I'm no regex wiz, but there are a lot of those on Stack Overflow, and I found a suitable thread about splitting camelCase to its words, here. Adapting it to also split on underscores was relatively easy, if perhapsthough probably not perfect, resulting in this:

private static final String RE_CAMELCASE_OR_UNDERSCORE =
 "(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])|_";
public static List<String> split(String string) {
 List<String> words = new ArrayList<String>();
 for (String word : string.split(RE_CAMELCASE_OR_UNDERSCORE)) {
 if (!word.isEmpty()) {
 words.add(word);
 }
 }
 return words;
}

return Arrays.asList(string.split(RE_CAMELCASE_OR_UNDERSCORE));

PS: this_is_usually_called_snake_cased, not "underscored".

My first impression:

The implementation looks long and complicated for something that seems simple
We have some example cases to verify correctness, which is a very good thing

My first reaction is to embody the example test cases in proper unit tests, and then replace the implementation with a different approach, and make the tests I broke work again.

The unit tests, straightforward from your examples:

@Test
public void testCamelCased() {
 assertEquals(Arrays.asList("I", "Some", "Camel", "Cased", "String"), split("ISomeCamelCasedString"));
}
@Test
public void testSnakeCased() {
 assertEquals(Arrays.asList("UNDERSCORED", "STRING"), split("UNDERSCORED_STRING"));
}
@Test
public void testMixed() {
 assertEquals(Arrays.asList("camel", "Cased", "and", "UNDERSCORED"), split("camelCased_and_UNDERSCORED"));
}

Next, I was thinking that this can be simplified by splitting on some clever regex. I'm no regex wiz, but there are a lot of those on Stack Overflow, and I found a suitable thread about splitting camelCase to its words, here. Adapting it to also split on underscores was relatively easy, if perhaps not perfect, resulting in this:

private static final String RE_CAMELCASE_OR_UNDERSCORE =
 "(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])|_";
public static List<String> split(String string) {
 List<String> words = new ArrayList<String>();
 for (String word : string.split(RE_CAMELCASE_OR_UNDERSCORE)) {
 if (!word.isEmpty()) {
 words.add(word);
 }
 }
 return words;
}

return Arrays.asList(string.split(RE_CAMELCASE_OR_UNDERSCORE));

PS: this_is_usually_called_snake_cased, not "underscored".

My first impression:

The implementation looks long and complicated for something that sounds simple
You've provided some example cases to verify correctness, which is a very good thing

My first reaction is to embody the example test cases in proper unit tests, and then replace the implementation with a different approach, and make the tests I broke work again.

The unit tests, straightforward from your examples:

@Test
public void testCamelCased() {
 assertEquals(Arrays.asList("I", "Some", "Camel", "Cased", "String"), split("ISomeCamelCasedString"));
}
@Test
public void testSnakeCased() {
 assertEquals(Arrays.asList("UNDERSCORED", "STRING"), split("UNDERSCORED_STRING"));
}
@Test
public void testMixed() {
 assertEquals(Arrays.asList("camel", "Cased", "and", "UNDERSCORED"), split("camelCased_and_UNDERSCORED"));
}

Next, I was thinking that this can be simplified by splitting on some clever regex. I'm no regex wiz, but there are a lot of those on Stack Overflow, and I found a suitable thread about splitting camelCase to its words, here. Adapting it to also split on underscores was relatively easy, though probably not perfect, resulting in this:

private static final String RE_CAMELCASE_OR_UNDERSCORE =
 "(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])|_";
public static List<String> split(String string) {
 List<String> words = new ArrayList<String>();
 for (String word : string.split(RE_CAMELCASE_OR_UNDERSCORE)) {
 if (!word.isEmpty()) {
 words.add(word);
 }
 }
 return words;
}

return Arrays.asList(string.split(RE_CAMELCASE_OR_UNDERSCORE));

PS: this_is_usually_called_snake_cased, not "underscored".

Source Link

answered Sep 10, 2014 at 12:35

janos

answered Sep 10, 2014 at 12:35

janos

112.9k
15
154
396

My first impression:

The implementation looks long and complicated for something that seems simple
We have some example cases to verify correctness, which is a very good thing

My first reaction is to embody the example test cases in proper unit tests, and then replace the implementation with a different approach, and make the tests I broke work again.

The unit tests, straightforward from your examples:

@Test
public void testCamelCased() {
 assertEquals(Arrays.asList("I", "Some", "Camel", "Cased", "String"), split("ISomeCamelCasedString"));
}
@Test
public void testSnakeCased() {
 assertEquals(Arrays.asList("UNDERSCORED", "STRING"), split("UNDERSCORED_STRING"));
}
@Test
public void testMixed() {
 assertEquals(Arrays.asList("camel", "Cased", "and", "UNDERSCORED"), split("camelCased_and_UNDERSCORED"));
}

Next, I was thinking that this can be simplified by splitting on some clever regex. I'm no regex wiz, but there are a lot of those on Stack Overflow, and I found a suitable thread about splitting camelCase to its words, here. Adapting it to also split on underscores was relatively easy, if perhaps not perfect, resulting in this:

private static final String RE_CAMELCASE_OR_UNDERSCORE =
 "(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])|_";
public static List<String> split(String string) {
 List<String> words = new ArrayList<String>();
 for (String word : string.split(RE_CAMELCASE_OR_UNDERSCORE)) {
 if (!word.isEmpty()) {
 words.add(word);
 }
 }
 return words;
}

return Arrays.asList(string.split(RE_CAMELCASE_OR_UNDERSCORE));

PS: this_is_usually_called_snake_cased, not "underscored".

lang-java