JavaScript program to count lines of Java code

Question 1

I've implemented a small program in JavaScript to count the number of lines of a string of java source code.

This was done for one of the Code Kata exercise: http://codekata.com/kata/kata13-counting-code-lines/

Overall, the goal of this exercise is summarised as follow:

The mixture of line-based things (single line comments, blank lines, and so on) with the stream-based block comments can make solutions slightly ugly. While coding your solution, consider the structure of your code, and see how well it fits the structure of the problem. As with most of these kata, consider coding multiple alternative implementations. Does what you learned on the first tries affect your approach to subsequent ones?

To be honest, I've submitted this for an interview and was told that this "lacked fundamentals by allowing errors in the code, there were issues in code construct and there were issues in choice of techniques".

Here's my simple app's source (also available in CodePen):

/**
 * Return the number of lines of code of a Java source code.
 * {@link http://codekata.com/kata/kata13-counting-code-lines/}
 * @param {string} Java source code
 * @return {number}
 */
function getJavaLineCount(source) {
 var count = 0;
 var patterns = {
 spaces: /^\s+/,
 singleLineComment: /^\/\/.*/,
 multiLineComment: /^\/\*((?!\*\/)[\s\S])*\*\//,
 code: /^.*\s*/
 };
 var readToken = function () {
 var m, type;
 if (m = source.match(patterns.spaces)) {
 type = 'spaces';
 } else if (m = source.match(patterns.singleLineComment)) {
 type = 'singleLineComment';
 } else if (m = source.match(patterns.multiLineComment)) {
 type = 'multiLineComment';
 } else if (m = source.match(patterns.code)) {
 type = 'code';
 } else {
 throw new Error("Unknown code pattern!");
 }
 var token = m[0];
 source = source.substr(token.length);
 //DEBUG
 console.log('Found token (%s): %s', type, JSON.stringify(token));
 return type;
 };
 while (source) {
 if (readToken() === 'code') {
 count++;
 }
 console.log('new source = |%s|', source);
 }
 //DEBUG
 console.log('LOC = ', count);
 return count;
}
$(function () {
		var calculateLOC = function() {
			$('#linesOfCode').text(getJavaLineCount(
				$('#source').val()
			));
		};
	
 $('#source').change(calculateLOC);
		calculateLOC();
});

textarea {
 width: 50%;
}

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea id="source" rows="15">
/*****
 * This is a test program with 5 lines of code
 * \/* no nesting allowed!
 //*****//***/// Slightly pathological comment ending...
public class Hello {
	@cool
	private String test = "some /* commented */ 'and' \" asd other\" " +
		"string";
 public static final void main(String [] args) { // gotta love Java
 // Say hello
 System./*wait*/out./*for*/println/*it*/("Hello/*");
 }
}
</textarea>
<p>
 <span id="linesOfCode">n/a</span> lines of java code.
</p>

To improve this, I could imagine writing a simple character parser to avoid using regexp, that's assuming that we consider regexps to be too slow in JS.

I know that if you provide it broken java code, it'll not work properly - but I assumed that this case would not be considered a normal scenario in this exercise.

Apart from this, I'm not sure what other kind of errors it might have.
Any suggestions?

Question 2

Sorry, don't really have the time to do a comprehensive review (maybe later I'll edit!) but for now:

Your code actually fails on some cases.

The way that you're finding whether a line counts as a LOC or not is too greedy. (Alternatively, your comment regexes are too strict.) You're reading in basically the whole line as a line of code. This fails for relatively simple cases:

a/*
b*/

should be one line of code, but you treat it as two, since your code regex absorbs the /*. In fact, I'd argue that the way that you choose to count lines of code is odd; it took me a few reads to see what you were doing and find that error. The more natural way perhaps would be to strip spacing and comments, then count non-empty lines.

That also has the advantage of being way faster: you're calling String#substr() a lot, which is pretty expensive.

Aside from that, I only have a few comments:

Keep spacing consistent. Choose either function() or function (); don't use both.
Be consistent in your choice of indentation character. This is easier if you enable showing whitespace in your editor.
Be consistent in string delimiters; don't switch between ' and " without a good reason.
Since you're using a bare object, you can iterate over it with a for..in. If you're really concerned that somebody will put properties into the prototype of Object (which, by the way, should NEVER EVER happen), then add the hasOwnProperty check:

 for (var tokenType in patterns) {
 // if (patterns.hasOwnProperty(tokenType)) ...
 if (m = source.match(patterns[tokenType])) {
 var token = m[0];
 source = source.substr(token.length);
 //DEBUG
 console.log('Found token (%s): %s', type, JSON.stringify(token));
 return type;
 }
 }
 throw new Error('Unknown code pattern!');

Question 3

Thanks a lot for the pointers. I've made a new version that I'll document here later: codepen.io/kayhadrin/pen/MagYOJ

Question 4

@Kayhadrin Alright! Make sure to read this thread on how to post follow-up questions. :-)

Question 5

Thanks for the info. I'm not going to open a new question for it though since I think it's ok for now.

Question 6

For the new version, I've addressed the issue for a java code that could be like a/*\nb*/. I've also used a for(...in) as suggested.

Question 7

The rest of the coding style issues like tabs and spaces, etc... are purely due to me coding in CodePen - not a real text editor by any account. ;-)

Schism SchismSchism 3,64517 silver badges31 bronze badges · Accepted Answer · 2015-08-26 15:44:40Z

Sorry, don't really have the time to do a comprehensive review (maybe later I'll edit!) but for now:

Your code actually fails on some cases.

The way that you're finding whether a line counts as a LOC or not is too greedy. (Alternatively, your comment regexes are too strict.) You're reading in basically the whole line as a line of code. This fails for relatively simple cases:

a/*
b*/

should be one line of code, but you treat it as two, since your code regex absorbs the /*. In fact, I'd argue that the way that you choose to count lines of code is odd; it took me a few reads to see what you were doing and find that error. The more natural way perhaps would be to strip spacing and comments, then count non-empty lines.

That also has the advantage of being way faster: you're calling String#substr() a lot, which is pretty expensive.

Aside from that, I only have a few comments:

Keep spacing consistent. Choose either function() or function (); don't use both.
Be consistent in your choice of indentation character. This is easier if you enable showing whitespace in your editor.
Be consistent in string delimiters; don't switch between ' and " without a good reason.
Since you're using a bare object, you can iterate over it with a for..in. If you're really concerned that somebody will put properties into the prototype of Object (which, by the way, should NEVER EVER happen), then add the hasOwnProperty check:

 for (var tokenType in patterns) {
 // if (patterns.hasOwnProperty(tokenType)) ...
 if (m = source.match(patterns[tokenType])) {
 var token = m[0];
 source = source.substr(token.length);
 //DEBUG
 console.log('Found token (%s): %s', type, JSON.stringify(token));
 return type;
 }
 }
 throw new Error('Unknown code pattern!');

Thanks a lot for the pointers. I've made a new version that I'll document here later: codepen.io/kayhadrin/pen/MagYOJ
@Kayhadrin Alright! Make sure to read this thread on how to post follow-up questions. :-)
Thanks for the info. I'm not going to open a new question for it though since I think it's ok for now.
For the new version, I've addressed the issue for a java code that could be like a/*\nb*/. I've also used a for(...in) as suggested.
The rest of the coding style issues like tabs and spaces, etc... are purely due to me coding in CodePen - not a real text editor by any account. ;-)

Stack Exchange Network

JavaScript program to count lines of Java code

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

JavaScript program to count lines of Java code

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions