Reasoning behind the syntax of octal notation in Java?

Question 1

Java has the following syntax for different bases:

int x1 = 0b0101; //binary
int x2 = 06; //octal
int x3 = 0xff; //hexadecimal

Is there any reasoning on why it is 0 instead of something like 0o like what they do for binary and hexadecimal? Is there any documentation on why they went this route?

Question 2

Consistency with older languages. Same reason perl, python, javascript, C++, clojure, etc... do it too.

Question 3

@MichaelT: make that the answer, I think. Maybe add a link to K&R and say "C, there".

Question 4

@Ӎσᶎ tad busy irl at the moment to do a good answer, and it likely goes back much further than C (that C did it for the same reason too). StackOverflow's got a reasonable answer stackoverflow.com/questions/11483216/…

Question 5

@MichaelT By the way, Python ditched that syntax in favor of OP's proposal (0o12345670) as one of the backwards-incompatible changes of 3.0.

Question 6

@Danny the key to searching well is knowing what you are looking for before searching. This isn't always practical if you don't know where to start from. The key wording is 'leading zero' - searching google for 'leading zero octal' gives that SO question. Another search I did (didn't find the answer I was looking for) was 'octal literal algol' which took me to rosettacode.org/wiki/Literals/Integer which is a most interesting read.

Question 7

Java syntax was designed to be close to that of C, see eg page 20 at How The JVM Spec Came To Be keynote from the JVM Languages Summit 2008 by James Gosling (20_Gosling_keynote.pdf):

C syntax to make developers comfortable

In turn, this is the way how octal constants are defined in C language:

If an integer constant begins with 0x or 0X, it is hexadecimal. If it begins with the digit 0, it is octal. Otherwise, it is assumed to be decimal...

Given above, it is natural that Java language designers decided to use same syntax as in C.

As pointed in this comment, StackOverflow's got a reasonable answer here:

All modern languages import this convention from C, which imported it from B, which imported it from BCPL.

Except BCPL used #1234 for octal and #x1234 for hexadecimal. B has departed from this convention because # was an unary operator in B (integer to floating point conversion), so #1234 could not be used, and # as a base indicator was replaced with 0.

The designers of B tried to make the syntax very compact. I guess this is the reason they did not use a two-character prefix.

Question 8

You can even go back further. Since C was derived from B, you will find an explanation in Thompson's B Manual section 4.1 Primary Expressions: Quote: "An octal constant is the same as a decimal constant except that it begins with a zero. It is then interpreted in base 8. Note that 09 (base 8) is legal and equal to 011. A character constant is represented by ' followed by one or two characters (possibly escaped) followed by another '. It has an rvalue equal to the value of the characters packed and right adjusted. "

Question 9

@Jérôme yup I didn't go deeper intentionally after getting original problem answered. As for the history of how C got that octal syntax, that would be a different question , that probably would better be asked and answered separately (and BTW it already was).

Question 10

@Jérôme "Note that 09 (base 8) is legal", that make no sense to me. Why would 09 be legal if the 0 is meant to represent octal constants.

Question 11

@Danny offering pragmatic convenience and small syntactic perks for programmers like legalizing 09 is probably what made C language win over pure, strict, logical and oh-so-friggin' scientific crap like Pascal and other Modulas

Question 12

No, C did not import octal from B. C was developed on the PDP-11, and the natural form of binary values on that machine was octal. Assembly language was full of it. Octal shows up again in Unix permissions. Nobody even considered hex in the DEC world at that time.

gnat gnat 20.5k29 gold badges117 silver badges308 bronze badges · Accepted Answer · 2013-12-18 21:35:58Z

Java syntax was designed to be close to that of C, see eg page 20 at How The JVM Spec Came To Be keynote from the JVM Languages Summit 2008 by James Gosling (20_Gosling_keynote.pdf):

C syntax to make developers comfortable

In turn, this is the way how octal constants are defined in C language:

If an integer constant begins with 0x or 0X, it is hexadecimal. If it begins with the digit 0, it is octal. Otherwise, it is assumed to be decimal...

Given above, it is natural that Java language designers decided to use same syntax as in C.

As pointed in this comment, StackOverflow's got a reasonable answer here:

All modern languages import this convention from C, which imported it from B, which imported it from BCPL.

Except BCPL used #1234 for octal and #x1234 for hexadecimal. B has departed from this convention because # was an unary operator in B (integer to floating point conversion), so #1234 could not be used, and # as a base indicator was replaced with 0.

The designers of B tried to make the syntax very compact. I guess this is the reason they did not use a two-character prefix.

You can even go back further. Since C was derived from B, you will find an explanation in Thompson's B Manual section 4.1 Primary Expressions: Quote: "An octal constant is the same as a decimal constant except that it begins with a zero. It is then interpreted in base 8. Note that 09 (base 8) is legal and equal to 011. A character constant is represented by ' followed by one or two characters (possibly escaped) followed by another '. It has an rvalue equal to the value of the characters packed and right adjusted. "
@Jérôme yup I didn't go deeper intentionally after getting original problem answered. As for the history of how C got that octal syntax, that would be a different question , that probably would better be asked and answered separately (and BTW it already was).
@Jérôme "Note that 09 (base 8) is legal", that make no sense to me. Why would 09 be legal if the 0 is meant to represent octal constants.
@Danny offering pragmatic convenience and small syntactic perks for programmers like legalizing 09 is probably what made C language win over pure, strict, logical and oh-so-friggin' scientific crap like Pascal and other Modulas
No, C did not import octal from B. C was developed on the PDP-11, and the natural form of binary values on that machine was octal. Assembly language was full of it. Octal shows up again in Unix permissions. Nobody even considered hex in the DEC world at that time.

Stack Exchange Network

Reasoning behind the syntax of octal notation in Java?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Reasoning behind the syntax of octal notation in Java?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions