Lots of effort seems to be geared towards formalism in proving the correctness of programs. Yet, when it comes to comments, there's usually little formal structure. Sure, some languages have ad hoc standards, where certain meta tags are used to extract to an html file, etc... but these are not really part of the formal language definition, and there's no enforcement mechanism to speak of.
Wondering if the language researchers have put any thought into making comments a bit easier to maintain and digest?
Posted to Software-Eng by Chris Rathman on 3/1/02; 3:33:21 PM
It's one thing to enforce form (javadoc, style checkers). It's pretty much the same to provide tools (IDEs) that make it easier to follow a form than not. Only peer review (ie, "Pride of Workmanship") can really provide that extra bit of encouragment to a comment author. Just because a chunk of XML is valid doesn't mean that the values are correct. At least code can have automated tests.
In any case, some software engineering efforts (um, sorry, no references handy) have focused on formalising requirements language, primarily for mathematical proofs of correctness of requirements and, as you mention, code. Perhaps those could be used for comments as well.
- comp.programming.literate has only the FAQ, no active participation.
- when I suggested to the author of the "syntax across languages" page http://merd.net/pixel/language-study/syntax-across-languages.html that his "comments" section should have *some* mention of literate programming, at least in the form that Haskell's ".lhs" files encourage, he decided that this was somehow outside of the realm of syntax, outside of the scope of his page.
- most programmers are too lazy and too focused on the immediate problem to write clear natural language descriptions of what they intend for their code to do.
My 0ドル.02 says that Lisp's docstrings, `describe', `apropos', and so on are great and that other systems should use them too.
Interesting topic, terrible article!From a theoretical standpoint, there wasn't much to recommmend the article. It was mostly a matter of pragmatics or common sense - things that most programmers know. Mostly the article was about implementation, of which LtU is not necessarily interested. The article was more a springboard about discussing how programming languages might aide the process the commenting code - so I really didn't discuss the article per se.
My 0ドル.02 says that Lisp's docstrings, `describe', `apropos', and so on are great and that other systems should use them too.I think that these comment structures represent an improvement over totally free form comments, but I think they are still only a beginning. Oddly enuf, many people have complained about how difficult it is to comment out Lisp code, so maybe the constructs represent counter attempts to overcome the inherent deficiencies in the free form comments in the language.
Personally, I think the solution to the problem comes by limiting each and every function to 5 to 25 lines of code (think smalltalk). If one limits a function/method to a minumum, then one does not need to comment each and every line. Indeed, I think that if one gives any comment to anything other than the overall level of the function, then one has not factored the function properly.
So to those that would give comment that one should write comments within the function pertaining to the exact nature of the specifics of the function, I eould say that you have written functions that are way too complex. A function should do one thing - and it should do it well.
It's one thing to enforce form (javadoc, style checkers). It's pretty much the same to provide tools (IDEs) that make it easier to follow a form than not. Only peer review (ie, "Pride of Workmanship") can really provide that extra bit of encouragment to a comment authorOne problem I see is that code is a document with many different consumers. On the one hand, we have the readers that want to simply know the external references (contracts) that are present within the code. On the other hand, we have the programmers that want to know how the code goes about fulfilling that contract - wishing to know whether the code is truthful, or whether it is lying about that contract. In addition, there is the end user that just wants the dang software works.
The point is that the code is read for different purposes by different people. The code may correctly convey information to the machine but its usefulness in the long term is determined by the META-information contained in the comments in the code that are neither compiled nor verified.
Worth mentioning is also the Object Contraint Language, an often neglected part of UML, which strives to be some form of language independent Design by Contract. I guess all specification languages could wind up in special comment sections (provided they can be expressed in ASCII).
Also design by contract, despite its limitations, goes some way towards making what would be comments in other languages executable (and typed) constructs.
One language design issue concerns adding language constructs that cannot be checked or enforced by the translator. Many find such constructs problematic. This is one reason why some prefer comments over annotations that are part of the code but are not checked. Contracts are again a useful example, since contract are used to specify properties that are only checked during runtime, as opposed to static properties specified using type systems. The problem becomes even more extreme when you consider properties that cannot be checked effectively even during runtime, like being deadlock free and other temporal properties. Should these be specified as code, or as comments? Personally, I prefer formal, executable, notations. I concede that this can be problematic, and implies adding to the complexity of the programming language.
One language design issue concerns adding language constructs that cannot be checked or enforced by the translator. Many find such constructs problematic. [...] Personally, I prefer formal, executable, notations. I concede that this can be problematic, and implies adding to the complexity of the programming language.
A bit of support for Ehud's point of view here: Alex Martelli writes voluminously, passionately, articulately, repeatedly, and convincingly (to me, anyway) about this subject in comp.lang.python ... a GooJa search for
compiler express group:comp.lang.python author:Martelliled me to these paragraphs (each a part of a much longer posting):
July 2001
Subject: Re: not safe at all
http://groups.google.com/groups?selm=9ioonl0jgu%40enews4.newsguy.com
I'd like a language that just lets me state as little or as much as I know and want to express unambiguously -- then the compiler in turn can generate as little or as much static or dynamic typing as it knows how to generate and/or it's directed to generate by options or whatever, but meanwhile I HAVE expressed my design intent in my sources -- *NOT* in possibly-ambiguous comments, but in formal, unambiguous language that MAY, depending on the state of compiler technology, turn out to help direct the compiler to generate appropriate code.
August 2001
Subject: typing system vs. Java
http://groups.google.com/groups?selm=9ke6n409mj%40enews3.newsguy.com
When expressing design intentions in a formal language, I get a *CHANCE* that they will be checked, at least partially, or used for optimization purposes and other inferences. [...] [Expressing these *formally* makes them:]The first point always holds and it's a key one. Second and third are just possibilities (with current Python technology, for assert, the second does hold, the third one doesn't), but the point is: once the construct is in the language, and therefore in those application programs that might live for the next 20 or 40 years, compiler technology may improve and actually take advantage of what today are just 'potentials', aka possibilities.
- totally unambiguous [...]
- potentially checkable [...]
- potentially usable for inferences [...], to get higher-performing code
November 2000
Subject: Re: Programming Habits in Python
http://groups.google.com/groups?selm=8vlia101jd9%40news2.newsguy.com
What I'd really like is a set of ways to _express_ every possible constraint/expectation that I may consider to be highly relevant for my application -- whether the compiler is able to check it statically, dynamically, or not at all (or, conversely, whether fruitful optimizations may be gained by the compiler relying on my assertions) are secondary issues... the key is that the language lets me STATE things, then, as technology develops, some of them may come in handy, and anyway in the meantime human readers and maintainers are better placed than if the assertions were in comments, in separate docs, or totally absent.