auto break case char const continue default do double else enum extern
float for goto if int long register return short signed sizeof static
struct switch typedef union unsigned void volatile while
[BK/DR,1988,p.192,A2.4]
;
[9899:1999]
additionally defines
inline restrict
and
the
[partially optional]
types
_Bool
_Complex
_Imaginary
[9899:2011]
additionally defines
_Alignas
_Alignof
_Atomic
_Generic
_Noreturn
_Static_assert
_Thread_local
(often used in macros rather than directly)
[
questions on:
auto (stack alignment),
const (orthogonality to volatile),
int (string to int),
sizeof (orthogonality),
struct,
typedef,
union,
unsigned (often not needed),
volatile (orthogonality to const),
/
keywords vs. reserved words
]
alignas alignof
and and_eq asm auto bitand bitor bool break case catch char
char16_t char32_t
class compl
const const_cast
constexpr
continue
decltype
default delete do double dynamic_cast else
enum explicit export extern false float for friend goto if inline int
long mutable namespace new
noexcept
not not_eq
nullptr
operator or or_eq private
protected public register reinterpret_cast return short signed sizeof
static
static_assert
static_cast struct switch template this
thread_local
throw true try typedef
typeid typename union unsigned using virtual void volatile wchar_t while
xor xor_eq
[ISO/IEC 14882:2011].
and and_eq bitand bitor bool compl const_cast dynamic_cast
explicit export false mutable namespace not not_eq or
or_eq reinterpret_cast static_cast true typeid typename using wchar_t
xor xor_eq
were introduced with
[BS,1997]
(p.794, A.2).
alignas alignof char16_t char32_t constexpr decltype noexcept nullptr
static_assert thread_local
were introduced with
[ISO/IEC 14882:2011].
[
questions on:
auto (stack alignment),
const (orthogonality to volatile),
int (string to int),
mutable (vs. const),
operator (precedences),
sizeof (orthogonality),
struct,
typedef,
typeid,
union,
unsigned (often not needed),
virtual,
volatile (orthogonality to const),
and/and_eq/bitand/bitor/compl/not/not_eq/or/or_eq/xor/xor_eq,
]
goto is not strictly a
keyword
[since it is only reserved but not used in that language;
however it is listed as a keyword
{to allow compilers issue more informative diagnostics and
minimize Java/C++ porting problems}].
The reserved words in the C programming language are
[as mentioned]
its keywords;
additionally some identifiers must not / should not be used,
e.g. such beginning with the underscore character
[reserved for the C system/implementation/library
(and used too with keywords later added,
such as _Noreturn)].
[
C keywords
/
C++ keywords
/
Java keywords
/
Turtle keywords
/
coding guidelines about identifier names
/
coding guidelines about name space
/
scope and identifier naming
]
<% %> <: :> %:
meaning
{}[]#)
and
trigraphs
(??= ??( ??< ??/ ??) ??> ??' ??! ??- ???
meaning
#[{\]}^|~?)
support restricted character sets
(that are obviously missing one or more of the characters
#[\]^{|}~).
C++ supports
keywords
(and and_eq bitand bitor compl not or or_eq xor xor_eq not_eq
meaning
&& &= & | ~ ! || |= ^ ^= !=
{more consistently [but less conveniently]
and_eq would be bitand_eq,
or_eq bitor_eq,
xor bitxor,
and
xor_eq bitxor_eq})
that can replace
!&^|~.
[Found in
iso646.h .]
However, one cannot do without the special characters
"%'()*+,-./:;<=>?_
{unless with obfuscating practices}.
Some of the
ASCII characters
($@`)
are not used in C/C++ programming languages.
[
digraph-hello-world.cc
/
trigraph-hello-world.c
]
and &&
and_eq &=
bitand &
bitor |
compl ~
not !
not_eq !=
or ||
or_eq |=
xor ^
xor_eq ^=
[
digraphs
/
trigraphs
]
man atoi' (Unix manual command [see
question on unix man
too]) or what's appropriate on your system.
[Unfortunately atoi does not have a dedicated error return
value (0 is also a valid non-error return value);
for that reason, the better
strtol,
which features an
(optional) additional reference parameter and a parameter denominating
the input base, should be used.]
[
integer conversion question
/
how to do it in C++
]
extern void qsort(void* base, size_t count, size_t width, int (*compare)(void const* a, void const* b));
is called with an array of data to sort [base],
the number of elements to sort [count],
the size of a single element in bytes
[width; an array element may be a large structure or just a pointer],
a compare function [compare],
some of which might be expressed with sizeof.
A typical qsort compare function [helper function] looks like
[T being a data type, e.g. struct stat]:
static int cmp_func(void const* a, void const* b)
{
T const* aa = (T const*)a;
T const* bb = (T const*)b;
if (aa->x != bb->x) {
return (aa->x > bb->x ? 1 : -1);
}
if (aa->y != bb->y) {
return (aa->y > bb->y ? 1 : -1);
} ...
return 0;
};
all data accesses are const,
the helper function can be static
[i.e. not globally visible],
the parameters are passed as generic [void] pointers
and are cast to the target type [while remaining const],
the most significant data element is compared first,
then, if it's equivalent, the next one, etc;
zero return means 'equivalent'.
If there is an additional level of indirection
(as with an array of pointers to data),
T const* aa = *(T const**)a;,
etc is chosen.
Note that since
typical qsort implementations are not stable/preserving
[i.e. an existing order in groups of elements of equivalent sort key
may/will change],
the compare functions often compare up to a unique key
[e.g. when sorting directory entries by time or size:
sort by that and, if equal, by {unique} name].
[
what is quicksort
/
qsort's drawbacks
/
sort-up-to-a-unique-key pattern
]
strtok_r}),
implements only limited patterns
and the buffer to be parsed is modified.
Typically it's more convenient [for a parser] to return the whole
[parsing-] state to the caller and use an end-index instead of [nul]
string termination
[lrctpz.zip:ctpzut.cc:SetParser::enumerate
(C++ sample)
implements
such a parser pattern,
with the help of a (simple) StringRange helper class].
Java's
nio's
Buffer,
with its 'viewed buffer'
[layering]
abstraction,
implements
this useful pattern
and
multiple indices
['capacity', 'limit', 'position', 'mark']
too.
private, protected, public, friend]
that control scope in a class relationship manner.
[
coding guidelines' scope chapter
/
Java's access modifiers
]
typeid operator returns a class
type_info reference, which, as the name suggests, carries
data type information.
type_info provides a printable name [as string]
and comparison operators [to check for type equality and make
type information data sortable].
type_info is part of C++'s run time type information
(RTTI) support [as dynamic_cast also is].
RTTI support instantiates several helper data
structures per class [which are usually linked into the classes'
virtual
tables] and sometimes requires extra compiler options.
tvs
(trap on overflow set)
and e.g. the {esoteric}
taddcctv
(tagged add and modify integer condition codes and trap on overflow);
x86:
into
(interrupt on overflow)].
The carry flag and the sign flag aren't directly
accessible from the C language either.
10,
0x10 and 010).
[As number literals typically shouldn't be used in code anyway
(see
here),
this restriction isn't too severe.]
A C programmer typically isn't concerned with a machine-internal number
representation [two's complement,
bcd,
endianess,
etc]
either
(lower level libraries are).
The conversion from [external] string representation to
internal number representation should be accomplished by
strtol (which features an input base parameter and
an optional scan-end-position reference parameter [which can be
used for error checking and further scanning]; possible
overflow
is checked too [since ints,
longs, etc can only hold a limited number range]).
[Don't use scanf and atoi, for different reasons;
see
'atoi' question
too.]
Conversion from internal representation to a string
('printing') is usually done with sprintf (better,
if available: snprintf).
[s][n]printf unfortunately offers only the d,
x and o (decimal, hexadecimal and octal)
format specifiers, i.e. not all representable bases are
supported [unlike e.g. with Lisp, that offers the optional
:base and :radix parameters and the
~nR print-format; input can be
specified by #nr...
{e.g. #3r...}, #x...,
#o... or {binary!} #b...].
Numbers' string representations usually allow bases from
2 to 36 (inclusive) [using 10 digits and 26 latin letters].
[
coding guidelines' chapter on sign extension
/
how to convert to integer in C++
]
istream/ostream operator>>/operator<<
mechanisms.
E.g. to convert from string,
use
istrstream s(input);
int i;
s >> i;
if (s.tellg() == 0) ...
[scan position advancing used to check for parse errors].
Note that the ios class, from which istream
and ostream [and istrstream, etc] derive,
does not implement an input/output base,
but only some special bases [dec, hex, oct,
implemented as flags].
[
C conversion methods; additional background information
]
char
to a larger numerical type
[e.g. to an int;
see
here].
Often chars are signed and widening will
extend that sign, so [non-ASCII] characters above 0x7f will result in
negative ints [with leading 1-bits {ffffff
prefix}];
e.g. a MIME-quoted-printable encoding '=FF' in
this case may be interpreted as 'ÿ' [y-umlaut; in iso-latin-1];
an URL-encoding '%FF' may lead to the same.
1<<0, 1<<1, 1<<2, ...})
[runtime O(n),
where n is the bit size of an integer {e.g. 32}],
or by a binary search
[through an array of the powers of two]
[runtime O(lb(n)), {e.g. lb(32) == 5}].
Taking advantage of the usual
two's complement integer representation,
one may scan for the highest bit
[runtime O(n)],
or apply bit patterns
[that implement a specialized binary search
{a binary binary search?;-)}]
[e.g. on 32-bit platforms
bt(0xffff0000u);
bt(0xff00ff00u);
bt(0xf0f0f0f0u);
bt(0xccccccccu);
bt(0xaaaaaaaau);
where bt(m) is
i <<= 1;
if ((p = n & m) != 0) i |= 1, n = p;]
[runtime O(lb(n))].
Note that many processors implement opcodes to do that,
such as bit-scan, highest-populated-bit, etc;
that however are not necessarily faster than the solution above.
[
ffff0000 and similar bit patterns
]
alloca shouldn't be used either
since it is not so
portable/robust
and tends to irritate debuggers.
[
stack based programming languages
/
stack alignment
/
pascal style stack clean-up
/
stack corruption
/
stack growing up or down
/
stacks in recursion
/
stack size in quicksort implementations
/
stack overflow with recursion
]
double, long long, long double, etc)
[this is of course especially the case for CISC machines,
which often don't really require data alignment
{however run faster with it}],
so no additional steps need to be taken
besides subtracting the required number of bytes.
If not:
the stack frame may be aligned by rounding it down
[e.g. {x86} esp &= ~0xf]
(in the case of stack growing down);
this typically requires saving its old contents
[e.g. in {x86} ebp].
Global data is usually rather generously (i.e. not less) aligned,
compared to stack data.
[This issue is handled by compilers,
not the C programmer,
however worth some consideration.]
longjmp can leave corrupted stacks
[if used inappropriately].
An
overflow
of a local buffer is the usual reason for stack corruption.
htonl function family
['host to network byte order';
htons, htonl, htonll,
ntohs, ntohl, ntohll].
hton/ntoh operations are cheap
and happen to be no-ops
[identity operations]
for one class of systems
[the
big endian
systems].
With the
little endian
systems,
the ntoh functions are identical to the corresponding hton function,
reversing the order of the bytes.
[
network byte order
/
"[htonl] statement has no effect" warning
]
htonl function family),
or use ASCII encoding
(i.e. use a more human readable
and system independent representation).
Use
the snippet
printf("%x\n", *(int const*)"abcdefghijklmnop");
to generate an endianess-code,
e.g. 64636261 for little endian, 32 bit
[ASCII encoding].
[
network byte order
/
document normalization
]
gotos
{that only allow to jump inside a function}
to change such states in fine-granular ways
{often
[the better abstractions than goto?]
break and continue
are sufficient}].
[Break-out of nested loops in C/C++ requires gotos too,
unlike with
Java's
multi-break.]
I prefer early returns to block nesting;
often 'normal' code runs down from top to bottom of a function
and special cases and error cases
can be handled in if-blocks with early-returns.
If-statements in the form of
if (...) {...} else if (...) {...} else if (...) {...} ... else {...}
[tail nesting]
are conceptually not considered nested
and are not indented
[K&R2].
[Tail-nested ternary statements
can be formatted in a similar way,
see e.g.
ipow sample.]
Complicated nested if/else blocks
usually are better handled with a revised better data model.
[
code nesting
/
strtok nesting
/
multiple returns
/
iterator pattern
]
p = calloc(n, m);
is semantically equivalent to
p = malloc(n * m);
if (p != NULL) memset(p, 0, n * m);,
typically meaning:
allocate an array of n data objects of size m bytes each
[m typically expressed in terms of sizeof],
with implicit initialization to 0.
In C/C++ 0 has
a well defined and useful meaning
for the primitive types
{and hence in C for the aggregated types;
C++'s object composition includes additional stuff,
like pointers to meta data [needed for polymorphism],
which must not be 0}.
n*m is
the over-all size.
It is a good practice
to initialize
data [buffers],
especially dynamic memory,
before use,
to make code more
deterministic,
and robust
(which is especially of help at bug finding time
['searching for bugs that show non-deterministic symptoms
is an unpleasant task at best']).
In symmetry with initialization of
stack based data,
I prefer the malloc/memset approach.
Dynamically allocated memory is returned by free
[however, the heap {the area which serves malloc et al.}
typically doesn't shrink
{use mmap based allocation for that}].
[
determinism
/
program robustness
/
stack vs. heap
]
's,//\(.*\),/*1円 */,',
but that doesn't consider // in 'problematic'
lexical contexts, such as in a string literal!
The better answer is to use a parser [more precise: a lexical analyzer]
such as
ccomment
['Care has been taken of: strings, character literals, constructs
like '//.../*...*/' and performance.'].
__VA_ARGS__
meta macro],
that however supports
only simple things
[such as forwarding the argument list as a whole;
i.e. isn't very flexible either;
parsing the variable argument list e.g. is actually not possible].
C++'s inline code and template code [possibly
together with function overwriting] are acceptably flexible.
gets.
Use fgets instead.
Why: possible
buffer overflows
that cause
program robustness problems
and severe security risks.
access or write.
struct _finddata_t [for directory traversal]
and the FAT file attributes [read-only, hidden, system,, archive]
can be found there as well.
Some of the io.h API maps to the Posix API
[on Unix found in unistd.h],
other parts are additions [such as chsize].
io.h and IO.H are synonyms since win32 filesystems, such as FAT [FAT16,
FAT32] and NTFS, are case-insensitive.
Header files to standard libraries are, as C [and C++] development kits,
not included in standard win32 operating system installations;
they come with those development kits.
errno is stated as
'integer expression' (not 'integer variable') in
K&R2
[and typically implemented by means of a function returning a
pointer {to thread-local data}].
Sometimes, environ,
_argv or __argv,
etc,
are found,
but are proprietary and not
portable .
Hidden can be considered whatever is external
[accessible] in libc and not defined in published header
files
[but of course should not be used!].
[
question on reentrancy
]
void*,
can't be checked;
object-oriented programming languages use the safer
polymorphism
and inheritance [hierarchies] for that.
Older languages such as basic mostly didn't apply
static type safeness, leading to
[possibly] late run-time errors.
Modern counter-samples to typesafeness are:
C's unsafe/unchecked casts,
C/C++'s use of integers as booleans,
Java's polymorphic vector/hashtable/etc.
StringTokenizer and tiny
details like Thread.getName missing in
J2ME/MIDP/CLDC].
You might expect code of good quality to be portable.
System specific code is often separated from the core
code [or from most code] to keep that portable.
[
portability types
/
typedefs
/
minimizing Java porting problems by adding C/C++ keywords
/
unportable 'hidden' variables
/
robustness
/
C++ to Java
/
C to Java
/
Java to C++
]
int&;
holder needed or slight redesign].
C++ things typically not needed include:
default parameter values
[overwriting is sufficient],
static function variables
[not useful with multithreading or
reentrancy],
macros
[unsafe because of side effects],
conditional code compiling
[separable to platform specific jars].
[
C to Java
/
Java to C++
/
differences between Java and C++
]
i = j = 0;
f(g(), i++, ++j);
i = f(k = 2, h(g()), i);;
sometimes it may be convenient
to have such language constructs.
[That however doesn't mean you have to or should use them.
Readability and debugging convenience will decrease
when using nested or chained expressions;
e.g. source code debugging doesn't show intermediate data,
sub-expressions can't be reassigned a value,
{at least not easily, without showing/using [cpu] registers},
etc].
Unless necessary
C doesn't lock in [certain] expressions [e.g. assignments]
to be a special statement case
[what possibly would make the language less usable
and the grammar more complex].
C's operators are defined for different data types
[e.g. math vs. pointer arithmetics];
sizeof is defined for
expressions [including variables]
and types.
[Counter-intuitively to some]
C's const and volatile keywords
are orthogonal
[i.e. don't exclude each other]:
const means 'can't be modified by the code'
[relevant for putting data in {shared} read-only segments
and for contracts on data access policies];
volatile means 'is subject to asynchronous change'
[relevant to optimizations done by compilers];
[9899:1999]
names the sample of a hardware clock
[mapped to an unwritable memory address],
declared using both modifiers.
Another sample of orthogonality are Lisp's sequence operators
that do not only work on lists but also on arrays, strings, etc
[the concept of sequences here is
somewhat orthogonal to the concept of lists,
however this may looked on as
polymorphism;
some consider Lisp not to be an orthogonal language
since often there's not only one single construct
implementing a concept,
e.g. if/else statement doable by
if,
when,
unless,
cond,
and,
or, ...
{question:
compact orthogonal set
or compact orthogonal working set?}].
[
orthogonality
/
C programming language overview
/
C programming language coding guidelines
]
abstract assert boolean break byte case catch char class const continue
default do double else enum extends final finally float for goto if
implements import instanceof int interface long native new package
private protected public return short static strictfp super switch
synchronized this throw throws transient try void volatile while
;
The keywords
const goto
are reserved, though not used.
The words
false null true
[technically] are considered literals.
strictfp
was introduced with
[JG/BJ/GS/GB,2000],
assert enum
were introduced with
[JG/BJ/GS/GB,2005].
[
Java programming language specifications
/
abstract / final / private / protected / public / static classes
/
interface (to separate declaration and definition)
/
synchronized features
/
use of volatile keyword
]
T* const t = ...)
is supported
by the final keyword
(Java due to its flow analysis
also supports blank final fields and variables).
Constantness / immutability
of referenced objects / data
(C++: T const* t;)
isn't explicitly supported;
it is usually contractual
(e.g. String),
sometimes found in the type's name
(e.g. BigInteger vs. MutableBigInteger).
'Constness' is considered
rather logical than physical
with more abstract (object oriented) programming languages
(C++ as opposed to C);
C++ needs the additional mutable keyword
in such contexts.
Physical constantness
(constant, usually non-OO, data put into read-only sections)
isn't used with Java's heap object data.
goto statements
(i.e. some gotos wouldn't be valid
or would make the code flow check infeasible).
However, labeled continues and breaks are permitted
(since no valid initializations are skipped
when e.g. jumping out of a for loop).
interface
declaring the methods (functions) in question
and provide (separate) implementation(s) for it.
The implementations can be instantiated
via
reflection
((interface)Class.forName(String).newInstance())
to totally decouple code
(at compile-time);
often this is accomplished [indirectly] via
a JNDI (service locator) framework.
int unsignedValue; ...;
long value = unsignedValue & 0xffffffffl;
(similar masking is used in
BigRational code)].
Object
(i.e. Java's type hierarchy is single-rooted);
Object is known to be a super class
(e.g. of classes defined in the java.lang package
[and of course all additional classes]).
Nested classes were
introduced
in
[JG/BJ/GS/GB,2000].
private,
default/package/friendly,
protected,
and public;
each being a subset of the following,
i.e. the access modifiers are not
orthogonal
(protected access includes package access).
Private fields and methods are only accessible by the implementing class itself,
default/package/friendly (no keyword) by all code in the package,
protected additionally by derived classes' code,
and public generally.
Having protected implying default
makes the language simpler.
In a good object oriented approach
private would be all data fields
and helper methods,
protected only helper constructors
and some helper methods needed by derived classes,
and public the API
(that possibly implements a public interface).
synchronized keyword
does provide
guarding critical sections and
memory synchronization.
It [usually]
doesn't trigger
task switches,
which fact may lead to possible thread starvation:
one/some thread(s) don't get to run
[see
synchronized block based code sample
and
java.util.concurrent.locks based code sample
(both implementing [default] non-fair and fair monitor/lock use,
the first by explicit code)];
a higher thread priority doesn't help,
nor does [Thread.]yield;
in a robust approach
only a kind of wait/notify paradigm
or
Doug Lea's
fair locks
[now integrated into Java 1.5 [API],
as java.util.concurrent.locks]
will work;
these are however collaborative constructs
(and constitute some {possibly unwanted} level of dependency).
A JVM's monitors could be implemented for fair scheduling,
but [typically] aren't,
for performance reasons.
Note that a multi-processor system may defuse
thread starvation problems a bit
[buy-decent-hardware [for [Java] servers] pattern].
[
flush by synchronized
/
double check locking pattern
/
reentrancy
/
reentrancy problems on multi-processor systems
/
thread starvation sample
/
thread deadlock sample
]
String's, where however not even volatile matters),
or if there's one writer thread only,
updating one variable (that actually is updated atomically) only.
Use of
volatile
earlier may have indicated a degree of cluelessness.
Some of its uses seemed to be broken
just like the
double check pattern.
Further it introduces dependencies
[and therefore increases complexity, and decreases robustness;
e.g. HashMap's
transient volatile int modCount;
and its unguarded updates modCount++;
that are likely not atomic
(getfield modCount:I;
iconst_1;
iadd;
putfield modCount:I;)
assume that it doesn't matter to lose
one [concurrent] update,
as long as not all relevant updates are lost
(that aren't, since modCount is only counted up,
and only one bit of information {'changed'} is used)].
Consider also using types of the
(also non-blocking) AtomicReference family.
[
synchronized keyword
/
memory synchronization object
/
code robustness
]
java.util.Iterator is
a semi semi robust iterator;
it additionally includes
a remove method and
[better] takes care of the underlying data structure's
modification status
[one 'semi' here arises from the fact that
only remove functions
are considered but not insert functions,
the other from the fact that such operations are only allowed via
the iterator's methods
but not via the normal container methods;
asynchronous {multithreading} issues are
not considered {at all},
since they question the validity of returned elements
which are being processed;
omitting insertions makes
the iterator implementation simpler and
leaner
{and more
robust
in a different sense}
and avoids questions about iterator-after-insertion semantics;
removal during iteration is frequent
but insertion rare,
samples being
elements deregistering themselves].
[
race conditions with iterators
]
AbstractCollection implements
an unsynchronized method remove(Object)
with the help of an Iterator,
which may throw a ConcurrentModificationException,
e.g. in the case of the concrete class ArrayList
(which indirectly inherits from AbstractCollection,
and whose Javadoc informs on this behavior).
Safer is the similar Vector,
which implements a synchronized remove(Object).
Another alternative is the use of a wrapper collection
returned by Collections.synchronizedCollection(...).
Yet better is the use of a more appropriate data structure,
e.g. a Hashtable
when remove(Object) operations are frequent
[Hashtable is synchronized
and doesn't iterate].
[
read manual pages to obtain robust code
]
Objects
[not compile-time type-safe, primitive types have to be boxed]
or by using (additional) holder type parameters
[as e.g. in CORBA implementations].
[
multiple return statements
/
porting C++ code to Java
]
forward, left, etc.
Samples of these
Java turtle graphics
can be found
here.
Insights into turtle geometry implementation may be won from
com/lrdev/turtle/TurtleCore.java
[part of the
com.lrdev.turtle package;
includes wrapping, clipping and scaling].
back drawstate forward heading home left nowrap pendown penup right
setheading setx setxy sety towards wrap xcor ycor
/
bk fd lt pd pu rt seth
void draw(Turtle turtle, double distance, double distanceFactor, double omega, int depth) {
if (depth > 0) { /* recursion-end check */
turtle.forward(distance); /* trunk */
turtle.left(omega); /* orient towards left branch */
draw(turtle, distance * distanceFactor, distanceFactor, omega, depth - 1);
turtle.right(2 * omega); /* orient towards right branch */
draw(turtle, distance * distanceFactor, distanceFactor, omega, depth - 1);
turtle.left(omega); /* back to origin */
turtle.back(distance);
}
}),
and
fractaltree.java
(and
its
jar)
for a more complex solution
including GUI elements (sliders) to adjust parameters at runtime,
as well as coloring and different strokes (using Graphics2D).
[Java Swing types/components used:
JFrame (the application window),
JPanel (the paint window, and the sliders panel),
JSlider (parameter control),
JSplitPane,
BorderFactory/TitledBorder (parameter controls naming / value display),
ChangeListener (parameter change propagation, dependencies considerations (enabling/disabling additional controls));
Graphics2D types used:
Graphics2D,
BasicStroke,
Color,
Rectangle;]
[
recursion
/
hilbert curve
]
BigRational.java can be found
here (javadoc)
[and
here (source)].
The big rational number type lets you do the arithmetic
operations addition, subtraction, multiplication and division
without overflow and without loss of precision.
[The data type typically grows {concerning memory use and calculation
overhead} with its size and precision.]
Note that logarithms, roots, etc cannot be handled precisely with a big
rational number type, since their results typically don't fall into the
rational number domain.
[
about the BigRational testcases
/
about BCDs
/
question on integer overflows
/
rounding
/
add rational numbers
/
sample/simple big rational calculator
]
BigRationals
you would either say
c = a.add(b); or c = b.add(a);
(addition is commutative [and
associative]).
Java
doesn't support operator overloading,
so expressions such as
c = a + b;
won't be accepted for other than primitive types
[with an exception of String,
so you could say
c = "+3/5" + "+7/11";
but that doesn't do the actual addition,
and relies on using an explicit sign
(i.e. a denormalized representation)].
Have a look at
BigRational.add(BigRational)
on how it's implemented by means of [big] integers
(representing numerator and denominator):
as an optimization, to avoid expensive operations
(gcd, multiplication) and object creation overhead,
the cases of one or both operands being zero are treated specially,
same denominator is treated specially,
one or both operands being an integer is treated specially;
all avoiding additional big integer multiplication(s).
The general solution is then
new BigRational(m_n.multiply(that.m_q).add(that.m_n.multiply(m_q)),
m_q.multiply(that.m_q));
(BigRational(BigInteger,BigInteger)
then normalizes numerator and denominator
to have
no common factors
[by means of
gcd]).
[
Java BigRational
/
add associativity
]
BigRational.valueOf(double).floatValue() can also be used
to convert binary64 (double) to binary32 (float);
however, a [[Java] language provided] simple cast
(float)double has the same effect,
with less overhead
[memorywise and computational;
being in essence no more than stripping excess bits
and a simple rounding operation
(provided the exponent fits into the smaller size)].
double [floating point number]:
if the big number contains more than 53 significant bits
(if representable as a IEEE 754 normal number,
else even less),
or represents a rational with quotient other than a power of two,
or carries a binary exponent of more than 1023:
you can't.
With a conversion to a int [integer number]:
if the big number contains more than 31 significant bits,
or represents a rational:
you can't.
Easiest test
for precision loss
with
BigRational
to double
is:
NaNConsideringBigRational.valueOf(x.doubleValue()).equals(x)
(NaNConsideringBigRational.valueOf(double)
doesn't throw on Double.POSITIVE_INFINITY,
unlike BigRational).
[
rounding
]
%g0.
These are called synthetic instructions;
the list is the following
[including the opcode(s) they are translated to]:
bclr [andn],
bset [or],
btog [xor],
btst [andcc],
call [jmpl,call],
clr [or,st],
clrb [stb],
clrh [sth],
cmp [subcc],
dec [sub],
deccc [subcc],
inc [add],
inccc [addcc],
jmp [jmpl],
mov [or,rd,wr],
neg [sub],
not [xnor],
restore [restore],
ret [jmpl],
retl [jmpl],
save [save],
set [sethi;or],
tst [orcc].
See too the sources mentioned in the
above question.
printf].
Second:
see e.g.
here
[sample compiled from high level language]:
...; .s:; .ascii "hello, world\n000円"; ...;
sethi %hi(.s),%o0;
add %o0,%lo(.s),%o0;
call printf;
nop; ...;
Third:
the operations involved in calling a library function
[such as printf]
are very specific not only to the platform [CPU],
but also to the chosen high level language and the compiler settings
(see e.g. __fastcall pragma,
pascal stack cleanup mode,
volatile registers across function calls,
or
polymorphic C++ method calls).
[
convert [integer] to string [via s[n]printf]
]
esp and ebp
[makes things less tricky if they are non-volatile],
as well as
esi, edi
and
ebx
are considered non-volatile
[i.e. more x86 registers are considered non-volatile than volatile].
See e.g.
x86 sample dump with comments.
xor eax,eax clears (zeroes out) the register;
its x86 [ia32] CISC opcode
(33 C0,
or 31 C0) uses 2 bytes,
whereas mov eax,0
(B8 00 00 00 00)
uses 5 bytes;
so the former is at least more space efficient.
The obfuscation property of such opcodes shouldn't hurt,
as [[optimizing] compiler generated] assembly code was
never intended to be [easily] readable.
__set_app_type, _except_handler3, etc,
as e.g. present as references in this
win32 x86 dump,
can be found in the
system specific C runtime libraries and sources
(sources are often available as part of the development system).
These functions are part of the C runtime code
and are implementation specific internals
(hence e.g. beginning with __ or _
[reserved identifiers
in
C]).
[The 'application type' that is set in the function in question
(e.g. console or windowing system)
is e.g. used to route diagnostic output.]
daa (decimal adjust after addition) implements
two-step bcd operations (e.g. adc/daa pairs) and
does typically two digits (one byte only) per round, since one
carry flag is required per digit [x86 implements 'the' carry
flag and an additional 'auxiliary' {'nibble'} carry flag].
[x86 processors also support an unpacked bcd format
(with which the more significant nibble is ignored on inputs, and so can
be fed e.g. with ASCII digitals {\x30..\x39} directly).]
Precise base-10 representations can be implemented too by
scaled big-numbers (big-decimals) and big-rationals.
[
Java BigRational
]
strlen
or similar functions (strchr, memchr, etc) [usually due to
a software bug], a process will dump on the first memory load,
probably after initializing registers with the above values;
the values may also be remnants of earlier function calls [such
functions are called frequently and unused registers get
not cleared].
[
more comments on an optimized C strlen C implementation
/
ffff0000 patterns
]
synchronized
modifier in Java, i.e. one synchronized method
can safely call another synchronized method of the same class.]
[
recursive lock C++ sample implementation
]
StringBufferStream implements an undo
mechanism
[mark;
as a stream extension;
see
javacppdiffs].
sqfig.zip:sqenc.c:mergeLineFilter
(C sample)
implements a ring buffer
[a.k.a. circular buffer; this sample is used in a parser too].
The ring buffer features a begin pointer (current read
position) and an end pointer (current write position), which
are updated in the respective operations, while considering buffer wraps
and increasing buffer size if necessary.
java.util.concurrent.]
ArrayBlockingQueue (space optimized, bounded)
and LinkedBlockingQueue (time and concurrency optimized,
optionally bounded [implementing "a variant of the 'two lock queue'
algorithm: one lock gating entry to put/offer operations, the other
to take/poll operations; a count field maintained as an atomic
avoids the need to get both locks with put/offer/take/poll"]).
[Especially concurrent] queues
are used
for producer/consumer patterns
(to separate code [flows]
and make things asynchronous
[e.g. to avoid deadlocks
or make things more responsive])
and task [queue] / worker thread pool patterns
(e.g. to decouple network reads/writes
[and flow control] from processing
and to limit thread resources
[to make things more scalable]).
Queues are chosen to be bounded
to limit [memory] resource requirements
(to avoid out-of-memory scenarios
and system thrashing)
and/or implement flow control
(e.g. to make things fairer
and avoid worst-case response scenarios).
diff and version control tools] and are
sometimes better [human] readable [as unfortunately e.g. seen in RTF
document format usages].
[
RTF file normalizator
/
diff output readability
/
normalized sorted data (sort-up-to-a-unique-key pattern)
]
diff output more
readable are
normalization
(either implicit {such as with the -b option}
or by filter {such as
rtfn}),
context (e.g. by -u option)
and [possibly] outlining.
Tools such as
xemacs/emacs'
ediff [coloring] package are convenient
by providing context and fine-diff.
com.lrdev.bn.BigRational.test()
implements
some 1114 (!)
testcases (for one
[fat!]
module!)
[only testing the public interfaces, leaving away function
aliases].
BigRational does not use the JUnit unit testing framework, for
code simplicity.
[
Java BigRational
]
unsigned int gcd(unsigned int a, unsigned int b) {
return (b == 0 ? a : gcd(b, a % b));
}.
The mathematical methods deduction and [transformed]
induction can be implemented by recursion.
True recursion involves more than one recursive call
[such as the two calls in
qsort],
otherwise it's easy to transform it into an iteration
[provided proper data structures
{e.g. a stack},
actually any recursion can be transformed into an iteration];
the gcd sample above shows an even simpler tail-end recursion.
Indirect recursion involves a flow-graph that eventually lets a
function be called again [e.g. a calls b calls c calls a again].
Deep recursion requires some
stack
space being available,
however truly recursive algorithms often work
with recursion depth log2(n)
[and hence small stack requirements].
[
typical recursion applications
/
recursive gcd implementation
/
bit count by recursive pairwise addition
/
Hilbert curve recursive [fractal] drawing pattern
/
recursion to iteration
/
recursive locking
/
stack overflow with recursion
]
empty? predicate and butfirst list
operator].
Stronger recursion is used
in divide-and-conquer algorithms
such as the
quick-sort
sorting algorithm,
in tree-traversing
such as in Unix' find tool
[that scans hierarchical directory structures],
backtracking,
permutation enumeration
and many algorithms that work on graphs.
Geometrical applications include fractal curve drawing
and tiling.
[
about recursion
/
recursive gcd
/
Hilbert fractal curve
/
fractal tree
/
polyomino tiling
/
Logo Foundation
]
int ipow(int i, unsigned int n) {
return
(n == 0 ? 1 :
(n % 2 == 0 ? ipow(i * i, n / 2) :
i * ipow(i, n - 1)));
}
that can be transformed to
int ipow(int i, unsigned int n) {
int j = 1;
while (n > 0) {
if (n % 2 == 0) {
i *= i;
n /= 2;
} else {
j *= i;
n--;
}
}
return j;
}
requires a new state j
because of the i*ipow() multiplication.
j*ipow(i,n) can be seen as an invariant [and 'result']
for all steps.
[Note that the sample codes do not assert !(n==0&&i==0).]
[
recursion
/
recursion applications
]
Turtle
code
layered above
TurtleCore
code;
another sample are
multi-level layered
implementation hierarchies
and filter classes,
e.g. Java's input/output streams].
C
is a quite lean language
that can be used to produce
extremely lean executables
[therefore it does not even include
run-time size information or run-time type information
with variables, arrays, etc
{applications have to take care of this information explicitly;
C's
buffer overflow
problem
partly arises from this fact}].
C++
is a fat language
that can however be used to produce
extremely lean executables
[just like
C,
but requires thorough language insights!].
[Additional terms:
lean methods,
lean classes,
compact software,
fat interfaces,
monolith vs. framework,
compact command set vs. compact working set.]
Lean software development
means either
development of lean software
(discussed above)
or lean development of
[normal]
software,
which is rather about [development] processes
than about design.
Interesting approaches of such a lean development are:
find defects early
[late finding is expensive;
might lead to:
constant redesign / constant refactoring],
'delayed commitment'
[early developing in the wrong direction is expensive],
iterative development
[small steps;
proofs of concepts, proofs of functionalities, proofs of needs].
[
C
/
C overview
/
orthogonal design
/
robust code
]
((a + b) + c)
and (a + (b + c)) yield different results
with a and b [in a 32bit model] being ints (32bit)
and c being a long long (64bit)
[C
sample, -std=c99],
all three initialized to [e.g.] INT_MAX:
the difference is whether only one or both additions
are done as long long (64bit) additions,
the first case leading to a [likely unintended]
intermediate integer overflow
(i.e. one bit is lost; later sign-extension issues arise).
[Note that an equivalent sample with shorts (16bit)
and an int (32bit) won't overflow, since (in C)
[also short] integer calculations are done as ints.]
Note that with integers, without widening,
(i.e. all three operands being e.g. ints),
overflow in [the usual] two's complement integer space
doesn't affect associativity:
e.g. (((INT_MAX-1)+(2))+(-1)) == ((INT_MAX-1)+((2)+(-1))) == INT_MAX.
Floating point number addition can overflow [too],
e.g. [on i686] with long doubles:
(LDBL_MAX*0.75) + (LDBL_MAX*0.5) + (LDBL_MAX*-0.375);
additionally rounding issues [due to limited precision]
can pop up, e.g. with 1.0 + (LDBL_EPSILON*0.5) + (LDBL_EPSILON*0.5).
[Note that on i686 doubles with equivalent expressions
based on DBL_MAX and DBL_EPSILON won't overflow nor differ due to
lost precision, since [in C] the whole statement (expression)
is calculated in the higher [CPU native] precision (long double).]
C++
allows operator overloading
and can't enforce [user provided] implementations
to be associative.
Java
supports the use of the add operator with Strings,
denoting string concatenation:
(("" + 2) + 3) signifies "23",
("" + (2 + 3)) signifies "5"
(two type coercions from int to String vs. one).
man is an abbreviation for
the Unix [operating system] [online] manual [pages] [help] system.
Online means that the man [shell] command
and the 'man pages' were/are found (if installed)
on UNIX (including Linux, Solaris, etc.) systems
in electronic form
[supplied additionally, to a printed version]
[installation sometimes was omitted to reduce disk space requirements].
In a pre-HTML/HTTP era the pages were served locally,
suited for text terminals
(i.e. with minimal layout;
based on the nroff package
allowing to format for terminals of different capabilities
{e.g. of displaying bold/inverted/etc.}
and different printers
{on some making words bolder by printing
e.g. U\bUN\bNI\bIX\bX})
and hyperlinks were preceded by
a 'see also' section
(towards the end of a man page).
Just type
'man man',
or 'man man | more'
to start.
Type
'apropos -?'
to search in the index of installed man pages.
Several web servers offer gateways to man pages;
scope, options {and details} of commands, system calls and library
functions vary from Unix to Unix;
be careful to recognize proprietary stuff,
to stay
portable.
Unix Man
is also known as a well known song seen from the Unix-point-of-view.
Annotations,
and to define a retention policy for these to keep
meta information beyond the compile process,
facilitating the inclusion of documentation/manuals
into the runtime system.
[
unix man (unix manual)
/
man man
]
ls -l (real size [in bytes])
vs. ls -s (allocated size [in blocks])
on the command layer
or by comparing stat system call's
size information
to the number of blocks,
on the system call (C language) layer.
--sparse=always option.
ls -s to get the allocated size in disk blocks
(vs. ls -l, the real size in bytes).
-DSPARSEFILE_LSEEK=lseek64 -D_LARGEFILE64_SOURCE, or -D_FILE_OFFSET_BITS=64
to avoid EOVERFLOW (75, Value too large for defined data type) in this case.
/export/home/x/
gets automounted to /home/x/
[/home/ designated to trigger automounts,
as e.g. specified in the according NIS+ tables],
thus allowing users to log on any system
in a networked environment
and finding their expected home directory.
If not used, automounted systems are eventually unmounted,
allowing least invasive maintenance on the NFS servers.
truss usage is given too, along with [small]
output samples
and truss output viewing strategies.
isatty
[check input/output for interactivity,
e.g. to check whether output buffering is appropriate
(or rather autoflush is appropriate):
it isn't in the interactive (tty) case,
involving some performance penalty]
eventually issuing a ioctl(...,TCGETA) call,
returning with an errno set to ENOTTY.
Judging the severities of system call errors and
exceptions is often not an easy thing to do
in intermediate software layers such as middleware or
library code
[apparent if there trying to figure out on what log level
such a condition should be traced].
[
how to trace a process
/
Unix process tracing patterns on system call errors
]
lsof tool
will tell you
which Unix process is writing to a file;
the
truss
tool will tell you
which files get opened by a process
and
what is written to a file.
lsof output, too,
and in the netstat -tunaWp output.
The
truss
tool lets you trace TCP communications
on the user level;
network capturing tools
(either on the network interface [Unix device]
or on inter-connecting hardware)
show you what's happening
on the TCP protocol layer
(including retransmits, timing, etc).
Tracing UDP is very similar.
ps,
using appropriate options
[e.g. ps -elf
or s in e.g.
ps -e -o pid,ppid,user,s,psr,nlwp,stime,time,args
{Solaris;
see ps
man page}].
Running state
(including online state
[a running process actually assigned to a processor])
is an 'active' process state,
in contrast to e.g.
the sleeping state
[e.g. waiting on user input, a timer, child process termination, etc.]
or the stopped state
(also known as traced state)
[suspended by {a shell's} job control or a debugger, etc.].
The top tool allows
to filter process information by [running] state
[use the i (idle) option;
for older versions
see this
stopped-processes-considered-idle patch
to filter stopped processes too;
however,
'stopped' is somehow orthogonal to 'sleeping',
Solaris' pflags tool
e.g. displays according processes as both
{PR_STOPPED|PR_ASLEEP}].
The uptime command
will show you the load level
[how many processes / threads are 'runnable']
over 1, 5, and 15 minutes
[so does the top tool too].
[
how to trace a Unix process
]
sleep function it specifically means 'wait for a specified time'.
[
how to display Unix processes' state / how to show running processes
/
stat sleeping
]
floor(){return round(ROUND_FLOOR);},
ceiling(){return round(ROUND_CEILING);}
and
truncate(){return round(ROUND_DOWN);}.]
For a sample of an implementation of rounding,
see
Java BigRational
too.
unsigned int gcd(unsigned int a, unsigned int b) {
while (b != 0) {
unsigned int c = a % b;
a = b;
b = c;
}
return a;
},
a
[tail-end]
recursive gcd C implementation would be
unsigned int gcd(unsigned int a, unsigned int b) {
return (b == 0 ? a : gcd(b, a % b));
}.
Another
[[non-tail-end (due to the <<1 terms)]
recursive]
implementation,
that doesn't use any
[expensive]
division/remainder operation, is
unsigned int gcd(unsigned int a, unsigned int b) {
return
(b > a ? gcd(b, a) :
(b == 0 ? a :
((a & 1) == 0 && (b & 1) == 0 ? gcd(a >> 1, b >> 1) << 1 :
((a & 1) == 0 ? gcd(a >> 1, b) :
((b & 1) == 0 ? gcd(a, b >> 1) :
gcd(a - b, b))))));
}
(See
[The Art of Computer Programming,
Volume 2: Seminumerical Algorithms,
Donald E. Knuth,
Addison-Wesley,
3rd Ed. 1997,
ISBN 0-201-89684-2],
p. 338).
[The
iterative
analogon would about be
unsigned int gcd(unsigned int a, unsigned int b) {
int i = 0;
if (b == 0) {
return a;
} else if (a == 0) {
return b;
}
while ((a & 1) == 0 && (b & 1) == 0) {
a >>= 1;
b >>= 1;
i++;
}
if ((a & 1) == 0) {
aeven :
do {
a >>= 1;
} while ((a & 1) == 0);
} else if ((b & 1) == 0) {
beven :
do {
b >>= 1;
} while ((b & 1) == 0);
}
if (b == a) {
return a << i;
} else if (b > a) {
b -= a;
goto beven;
} else {
a -= b;
goto aeven;
}
}.]
A gcd is e.g. used in
memrot.c
[memory rotation, analogous to
C's
memmove,
doing without scratch space]
to check a data buffer's symmetry properties
[memrot itself is used in this
simple Ziv-Lempel (LZ77) encoding implementation
to allow multipart overlapping memory copying].
Another sample is
BigRational.java,
that uses
a gcd
for normalizing its rational number representation.
[
recursion
/
recursion to iteration
/
hp RPN gcd implementation
]
dc
[Unix'
desk calculator program]
e.g. by:
8k 5v1+2/ p q
(half of one plus root of five).
The solution comes from solving
the
quadratic equation
r:1=(r+1):r.
The golden ratio is e.g. used in visualizations, e.g.
here.
RCL 7;
typical
stack languages
usually have none).
The RPL models
[e.g.
hp28s,
hp50g]
do implement a stack language.
[
RPL stack functions
/
RPN stack functions
]
ParametricPlot3D[{(2 + Sin[2 Pi u]) Cos[2 Pi t],
(2 + Sin[2 Pi u]) Sin[2 Pi t],
Cos[2 Pi u]},
{t, 0, 1},
{u, 0, 1}],
see e.g.
m2wrl]
can be considered circle-like
(only small extensions around a circle-line;
{in this sample however the 'small' dimension u is not
chosen so insignificant, relative to the circle-line t}).
To calculate the points from the formula above [which form two
kinds of circles, see the black lines in
m2wrl-torus.gif],
just run both t and u from 0 to 1
(say, 0.0, 0.1, 0.2, ...).
['glass'
{another VRML generating application using circles}
uses symmetry considerations to calculate only a fraction of
the circle points.]
Circles are not suited as finite VRML elements either;
usually triangles or quadrilaterals [often trapezes,
as in the torus sample] are used.
/usr/X11/lib/X11/rgb.txt...
However, from
various sources
I have seen
values such as
153 51 102 [#993366]
(which is a
'browser safe palette'
pattern),
92 1 32 [#5c0120]
and
even
171 78 82 [#ab4e52].
[
Bordeaux [red] wine
]