Often, answers to questions asking for "programs" or talking about "programming languages" utilize things like sed
, awk
, ... in order to get around having to write an actual shell script.
Therefore, a question comes to my mind:
What qualifies as a programming language?
Sure, ultimately the OP can define this themselves. But what is a reasonable understanding of a "default" in case the question does not clarify this (and how should it be clarified)?
Do coreutils count as "languages" and if so, how to handle different sets of coreutils on different systems?
To give a concrete example for a questionable usage of the term programming language, see my answer here. It uses a rot13
binary as the interpreter. In case this is invalid, but "coreutils" are valid, how can we define the difference?
6 Answers 6
My previous answer was criticised for not drawing a line in a sand, so following some discussion on chat I propose a line.
Executive Summary
A purported programming language should be accepted as such if and only if it is capable of addition of natural numbers and primality testing of natural numbers.
More precise description
The language must:
- Support a representation of natural numbers and of tuples. (We're talking about languages rather than implementations, so we will leave to one side the issue of type widths).
- Be able either to transform inputs into outputs (transformational model) or to distinguish an "accepted" input from a "rejected" input (decision model).
- Be able to take two natural numbers and add them. In the transformational model, this means transforming an input tuple of two numbers into an output which correctly represents their sum. In the decision model this means deciding whether an input contains the representation of a tuple of three natural numbers such that the third is the sum of the first two.
- Be able to take a natural number and say whether or not it is a prime. In the transformational model this means transforming a natural number into the representation of
0
or1
according to whether it is a composite or a prime number. In the decision model it means accepting precisely those inputs which represent a prime.
Observations.
In roughly decreasing order of importance.
- This definition does not require a language to be Turing-complete (although it certainly permits any Turing-complete language). This is intentional. Turing-completeness is unnecessary for many problems on this site, and there are interesting non-Turing-complete languages and interesting languages whose status with respect to Turing-completeness is unknown.
- My original aim was to capture the intuition of accepting primitive-recursive languages. However, I considered that primality testing is a lot easier to explain and to demonstrate, while requiring enough power that it probably doesn't include any undesirable "language". The additional requirement to support addition is to exclude arguments about the
factor
tool qualifying. - Although some might object to the inclusion of the decision model, it corresponds more closely to the formal definitions of well-known complexity classes than the transformational model does.
- If you're using an obscure language, please add an entry to the list of installation and testing instructions so that other people can test your code.
- This definition allows regex flavours which include backreferences. I don't consider this a problem.
- This definition excludes HQ9+. I don't consider this a problem either, for two reasons:
- It was created as a joke rather than a language, and has ceased to be funny in the context of this site.
- I think that every interesting problem which HQ9+ can "solve" has already been asked, so I don't think this will exclude any interesting answers in the future.
-
3\$\begingroup\$ I like this answer a lot. Simple, clear rules that pretty much any language I intuitively consider a language passes while others don't. Also rules you don't need to really check for most languages because it's obvious, even for very specific languages (where turing-completeness would've been a problem). \$\endgroup\$Ingo Bürk– Ingo Bürk2014年08月22日 05:07:02 +00:00Commented Aug 22, 2014 at 5:07
-
18\$\begingroup\$ +1 your answer, I like it a lot, although it does pave the way for a new "language" HQ9AddP. It would mean I would win the following argument, though it did give us a good laugh at the time: codegolf.stackexchange.com/a/23003/15599 \$\endgroup\$Level River St– Level River St2014年08月23日 13:52:13 +00:00Commented Aug 23, 2014 at 13:52
-
1\$\begingroup\$ So... it needs to be able to solve at least those problems whose recursive complexities are of order less than ω? \$\endgroup\$SuperJedi224– SuperJedi2242015年08月18日 17:29:56 +00:00Commented Aug 18, 2015 at 17:29
-
2\$\begingroup\$ @SuperJedi224, I don't know what your definition of recursive complexity is (and any attempt to Google it just gets a tonne of basic pages about big-O notation), but I don't think so. It doesn't have to be Turing-complete. (It might even be the case that PRIMES is in LOGSPACE - I don't think anyone's proven or disproven that). \$\endgroup\$Peter Taylor– Peter Taylor2015年08月18日 18:44:03 +00:00Commented Aug 18, 2015 at 18:44
-
2\$\begingroup\$ A turing complete system can solve (in theory, in practice this would require insanely impractical amounts of time and memory) problems whose recursive complexities are on orders quite a bit higher than ω... (Knuth's Arrow Notation and the Ackermann Function, for example, both have recursive complexities of order ω, and Graham's Sequence has recursive complexity of order ω+1). ω<sub>1</sub><sup>CK</sup> is by definition the least order of recursive complexity that no computable model can solve, even given unbounded time and memory. \$\endgroup\$SuperJedi224– SuperJedi2242015年08月22日 12:15:34 +00:00Commented Aug 22, 2015 at 12:15
-
4\$\begingroup\$ @SuperJedi224, without a link to a good definition and description of "recursive complexity", you're wasting your time (and mine). \$\endgroup\$Peter Taylor– Peter Taylor2015年08月22日 12:43:27 +00:00Commented Aug 22, 2015 at 12:43
-
1\$\begingroup\$ What does supporting a representation of tuples mean exactly? For example, how does regex support tuples? \$\endgroup\$jimmy23013– jimmy230132015年10月12日 07:06:42 +00:00Commented Oct 12, 2015 at 7:06
-
4\$\begingroup\$ @jimmy23013, any mechanism which allows items to be grouped and ungrouped would do. For computational models which support arbitrarily large integers, a classic way of representing tuples is using powers of primes (e.g. you represent the pair
(a, b)
as2^a * 3^b
and extracta
andb
with division and modulus). For regexes with backreferences, one easy way to demonstrate that they meet the criteria is to use unary with digit1
for numbers and separate elements of a tuple with whitespace. Then a regex for addition would be^(1*) (1*) 1円2円$
\$\endgroup\$Peter Taylor– Peter Taylor2015年10月12日 08:04:01 +00:00Commented Oct 12, 2015 at 8:04 -
1\$\begingroup\$ That is, to support tuples in the input, but not necessarily anywhere else in the computation? \$\endgroup\$jimmy23013– jimmy230132015年10月12日 08:06:41 +00:00Commented Oct 12, 2015 at 8:06
-
1\$\begingroup\$ @jimmy23013, yes (although I'm curious to know what model you've found which supports them in the input but not elsewhere). \$\endgroup\$Peter Taylor– Peter Taylor2015年10月12日 08:15:46 +00:00Commented Oct 12, 2015 at 8:15
-
1\$\begingroup\$ @PeterTaylor Regex? It can only use the tuples in the input, but not generate new ones. \$\endgroup\$jimmy23013– jimmy230132015年10月12日 08:17:56 +00:00Commented Oct 12, 2015 at 8:17
-
4\$\begingroup\$ @PeterTaylor I think this is what SuperJedi is referencing: Ordinal Complexity of Recursive Definitions. The authors seem to have invented the terminology, given that the first instance of "complexity" is in quotes. \$\endgroup\$primo– primo2015年10月18日 19:22:59 +00:00Commented Oct 18, 2015 at 19:22
-
5\$\begingroup\$ I would recommend adding a deterministic requirement to this - a language must be capable of producing consistent output for a consistent combination of code and input, and those deterministic parts are the ones that must meet these requirements. This does not disallow RNG components - it just requires that the language has the capability to deterministically add, determine primality, and either follow the transformational or decision model. \$\endgroup\$user45941– user459412015年11月14日 18:17:27 +00:00Commented Nov 14, 2015 at 18:17
-
2\$\begingroup\$ Does this definition require the language to support unbounded memory? (This situation has come up with a language which requires the type width of every variable used to be specified, in unary, within the program itself.) It sort-of implies that it does, but on the other hand, that would exclude many more normal programming languages (e.g. in C, the mere existence of
size_t
places a hard bound on how large memory can be in any given program). \$\endgroup\$user62131– user621312017年01月05日 18:07:47 +00:00Commented Jan 5, 2017 at 18:07 -
2\$\begingroup\$ @ais523, I don't think this criterion really says much about memory capacity. It's implied that you have more than two bits of memory, but I certainly don't think that unbounded memory is implied. \$\endgroup\$Peter Taylor– Peter Taylor2017年01月06日 00:14:30 +00:00Commented Jan 6, 2017 at 0:14
We're asking the wrong question
We're having the XY problem. The question is really, "What formats should we allow in answers?" For this purpose, I think that markup languages and limited output languages should be treated the same as programming languages.
Having fewer features puts a language at a disadvantage. If a language does your challenge, yet cannot implement something like primality testing, that's more impressive, not less. It's silly to put a minimum on the functionality of the language. This is just making user-made languages cheated their way into being Turing complete, like with Bubblegum and CHIQRSX9+.
If a challenge is fixed-output and is dominated by answers that are basically the output string, that's a problem with the challenge. Little will change with a ban on "print-this" languages the output their text, since plenty of fully-fledged languages have really short ways to quote and print text.
-
7\$\begingroup\$ Re fixed-output questions, see also "downvote the question rather than post a protest answer consisting of the literal output". (There were some ridiculous extremes in the recent French flag question: I downvoted and flagged one answer which was a raw PNG file). I think the motivation of this question is insisting that people actually use a language, rather than claiming that they have a 0-byte answer with $unix-tool-which-does-exactly-what's-needed. But the site is PPCG, and answers which don't involve programming should be off-topic. \$\endgroup\$Peter Taylor– Peter Taylor2015年11月19日 10:35:36 +00:00Commented Nov 19, 2015 at 10:35
-
2\$\begingroup\$ @PeterTaylor That has happened before when we had the Super Mario graphical output challenges. I started a separate meta thread back then and the consensus seems to be that kolmogorov-complexity challenges don't require answers in programming language. :/ \$\endgroup\$2015年11月19日 12:32:14 +00:00Commented Nov 19, 2015 at 12:32
There may be some grey areas (regexes having been mentioned), but none of the cases explicitly named in the question falls into one.
Definitely languages
sed
Wikipedia says:
sed (stream editor) is a Unix utility that parses and transforms text, using a simple, compact programming language.
It has commands and is Turing-complete.
awk
Wikipedia says:
AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool.
It has variables, types, control flow... There is a book called The AWK Programming Language.
Definitely not a language
rot13
Forget Turing-completeness (equivalent to mu-recursive functions): rot13 doesn't even allow you to implement primitive-recursive functions. In fact, it doesn't allow you to implement anything: it implements a single bijective function (or, at best, 26 closely related bijective functions), and the only thing you can do is choose the input to that function.
(NB although you didn't mention it, I've seen it in some "answers", so: cat falls in this category as well for the same reasons as rot13. And since I've just seen a comment in which you suggest claiming it as a language, date is definitely not either).
-
7\$\begingroup\$ Sed is Turing-complete?!?! That's what I like about PP&CG, it jostles my certainties. Here's the most interesting page Google lead me to: A proof that Unix utility "sed" is Turing complete. \$\endgroup\$Scott Leadley– Scott Leadley2014年08月17日 15:28:50 +00:00Commented Aug 17, 2014 at 15:28
-
\$\begingroup\$ @Scott random application of string-rewriting rules (or s/// regex replacement) is Turing-complete. Check out the esolangs Thue and /// on the esolang wiki (can't provide link at the moment). \$\endgroup\$FireFly– FireFly2014年08月17日 16:04:28 +00:00Commented Aug 17, 2014 at 16:04
-
6\$\begingroup\$ None of this defines a difference, though. With these arguments I have trouble accepting HQ9+ as a programming language. It does nothing that really differs from rot13. We can call this gray area, but that doesn't get us anywhere. Why is one allowed but not the other? \$\endgroup\$Ingo Bürk– Ingo Bürk2014年08月17日 16:41:24 +00:00Commented Aug 17, 2014 at 16:41
-
2\$\begingroup\$ @IngoBürk, so do I. \$\endgroup\$Peter Taylor– Peter Taylor2014年08月17日 16:54:32 +00:00Commented Aug 17, 2014 at 16:54
Yes, awk(1) and bc(1) and dc(1) are all programming languages. One can do factorial in AWK.
Yes, sed(1) is a programming language. I solved "Bring out the inner llama of a sentence" in sed. My program is too long to win code golf, because sed has no numeric operations, but it does answer the question. Programs in sed can do input and output, and make conditional jumps. I made a while loop, starting with the label :w
and ending with the conditional jump /[%z]/bw
, which branches to w
if the line contains any %
or z
characters. With input and output and conditional jumps, one can write programs in sed.
No, ed(1) is not a programming language. I believe this because I never learned how to do conditional jumps in ed; it has no labels or branches. People can write scripts in ed, and ed can do some conditional logic, so perhaps one can disagree and claim that yes, ed is a programming language.
No, most other shell utilities are not programming languages. Commands like ls
or sort
or rot13
are not programming languages by themselves. If the question is rot13, I can't use rot13
, because I must not use a library function, or command, to do all the work. I can answer tr A-Za-z N-ZA-Mn-za-m
and count it as a shell script, 22 characters. As tr
is no programming language, I must count the call to tr
as part of my program.
Answers need not be portable. It is fine to write a shell script that works with GNU coreutils, but fails with BSD or Solaris. This is no worse than a PowerShell answer that only works with Windows! One platform is enough.
-
\$\begingroup\$ There's a misunderstanding about the rot13 abswer. Your tr program would not be a valid answer there, the goal was not to write a rot13 program. I wasn't using rot13 as the submission, I was using it as the "language". The actual submission was the input "A". \$\endgroup\$Ingo Bürk– Ingo Bürk2014年08月18日 04:37:17 +00:00Commented Aug 18, 2014 at 4:37
-
\$\begingroup\$ Another argument for
sed
is programming language: one can code a SedSokoban game in it. (Though I'm afraid "one" can be taken literally in this case...) \$\endgroup\$manatwork– manatwork2014年08月18日 09:06:30 +00:00Commented Aug 18, 2014 at 9:06
Aside from allowing programming languages as in Peter's definition, I think we should also allow declarative languages.
In computer science, declarative programming is a programming paradigm, a style of building the structure and elements of computer programs, that expresses the logic of a computation without describing its control flow.
Common declarative languages include those of database query languages (e.g., SQL, XQuery), regular expressions, logic programming, functional programming, and configuration management systems.
This means that we should also allow languages such as HTML (+SVG), CSS.
-
\$\begingroup\$ My definition already includes most declarative languages, including the combination of HTML+CSS3. It doesn't include HTML by itself, but I can't think of any question where you'd want to include HTML by itself. \$\endgroup\$Peter Taylor– Peter Taylor2015年01月01日 14:26:47 +00:00Commented Jan 1, 2015 at 14:26
-
\$\begingroup\$ @PeterTaylor I see. While I also cannot think of such a question, there are surely questions where you'd want to use SVG, which is not included by your definition, I think. So I edited my answer. \$\endgroup\$user3094403– user30944032015年01月01日 14:32:55 +00:00Commented Jan 1, 2015 at 14:32
-
\$\begingroup\$ @PeterTaylor I know this postdates your comment, but the "create a checkbox" challenge certainly qualifies - and HTML did participate. \$\endgroup\$John Dvorak– John Dvorak2017年03月17日 18:37:18 +00:00Commented Mar 17, 2017 at 18:37
Clearly a general purpose programming language -
Turing equivalence is sufficient to decide it's a programming language, whether well-defined (C, Python, Haskell, etc.) or ad-hoc (e.g., bash + Linux utilities, PowerShell + Windows utilities, etc.).
Useful exceptions -
- SQL
- state machine languages, including all flavors of regex without code injection
- Until I read the Wikipedia article on Turing completeness, I'd never heard of useful almost Turing-complete programming languages before. I, and I think most people reading PP&CG, would accept something like Charity as a programming language.
Clearly not a general purpose programming language -
No ability to construct a branch and/or a loop control structure. This definition is woefully incomplete, since the breadth of state manipulation is also an issue.
-
\$\begingroup\$ What about the countless languages that are suspected to be Turing complete, but have not been proven to be so? \$\endgroup\$Ingo Bürk– Ingo Bürk2014年08月17日 07:14:47 +00:00Commented Aug 17, 2014 at 7:14
-
2\$\begingroup\$ I personally think that regular expressions shouldn't even be ruled out. If the problem is trivial to solve with regular expression it probably wasn't interesting enough. In all other cases solving it with a regular expression is often just a skillful (if not more) than solving it with a "normal" language. Furthermore, at least PCRE is Turing complete, and other flavours (like .NET's and Ruby's) are similarly powerful. The only thing that's a bit problematic with regular expressions is to adhere to input/output specifications. But if that works, then people should be allowed to use regex. \$\endgroup\$Martin Ender– Martin Ender2014年08月17日 08:49:02 +00:00Commented Aug 17, 2014 at 8:49
-
\$\begingroup\$ I agree with @MartinBüttner. Turing-completeness is not a good criterium when it comes to regular expressions, neither is classical branching/looping. There are fantastic regexp answers that I don't think should be excluded. \$\endgroup\$Ingo Bürk– Ingo Bürk2014年08月17日 09:00:41 +00:00Commented Aug 17, 2014 at 9:00
-
\$\begingroup\$ I'm not a snob and would upvote an elegant or eloquent or jaw-dropping regex-only answer. I'd do the same for an answer that used any useful state machine language. \$\endgroup\$Scott Leadley– Scott Leadley2014年08月17日 14:25:25 +00:00Commented Aug 17, 2014 at 14:25
-
\$\begingroup\$ @MartinBüttner Being pedantic, I have it on "good authority" (meaning I haven't paid much attention to it, but it sounds right) that perl regular expressions, without code insertion, are not Turing-complete. If you know otherwise, I'd be grateful for a reference or two. \$\endgroup\$Scott Leadley– Scott Leadley2014年08月17日 14:42:04 +00:00Commented Aug 17, 2014 at 14:42
-
\$\begingroup\$ @ScottLeadley Turns out I can't find any conclusive information about it. But I can't find a prove for it either way. Apparently it can match context-sensitive languages (but my very short search hasn't turned up a proof for that either). \$\endgroup\$Martin Ender– Martin Ender2014年08月17日 14:47:49 +00:00Commented Aug 17, 2014 at 14:47
-
1\$\begingroup\$ ANSI SQL is in fact Turing Complete, but it's a tarpit when you need it to be Turing complete. \$\endgroup\$Joshua– Joshua2016年05月04日 15:55:16 +00:00Commented May 4, 2016 at 15:55
sed
/awk
case androt13
is that the former two take some file specifying a transformation (sed does so via the-f
parameter) whereas rot13 does not. Or is rot13 specified to ignore any excess arguments? In that case, an empty file could be considered a "rot13 program" if one wants to bend the rules. \$\endgroup\$awk ...
), when, to be fair, only the argument containing the logic should count. \$\endgroup\$sed
is a programming language and it's great for string manipulation golfing, codegolf.stackexchange.com/a/28655/16402 and codegolf.stackexchange.com/a/32526/16402 \$\endgroup\$