One big problem is this project has little relevance to the world outside arclanguage.org.
On the technical side, I'd suggest looking to see if anyone has made a Scheme to Common Lisp compiler, since that's an isomorphic problem. If one exists, that will give you a big start. If one doesn't exist, hmmm....
Offhand, I don't see how you're going to handle continuations or tail recursion. Note that many Scheme implementations punt on continuations (e.g. Kawa, Scheme 48, QScheme). Nil handling is also likely to be a pain.
If you're looking for a way to compile Arc, I think it would be much easier to use an existing Scheme compiler as the base, rather than using a Common Lisp compiler. Using a Common Lisp compiler seems to be just adding difficulty.
On the other hand, you have a lot of time budgeted to get the web server running. It seems to me that if you get Arc running, then srv.arc should just work and give you the web server for free.
I think interpreting Arc (as opposed to compiling) in Common Lisp using Peter Norvig's Scheme interpreter is doable, and I'm actually currently looking into that.
-----
> This project has little relevance to the world outside arclanguage.org.
I don't see how much of anything I do here affects the outside world.
> I'd suggest looking to see if anyone has made a Scheme to Common Lisp compiler.
That seems like a rather roundabout method of compiling Arc to CL. Should we always use existing cross language compilers whenever we want to port Arc to a new language?
Say we have a language X, which only has a CL to X compiler. Compiling is then a conversion from Arc to Scheme, then to CL and finally to X. Is it really a good thing to have all the middlemen? And say we have a language Y which only has a X to Y compiler....
> I don't see how you're going to handle continuations or tail recursion.
Continuations are a difficulty, but I think one of CL's libraries may do the trick. They might not be truly first-class, but if they work I don't intend on worrying about it until I've got everything else working.
I don't see how tail recursion would be a problem. As far as I know most CL implementations provide tail recursion. (pg's ANSI Common Lisp talks a lot about using tail recursion in CL, which is one reason why I am not too worried about it.)
> It seems to me that if you get Arc running, then srv.arc should just work and give you the web server for free.
I hope it works out that way. (I will be more than happy to move on to other things if that step goes quickly.)
-----
If you're implementing a Scheme-like system, I highly recommend Dybvig's "Three Implementation Models for Scheme" (http://www.cs.indiana.edu/~dyb/pubs/3imp.pdf). The three models are a heap model, a stack model, and implementation on a crazy string-based research processor; you'd probably want the stack model. Despite being a PhD thesis, it is very practical; it describes how the author implemented Chez Scheme and gives a pretty much full Scheme implementation running on a simple virtual machine. It handwaves about converting the virtual machine code to assembly, and gives some sample VAX code in an appendix.
I'll reiterate that you should look at how to make your project have a larger impact and relevance; what could you do (perhaps using your compiler as a base) that 1000 people would benefit from?
-----
I presume the paper you posted is meant for implementing Scheme from scratch? (So this would be an alternative to the previous option?) I'll look over it if I get the time.
Now perhaps I'll see if I can make a proposal out of this. (I might as well, assuming I have the time, since I am allowed more than one.) Although there is one issue: in order for such a proposal to get funded, I need someone from the Arc community to step up and mentor it.
-----
Edit: http://en.wikipedia.org/wiki/Chicken_%28Scheme_implementatio...
As for continuations, if you transform everything (!!) into cps style, then you get continuations for free ^^.
-----
git://github.com/pauek/arc-sbcl.git
I ended up creating a public copy of my own git repo not to mess with Anarki...As for your concerns, I think you shouldn't be too worried. Just check it out, there is ample room for improvement, especially in speed, which I understand is your primary objective. No network yet, no threads, continuations are implemented as chains of closures (I honestly don't know better than that)... it just loads plain arc.arc, and I haven't checked it thoroughly.
For me, this was a very nice opportunity to learn about the CPS transformation. Arc is perfect in this respect (only 4 special forms). I'm pretty sure that, as an exercise, it's full of bugs and misconceptions, because I love to learn by doing (and make lots of silly mistakes along the way)...
I'll be happy enough if it helps just a tiny bit...
-----
At the moment I have one major problem: LispNYC does not deem CL-Arc to a be a full summer's worth of work. So unless I can show them that CL-Arc really is a summer's worth of work, I'm left with either coming up with additional ideas which I can work on after CL-Arc, or writing an entirely new proposal on another topic.
I'm wondering if anyone has ideas for what I could work on after completing the basic CL-Arc. If the axioms and core language take a couple of weeks, and the libraries take a couple of weeks, and FFI (probably just a port of the current FFI on Anarki) takes two weeks, that leaves a lot of time in the summer. What can I do with this time?
One idea is I could port Arco to CL-Arc. Unfortunately Arco isn't complete and is undergoing rapid changes, and might be completely different by the time I get around to CL-Arc. This makes it even more difficult to write a definite proposal.
Any other ideas? suggestions?
Thanks.
-----
^^ Okay, okay, hahaha j/k.
How about porting the Anarki-specific extensions to the language?
1. defcall
2. FFI - probably take the longest
3. settable-fn (make it builtin, the Anarki version actually drops down to Scheme using $ for this).
And some things that Arc base lacks which is hard to hack around the scheme implementation for:
1. Error reporting. With LINE NUMBERS. And the ability to give macros access to the line numbers, too.
Some of the higher-voted ones here too:
-----
About a week, maybe less ^^. Defcall is reasonably trivial although it requires some rethinking. settable-fn is implementable using defcall (see nex-3's settable-fn2) but I'm personally dubious of such a style, and prefer my own.
> (But I'm a little dubious about working on something pg may have already done.)
We don't have evidence either way - pg hasn't commented on this (none that I've seen, anyway). Up to you to decide whether to act as if pg made it, or act as if pg didn't make it.
-----
In SBCL, I am consistently able to achieve results within 30% of C programs. Without type annotations the speed is still quite respectable.
If we ever want Arc to be usable in the wild for a diverse range of applications, we will need a compiler that can generate fast code. This is one way of getting there.
-----
"When I was a noob, I talked like a noob, I thought like a noob, I reasoned like a noob. When I became a hacker, I put noobish things behind me. Now we see but a poor implementation as in an alpha; then we shall see code to code. Now I code in part; then I shall code in full, even as I am full of code. Now these three remain: hubris, impatience, laziness. But the greatest of these is laziness."
-----
Yes, you can always optimize your Arc code to mitigate the problem. My point still stands though. If we want to be able to use Arc in production situations, we will eventually crave an industrial-strength Arc compiler.
This becomes even more obvious if you take seriously the notion of "A Hundred-Year Language".
-----
We should focus on improving what Arc does, rather than trying to morph it into the behemoth Swiss Army knife that is Common Lisp.
Not to discourage you from your own pursuits regarding this, but I think as a whole, the Arc community would be better off putting its focus elsewhere.
-----
I'm not trying to turn Arc into Common Lisp, I'm advocating making Arc fast. There's a difference.
CL was a behemoth because it was a union by committee of all the cruft in every popular Lisp dialect circa 1980. One of Arc's goals is to design from a clean slate. There is not a pressure to conform to bad or inconsistent decisions of the past. None of that has anything to do with performance.
The question becomes: Do we want to make a real general purpose language, or do we want Arc to become "yet another scripting language"?
This is a long term question. We may not need a compiler tomorrow, but I'm not ready to say for the long haul: "Arc is for exploratory programming and rapid prototyping. It will never scale. You shouldn't write ray tracers in it, or image processing, or games. It's fine for web apps -- unless they get allot of traffic."
-----
Glad I had Psyco there. If I hadn't had it, or at least Pyrex, I would have probably dropped Python because writing C extensions for it is quite painful. And that's also the reason why I never really used Ruby, despite its cool features.
And I don't want to write prototypes and say : "Hmm... My code is working now, let's write it in a serious language for the production version".
Look at Scheme anyway. We really can't say it's a language focused on speed or designed to crunch numbers. Well, look at Ikarus. Oh, yes, for number crunching, you might prefer Stalin. Even C looks slow when compared to Stalin.
Of course, this shouldn't be the main focus of the community, and I don't even think the language should be designed with speed in mind (well, a little of course, or else we would have dead slow numbers implementd with lists of symbols) but that it should be seriously taken into consideration.
-----
A misplaced focus on speed is bad, and you should get it working before you make it fast, but that doesn't mean speed is a non-issue. If a program is slow enough to cause annoyance, that is a problem, and it should be fixed. Languages have to pay extra attention to speed issues. If programs written in a language are slow because of the language, and not because the programs themselves are badly written, there's something wrong with the language.
Another thing that newbies don't get is that well-built languages are usually optimized so that the more obvious and more commonly-used approaches are actually faster than that tangled mass of "optimized" code you just wrote. Profiling profiling profiling. Don't just guess.
So, the points are: performance should be a secondary concern, but secondary is still pretty high up on the list, and optimization should be based on information gathered with a profiler, not what sounds efficient or inefficient. Sorry for rambling, I hope this post contributes something to something. I just have a tendency to spew everything I have to say about a topic all in one place every now and then, even if only some of it is relevant. I guess you could boil this post down to a "me too", but only if you boiled it a lot.
-----
-----
Well, actually, I'm not sure if implementing a still-designed language in the beta-release of a compiler is a long-term solution... :) Maybe in a few months / years this could be done ? But as for now, I think porting it to CL would be easier.
-----
-----
At the very least, if I write this CL-Arc compiler, I'll be able to write games in Arc using existing CL libraries for SDL.
-----
Okay, okay, I was actually planning to do something like that a long time ago, haha. But I haven't (yet) found any holes in the basic concept of modelling every function call (f x) as a macroexpansion on (__funcall f x), which expands to an (__asm) form with the function call. Then for functions themselves (fn (x) (f1 x) (f x)), transform the last function call to a tail call: (__asm_let x (esp 4) (__funcall f1 x) (__tail_funcall f x)). Stuff like that ^^.
-----
So quite frankly, I have a lot to learn before I'll be able to take on a project like that.
-----
Sure GC is nontrivial, but Boehm's GC is not bad at all. And if you really need continuations/tail recursion than make everything continuation passing style (you'll probably need to anyway). And I'm sure raymyers' treeparse can help in the reader department.
(IMO the difficulty here in itself is probably the assembly code you'll emit for a given piece of code, not reader/GC/conts/tailrec)
Remember, CL is by itself not so similar to Scheme that you can directly use its reader, as well as its execution model, in your final product. You'll write your own Arc reader in CL anyway (in CL 'arc == 'ARC, in arc 'arc != 'ARC), so you might as well (tada!) write it in Arc and compile it down to assembly. You'll need conts and no, you can't trust CL enough to handle tailrecs.
(Hmm. Maybe I shouldn't be advising you, maybe I should be doing this myself to steal your thunder ^^)
-----
(defun read-square (stream c) (declare (ignore c)) (let ((body (read-delimited-list #\] stream t))) `#'(lambda (_) ,body)))
(set-macro-character #\] #'(lambda (x y) (declare (ignore x y)) (values))) (set-macro-character #\[ #'read-square)
-----
Yes, the assembly part of it looks difficult to me. When I look at Arc or Lisp code I don't see any way to translate that to native code. Obviously has been done, I'm just not educated on such matters.
You have a good point about the reader, I should probably add that to my proposal. And writing it in Arc would be an interesting exercise.
And can't I trust CL about tail recursion? Most decent implementations do tail recursion, right? And can't I tell people to stay away from those that don't?
But suppose I can't trust CL to do tail recursion. What am I supposed to do about it?
-----
And neither is Linux completely implemented in C, and C compilers written in C are not completely implemented in C, because bits and pieces of the libraries they link their code to are written in assembly.
> Yes, the assembly part of it looks difficult to me. When I look at Arc or Lisp code I don't see any way to translate that to native code. Obviously has been done, I'm just not educated on such matters.
The Lambda the Lutimate papers are a good place to start if you're interested - they include some hand-written assembly code equivalents to Scheme/Lisp code, largely function calls and prefix/suffix. Given that the most basic axioms of Arc include (fn ...) and a function call syntax, this would be quite of interest.
> But suppose I can't trust CL to do tail recursion. What am I supposed to do about it?
Use 'prog and 'go? ^^ Lambda is the lutimate!!
-----
I found a good link / tutorial about how to compile a subset of Scheme to C language. The compiler is about 800 lines of Gambit Scheme (blank lines included) and even deals with tail-recursion and continuations ! (well, that's based on the lambda papers...)
No GC, but you can use Boehm and have one for free.
-----
The someone-Boehm GC (reportedly) works well with C - I'm reasonably sure that it will work well with assembly.
-----
Anyway, maybe the right way to do so is by destructuring a Scheme implementation ? Starting from a given implementation, you write your compiler from scratch, but use the facilities of the chosen implementation for the reader and the GC. Then, once it's working, you gradually remove the scaffolding by implementing these things by hand...
-----
-----