Planet Parrot

February 08, 2010

v
^
x

Andrew WhitworthGSoC Idea: Parrot + RTEMS

By background I'm an embedded systems guy. In college I spent years focusing on microcontroller systems and FPGA hardware. I'm not going to spend the whole post reminiscing about the "good-ole' days" when I still worked down on the metal, but I do want to express my particular fondness for this next GSoC project idea: The Parrot-on-RTEMS port.

Jonathan Leto has been spearheading the project so far, but it's become clear that there are some major deficiencies in Parrot's cross-compilation facilities to enable Parrot to build and run properly on a real-time OS like RTEMS. Also, there are other problems with Parrot's algorithms that prevent it from meeting strict real-time deadlines.

In a message to the mailing list tonight Johnathan put out an official-looking call for interested participants, and also advertised an extremely informational page on the RTEMS wiki.

I will let the RTEMS page discuss most of the good details about what is needed, so I will cut off this blog post here. But if you're an interested student and want more information about this project, head on over to the RTEMS wiki and get in touch with Jonathan Leto.

by Whiteknight (noreply@blogger.com) at February 08, 2010 20:13 UTC

February 07, 2010

v
^
x

Andrew WhitworthParrotTheory: Threading

Threading is one of those technologies, buzzwords, that is supposed to be the future of computer programming. It's one of those things that is supposed to be all things to all people: improved scalability and performance, better utilization of hardware, and it can even cure the blind. The idea is that threading will allow applications to scale and take advantage of new multicore processors that companies like Intel and AMD are creating. Like any tool, threading can be a huge help to certain people for certain classes of applications. However, threading can also bring with it a large number of problems that actually hamper program performance. To understand this dichotomy it's important to understand what threads are and how they operate.

In a very narrow sense, a thread is an executing context. On a hardware platform like x86 we can define an executing context by knowing the contents of the processor registers, the contents of the stack, and the sequence of machine code instructions that the program is following. The instruction pointer (IP) register in the processor points to the current machine code instruction, and after a command is executed the pointer is updated to point to the next instruction. By taking a snapshot of a context and saving it, we can essentially "freeze" the state. We could save the context to disk, and re-load it later to continue where we left off.

A single-processor CPU core is a very linear device. Machine code instructions are executed in order starting at a start address and following linearly through memory. The one exception to this, branches, allow control to jump to arbitrary addresses where linear control continues. When you look at a modern single-core desktop machine it appears that things are happening concurrently and multiple programs are executing at the same time. This is only a facade. The OS breaks execution into time slices where programs execute one at a time. Moving control from one process to another is called a context switch. Each process is allocated individual slices of time, and if the slices are made small enough and context switches happen often enough, it appears to the user that things are happening together in parallel. It's a play on the limitations of human perceptions.

Threads can be preemptive or cooperative. A cooperative threading system, by far the least common type, passes control to a thread until that thread explicitly relinquishes it. A cooperative system has a significant benefit: the program knows when it's execution will be paused and can therefore avoid much non-deterministic behavior. Alternatively and considerably more common is preemptive multhreading where the OS controls thread context switches. The thread executes without any knowledge of other threads and without any notion of cooperation. At a seemingly-random time the OS halts the thread, saves it's current execution context, puts the thread into a queue, and loads the next thread to execute. This system brings the benefit that programs can be written without any knowledge of theads and still be run on multiprocessing systems. Also, we have the benefit that no one program can "hog" system resources: the OS makes certain that all programs get fair opportunity to execute. On the other hand, preemptive multithreading brings a certain amount of non-deterministic behavior and creates whole classes of problems like memory corruption and deadlocking that do not exist otherwise.

In a preemptive multithreading system (which is the most common and the only type I will consider here), each thread is a structure in the OS. Threads are managed by a process called the thread scheduler. The thread scheduler is typically implemented as an interrupt handler attached to a hardware timer device. When the hardware timer device sends a trigger signal to the CPU, the scheduler is executed. Here is an example of what a scheduler looks like:

interrupt void ThreadScheduler() {
ExecutionContext lastctx = SaveCurrentContext();
ExecutionContext nextctx = GetNextContext();
Schedule(lastctx);
LoadContext(nextctx);
}

When the scheduler activates the current execution context, which consists of the contents of the processor registers and ID of the memory pages assigned to that thread, we load the necessary pages back into memory and continue execution from the point where it was last interrupted as if nothing has happened. Multiple threads can be executing in a single process, in which case they share the memory pages with executable code in them. However, each thread is going to have it's own stack, and likely it's own heap pages as well. This is how multiple threads in a single process are differentiated. The current context is enqueued and the next executing context is popped off the queue.

Because of the necessary operations of saving the current context, enqueuing the current context, dequeueing the next context and loading the next context, threading is inherently slower for linear operations than non-threaded systems are. Threads always carry a performance cost in terms of context switching and a memory cost in terms of maintaining separate stacks and context structures. In addition, the more threads we have, the less frequently each individual thread runs. To understand why, assume we have a system that switches threads 10 times per second. With only one thread, it runs 100% of the time. With 10 threads, each only gets one tenth of every second to operate. With 100 threads, each thread only gets a one tenth-of-a-second opportunity to execute every 10 seconds. These are not necessarily a large cost (in fact in many systems it can be negligible), but it does exist. In exchange we gain the ability to simplify and encapsulate separate tasks, create the illusion of concurrency, and (most importantly for graphical systems) limit the pauses the user experiences while the system is processing another task.

On multiprocessor or multicore systems, we also gain the benefit that threads can run on separate processors, truly in parallel. In this way, as the number of processor cores increases, so too can the performance of an application improve if it uses enough threads to fill those processors. In these situations, a program can maximize it's throughput if it has as many executing threads as there are available processor cores to run them on. Too many threads and we experience costly context switches. Too few threads and processor cores lay unused.

Context switches can only happen at instruction boundaries. This means that an individual machine code instruction is atomic: a context switch can happen between machine code instructions but cannot happen in the middle of an instruction. However, beyond this guarantee there is no way for the program to determine ahead of time where in the program execution these switches will happen. This creates nondeterminism which can cause bugs.

So what kinds of bugs can be created? Let's consider a structure with two integer values which are defined as a psychotic redundancy measure. The two are supposed to always be copies of each other, and if they are ever different the program will freak out and crash. Here's an access routine to change the values at once:

modify_data(my_struct* s, int newvalue) {
s->data1 = newvalue
s->data2 = newvalue;
}

This seems straight forward. Now consider the (admittedly contrived) case where two threads are calling this function with the same structure pointer, and a context switch happens between the two statements. Thread 1 updates the data1 to 1234 and a context switch happens. Thread 2 updates the data1 and data2 to 4567, followed by another switch. Now thread 1 updates data2 to 1234, and the two values are now not equal. The structure is left in an inconsistent state, the program freaks out, the plane crashes, and all the orphan children die in a fire.

To avoid inconsistencies like this we can introduce any number of concurrency lock primitives, such as mutexes, semaphores, spinlocks, critical sections, or whatever else people create for the purpose. It's beyond the scope of this blog to talk about all these things individually, so I might save the discussion for another blog post later. Regardless of the exact method we use, the code now looks like this:

modify_data(my_struct* s, int newvalue) {
lock (lock_object) {
s->data1 = newvalue;
s->data2 = newvalue;
}
}

And the two threads are somehow prevented from both entering the function at the same time. If one thread tries to enter the lock before the other thread has left it, we force an immediate context switch to let the first thread finish and exit before the second thread can continue.

So here's another good rule about threads: In addition to the cost overheads of switching contexts and managing threads, there are also costs involved in managing access to shared resources. All these costs combined can really make threading a performance drain instead of a performance boon. Plus, the need to properly restrict access to shared resources puts a strain on programmers and can increase the number of bugs in programs. Threads are but one tool in the toolbox of a skilled programmer, and should be used with care.

by Whiteknight (noreply@blogger.com) at February 07, 2010 09:57 UTC

February 06, 2010

v
^
x

Jonathan WorthingtonCatching up: two Rakudo Days from December

Today plenty happened in Rakudo land - in fact, it was my most active day's Rakudo hacking in quite a while. colomon++ also made some great commits, and between us a lot of things moved forward today. For my part, hashes and pairs are in much better shape.

I wrote before that I'd got some Rakudo days left to write up; there are two of them, but I'll cover them both in this post, since some of the work crossed the two of them anyway. Here's what I got up to between them.

  • Filled out attribute composition logic for role application. A good chunk of this was written in NQP - in fact, all of the role appliers are. :-) Along the way I brought roles up to speed with the attribute part of the meta-object protocol - I'd forgotten that when doing it for classes, though since we couldn't compose attributes at that time it wasn't so interesting anyway. The end result was that we could pass S14-role/attributes.t again.
  • The specification states that if in a role you do inheritance, then this is just passed on to the class that the role is eventually composed in to, and added to that class's parents. We never had any support for this in master; with a neat meta-model approach it became rather easier to get it in place in ng.
  • Got BUILD/CREATE fixed up a bit and added back support for "has $.answer = 42" style declarations, again through the new attribute sub-protocol.
  • Got us handling non-block where blocks again, and added Block.ACCEPTS back - in Perl 6.
  • We had various "helpers" to let us do some of the low-levelish stuff in PIR. This is mostly for the places where we need those things in place in order to be able to compile the rest of the built-ins that are written in Perl 6. However, a couple of these helpers knew too much about Parrot and too little about the meta-model, which abstracts it away. So, I re-wrote some of those in terms of the meta-model. Much cleaner.
  • Before we relied entirely on Parrot for our "do we do this role" checks. However, given the unfortunate semantic mis-match between Parrot's built-in role support and what we need for Perl 6 (I did try and influence things in a different direction back when we were doing Parrot's role support, but failed), I've been gradually working us towards not relying on those for Perl 6's role support. (In master, it felt to me like we have almost as much code working around the semantics of Parrot roles as we'll need to have to not use them.) Anyways, the divorce isn't quite complete yet, and it's not even a goal for the ng branch. However, I did make a notable step towards it by getting our .does checks implemented entirely in terms of the meta-model. In the long run, I'm hoping we may be able to write the entire role implementation in NQP, which helps with the even-longer-run dreams I have of Rakudo having additional backends. But that's for The Future. :-)
  • Cleaned up and re-enabled sigil based type checking in signatures.

Thanks to Vienna.pm for sponsoring me to hack on Rakudo, not only for these two days, but also throughout 2009!

by JonathanWorthington at February 06, 2010 22:45 UTC

v
^
x

Andrew WhitworthParrot-Data-Structures Benchmarks

When I first conceived of the Parrot-Data-Structures project, I envisioned a place where we could develop performance-optimized PMC types. A part of proving that we've improved performance is to provide benchmarks. So, this morning I went through and wrote up some naive benchmarks to compare several of my new PMC types against the venerable ResizablePMCArray. I didn't compare against the FixedPMCArray because the latter doesn't support push/pop/shift operations and I wouldn't be able to make direct algorithmic comparisons.

I've only put together one benchmark so far for stacks: We push 1,000,000 items onto the stack and them pop them all back off again in tight loops. This forces the stack to resize up to 1,000,000 items worth of storage. The times were small and I could have upped it to ten million items or more, but then we start to see more effects from caching pages to disk and lose insight into the application we are testing.


(FixedPMCStack) Time to push 1000000: 0.0456011295318604
(FixedPMCStack) Time to pop 1000000: 0.0357608795166016
(FixedPMCStack) Time to total 1000000: 0.0813620090484619

(ResizablePMCStack) Time to push 1000000: 0.0498058795928955
(ResizablePMCStack) Time to pop 1000000: 0.0467569828033447
(ResizablePMCStack) Time to total 1000000: 0.0965628623962402

(ResizablePMCStack2) Time to push 1000000: 0.0470800399780273
(ResizablePMCStack2) Time to pop 1000000: 0.0364069938659668
(ResizablePMCStack2) Time to total 1000000: 0.0834870338439941

(ResizablePMCArray) Time to push 1000000: 0.0531971454620361
(ResizablePMCArray) Time to pop 1000000: 0.0347628593444824
(ResizablePMCArray) Time to total 1000000: 0.0879600048065186


I've shown three of my types as they compare to Parrot's ResizablePMCArray type. FixedPMCStack is fixed-size, so it's allocated up-front and does not need to be resized on each push. ResizablePMCStack is a linked-list of mini-array chunks. Each chunk holds 16 pointers, so we can push 16 objects before a new allocation. ResizablePMCStack2 is an alternate stack implementation that I put together today. It uses a flat memory buffer but does geometric reallocations. Starting at 16 objects, every time we run out of space we allocate twice as much (16, 32, 64, etc). Finally is the ResizablePMCArray, which resizes the buffer on each push. This start size and the growth factor can be tuned, though I haven't made any efforts to do that.

FixedPMCStack obviously wins the competion since it only needs to malloc once and never needs to reallocate or free the memory. At this sample size the benefits are not huge over the other types. ResizablePMCArray2 eeks out a win over ResizablePMCArray for this sample size. However, at smaller samples such as 10,000 objects to push, ResizablePMCArray wins instead. I suspect we could tune the size of allocated chunks in ResizablePMCStack or the starting allocations and grow factors of ResizablePMCStack2 to alter these results and the threshold where the first type becomes less efficient than the latter type. As the data set gets larger, the geometric growth of RPS2 takes over and severely decreases the number of allocations that need to be made, while for the basic RPS type, the size and frequency of allocations is constant.

ResizablePMCArray performs well enough in these benchmarks. It does more allocations than any of my types, but uses a relatively efficient flat memory buffer to hold data, so it's not blown out of the runnings entirely.

Now for the FIFO Queue types. For these types, I used 100,000 items in a similar loop (push 100,000 items, pop them all). I didn't use 1,000,000 like I did in the stack tests above for a reason I will discuss below.

(FixedPMCQueue) Time to push 100000: 0.00739598274230957
(FixedPMCQueue) Time to shift 100000: 0.00774002075195312
(FixedPMCQueue) Time to total 100000: 0.0151360034942627

(ResizablePMCQueue) Time to push 100000: 0.0121519565582275
(ResizablePMCQueue) Time to shift 100000: 0.00611591339111328
(ResizablePMCQueue) Time to total 100000: 0.0182678699493408

(ResizablePMCArray) Time to push 100000: 0.00558805465698242
(ResizablePMCArray) Time to shift 100000: 5.47745704650879
(ResizablePMCArray) Time to total 100000: 5.48304510116577


FixedPMCQueue uses a pre-allocated memory buffer, which is a big saver and makes it the obvious winner in the category. However, since it's setup as a ring buffer each push/pop operation requires a few extra pointer checks that the other types don't need to go through. This is why the ResizablePMCArray wins among all types in push performance.

Shift performance is a different issue entirely. FixedPMCQueue again isn't the winner here but it is close. ResizablePMCQueue does just a little bit better here using it's efficient linked-list implementation. Even though each linked-list node needs to be free()'d, the implementation of free() in libc is pretty lightweight compared to the cost for malloc(). In fact, all things considered, it looks from the numbers above that free() is about twice as fast as malloc(), all other things being equal.

Shift is where the ResizablePMCArray type loses it's composure completely. FixedPMCQueue and ResizablePMCQueue both took about 1.5 one-hundredths of a second to complete the benchmark. ResizablePMCArray took about 360 times as much to perform the same operation. And the reason is that it took almost 5.5 seconds just to do 100,000 shift operations. And that's not even the worst of it. RPA.shift() is O(n2) complexity. When I tried to bump this benchmark up to one million items, the benchmark ran for over 10 minutes before I had to kill it with no end in sight. Both my queue types are time-linear, and bumping up to one million items took only 10 times longer for them. Why is ResizablePMCArray O(n2)? Because each shift operation requires a memmove, which loops over each item in the array and moves it to a new buffer. This is terrible and one of the reasons why I started the parrot-data-structures project in the first place.

I plan on adding a few more benchmarks. For instance, rapid-fire push/pop benchmarks that don't cause resizes might be interesting to isolate the per-primative operation cost without considering memory allocation costs. Your average program doesn't need to push one million items onto a stack or queue, of course. And with these benchmarks I'll be able to focus some optimization effort on these types to make them better.

The overall lesson to be learned from this post is this: For stack-like operations the ResizablePMCArray is a decent—though not perfect—tool. For queue-like operations on the other hand, ResizablePMCArray is lousy and should be avoided. At least, it should be avoided until we do some optimization effort to make it better.

by Whiteknight (noreply@blogger.com) at February 06, 2010 20:14 UTC

v
^
x

Jonathan WorthingtonThe importance of a break

Several days before Christmas, encouraged by my mum asking, "when you're going to start your Christmas break", I stopped working and hacking on stuff and started relaxing. Until then, I hadn't realized just how tired I was. I slept quite a few ten hour nights in the following week, and had an enjoyable Christmas break. I'd figured I'd maybe take a week or so's break, and then get straight back to things, but a week on I had no motivation or energy to dig in again whatsoever. So, I decided my break would go on through New Years. New Year's celebrations this year involved curry - something I certainly wouldn't mind it involving again.

Early January brought several days in Sweden, part of planning for an upcoming refactoring of my work/location - there's details on my personal blog, but the short version is that I've accepted a job at a Swedish startup and will be moving there in March. It's not full time, so I'll continue to have time for Perl 6 development. They know about and, happily, are supportive of my involvement in Perl 6 and my continued attendance of Perl conferences.

I spent a weekend in Prague on the way home. I did it by train rather than flying, which was enjoyable. It snowed almost my entire time in Prague, and I caught a cold in the following week, but it was kinda worth it to wander around this beautiful city. Didn't bother studying Czech at all, and sorta got by with speaking Slovak, though some folks heard me speak and immediately concluded English would be easier. :-) Somehow it kinda felt like I was back somewhere I belonged, even though I'd never been there before. I love central Europe, and excited as I am about Sweden, I know I'll miss this part of the world a lot.

Anyway, I eased back into some work in January, but mostly took it quite easy. The happy result is that, come February, I'm finding myself recharged and ready to dig back into things again. I got some nice commits done to Rakudo yesterday, and today I meant to, but instead participated on an interesting thread on p6l and did some other useful meta stuff (like this post). Tomorrow should have plenty of hacking time though, and I'm looking forward to it. I also have a couple of blog posts to do about Vienna.pm-funded Rakudo Days I did in December, but never got around to writing up; thankfully I did make notes on what I did on them. :-) My main focuses from here on will be on:

  • Continuing to get Rakudo's ng branch into shape - we'll make it master soon. A lot is missing, but things are going back fast and often very neatly. It's easy to focus on what it doesn't yet do that master does, but it has many things right that master does not - now including laziness!
  • Finishing up my signatures grant. I really, really want to do that within the next couple of weeks.

Anyway, that's what's been up with me. If you take away anything, it's that you may not realize how much you need a break from something until you take it, and if it's not the only thing putting food on the table, then it's probably better to take the needed amount of break and come back revitalized. I guess the other option is to dig back in regardless, but I suspect that's the path to burnout, something I'm quite keen to avoid.

More technical blabbering here soon. :-)

by JonathanWorthington at February 06, 2010 00:31 UTC

February 04, 2010

v
^
x

Andrew WhitworthIntroducing Parrot-Data-Structures

This week at #parrotsketch the question was raised as to why we had differentiated array-like PMC types at all. Why do we have separate ResizablePMCArrays from ResizableStringArrays and the like? For symmetry, why don't we also have separate PMCHash and IntegerHash types?

After the Array PMC was removed last week, plobsing mentioned that RPA had a complexity of O(n2) for shift/unshift operations while the Array PMC had a much better O(n) complexity for those same operations. I had a few ideas about it, but looking through the code I didn't see any easy ways to really fix the performance problems in RPA. At least, I couldn't think of a way to fix the performance of shift/unshift that wouldn't hurt some other metric. There's a certain trade-off in terms of performance we make when a data type becomes more flexible. Maybe it's good enough to just advertise the fact that certain operations, though available, are known to be sub-optimal.

These two points combined to inspire me to create a new project: Parrot-Data-Structures. Parrot-Data-Structures (PDS for short) will contain a collection of specialized datatypes that are less flexible than the standard ResizablePMCArray is, but which are designed to have particular beneficial properties in return. These beneficial properties may be optimized hot-paths for specific operations or high memory-efficiency for certain types of data sets.

I asked whether these types of PMCs belonged in core or whether I should create a new project. The consensus was that I should start a new project and if the structures were great enough we could consider merging them into trunk.

At the time of this writing I have prototyped three new PMC types:
  1. ResizablePMCStack. This type is a dynamically-sizable stack type that is optimized for push/pop performance. I've implemented the prototype as a linked list of mini-arrays, so we don't need to allocate storage on every push, and don't need to deallocate it on every pop. Because of it's architecture, RPS doesn't offer indexed element access, but it does provide a method to convert it to a normal ResizablePMCArray if indexed access is needed.
  2. FixedPMCStack: This is a stack PMC type that uses fixed-size preallocated storage. Push and pop are going to be faster than even RPS, and because it uses a flat memory buffer indexed element access should be possible at good speeds (though this is not currently implemented).
  3. FixedPMCQueue: This is a standard fixed-size first-in-first-out queue structure. It is optimized for push/shift performance. I've implemented the prototype as a ring buffer, so we can avoid costly memcpy and memmove operations while allowing the storage to slowly snake it's way around in memory. Because it's always moving around in memory, indexed element access is not possible at a reasonable speed, so I probably won't implement it. I have provided a to_array() method to convert it to a FixedPMCArray if random access is necessary.
Next on my immediate TODO list are:
  1. ResizablePMCQueue: I'm not sure how I will implement this one, it certainly can't be a ring buffer like the FPQ type is. It will be optimized for push/shift access like FPQ is, but might take a small hit when reallocations are necessary.
  2. SparsePMCArray: The sparse array works best on datasets where most elements in the set are some default value. Instead of allocating storage for all items, we only allocate storage for the interesting ones. This type is optimized for low memory use in sparse data sets. It's worth mentioning that the Array PMC, which has recently been deleted from trunk, was a sparse array type. However, it was a sufficiently bad implementation that I think it's good to just start over and try again.
Less concretely, I've also considered the benefits of providing some additional types, though I'm not sure how far I want to stretch the scope of this project:
  1. PMCHeap: This would be a binary-tree type of structure internally, keeping PMCs sorted according to a provided sort routine. I'm not sure whether I would want this to be more like a "max heap" or a binary search tree.
  2. ThreadSafePMCQueue: This would be a message queue variant that could be used safely across threads.
  3. PriorityPMCQueue: This is a queue type where each element has an assigned priority. Pulling an item off the queue grabs the highest-priority items first. Items of the same priority are retrieved in FIFO order.
  4. PMCGapBuffer: This type is optimized for fast insertions/deletions around a point of focus. Think about a text editor where most text changes happen right near the cursor. In the gap buffer, we would first set a "cursor point", and then we would be able to push/pop from the left side of that cursor, and shift/unshift from the right side of that cursor.
In addition to these potential types I mentioned, there are any number of searchable/sortable data structures we could implement to facilitate fast searching in a collection of elements that aren't keyed. Things like skip lists, jump lists, or Splay trees each have interesting properties that would allow searching for items in a collection in O(log n) time. If people express an interest to me about these kinds of types I might look into writing out prototypes.

These prototypes are all designed for PMC storage currently, though I plan to add similar types for INTVAL, STRING, and FLOATVAL too. This is a lot of PMC types, to be sure, but the specialization could be a huge benefit for applications that need a particular type to perform better in some metric than the Parrot core types can handle. I haven't looked into building or testing these types quite yet, but I have an idea that the build will produce several library files with different combinations of types in them. This way projects can only load the few types they need.

An ultimate goal of this project is to help out Parrot's core. There are two ways this could happen: First, the Parrot team could decide that we don't want a ton of performance-specialized or type-specialized array types in Core, in which case add-on projects like this will be necessary to provide that functionality with good performance. Second, the Parrot team might decide that these types have value and they could be added to Core later.

I actually like the second development path in general: It's a way we can prototype cool new features. Also in the second case, I don't need to worry about the scope of this project expanding too far, because it could always be considered an incubator for interesting, new PMC implementations.

If anybody reading this blog is interested in participating, I've got commit bits aplenty to hand out. If you want to participate in the fun, please let me know.

by Whiteknight (noreply@blogger.com) at February 04, 2010 19:03 UTC

February 02, 2010

v
^
x

Andrew WhitworthParrotsketch meeting has moved

The weekly Parrotsketch planning meeting has moved to 20:30 UTC from it's traditional timeslot at 18:30 UTC. For a quick reference, here is the new time for people in this random smattering of timezones:
  • 15:30 USA Eastern (EST)
  • 14:30 USA Central (CST)
  • 12:30 USA Pacific (PST)
  • 23:30 Moscow, Russia (MSK)
  • 07:30 Sydney, Australia (EDT)
  • 05:30 Tokyo, Japan
  • 04:30 Beijing, China

by Whiteknight (noreply@blogger.com) at February 02, 2010 09:25 UTC

January 30, 2010

v
^
x

Andrew WhitworthGSoC Idea: Implement a new GC

Blah Blah Blah Parrot needs a new GC. Blah Blah the current GC is lousy Blah Blah.

I've said it all before. At length. Ad nausea. Hell, I even attempted this same GSoC project myself two years ago.

The garbage collector is one of those systems that absolutely effects almost every aspect of Parrot's operation and performance. A bad GC means a slow, obnoxious VM with pauses and huge memory consumption. Of course, there is no one single "best" GC algorithm to use that provides optimal behavior to all classes of programs. This is why Parrot was designed to have a pluggable GC system where the most pertinent GC from among multiple options can be selected.

At least, that's the theory.

Parrot really only has one GC right now and it's a very simplistic Mark and Sweep collector with some less-than-desirable properties. bacek has been making good progress on porting the Boehm GC to Parrot, but that has it's drawbacks as well (though drawbacks are being mitigated). What we need is more people working on it, more people trying new things, and more eyes on the code.

So what kinds of GC could Parrot use? The projects page on the wiki describes three types that Parrot is specifically interested in:
  1. Generational
  2. Incremental
  3. Concurrent
These adjectives are not orthogonal, you could easily have an incremental-concurrent collector, or a generational-incremental collector, or even a generational-incremental-concurrect collector. But, these three things tend to have nice properties.

Generational collectors tend to have nice throughput because we subdivide the memory space and only take the time to mark and sweep a subset. Incremental collectors break the mark and sweep phases up into smaller increments which in turn decrease pause times. Concurrent collectors utilize multiprocessor systems to operate the GC in parallel with the program code and therefore experience no pauses and high throughput (but lousy performance on single-processor systems, or multiprocessor systems with many other running processes).

The aspiring GSoC student working on GC will benefit from two years of hard cleanup and refactoring work from myself and other contributors. Plus, there is strong community support to get a new GC up and running as quickly as possible to replace our current one. It's a project that will be tough no matter what, but where there are huge gains to be made for the Parrot community and a large amount of available support.

As I mentioned above, the GC system is supposed to be pluggable. So in reality you don't need to implement a great GC that will solve all our problems, you just need to implement any GC to prove that the system is, indeed, pluggable. If the GC you implement has some nice properties that's just a nice bonus.

by Whiteknight (noreply@blogger.com) at January 30, 2010 08:00 UTC

January 29, 2010

v
^
x

chromaticPerl 6 Design Minutes for 27 January 2010

The Perl 6 design team met by phone on 27 January 2010. Larry, Allison, Patrick, and chromatic attended.

Larry:

  • tweaked definition of when a series operator is considered infinite
  • nailed down more list assignment semantics with respect to interators
  • clarified how ($a, $b, @a) = 1..* works
  • KeyWeight deletion criterion kept consistent with other KeyHash types
  • negative keyweights are allowed to fail at pick time
  • "mostly eager" now assumes unknown closure generators are probably infinite
  • random whackage on List, Seq, Parcel, Capture, Iterator, Nil etc.
  • List is now simply the iterator role, and doesn't do Positional
  • Seq takes over Positional duties for reified (or reifiable) value lists
  • think of Seq now as a constant Array (but also lazy like Array)
  • Iterable now means you can ask for an iterator, but doesn't do List
  • Array, Seq, etc do Iterable, but not List
  • only actual iterators do List
  • Nil is defined as a suitable sentinel for both list and slice iterators
  • continued to rethink that with pmichaud++ et al
  • we'll probably end up with an EMPTY special exception object to be the iterator sentinal
  • proposed an E operator to go with it to make testing for EMPTY across multiple iterators very fast
  • other than that, mostly just bug whacking, no major refactors
  • still thinking about doing real LTM for STD
  • did lazify Cursor's fnum->fate translations for shorter LTM candidates in preparation for smarter LTM
  • we don't need special objects for the items that get matches
  • we do need to think more about the hyper cases
  • how to do list processing using balanced trees of delegated sub refs
  • don't want to build in serial assumptions where we don't need them

Patrick:

  • made the Rakudo #25 release last week
  • it was much easier to make the release than explain what we were planning to do instead
  • also working on iterators and lists
  • NG branch is blocking on that
  • worked on the design in my head for three weeks
  • realized that we were doing iterators completely wrong the other night
  • Larry's making some useful changes to the spec in response
  • there are still some unclear spots in the spec
  • we need an implementation to figure those out
  • my biggest question is the relationship between List, Parcel, Itertor, and array
  • as of this morning, I think I have it
  • that code seems to be working and efficient
  • so far it's working well
  • continuing with that
  • wrote a very short range iterator prototype that colomon has used
  • also write a map iterator that works
  • coming up with examples for the zip operator was nice
  • good ideas for what we need to be able to do
  • objects that can iterate have a .iterator() method
  • to interpolate that into a list, .list() returns a flat Parcel for that iterator
  • Parcels know how to generate Iterators
  • those know how to handle Iterators of Iterators
  • I suspect that's how we do hyper iteration
  • change Parcels to understand that
  • adding pieces back into the ng branch
  • next I have to fix slurpy parameters
  • many of our builtins need that
  • need to figure out Jonathan's code to do that
  • after that, I'll do arrays
  • that should remove the blockers on the ng branch

Allison:

  • working on Pynie
  • Francois has helped greatly to update it for Plumage

c:

  • still working on the TT #389 fix
  • think I have the right design, just need time to implement it
  • working on a potential new time for #parrotsketch

Allison:

  • thinking about hackathons
  • would be nice to have a Rakudo hackathon at YAPC::NA

c:

  • Parrot will come up; didn't it come up about half the time last year?

Patrick:

  • it was all Parrot

Allison:

  • you'll have an influx of Rakudo interest two months after Rakudo Star

Patrick:

  • probably will have one before then
  • but can tell people "Go to YAPC; we'll show you how to help in person there"

by chromatic at January 29, 2010 21:00 UTC

v
^
x

Andrew WhitworthGSoC Idea: Asynchronous IO System

I talked to Coke briefly yesterday evening about GSoC. I hadn't heard specifically that Parrot would be applying for GSoC again this year and even though I was hopeful I didn't want to presume. He didn't have a firm answer about whether Parrot was planning to apply as a mentoring orgnization, but he did say "I think we'd be crazy not to". That's good enough for me.

There are a few topics that I've covered more than others on this blog. One of them has been asynchronous IO. I've covered that topic at length, and was hoping to have the time to implement the system myself. Regrettably I don't think I will have time to do it before summer, which means it's ripe for prospective GSoC students to gander at.

Parrot has a basic, synchronous-blocking IO system right now. This is all well and good but let's all face facts: IO tends to be among the slowest operations that a program performs. IO represents a huge bottleneck, especially for certain types of IO-intensive programs. To alleviate this, Parrot's design documents describe an asynchronous system where IO operations are performed in the background while the program itself can continue performing non-IO operations at the same time.

I'm purposefully conflating ideas of "asynchronous" and "non-blocking" ideas, though strictly speaking there are differences between the two. What Parrot really needs is an "asynchronous non-blocking" IO system, which I will just call "AIO" for short.

In an AIO system, IO requests are dispatched to the operating system and program execution immediately resumes. The dispatch operation returns a request object. There are two ways to keep track of the IO request, if you need to keep track at all: First is to poll the request object to see if it's completed (or had an error). The second is to provide a callback with the request and the OS will execute the callback when the request has been satisfied (again, with error information if the request was not successful).

A current Parrot print operation looks like this:

print "Hello World\n"

An AIO implemention would add new forms like this:

request = print "Hello World\n", callback

This presents a lot of flexibility, and could potentially improve performance significantly for IO-intensive programs.

The IO system could use a few miscellaneous cleanups but is generally not in bad condition. A good implementation of AIO will be well-integrated with Parrot's event scheduler, and possibly with with the threading implementation as well. The scheduler is in good condition but the threading system is not. The aspiring AIO developer may find herself making fixes to any of these systems along the way. If AIO sounds like the job for you, you may want to start reading through the source code now so there are no surprises over the summer.

by Whiteknight (noreply@blogger.com) at January 29, 2010 14:24 UTC

v
^
x

Andrew WhitworthGSOC Idea: Strings and NFG

Unicode is a messy thing. You wouldn't think that something so trivial as representing all the text in all the languages in all the world, keeping track of the relationships between various related symbols and characters, and creating a clear delineation between languages and dialects would be such a hassle, but apparently it is. Pause for a second while I look for a decent way to signal my sarcasm.

Unicode text doesn't have a single format. In fact, the same string of text can be represented in a number of different normalization forms. These normalization forms can optimize for byte-width by combining character codes with modifiers like accents into a single symbol, or breaking these complex characters into a sequence of a character and modifiers. Current Unicode normalization forms are called NFC, NFKC, NFD, and NFKD.

The Parrot design documents speak of a new, special normalization form that can be used internally to Parrot to implement some of our internal string algorithms. We call this new normalization form "NFG" because it normalizes over individual graphemes. In NFG, each grapheme (which is a fancy word for a single symbol) corresponds to a single sequence of characters and composing modifers. In other normalization forms these sequences don't need to be unique.

NFG has some interesting properties that make it useful internally to Parrot as an intermediate form for string operations. Being able to convert strings from multiple encodings and charsets into a single unique character sequence could be a big help in a lot of ways. At the moment when two strings are combined together, Parrot tries to convert directly from the more restrictive encoding/charset to the less restrictive one. For N different encoding/charset combinations, we have potentially N2 such transformations. This is non-ideal.

With NFG we only need 2N transformations: One to convert each encoding/charset combo to and from NFG. Strings are converted to NFG, composed together, and then converted out into whatever format they are needed in. It's really a nice system on paper and I have high hopes that it would be a big benefit to Parrot in terms of scalability and performance.

Parrot needs two things to satisfy the letter of PDD28: A comprehensive string subsystem cleanup and refactor, and implementation of the new NFG normalization form. The later is probably more suitable for a GSoC project, while the former would be a good job for a prospective student to start looking at now to become familiarized with the system.

by Whiteknight (noreply@blogger.com) at January 29, 2010 14:20 UTC

January 28, 2010

v
^
x

chromaticPerl 6 Design Minutes for 20 January 2010

The Perl 6 design team meet by phone on 20 January 2010. Allison, Patrick, Will, and chromatic attended.

Allison:

  • did distro testing on Ubuntu and the Mac
  • set up a Hardy chroot; Parrot works just fine there
  • have a feeling we missed some deprecations, but we have a lot in there
  • enough to work on for three months

c:

  • I added the STRING idea, to give us the possibility of the value semantics change

Allison:

  • less stressful to have three months at a time
  • otherwise working on class assignments

Patrick:

  • family illness knocked me out for a few days
  • we'll postspone the January release for up to a week
  • going to make the Rakudo-ng branch the master branch
  • that won't take more than a week
  • we'll release by Thursday of next week
  • also need to make the -ng branch build with Parrot 2.0.0
  • need to merge some outstanding patches to make that work

c:

  • are you going to stick with 2.0.0?

Patrick:

  • unless we need a change in Parrot that we can't live without, yes

Will:

  • we could do a point release if you need one

Patrick:

  • that's up to Parrot
  • from Rakudo's perspective, that's not terribly important
  • we're shifting everything around for the -ng branch
  • we'll definitely stick the February release to Parrot 2.1
  • I'll post messages to the list about the new release plan shortly

Will:

  • working on the one_make branch in Parrot
  • trying to mark dependencies properly in a single Makefile
  • get some of that out of the configure system
  • we should be able to merge to trunk in a day or two
  • there's still more work to do, but we're at a merge point soon

c:

  • released Parrot 2.0.0 yesterday
  • sending out release announcements soon, but the code is out
  • Stephen Weeks helped me fix up PGE not to fetch methods from namespaces
  • should be able to merge the TT #389 fix branch to trunk very soon
  • will take a look at other Rakudo blockers after that

by chromatic at January 28, 2010 00:10 UTC

January 27, 2010

v
^
x

Andrew WhitworthGSoC Summer 2010 Projects

I saw an email today from Leslie Hawthorn, the program director for Google's GSoC program. I have not heard official confirmation from anybody, but I can't imagine that Parrot won't apply to be a mentoring organization this year. We've had such good success in the past with it, and so many of the students (including myself!) have stayed on to become long-term productive community members.

Mentoring organizations need to apply by the beginning of the March. Students can begin submitting their applications by the end of March. Before that, assuming Parrot gets accepted as a mentoring organization, I think we should have a few ideas for projects prepared. Of course a student is free to come up with any proposal they want, but I think it's good to have some ideas floating around to inspire people beforehand. Parrot has a wiki page devoted to project ideas like this, though that page is a little threadbare right now. Any ideas I get will be cross-post to that page as well.

I'm going to try to post some good GSoC project ideas on this blog in the coming weeks. If you have a good idea for a project, let me know and I'll write about it. If you're a student and want more information about a particular idea, get in touch and I will do the best I can to steer you in the right direction. Let's hope that this summer turns out as good as previous ones have!

by Whiteknight (noreply@blogger.com) at January 27, 2010 08:00 UTC

January 26, 2010

v
^
x

chromaticPerl 6 Design Minutes for 13 January 2010

The Perl 6 design team met by phone on 13 January 2010. Larry, Patrick, and chromatic attended.

Larry:

  • made constant declarations more consistent with type declaration syntax
  • removed various spec fossils regarding the old :by modifier
  • reworked KeyHash docs to make the semantics clearer
  • refactored regex AST methods out of Cursor
  • symbol table files are now compiled into their own subdirectory
  • STD can now use modules defined in the test suite
  • testing STD against the test suite now produces many fewer warnings about missing modules
  • STD now specifically disallows forms like :!foo(0) and :5bar[42] that supply unexpected args
  • STD now tracks 'of' types in declarations and prevents spurious 'of' types
  • STD treats anon enums, subsets, and constants etc more consistently
  • removed old type slot from constant declarator, now uses more standard type slots
  • random bug fixing
  • errors with expectation lists are less noisy, no longer reporting lookaheads and whitespace

Patrick:

  • lots of thinking
  • looking at Rakudo-ng today
  • plan to do lists tomorrow
  • want to get those out of the way
  • Jonathan and I think we can merge it for the January release
  • should know more after the weekend

c:

  • working on TT #389
  • :method should not add entries to NameSpace
  • PGE/TGE have problems
  • will not land for 2.0
  • may add (and immediately deprecate) an experimental op to help migration

by chromatic at January 26, 2010 22:54 UTC

January 25, 2010

v
^
x

Andrew WhitworthThe Problem with PIR

I've not liked PIR for a long time now, but I've never quite been able to put my finger on the reason why until now. It's a reasonably nice language to use for low-level interaction with Parrot and it's certainly more friendly than most other assembly-level languages that I've used in the past. So what's my problem with PIR?

Let's spin the question around and first ask this: What is the purpose of PIR?

When you talk about an assembly language, you tend to talk about a human-readable language that has a 1:1 relationship with the underlying machine code. On an x86/Windows computer for example I can write MASM code, compile that into machine code and then disassemble that back into MASM code that would be nearly identical to the original input. I can use a simple table lookup to convert between assembly language mnemonics and binary machine code words. This allows, among other things, live disassembly of code in a debugger and, within reason, in-place code modification. OllyDbg is a great program in this regard and was one of my favorites back when I was doing more work in this area.

PIR is not an assembly language. It's close, but if you read my paragraph above and take that as a strict definition of what an "assembly language" is, PIR is not that. PIR does not have a 1:1 relationship with the underlying bytecode, and it's not as simple as a table lookup to convert between the two forms. It's not significantly more difficult, but it is not strictly so easy either.

Another benefit to assembly language is that they tend to be easy to parse. Ridiculously easy, in fact. Almost all assembly languages use this format for their instructions:

<label>: <mneumonic> <arg>, <arg>, <arg> ...

If all your statements take this form (where the label is typically optional and there can be a different number of args, depending on the particular mneumonic), parsing is painfully simple. And fast. Here's pseudocode for a basic assembler using statements of this form:

Statement * stmt;
while(! file.end_of_file()) {
char * line = file.readline();
byte[INSTR_LENGTH] machinecode;
stmt = parse(line);
if(stmt.has_label)
save_label_address(stmt.label);
machinecode[0] = mneumonic_to_machine_instr(stmt.mneumonic)
for (int i = 0; i < stmt.args.length; i++)
machinecode[i+1] = convert_to_machine_addr(stmt.args[i]);
}
update_jump_addresses();

This code example calls several simple, single-purpose helper functions which I won't discuss in detail here but I could explain if any of the readers were interested. The point I am trying to make is that we could make a reasonably fast, stable assembler for a PASM-like assembly language if we optimized for parse-ability. The total parser could be within 300 lines of code and be completely readable and straight-forward.

This assembler, of course, wouldn't provide hooks for things like optimizations but that's the point exactly. What comes out of this assembler would have an exact 1:1 relationship with what went into it. We could make a disassembler that had almost exactly the same program structure too, and do complete back-and-forth translations of the code losing nothing but variable and label names, comments and whitespace.

And herein lies an interesting dichotomy: assembly language is the lowest-level interface to the machine, but is almost completely disregarded in favor of coding in C or C++. Most programmers don't want to program assembly. Most programmers shouldn't. I can entertain arguments about that point in the comments or even address them in a different post. Assembly language can be a huge waste of time, a huge headache, and your bug rate will tend to be higher than if you used a higher-level language. If the computer that you are using right now has software which represents over a billion lines of source code total, I would estimate that less than one ten-thousandth of one percent of that code was written directly in assembly language. Your bootloader is probably entirely written in assembly language, and small snippets in your kernel and device drivers may have been as well, but that's it. Almost nobody else uses assembly for anything because it doesn't make sense to do it. For almost the same capabilities and almost none of the same headache (or, different but less severe headache) people use C or C++ instead for low-level applications.

I've digressed, but I think some readers are going to understand my point. PIR isn't really an assembly language; it's had too many additions and syntax "improvements" to aid in readability and usability for the developer and has lost the intimate relationship with the underlying bytecode representaton because of it. At the same time, it's not really not an assembly language either: It's not as high-level, comparatively, as C is on the hardware machine and doesn't really provide the same kinds of structures or paradigms as a normal structured programming language would do. It's nice for the programmer, but not nice enough. Plus, programmers don't want to be using an assembly language anyway. Why pretty up an assembly-level language for developers in the first place if developers don't want to use assembly-level languages anyway? Why not just give them a low-level structured programming language that presents structures and idioms that they will be more familiar with? Sure, PIR offers a large series of complex and limited macros to simulate high-level control stuctures (which, by the way, are error-prone on many levels and difficult to parse), but why not give developers a real language with those structures properly integrated into the language from the beginning?

So what's the alternative?

NQP seems like a decent language to jump in and serve as the "system language" of choice for most developers on Parrot. Because of it's use in PCT, NQP does enjoy a certain amount of "market share" among developers and that kind of installed base is hard to ignore. Also, it's well-documented. I'm a little bit mistrustful of NQP because it is tied very closely to the Perl 6 language and doesn't necessarily represent the needs of Parrot developers in an agnostic way. I don't have any particular complaints, just a general mistrust of relying on software that has a different motivation than what I need. Also, NQP isn't even hosted in the Parrot repo, though the newest versions of it are copied to the repo for releases. A good systems language, in my opinion, has no motivation besides effectively bridging the gap between the raw power of the machine and the friendly abstraction needed by the programmers. I have high hopes for Austin Hasting's Close language, though that project appears to be on hold for now while he works on other foundational things. We want something simple but structured, with access to the intimate details of the Parrot VM but with a layer of abstraction to help keep programmers from getting lost in the arcana and minutia.

What I would like to see on the road to 3.0 are these things:
  1. PIR should really be deprecated as the primary interface to Parrot. To program on Parrot you should either use PASM (typically as the raw text-based output of automated code generators), NQP (or another similar system language), or one of the HLLs. PIR does not satisfy a useful niche, it has been thrust upon people as the way to program on Parrot when it doesn't have any particular merits in this regard.
  2. PIR can continue to exist, but as a separate compiler and not a part of Parrot directly. PIRC should be setup as a stand-alone compiler tool that converts PIR->PBC. The Parrot executable should have the ability to load this as an HLL compiler, but Parrot should natively act on PASM and PBC only.
  3. PASM should be fixed to accurately represent the Parrot bytecode format in a 1:1 way. It's not far off from this, but a few fixes are needed. PASM should also not have things like macros or any kind of "syntax sugar", and should have limited assembler directives.
  4. PCT should be updated to directly output PBC or, barring that, to output PASM instead of PIR. Of course, once we have PASM output, it's a matter of a simple table lookup to convert that into PBC. If PASM is the primary human-readable interface to Parrot, and if we have simple, fast routines for converting PASM to PBC on the fly, more programs and utilities will output PBC directly. This, in turn, improves performance for everybody.
  5. Extend NQP or whatever systems language we want to push as the default to have access to all or almost all capabilities present in PASM. Possibly give the option to inline small PASM sequences easily.
There are lots of benefits to doing this, I think: Decreasing developer confusion, increasing performance, and gaining momentum to build languages that people actually want to use and have fitness for particular purposes, etc. I think it's a really good idea and I would like to hear feedback about it. I say we let the assembly languages (PASM) be assembly languages which are intended to be close to the machine, and that we give developers structured programming languages by default that people will actually be comfortable programming in. It seems win-win to me.

by Whiteknight (noreply@blogger.com) at January 25, 2010 12:12 UTC

January 24, 2010

v
^
x

Andrew WhitworthGC Gets Kick Start

Parrot's garbage collector is starting to really get a lions share of developer attention recently, especially after some very interesting benchmark statistics from chromatic went public. For the benchmark of building the new NQP-RX project, a whopping 80% of execution time is spend in the GC mark phase. Actually, the real statistic is that 80% of the execution time was spend only in the Capture PMC's mark routine (and the functions that it calls). That's quite a huge amount of time, even for a naive GC like ours to spend.

Let's take a quick recap about GCs, and what causes them to be so expensive: GC is used to find and automatically reclaim unused data items so that the storage space can be reused. A good GC system means that the system programmer does not need to manually free memory when it is done being used: The GC will detect and automatically deallocate the unused memory. A good GC system, in essence, will completely eliminate memory leaks, and help to make the codebase much more clean and succinct. To do this, GC needs to first find all unused ("dead") objects and then free them, two phases known as mark and sweep.

In a naive implementation of a mark and sweep algorithm, there are two aptly named phases: mark and sweep. The mark phase is charged with finding dead objects to reclaim. We typically do this in a reverse order, by first finding all objects that are in current use ("alive"), and then declaring all other objects to be reclaimable (or already free). Starting from a root set, such as the register sets and interpreter globals in Parrot, we can construct a graph of all objects by following pointers and marking each reached object as alive. It stands to reason that if there is no pointer to a particular object, it cannot be accessed, and anything that we do not access during the mark is presumed unreachable, which is the same as dead.

In the sweep phase, we must iterate over the pool of all objects, finding objects marked dead and freeing them. Freeing an object typically involves calling a custom destructor if one is provided, and making the memory available to the allocator so the memory can be reused the next time an allocation is made.

In every mark and sweep GC collection run for a naive collector, we must first trace the entire memory graph and then iterate over the entire object pool. This is very expensive, and the expense grows large as the memory use of the program grows large. What we need for Parrot is something a little bit less naive.

What we probably can not do is make huge conceptual improvements to the idea of mark and sweep: We will always need to detect alive objects, and we will always need to traverse and free the dead ones. The general idea is sound and that's not something we want to change. What we can do, however, is to impose heuristics on the system to decrease the number of items to mark and decrease the number of objects to sweep. This is where the bulk of GC performance improvements can be made, by being much smarter about how the GC is used.

Allison sent a nice email to the list the other day essentially saying that GC has become an officially-recognized pain point and that we as a team are going to be looking at improvements after 2.0 (if we don't manage to start before that). Very interesting discussion has already started on ways to improve it.

As I mentioned above, the bulk of GC performance improvements are made by applying heuristics to decrease the number of objects to mark and sweep. A secondary set of improvements can then be made, often at the code level, to make the GC's operations run faster. I'll call the first set of improvements "algorithmic", and the second set "implementation". So what we the Parrot developers need to do first is pick the right algorithms to use and then implement and optimize them.

Here is a general list of things we can do to improve GC performance in Parrot:
  1. Allocate fewer GCable objects. This is typically the result of user-level code optimization. So, Parrot needs optimizers that are GC-sympathetic. Parrot also allocates a number of STRINGs and PMCs for internal purposes, so we need to minimize that.
  2. Mark fewer objects. This comes from a good generational GC system where we segregate items based on how stable they are.
  3. Sweep fewer objects. I think chromatic's linked-list idea will help us significantly in this regard.

What I think we are leaning towards in Parrot is a system called a generational GC. A generational system uses the heuristic that items which have lived for a long time without being GC collected will tend to stay alive longer, and items which are recently allocated tend to die quickly. It's an acknowledgement that a lot of garbage is created for very short-term uses, and relatively few things stand the test of time. Here's a quick example using explicitly non-idiomatic Perl 5:

my @array = fill_array(100); # 100 items in the array
foreach my $item (@array) {
my $new_item = mangle($item);
say $new_item;
}

In this loop we create a lot of garbage. Every new instance of $new_item is a new collectible item which can be declared dead at the bottom of the loop and allocated anew at the top. Also, all the local variables used inside the mangle function follow the same life cycle. The only items that survive through the entire snippet are @array and it's contents.

Every time we mark, we have to mark @array and all it's contents, even though they are long-lived and will survive the entire loop. Every time we sweep we need to separate dead items $new_item and all the local variables created inside mangle() from @array and its persistently live set.

Generational GC works by saying that @array is long-lived and putting it into an older generation. Older generations contain objects which are, by definition, older and therefore less likely to die. If we aren't worried about the item dieing, then we don't need to explicitly mark it. At least, we don't need to mark it as often. We also don't need to sweep it, if we can find a good fast way to avoid that.

by Whiteknight (noreply@blogger.com) at January 24, 2010 19:33 UTC

v
^
x

Andrew WhitworthParrot Developer Meeting Yesterday

Yesterday was the big online Parrot Developer Meeting, which I mentioned briefly last week. The idea was to have a meeting similar to the large Parrot Developer Summit from 2008 to reevaluate our long-term development roadmap and make sure we were on the right path for 2.0 and beyond. I had been one of the more vocal people saying that our roadmap up to that point was incomplete, outdated, and not reflective of our recent development priorities, so I'm particularly happy that this meeting was held.

It was a very productive meeting too, although due to time constraints and internet connectivity issues I wasn't able to participate as actively as I wanted. The meeting started with a 30-minute project retrospective lead by chromatic, a statement of our short-term project goals, some discussion of changes to the support policy, and then an item-by-item review of some of our existing and new roadmap items.

Short-Term Goals

Rakudo* ("Rakudo Star") is due shortly after Parrot's 2.3 release, and is going to be a major milestone for both projects. James Keenan suggested, to much agreement, that Parrot should be focusing almost single-mindedly on the needs of Rakudo* between now and then to help make that release as successful as possible. The desired return on investiment is that the success of Rakudo* will help to spur on increased interest from new and existing HLL developers, and demonstrate the Parrot is a viable platform to host these compiler projects.

Among the major needs of Rakudo* are improved Parrot performance. Specifically, optimizations for PCC and an overhaul of the GC were both mentioned as paths towards this goal.

Support Policy

Several days ago I sent an email to the list proposing that we rewrite our support policy to alleviate some of the problems people have been having with our long deprecation cycles. While these long deprecation cycles are good for stability and developer piece-of-mind, it has been a strong negative influence on development speed. Top this off with the fact that many of our systems are still immature and in need of major overhaul, and we run into some serious problems.

My suggestion, while a catalyst for discussion, was not accepted. What we have decided to do in it's stead is to shorten our deprecation cycle by adding more supported releases. Parrot will now have a supported release every 3 months instead of every 6. So next year we will support releases 2.0, 2.3, 2.6 and 2.9. Hopefully this improved turnaround time will alleviate some of our issues and increase the speed of development.

Roadmap

We did a complete item-by-item review of our roadmap items, and assigned specific releases when we would like to have each feature in place. Here is a quick listing of some of the major items on this list:

  • Improved GC (2.3)
  • Fixed L-Value semantics (2.6)
  • Overhaul NCI (2.6)
  • Asynchronous IO (2.9)
  • Concurrency (3.0)
  • Strings overhaul, including immutable strings (3.0)
  • Lorito (3.0)
  • PIRC (3.0)
  • JIT (3.3)
I personally suspect that some of these numbers will be rearranged in practice (AIO and Concurrency are going to go closely together, I predict Strings will become a pain-point sooner than 3.0, PIRC is going to be affected by Lorito in unforseen ways, etc), but overall it's a decent and well thought-out list, and I won't nitpick it. We would be lucky if we could get even half of these things done before 3.3, but I hold out hope that we could do better than that. I am especially hopeful considering the way our development team is steadily growing.

So that's my quick recap of the developers meeting. I'm planning to read back over the transcripts, and I'll write more posts about any topics that I find to be of particular interest.

by Whiteknight (noreply@blogger.com) at January 24, 2010 19:33 UTC

v
^
x

Andrew WhitworthParrot 1.9.0

Parrot 1.9.0 "Blue-Fronted Amazon" was released yesterday, courtesy of Gerd Pokorra. As always, Parrot showed up on time and under budget, a rarity in the software world. This is the last development release before the big 2.0 milestone on 19 January 2010. chromatic will be the release manager for 2.0, followed by darbelo (2.1, 16 Feb) and then cotto (2.2, 16 March). Probably around January I will put out a call for release managers in April and May too.

1.9.0 was a relatively conservative release in terms of entries in the changelog. Comparing the release announcement for the past month doesn't show the same number of high-profile projects that previous months had. Part of the reason for this is the ongoing focus on testing and optimization. These things are great for the project but tend not to pad the changelog too much. I also think that many of our core developers are starting to focus energy on other ecosystem-related projects: compilers, libraries, and utilities. After all, Parrot is an interesting project by itself but all the various projects that it enables are even more so, which is a major draw for our developers. Success begets success, so the rapid proliferation of these side projects are just as important to the overall success of Parrot as a whole.

I expect 2.0 to be similarly conservative with much of the development team focusing on fixing bugs, improving documentation, and expanding test coverage. There are some big changes brewing that could land before 2.0, however, so it might turn out to be a big release indeed.

by Whiteknight (noreply@blogger.com) at January 24, 2010 19:33 UTC

v
^
x

Andrew WhitworthParrot Test Matrix

In anticipation of Parrot's 2.0 release this thursday I've been going crazy trying to run tests on as many platforms as possible. Since I've started using VirtualBox I've been able to dramatically increase the number of platforms where I can do testing. Here is a matrix of all the various platforms where I've done tests, and what the results of each are.

x64 Ubuntu 9.04
Ubuntu 9.04 is my primary operating system that I use on my personal laptop. Due to problems with the upgrade I will not be switching it to 9.10 any time soon. I used to have a second partition with 64-bit Windows Vista, but after my last reinstall I wiped that partition out. Any Linux variant is going to have decent support for Parrot. Many of Parrot's developers and current end-users use Linux, so we are typically able to keep it running smoothly there.

GCC 4.3.3: Nothing to see here, GCC builds always just work. This is probably because a majority of Parrot developers and testers are working on some combination of Linux/GCC. Test failures on GCC, when they appear at all, tend to be short-lived and quickly fixed. My most recent test runs have shown no errors here. Result: No Failures

perl Configure.pl --cc=gcc --link=gcc --ld=gcc


G++ 4.3.3: The G++ builds almost always perform as well as vanilla GCC builds do. The primary difference to worry about are the few new keywords that C++ defines but C89 does not. Variables called "new" are a great example of something that breaks the G++ build. Assuming the software builds with G++, it typically passes tests just as well as the normal GCC build does. My most recent test runs have shown no test failures here. Result: No Failures

perl Configure.pl --cc=g++ --cxx=g++ --link=g++ --ld=g++


Intel C++ 11.1: Intel offers it's C++ compiler, ICC, free for non-commercial use on Linux. It's a bit of a pain to install and it isn't open-source, but it's a good compiler nonetheless. By default ICC outputs many helpful warning messages about troublesome code that GCC ignores. Also, ICC has some extremely powerful and aggressive optimizations that can make Parrot noticeably faster than a GCC build in some benchmarks. Some test failures do show up in the ICC build, but all of them that I have seen involve incorrect handling of "negative zero". I've posted a question to the list about these errors and didn't see any definitive replies that would have helped me fix these tests. So long as the software you are writing does not depend on proper handling of signed zero, this shouldn't be an issue. Result: 5 Test Failures

perl Configure.pl --cc=icc --link=icc --ld=icc


Clang 1.1: Clang isn't packaged on Ubuntu, so it isn't as easy to install as GCC or G++. ICC comes as a binary with an automated installer. Clang, on the other hand needs to be built from source. Luckily this build, though lengthy, tends to occur without any problems or magic incantations. Parrot built with Clang performs very well and typically doesn't experience any additional failures from the GCC build. Also, like ICC, Clang outputs a number of warning and diagnostics messages by default that can be very helpful. Result: No Failures

perl Configure.pl --cc=clang --link=clang --ld=clang


x86 Ubuntu 9.10
I run Ubuntu 9.10 in a virtual machine because there were so many problems running it natively on my personal laptop. However, since my processor doesn't support hardware-level virtualization I cannot virtualize a 64-bit version of it. Ubuntu 9.10 uses more recent versions of GCC and other build utilities by default, although it's not the cutting edge.

GCC 4.4.1: Again, this is basically Parrot's native environment. As with the GCC build on x64, there are plenty of developers using this platform, plenty of end-users using it, and very few bugs show up here. Result: No Failures

G++ 4.4.1: Again, G++ performs very similarly to GCC and so long as the build isn't broken because of misuse of C++ keywords, there tend to be no extra test failures. Result: No Failures
x86 OpenSolaris 2009.06

Intel C++ 11.1: Similar to my notes above: Install is a pain and is not open source. However, the build is good and optimizations are top notch. ICC on x86 has the same negative zero failures that the build on x86-64 has. For most applications, these failures won't pose any problems for end users.Result: 5 Test Failures

Clang 1.1: Clang always performs similarly to GCC in my experience, and testing produces no challenges to that rule. No problems here. Result: No Failures

x86 OpenSolaris 2009.06:
Using VirtualBox has allowed me to test out some new operating systems that I would never have gambled on otherwise. OpenSolaris is an interesting platform that is very Linux-like in most regards. It has Gnome and Bash, so any Linux user will feel mostly at home with OpenSolaris. There are a few places where the OS feels a little bit more fragile to me. For example, one installation of it started having Xorg problems after I made a change in one of the network adaptors, and I haven't been able to use it since. Intel does not appear to provide a free version of ICC for OpenSolaris, though I haven't tried yet to use the Linux installer. I tried and failed several times to build Clang on OpenSolaris.

GCC 3.4.5: Part of me is amazed that OpenSolaris uses such an old version of GCC by default. If you're trying to keep your platform stable and rock-solid it can be easy to fall into the trap of never upgrading (or upgrading very slowly) the components that "just work". Also, I absolutely refuse to build GCC myself, especially on systems that already have it available (or have a package downloadable). Even though GCC 3.4.5 is a stable and dependable release, it still doesn't produce a properly-performing Parrot binary. There are several test failures, espcially involving mathematics functions asin, acos, atan, ln, log10, and log2. I haven't been able to dig too deep into any of these, but it is appearing like some of the libc functions of the same names are just returning incorrect values. Also, there are some very troubling failures in t/compilers/complete_workflow.t that I haven't looked at in depth yet. Result: 15 Test Failures

G++ 3.4.5: Same as GCC. Result: 15 Test Failures

x86 Windows XP
On a windows platform there are basically two routes to build Parrot: The first is to use ActivePerl and MSVC, the second is to use Strawberry Perl and the included MinGW compiler. Mixing and matching causes problems. This is partly because Parrot still derives too many of it's configuration settings from the Perl binary installation, and on Windows there are too many differences to worry about. Since we can't mix and match, to test multiple compilers we need to have multiple Perl installations, which can get really ugly if we want to have at least one Perl install in your PATH. On my current Windows system I have not taken the effort to install both toolchains in parallel and properly sandbox them. So, for the time being, I am only using ActivePerl/MSVC. The build with Strawberry Perl and MinGW tends to perform very similarly to the GCC build on Linux, so it isn't as interesting to me. Intel does not provide a free compiler for Windows platforms, even for non-commercial uses. At least, I haven't found such a download. I have not tried to build Clang on Windows.

Microsoft Visual C++ 15.00: The build here is always a little sketchy, there are some tests that are marked TODO but always seem to pass anyway. There are a few other tests that fail intermittently. When tests on this platform fail they tend to remain failing because we have so few developers working on Windows. Sometimes we may not find out about and work to fix errors on Windows until moments before a release. Also the build is a little slower, Microsoft's nmake utility doesn't appear to support (or doesn't support well) parallel builds. Despite the flakeyness of this system, at the moment there are no failures here. Result: No Failures

x86 Ubuntu 8.04
This is a slightly older platform, and I would have ignored it entirely if I hadn't heard rumor of problems here. What's so interesting about Ubuntu 8.04 is that it's a long-term support release so it's very possible that Parrot end users could be using this system for some time to come. The GCC and G++ builds on this system, despite the rumors I had heard, go off without a hitch. It's interesting that, even though this OS is more than a year old, it uses a more recent version of GCC than OpenSolaris does. I haven't bothered to try installing ICC or Clang here, yet.

GCC 4.2.4: Build works without problems and all tests pass. Result: No Failures

G++ 4.2.4:
Surprise, Surprise: no difference from the GCC build. Result: No Failures

There are a handful of other systems I have been looking into, but don't have any recent test results to share. Fedora 12 uses a more recent version of GCC for instance, but VirtualBox 3.1.2 isn't compatible with the version of Xorg that's on Fedora 12, so I haven't used Fedora since I upgraded VirtualBox. I haven't been able to get any BSD variants working in VirtualBox either, even though the documentation seems to suggest OpenBSD and maybe FreeBSD should work. I no longer have access to a 64-bit Windows environment either. Apparently it's not even legal to run any Apple OS's without Apple hardware, so I haven't attempted to use any of them.

by Whiteknight (noreply@blogger.com) at January 24, 2010 19:33 UTC

v
^
x

Andrew WhitworthParrot 2.0 is Out

Behold! Parrot 2.0 has been released by chromatic. 2.0 is going to be a supported release that most other projects and end users should be able to target confidently for the next several months.

Keep in mind that Parrot uses time-based and not feature-based releases. So 2.0 doesn't strictly satisfy any pre-set group of specifications. It does represent a stable and well-tested version of the software however. The number of feature addititions or performance enhancements from 1.9.0 is actually relatively small, but we have a lot of confidence in this release because of all the testing that has been done to it.

I'll have plenty more things to say about 2.0 in the coming days.

by Whiteknight (noreply@blogger.com) at January 24, 2010 19:33 UTC

v
^
x

Andrew WhitworthParrot 2.0: Personal Retrospective

A while ago I put out a list of tasks that I wanted to see in Parrot 2.0. This wasn't necessarily a list of tasks I wanted to perform personally, more like a wishlist of tasks that I wanted to see in 2.0 and would have been willing and capable of doing myself if necessary. In other words, it was a list of projects I was personally interested in, even if I didn't have the time to complete them all myself.

Now that 2.0 is fresh out of the factory I think it's a good time to go back over this list and review the status of all the things I wanted to have been included.

GC Internals: At the time I wrote the list this task was already done. What I wanted to do was clean up the GC internals and properly encapsulate things so we could add new cores more easily. I had done that work back in May, and since then other work has happened to cleanup and improve the GC internals for clarity and encapsulation. In large part, I consider this task complete. Final Result: Done.

Incremental GC: This didn't happen, though in the buildup for the 2.0 release a lot of excitement and motivation has been generated to improve the GC. I suspect we will see a Generational GC first, but an incremental GC cannot be far behind (especially if interest in the RTEMS port continues to grow). Final Result: Not Done.

Generational GC: See my comments above. Interest is building and I think the community has at least a tenative agreement that the next GC core we work on will need to be a generational one. Downside to this is the likely need to add write barriers throughout the codebase which can be costly in terms of performance and code bloat. However, we should see performance improvements from it. Final Result: Not Done.

Boehm GC: Bacek put this together recently, but unfortunately the results were not as good as we expected. I did a brief analysis about why performance might have been a problem with the Boehm collector in Parrot, but without doing more benchmarking myself I really don't know what the problem is. Maybe 2010 will see some improvements to this core and better performance as a result. Final Result: Done.

Context PMCs: This was another project that Bacek put together, and the result was quite spectacular. It only lasted for a while, however, before the Context PMC got merged with the CallSignature PMC into the new CallContext. Some tests broke and performance isn't as good yet as I think it could be, but the hardest part is done now. Final Result: Done.

RetContinuation Unification: I had anticipated that RetContinuation would be merged into Context first, and then that PMC would have been merged with CallSignature. Things happened in the opposite order with CallSignature and Context being merged into CallContext instead. As of the 2.0 release RetContinuation is still a separate entity, though there are rumblings about changing that and cashing in on the performance win. Final Result: Not Done.

Subclassable Exception Handlers: I'm not really sure where this task ended up, though I assume the work on the Context PMC would have enabled it. At the very least (and I say this without doing any testing or exploring the codebase) the hardest part of the work is now done. I'll mark it off as a success. Final Result: Done.

PCC Refactoring: This was done and ended up being a herculean task. Also, the new (and cleaner!) code brought along a performance penalty that we've been scraping and clawing to mitigate. One thing that cannot be argued though is that the mechanism has been significantly improved and various optimizations and feature additions will be much easier now. Final Result: Done.

CallSignature Unification: My original intention for this task was that the Context+RetContinuation PMC would be merged into CallSignature. Things happened in the wrong order and CallSignature got merged into Context first. Final Result: Done.

Subclassable Everything: This task, which would allow all PMC types to be completely subclassable from PIR, was daunting and perhaps a little bit too ambitious. First stage required that most fields in existing PMCs were converted to one of the "big 4" types (INTVAL, FLOATVAL, STRING, PMC). Second stage required that we come up with a sane way of working with opaque pointers and data items from PIR, which we don't have yet. Final Result: Not Done.

:invocant: To my knowledge this one hasn't been added yet. With the new PCC system in place it should be trivial to add, but i haven't seen it done yet. Final Result: Not Done.

Morphable Objects: I can't even really remember what I wanted out of this task, but in hindsight the idea of morphing objects between types doesn't sound like something I would really want to stress. Final Result: Not Done.

Override VTABLE Invoke: I believe this one was added shortly after the PCC refactors landed, though I haven't played with the new implementation myself. I'll mark it as being completed for now. Final Result: Done.

IO Cleanups: The IO system is decent but does need some work still. I was hoping we would see a proper pipes implementation, and then once we have all the major fundamentals I was hoping we could get some of the various codepaths unified and simplified. This one, unfortunately, did not happen the way I wanted. Final Result: Not Done.

PIR IO Objects: The goal was to properly subclass IO object types from PIR. This would require both the aforementioned "IO Cleanups" with the pipes implementation, and maybe some variation of "Subclassable Everything" so we could manipulate the low-level buffers from PIR. Sadly, this did not get completed. Final Result: Not Done.

Asynch IO: I got derailed on this project, which I put quite a lot of planning effort into, when the IO Cleanups project got stalled. I still have all the best intentions of getting this feature added, but sadly it did not make it into 2.0. Final Result: Not Done.

IO Unification: This was dependent on the Asynch IO task. The goal was that we could merge the code paths between AIO and regular IO, implementing some blocking primitives in terms of their non-blocking counterparts. Without AIO, this was impossible. Final Result: Not Done.

Concurrency Cleanups: The concurrency system is a mess and I was hoping to change that. Plus, AIO needed improvements in this area to allow us to properly integrate things the way they should be. I never took the time to tackle this one and since it was deemed lower priority I don't think anybody else did either. Final Result: Not Done.

Update PDDs: I haven't spent hardly any time on documentation recently, and the docs didn't get improved to the standard that I was envisioning. Other people did work on these but I am not sure if it's the kind of comprehensive overview that I was hoping for. Final Result: Not Done.

Parrot Developer Guide: The first published book was a PIR users guide. I was hoping to at least start second book, a more in-depth guide for internals developers. This never got off the ground. Final Result: Not Done.

From among 20 tasks that I was personally interested in seeing included in the 2.0 release, 7 of them were completed. That's actually not a bad number considering the work we did accomplish as a development team, the pathetic level that I have been contributing at for the previous few months, and the enormous scope of some of these tasks.

Sometime soon I will try to post a wishlist for the upcoming supported releases: 2.3, 2.6, 2.9, and 3.0.

by Whiteknight (noreply@blogger.com) at January 24, 2010 19:33 UTC

v
^
x

Andrew WhitworthParrot: the 1.x months and beyond

Parrot 2.0 is out and the era of 1.x releases is behind us. 2.0 was envisaged as being "production ready", though I think we can all take a look at the final result and agree that this isn't the case. It was a motivation, no doubt, but the released software fell pretty far short of that goal.

But then again, maybe what I think of as being "production ready" is different from what other people think.

In contrast, Parrot 1.0 was described as being a "stable API for language developers", and in hindsight I think we can all also agree that this wasn't the reality of that release either. I'm not trying to be disparaging here, but I do want to take a critical look at these two releases and try to see what is going on with Parrot.

1.0 was stable in the sense that we had the deprecation policy and could not consciously pull the rug out from under our compiler devs without adequate warning. That said, we were making pretty significant changes to the API (defined loosely as the ops, PMCs, command-line options, C embedding API and C extending API) every opportunity we got. I have also said before that the API that we had stabilized at that time was hardly golden. There were plenty of warts, and the problem with that deprecation policy is that those warts are around to stay. We can't just fix a problem or cleanup a bad interface, we are stuck with it until the next deprecation point. But, that's only if we put in a deprecation notice sufficiently early. In some cases HLL developers, most notably Patrick Michaud from the Rakudo group, were urging us to make changes faster than the deprecation policy allowed because these warts were too taxing to work around for months on end.

In hindsight, I think we can better label 1.0 not as a stable API, but instead as a critical maturation point for the community. 1.0 was a coming of age. It was a time when the community got it's act together, put some policy together, outlined our priorities, and started making promises to people. Say whatever negative things you want about our deprecation policy (everybody knows I do), but at least we have a policy.

The magic 1.0 (and now "2.0") numbers are a little bit misleading because not a lot of people understand the way we do numbering. People think 1.0 means it is "complete", when any of the Parrot devs will tell you that this is not the case. We number and release according to the calendar, not according to the state of the repo. Similarly, we released 2.0 because the calendar said to, not because we had implemented any specific feature or reached any specific milestone. The tagline "production ready" was only a vague motivation and not the final result.

That said, what was the result of 2.0?

1.0 as we mentioned was a critical maturation point for the community. 2.0 then, I think, provides that stable API that we would have liked to have had 9 months ago. This is not to say that all our warts have washed away, but the API is in a much better condition now than it was when 1.0 came out the gates. Since the 1.0 release we have done a lot of cleanup, refactoring, and improving of various systems. It's worthwhile to mention that we haven't added a whole ton of new features during that time. It really has been 9 months of non-stop cleanup and because of that effort we are in a position that I would be happy to call "stable".

Now where are we headed? What will 3.0 look like in a year?

Coming up to 2.0 we've done much cleanup. Systems are improved, refactored, encapsulated. We've improved naming conventions and code style. All the groundwork has been laid to start adding some of the blockbuster new features that we'll need to really get Parrot accepted into the world.

When I think "production ready", I think of a few key concepts: Robustness, Scalability, Performance. Business simply aren't going to employ software that is buggy, cannot adapt to different sized tasks, and makes inefficient use of expensive hardware. We can call a piece of software "production ready" all day long and pat ourselves on the back over how great Parrot seems to us, but if we haven't satisfied these three principals nobody is going to use it. So, how do we get these properties? What do we need to do between now and 3.0 to truely make Parrot business-ready?

In terms of robustness I think we really need to focus on two things: Cleanup and documentation of our external API, and improving the comprehensiveness of our test suite.

For scalability, we absolutely need to rein in our memory usage and I think we also need to significantly improve our threading implementation.

For performance I could rattle off a laundry list of things we need to improve but I will limit my list to only a few:
  1. Improved Garbage Collector
  2. Add [pluggable] optimizations to PCT
  3. Enable PCT to output bytecode directly
These three things will improve performance for most applications, while improvements to other individual subsystems (such as MMD or IO) will improve performance for smaller classes of applications.

If we can get address the three priorities of Robustness, Scalability, and Performance, I think that 3.0 can truely be the production-ready release that we've been saying 2.0 was going to be. Because of all the wonderful cleanup work we've been doing in the past few months, I think the stage is set to really get to work on these things and make that goal happen.

by Whiteknight (noreply@blogger.com) at January 24, 2010 19:33 UTC

January 23, 2010

v
^
x

Andrew WhitworthNeko VM

There has been some traffic on the Parrot-developers mailing list recently involving the Neko VM. In terms of capability and scope, it does appear that the Neko VM is very close in scope to Parrot: a virtual machine designed to host dynamic languages. There are plenty of similarities between the two projects but a few differences too which I will try to talk about below.

This discussion really got started when people saw some incorrect ideas expressed about Parrot in the Neko FAQ:

Targeting Parrot requires you to learn another language which is more complex that Neko itself, with different possibilties at different levels (low level PASM, medium level PIR, high level NQP).

It is also difficult to differenciate beetween the language and the standard library because of the numerous cores apis (PMC) whereas NekoVM has a single builtins core api which a single highlevel language with minimal syntax and core types.

Parrot is written in C while Neko compiler is written... in Neko. The language is fully bootstrapped right now. Also, Neko is lightweight and the Virtual Machine is only 68 KB on Linux and 40 KB on Windows, while still offering a very good speed.

It's obvious from this and other mentions of Parrot I have seen around TEH INTERWEBZ that the marketing wing of Parrot isn't doing a lot to dispel these kinds of common misconceptions. Let me do what I can here.

The Parrot VM runs natively on Parrot bytecode, with the most common human-readable form of that being PIR. I see people talk about PASM all the time and I think this is a huge mistake. Nobody uses PASM. Basically nobody should use PASM either. It's mostly useless, ugly, and limited. PASM exists only as a direct human-readable translation of raw bytecode in lieu of any good utility to convert bytecode to (unnuanced) PIR or to do PIR-level debugging. Seriously, forget PASM exists.

PIR is the more friendly, more familiar programming language that people should use to interact in a basic, low-level way with Parrot. PIR is to Parrot what HLA is to the x86 platform: it's basically an assembly language but with some improved syntax, macros, and other nice features. Most coders will be reasonably familiar with the syntax of PIR, though it will feel primitive and clunky.

The FAQ above says that Parrot is written in C, while Neko is written in Neko. This is a pretty big distortion of the truth, and is certainly not comparing apples to apples. The Neko compiler frontend is written in a boot-strapped version of Neko itself. This much is true. However, the Neko VM that executes the compiled bytecode is itself written in C. Likewise, Parrot's core is written in C and the PIR compiler is written in C (with flex/bison magic), but our premier compiler utility, PCT, is written in Parrot. There is a difference in scope here, but if you are writing software of any significance on Parrot your code is more likely to be compiled by PCT than by Parrot's native PIR compiler.

The second paragraph of the FAQ is problematic too. It talks about the differentiation between the language and the runtime, which doesn't make sense in the context of Parrot. Parrot VM is, at heart, only a runtime. The PIR language exists only as a way for people to interact with that runtime. None of the Parrot developers would call the PIR language a "feature" of the VM in the sense that the language has any value outside of the Parrot project. The Neko language is in a different league from PIR because it is much higher-level than PIR is. In a direct side-by-side comparison it looks to be more similar in terms of capability and scope to NQP. PIR is a human-friendly low-level interface to Parrot, and once some of our tools mature sufficiently it is reasonable to expect that PIR may disappear completely. That occurance is still a long time away but there is no reason to suspect that it cannot or will not happen.

As an aside, since the Neko language is designed to be easy to parse using an LL(1) compiler (which PCT's recursive descent parser is), it would probably be very easy to create a Neko compiler frontend for Parrot. The benefits in interoperability might be quite nice, because any language that has a compiler into Neko could then be made to run directly on Parrot. But, that's an aside that I won't talk about at length here.

NQP is explicitly designed as a language without a runtime. Parrot provides a number of default object types (PMCs) for basic use, but they can almost all be easily overridden with user-defined types in C or PIR as necessary. In a sense, the PMCs themselves represent the low-level data structures that user-defined objects share. But they're more dynamic and general than that.

I don't see a huge difference in terms of the design goals between PMCs and Neko's underlying object model. Parrot's version appears to be a little more heavy-weight and a little bit more powerful because of that, but I would need to look much more closely at the Neko model before I make any firm conclusions. Notice that "heavy-weight" can very much be a bad thing to some people, as much as "powerful" can be a good thing to others.

The final bit of the last paragraph in the FAQ is where some of the real differences start to emerge. I don't know the complete in-memory size of the Parrot binary is but I have to suspect it is bigger than 68K on Linux and 40K on Windows. If a small footprint is your driving criterion, maybe Neko is a better choice for your language than Parrot is.

Looking at the documentation Parrot really seems to be a more ambitious project with a larger scope than Neko. I'm not going to get on my soapbox and say that bigger is always better because it isn't. Parrot, by design, offers more features and capabilities than any one language would ever need. Sometimes developers need a smaller platform with a smaller footprint. Sometimes people need more power and capability. I think you would be hard pressed to try and run a standards-compliant Perl 6 implementation, along with decent Tcl and M implementations on Neko VM. At least, I can't imagine that you could have all these things without a significantly larger development effort than those compilers currently require on Parrot.

I'm not sure what all features Neko VM supports or intends to support, but I can tell you that Parrot has several great features and all intentions of adding some other major features in the future: GC, Exceptions, Asynchronous I/O, JIT, multithreading, and aggressive optimizations. We're developing a large ecosystem of langauges, libraries, and tools for use with Parrot. We also have a world-class compiler generator tool in PCT, which I think any other project would be hard pressed to rival.

I've got the Neko sources now and am going to build it and play with it a little bit. I'll report in when I have done more playing.

by Whiteknight (noreply@blogger.com) at January 23, 2010 08:00 UTC

v
^
x

chromaticPerl 6 Design Minutes for 06 January 2010

The Perl 6 design team met by phone on 06 January 2010. Larry, Allison, Patrick, Will, and chromatic attended.

Larry:

  • in Spec Land, renamed p{} to qp{} to avoid using up another common single letter
  • bare say/print is now just a warning
  • Carl Mäsak dug up fossilized restriction on hash literals, which I removed
  • since the insides of blocks are now parsed as statements, there is no longer an inconsistency in line-ending curlies
  • refined the picking vs grabbing semantics with respect to immutable vs mutable bags and such
  • to avoid legacy confusion, renamed break/continue to succeed/proceed
  • clarified that an implicit succeed returns the value of the whole when block
  • it is not somehow magically inserted around the last statement
  • renamed true to so to avoid confusion of the predicate with the True enum value
  • at the suggestion of moritz++, split Any up into Any and Cool types
  • Cool stands for Convenient OO Loopbacks, or any other acronym you'd like
  • the built-in types derived from Cool are the ones that do Perlish dwimmy coercions
  • user types still derive from Any by default, so aren't born with gazillions of methods
  • conjecturally, also keep "last-resort" multis in Cool package
  • responded non-explosively to a potentially explosive rant/twitter
  • clarified various things in response
  • it is not necessary that all implementations be equally good at everything
  • there will be a minimal Perl 5ish grammar alongside STD that any VM can support as a well-behaved subset
  • it is also acceptable to support bug-for-bug compatibility with Perl 5
  • the language designer is neither omniscient nor omnipotent
  • the design process is therefore convergent on the part of all parties involved
  • the rate of convergence is an emergent property, and is to be forced
  • convergence is deemed to be positive as long as anyone is still working on Perl 6
  • the solidification of the spec is also part of the convergence, and depends on proven implementation
  • unproven parts of the spec are to be considered implicitly conjectural
  • as implementations converge on specs, we can throw out or delay parts of the spec that as yet unproven
  • everyone is allowed to panic once.
  • on to implementation, found fencepost precedence error inside list prefix
  • moved the default initparse method from STD into Curso so other grammars don't have to define it
  • added quote modifier :p (aka :path) so we can form the qp{} path literal
  • installed better warnings about bare say/print
  • generalized the say/print warning to anything a p5er might try that might be in p6 without the defaulting
  • STD and CORE now support recent renamings to so, succeed, and proceed
  • no longer reports "Bogus statement" when "Missing term" is more accurate
  • now catches /\b/ and advises to use an appropriate p6 word boundary assertion instead
  • emits better message when an intended reduce is interpreted as composer
  • detects most attempts to use postfix after whitespace, and suggests omitting whitespace
  • now parses tick-less embedded comment syntax as line-end comment (but still warns for now)

Patrick:

  • took the last couple of weeks off
  • keeping up with things, but not much writing code
  • read the S01 changes with great interest
  • glad to see them
  • answered some PGE questions for Carl

Larry:

  • we've talked about them all along
  • they weren't written down in an obvious place

Will:

  • working on Parrot
  • trying to move as much out of the configure process into a Makefile as possible
  • intended to improve the build
  • attempting to remove recursive makes
  • avoid unnecessary rebuilds
  • improve dependency tracking in the build
  • probably ready after 2.0

Patrick:

  • I'm impressed
  • thanks for taking that on; we needed it

Allison:

  • working on deprecation notices
  • we've talked about a lot of things over the past six months
  • not sure they're all in the file appropriately

c:

  • working on bugfixes
  • also working on deprecations

Patrick:

  • I intend to merge the ng branch before the January release
  • some people are antsy, but I have a lot of confidence
  • we'll probably pass about 70% of the test suite
  • it looks like a regression, but we have different features added now
  • lazy lists work, for example
  • lots of things fudged in the previous version work now

by chromatic at January 23, 2010 00:12 UTC

January 22, 2010

v
^
x

chromaticPerl 6 Design Minutes for 16 December 2009

The Perl 6 design team met by phone on 16 December 2009. Allison, Patrick, Jerry, and chromatic attended.

Patrick:

  • finished my Hague Grant
  • sent the final report to Jesse to wild approval
  • drafted a new version of PDD 31 on HLL interop
  • write some code to implement part of that in NQP's HLLCompiler
  • added various tests
  • need to get languages using that now
  • Rakudo will use that
  • it's the basis for Rakudo's use and import
  • if it works for Rakudo, it should follow for other languages which use HLLCmpiler
  • had a few comments about missing pieces and corrections
  • it's still a draft spec, but we need iteration to finish the spec
  • working on little bits of code here and there
  • adding features for projects which use NQP
  • trying to return to using Rakudo and updating the -ng branch

Allison:

  • working on the Pynie refresh this week
  • started over with the Python 3 grammar
  • have that translated into NQP-rx
  • it compiles and can parse a few things
  • working my way through the grammar
  • no actions set up yet

Patrick:

  • there is a .DEBUG rule
  • call it on a subrule to turn on tracing from that rule down
  • that saves you from having to put in panic statements

Allison:

  • is there a good NQP-rx tutorial for actions?

Patrick:

  • working on it
  • is your work in the Pynie repo?

Allison:

  • it's in Mercurial on bitbucket.org under project pynie
  • that's what Python 3 uses
  • also the Parrot roadmap session went very well

Jerry:

  • we still have some actions to take based on that work
  • need to convert our priority list into Trac tickets and wiki items
  • Parrot's goal for every month until Rakudo * is to support Rakudo * and HLLs in general

Allison:

  • that's valuable
  • it changes our priorities in the next year
  • it moves things between "would be nice" and "necessary"

c:

  • worked on the roadmap session
  • helping with the Context/CallSignature merge
  • will do a dry run of the Rakudo release today

by chromatic at January 22, 2010 01:52 UTC

January 21, 2010

v
^
x

chromaticPerl 6 Design Minutes for 09 December 2009

The Perl 6 design team met by phone on 09 December 2009. Larry, Allison, Patrick, Will, and chromatic attended.

Will:

  • some work on NQP port of Partcl
  • Patrick has been very helpful
  • sent a message to the Parrot list about the planning meeting this Sunday
  • initiated a community document to discuss those plans

Allison:

  • implemented large chunks of obscure C code to perform fast string matching using the FFT
  • wondering if that'd be useful in Parrot
  • maybe we do our indexing operations by character set in the NFG form
  • also does very basic pattern matching by leaving out optional characters
  • could be useable in the core tests, where it's tricky to depend on PGE
  • this was my final assignment before the Christmas break
  • have a month off to work on Parrot stuff then
  • I'll show off my assignment when I submit it

Patrick:

  • finished my final report for my Hague grant
  • haven't quite finished the grant, but left TODO items
  • rather than trying to finish everything and then write the report, I'd draft the report and keep notes on what I needed to finish
  • need to work on HLL interop
  • enable Perl 6 and other Parrot languages to load libraries from other HLLs
  • will work on that over the next few days
  • had several coversations about optimizations, constants, and inferior runloops
  • made minor PAST improvements
  • integer constants can automatically promote to num constants without going through a PMC
  • updated NQP to make it easier to write custom operator subs, if you're using the operator precedence parser
  • implemented the beginnings of smart matching
  • not full Perl 6 smart match
  • makes sense in the Parrot context
  • can match against regexes, tokens, rules, and any types with protoobjects
  • code looks more like Perl 6
  • not much on Rakudo besides answering questions
  • will get back to the Rakudo-ng merge after finishing my grant work
  • also worked on Partcl
  • updated its regex syntax, particularly for enumerated character classes
  • fixed it to handle unquoted, non-word characters in regexes
  • previously it only handled barewords as literal matches
  • it's closer to the Perl 5 syntax now

Larry:

  • didn't like the name PairValSet, renamed it EnumMap
  • likewise PairSet is now PairMap, and PairVal is just Enum
  • so individual constant pairs now called "enums"
  • we distinguish pairs, which have read-write values, from enums, which are constant in the value
  • you can now do .enums on hashes and arrays as well as enumerations
  • differs from .pairs, which give reference semantics into the values of the original data structure
  • .enums gives you a constant snapshot
  • David Green suggested renaming Enum.name to Enum.key, and he was right, since they're constant pairs
  • trying to be consistent about calling the whole type an "enumeration" and referring to the bits as "enums", even though the keyword is enum
  • thought people would rebel at typing the long name
  • clarified that the anonymous enum is compile-time evaluated as an anonymous list of constants
  • you can always cast to an EnumMap at run time for the other behavior
  • simplifying conditional semantics
  • STD parser now parses a WHENCE closure as part of the typename, rather than relying on subscript parse
  • block escape within a closure within a string used to parse as a normal block by responding to comments outside of the block
  • already fixed the embedded block in the regex syntax
  • made that usable by strings and regexes now
  • blocks in regular code try to figure out if they're at the end of a statement
  • look for the trailing curly
  • inside a string or regex, there are no statements
  • it makes no sense to look for the end of the statement there
  • the obsolescence messages were still in the old framework that upsets some Perl 5 people
  • changed the wording to "Unsupported use of ..."
  • #perl6 found a precedence inconsistency in parsing of list prefixes vs list infixes in NG
  • turned out to be wrong in STD first, and NG copied it
  • I fixed it in STD, Patrick fixed it in NG
  • otherwise last week was rather too ADD-ish, so mostly did Q&A on IRC

c:

  • fixed some bugs
  • made some optimizations
  • think I've fixed most constant PMCs in PBC now, which should help NQP and Rakudo

Patrick:

  • it'll take a while before Jonathan and I can take advantage of that
  • Allison, when you push_eh an ExceptionHandler onto an array in a context, it creates an RPA
  • does that hold other things besides an EH?

Allison:

  • potentially
  • events get stored in the scheduler
  • only EHs are scoped to a context
  • the old pushmark/popmark stuff to do actions used that same global array
  • it may have changed to use the same array
  • that's deprecated though
  • they won't use that array for long

Patrick:

  • I need something to replace pushaction and popaction before they go away
  • when we handle LEAVE semantics, we want to avoid generating an exception to leave that scope for caching
  • I don't want to generate and rethrow actions to go up the stack
  • those ops let me do that without generating exceptions

Allison:

  • we do need singleton exception objects for FAIL and RETURN
  • no extra information needed
  • right now, you can insert anything you want in that array
  • the local_branch and local_return uses that array

Patrick:

  • it shouldn't
  • bsr and ret may have
  • I provide my own there

Allison:

  • oh right
  • I might not have checked in that code

Patrick:

  • by the way, NQP doesn't use local_branch or local_return

by chromatic at January 21, 2010 04:41 UTC

v
^
x

chromaticAnnounce: Parrot 2.0.0 "Inevitable" Released!

The Beyond and below are like a deep of ocean, and we the creatures that swim in the abyss. We're so far down that the beings on the surface superior though they are can't effectively reach us. Oh, they fish, and they sometimes blight the upper levels with points we don't even understand. But the abyss remains a relatively safe place.

Vernor Vinge, A Fire Upon the Deep

On behalf of the Parrot team, I'm proud to announce Parrot 2.0.0 "Inevitable." Parrot is a virtual machine aimed at running all dynamic languages.

Parrot 2.0.0 is available on Parrot's FTP site, or follow the download instructions. For those who would like to develop on Parrot, or help develop Parrot itself, we recommend using Subversion on our source code repository to get the latest and best Parrot code.

Parrot 2.0.0 News:

  • Features

    • Context PMCs now support attribute-based introspection
    • Context and CallSignature PMCs merged into CallContext
    • .lex directive throws exceptions when used with incorrect register types
  • Platforms

    • Packaging improved for free OS distributions
    • PPC, PPC64, and ARM now tested when running Linux
  • Performance

    • Minor improvements to the profiling runcore
    • Improvements from the CallContext PMC merge
  • New deprecations

    • In/out parameters in STRING modification functions
    • Void handling in NCI signatures
    • Parameter passing opcodes order in PBC
  • Tests

    • Continued migration of core tests from Perl 5 to PIR
  • Tools

    • dependency checker improved
  • Miscellaneous

    • Deprecation cycle length changed to three months from six
    • GC accuracy improved
    • PMC freeze improvements; much more reliable
    • Makefile improvements for dependency handling

Thanks to all our contributors for making this possible, and our sponsors for supporting this project. Our next release is 16 February 2010.

Enjoy!

by chromatic at January 21, 2010 04:37 UTC

Perl.org sites : books | dev | history | jobs | learn | lists | use   
When you need perl, think perl.org  
the camel    
(Last updated: February 09, 2010 02:00 GMT)