Planet Parrot

May 10, 2008

v
^
x

Patrick MichaudMore rakudo and parrot news

For the past couple of weeks I've been mostly providing feedback and guidance to others on IRC, as well as trying to keep up with $otherjob. But this week I managed to get some coding in and continue progress.

Of course, it's great news that Jonathan++ has received a grant from Vienna.pm to continue his excellent work on Rakudo. He's making so much progress lately that I'm finding it difficult to keep up with him. Of course, that's a very good "problem" to have, so I'm not at all complaining.

Chromatic indicated that he had a goal of reducing the number of open RT tickets for parrot to below 700, so I spent some time over the last couple of days closing out some of my older tickets. Earlier today were briefly down to 698, but then I found some additional Parrot bugs that brings us back to 700 as I'm writing this. But I'm sure we'll be able to close a few more quickly.

The Parrot developers could really use some help with reviewing the Parrot RT queue. Personally I think that a good number of those 700 open tickets must be irrelevant by now or otherwise already addressed. Coke wrote a nice set of guidelines for this at http://www.parrotblog.org/2008/05/700-ticket-challenge.html, if you're interested in helping.

Several RT tickets had to do with updating PGE to include some syntax changes in Synopsis 5. Of course, changing PGE's syntax also means we have to update the various grammars that are using the older syntax, which takes a little while. I think I managed to convert all of the existing languages except for Plumhead, which looked a little more involved than most. So I'm hoping Bernhard will be able update that one soon.

I also cleared out a few RT tickets from the "perl6" queue, applying several useful patches and closing some out-of-date tickets. This leaves us with 22 new/open tickets for Rakudo, although I expect that number to grow as more people start playing with it and trying out various tests.

Tonight I updated Rakudo's interface into the operator precedence parser so that the 'fatarrow' syntax now works properly. So, we can use fat arrows for named parameters in subroutine and method calls. Soon I'll want to work on the hash and list composers, and then start to tackle list context and list assignment.

For the upcoming weekend I'm planning to update the spectests so that "make spectest" in Rakudo doesn't give back a ton of not-very-useful-error messages. As a result, we can't easily determine if a change to Rakudo causes something to break that was working previously. Fortunately, the 'fudge' script makes it much easier for us to maintain this. In an upcoming post I'll give an overview of spectests are being marked in Rakudo.

Spinclad on #parrot was remarking that was working on kjs' Squaak tutorial, and wanted to initialize the @?BLOCK variable using pure NQP instead of the PIR List class that the tutorial currently requires. A couple of weeks ago I updated Parrot's ResizablePMCArray class to automatically provide the methods that the PIR List class was using, so I figured we'd be able to use those directly. Turns out it's not quite as simple as that, but NQP still lets us work around it a bit. What we ended up with was the following:

        Protomaker.new_subclass('ResizablePMCArray', 'List');
        our @?BLOCK := List.new();

The first line uses PCT's "Protomaker" object to create a new "List" class that is a subclass of Parrot's ResizablePMCArray. The second line then creates a new List object, and binds it to @?BLOCK. Since List inherits from ResizablePMCArray, we automatically get the shift/unshift methods that Squaak needs to do its work.

In fact, another good task will be to update the Squaak tutorial (or create another tutorial) and use some of the more recent PCT and NQP features that were added in April.

However, it's probably not a good idea to rely too strongly on Protomaker just yet. Currently we have two protoobject implementations -- one for PCT and another for Rakudo -- and after discussions with Jonathan and others I've decided that they really ought to be merged into a single implementation that is used by both. Protomaker may very likely disappear in the merged version (but we'll undoubtedly have something equivalent).

Anyway, lots to work on as usual, and I'm glad I have a few weeks of available coding time again.

Pm

by pmichaud at May 10, 2008 04:13 UTC

v
^
x

chromaticDoing Nothing Better in Perl 5

Perl 6 has three code placeholder operators, known affectionately as the "yada, yada, yada" operator (see List Prefix Precedence in Synopsis 3). It's a matter of (very sarcastic) public record how much I love writing, maintaining, and patching parsers, so I've just sent a very preliminary five-line patch to p5p to add support for ... to Perl 5.

--- perly.y~    2008-05-09 17:47:35.000000000 -0700
+++ perly.y    2008-05-09 17:47:41.000000000 -0700
@@ -1227,6 +1227,11 @@
             }
     |    WORD
     |    listop
+    |    DOTDOT
+            {
+              $$ = newUNOP(OP_DIE, 0,
+                  newSVOP(OP_CONST, 0, newSVpvs("Unimplemented")))
+            }
     ;

/* "my" declarations, with optional attributes */

Apply this to recentish bleadperl sources, run perl regen_perly.pl, rebuild, and now you can run programs such as:

sub foo { ... }
foo();

And get an "Unimplemented at file line line." error message.

(Now everyone who complains that I don't code enough to match my talk, please punch yourself in the face.)

by chromatic at May 10, 2008 02:24 UTC

May 08, 2008

v
^
x

chromaticPerl 6 Design Minutes for 07 May 2008

The Perl 6 design team met by phone on 07 May 2008. Larry, Allison, Patrick, Will, Jerry, Nicholas, Jesse, and chromatic attended.

Allison:

  • spent my time this week slicing and dicing the exceptions implementation
  • replaced the old internals with the new system
  • checked that in yesterday
  • still a few failing tests in edge cases on the branch
  • did more work on the Parrot Foundation

c:

  • I own an acre of Mars, we could incorporate there

Allison:

  • don't you own a cow in the Philippines?

c:

  • yes, but that doesn't give me any governmental powers

Patrick:

  • isn't that worth a lot?

c:

  • the peso is improving against the dollar

Jesse:

  • moving on...

Larry:

  • clear bill of health from my medical reports
  • hacking a lot against the standard grammar in my STD5 implementation
  • lots of refactoring
  • all of the various parameters that used to go through separately now go through as part of the Match object
  • including the "fate" and whether we're peeking at the longest token set
  • the longest token matcher works now
  • I threw out my old mechanism for gathering Match objects
  • it now creates the more-or-less correctly
  • lots of grammar tweaks, as suggested by Mitchell Charity
  • lots of refactoring of how logging works so that it doesn't always spew enormous quantities of information to the screen
  • I can actually run the parser quite quickly now, for some definition of quickly that approximates 2000 characters per second
  • matches symbols directly, rather than calling a rule, which is faster
  • does the backoff now on longest token matching
  • started refactoring the grammar on the assumption that I cna trust the longest token matcher
  • no longer any nofat rule
  • the longest token should match the fat arrow, if there is one
  • started refactoring the quoting rules to parse as if they were sublanguages
  • getting rid of extra rigamarole to recreate the other mechanism we already use for other languages
  • working out the linkage for switching in and out of sublanguages
  • how to get to the outer language from the inner language
  • calling into pure Perl from closures in a regex
  • or the host language if you're calling the regex from another language
  • nailed down the available methods for Match objects in the specs
  • giving a talk in Seattle on Friday at SPU
  • flying to Japan on Saturday

Patrick:

  • spent a lot of time teaching this past week
  • cleared up now
  • mostly I've answered questions on mailing lists and IRC
  • I'm not always sure that I'm helpful, but I'm there
  • yesterday I worked on trying to get a bunch of little small things here and there
  • fixed up a few things in PCT internals
  • today I'm bringing PGE up to date with some of the latest changes in S05
  • these all help Rakudo and other languages in small ways
  • trying to clean out my backlog and clean up a bunch of RT tickets
  • I'll continue over the next couple of days
  • and blogging about it as I go

Jerry:

  • things are busy, mostly non-Parrot related stuff
  • submitted a ticket that I hope Patrick can close today

Patrick:

  • many languages depend on the old behavior, including Plumhead
  • I'm not certain about some of them

Jerry:

  • mostly otherwise answering questions on #parrot
  • making sure that things are set up for the real work phase of GSoC
  • making sure that students have their CLAs, if not commit bits
  • astonished to see how much work Jonathan is getting done in just two funded days
  • it's amazing to see how much a motivator money can be
  • I'd like to see more of it, hint hint

c:

  • working on closing as many open Parrot bugs as possible
  • applying as many open patches as possible
  • should be able to help on the concurrency branch soon
  • otherwise preparing for the release
  • going to check on received CLAs this week

Nicholas:

  • found it curious that Perl 5.10 has the best state implementation of any language
  • wanted to steal tests from another implementation
  • had a discussion with Leon about SMOP
  • there's no real description of how all of these implementations fit together
  • Rakudo plus Parrot is a complete implementation
  • SMOP and kp6 fit together nicely

Jesse:

  • I started a wiki page on the Perl 6 wiki at Perl_6_Implementations

Patrick:

  • I don't know that it says how things fit together

Jesse:

  • I tried to encourage other people to contribute stuff
  • didn't get much uptake

Nicholas:

  • should we suggest to Daniel that he should help explain things?

Jesse:

  • that's more likely to get people contributing to it

Will:

  • there's definitely some confusion about it within the Grants Committee

Jerry:

  • SMOP has the highest documentation-to-line-of-code ratio of any implementation

Patrick:

  • it needs a good overview though

Nicholas:

  • I'll ask Daniel to explain more
  • especially its relationship to Parrot and Rakudo

Jerry:

  • it sounds like it could be an alternate runcore for Perl 5 as well

Jesse:

  • tried a few different things
  • decided to write a test for Rakudo
  • tried a simple arithmetic test pulled from Pugs
  • found that Rakudo didn't implement a function specified in the S29 draft
  • Patrick helped me write a couple of lines of code to implement it
  • then discovered that fudge didn't support try blocks in a specific way
  • Larry patched that
  • then found that incrementing an undefined value didn't work in Rakudo
  • that was the end of my day
  • I still need to write up my findings
  • how easy is it for someone without experience in Rakudo and its internals to pick things up and contribute something?
  • more difficult than I thought it might be, but it's getting more doable
  • it's important to understand how it might fail before trying to get people to do it
  • then I started trying to play with MAD on the weekend
  • found and fixed a bug in its XML
  • refactored it such that you can run MAD's tests in the core if you add a copy of XML::Parser to the core
  • it's not far enough yet, but it's a start

Nicholas:

  • is it going to be difficult to restructure the Parrot foundation from 501(c)(3) to 501(c)(6)?

Allison:

  • you can do pretty much the same thing
  • sponsors are on the board in a c6
  • they're only advisory in a c3
  • the sponsors we've talked to are mostly only interested in getting regular status reports and the like

Jesse:

  • is there any jumping around to transfer copyright to the new foundation?

Allison:

  • we'll do a copyright assignment from the Perl Foundation to the Parrot Foundation
  • all of the CLAs that went into the code up to the point of signover will be fine
  • but we'll essentially copy the Perl Foundation CLA to a Parrot Foundation CLA

Will:

  • do we need to contact committers who haven't signed a CLA?

Patrick:

  • where does Rakudo fall?

Allison:

  • still under the Perl Foundation
  • it doesn't move at all

Will:

  • do we want to split up the repository at that point?

Allison:

  • eventually, we'll want to do so anyway
  • it's not an urgent thing

Jerry:

  • what would it take to version a Perl6Regex frontend to PGE?
  • let grammars specify a version of the grammar

Patrick:

  • I did that before by having a separate compiler
  • you're talking about something a bit finer grained
  • I don't want to do that
  • as we get closer to 1.0, that'd be fine
  • I already have enough to do keeping up with the latest versions

Will:

  • I don't think we want to keep up old versions

Patrick:

  • I don't mind sticking to our deprecation cycle
  • I hadn't put the change from today into the deprecation list yet
  • we'll get to it in a couple of weeks

Jerry:

  • just trying to figure out how to push forward with changes to PGE without having to update every language in the repository

Patrick:

  • freeze S05?
  • not a great solution

Larry:

  • I heard that

Patrick:

  • the last few changes have been great
  • I'm not really serious about that

Larry:

  • some of them you even asked for

c:

  • it's an advantage to have these languages in the repository
  • we can update them
  • but only if we can run the tests before and after and know that they pass

Will:

  • we might consider removing languages with failing tests and no recent updates
  • there are 17 grant proposals, some of them Perl 6-related
  • please comment on the TPF blog
  • it'll help

Jesse:

  • blog.perlfoundation.org

by chromatic at May 08, 2008 22:42 UTC

May 07, 2008

v
^
x

chromaticGood Error Messages are Important

Parrot r27355 was fun to write.

One of the persistent error messages Parrot emits for compiler writers is Null PMC access in invoke(). If you've had your hands deep in the guts of Parrot, you know what that means -- you tried to call a Sub PMC when you don't have a Sub PMC, you have no PMC. (If you don't know what that means, this entry is for you.)

Sometimes this means that there's a problem in Parrot. We've fixed almost all of those problems though, so the error usually comes from elsewhere. If you're writing a compiler, or running a compiler built on Parrot, the error usually means "You tried to call a function that doesn't exist."

Parrot's optimizer does something interesting at the end of compilation time. You've probably heard that Parrot's compiler, IMCC, translates PIR into PBC. That is, it turns source code into bytecode, which Parrot can either serialize t to disk or execute immediately. That bytecode is just a chunk of linear data in memory. It's not really a data structure. (Okay, it's a C array, but that doesn't make it a data structure.)

After IMCC has finished building a standalone chunk of bytecode, it performs a constant fixup phase. The notable part of this phase is that it edits the bytecode in place to replace all named invocations of functions known at fixup time with offset invocations.

The previous code looks something like:

invoke known_function
null    # padding
null    # padding

If IMCC has already seen known_function by this time, the direct invocation of known_function can continue. There's no runtime lookup necessary; all functions already compiled and ready are available in the bytecode.

If IMCC hasn't seen that function, runtime lookup is necessary, and so this function replaces the bytecode earlier with the equivalent of:

.local pmc func
func = find_name 'unknown_function'
invoke func

(I've simplified what actually happens slightly, because the concepts are more important than the details. Hopefully you see why the padding is necessary. If not, just imagine trying to splice additional opcodes into what may presumably be a lengthy C array -- like I said, barely a data structure.)

The problem with this second form occurs when find_name returns a NULL PMC, which it can legitimately do. In that case, the invoke opcode tries to invoke a NULL PMC and fails, and Parrot throws an exception saying "There's nothing here to invoke." There's the error message.

It's clear why that happens, but it's not useful. It would be more useful to see the name of the function you tried to invoke in the error message. Unfortunately, by the time Parrot calls the invoke op, that name is long gone.

My first idea was to rewrite the dynamic lookup form into something resembling:

.local pmc func
func = find_name 'unknown_function'
if defined func goto call_it
die "Can't invoke 'unknown_function'"

call_it:
invoke func

Unfortunately, I didn't have the space in the bytecode stream to insert that many ops, and I had no desire to move chunks of memory around in that C array. I could have added more padding after an invocation, but to be fair I'm only mostly sure that it exists there in the first place.

I had room for one op with a destination PMC and a string constant argument. I added an experimental op called find_sub_not_null which does the same thing as find_name but throws an exception which includes the requested name if Parrot can't find a PMC of that name.

This isn't entirely an ideal situation. It's a special case op, and I prefer to remove ops where possible. It's also nearly code duplication, though it's effectively three lines of code in an op, which isn't awful. I still want to be able to perform these kinds of transformations in PBC itself, but we need a different way to generate PBC and perform op-level transformations in PIR before we can do this effectively.

There are always tradeoffs, though. Doing this check in C is slightly faster than doing it in PIR. The standard Perl 5 rule of optimization applies even in Parrot -- the fewer ops, the mostly faster you can go. As well, I was able to improve the warning message today, rather than at some point in the future when we have better PBC optimization possibilities.

After all, I can always remove this op in the future.

by chromatic at May 07, 2008 05:42 UTC

May 06, 2008

v
^
x

Jonathan WorthingtonGrammars Get Class-Like, And Other Bits

I started out the day by looking through the RT queue for Rakudo. Two tickets were already dealt with, so I just closed those. Another was a bug report concerning assigning undef to typed variables. Doing:

my Int $x = undef;

Would give a type check failure. This is now resolved. Furthermore, if your type is a class name, then assigning undef at any point to it will result in it holding the protoobject for that class again. I also took a moment to post to Perl 6 Language to get some clarifications on what "my TypeName $x;" left $x being when $x was a role, subset type or a junction of types.

Last week I started getting grammars in place. I got so far as having the regex live in the correct namespaces, but that didn't make grammars at all class-like, which is how they should be. This week I set out to fix that. Grammars now get protoobjects too, which you can call .WHAT and so forth on. Furthermore, I got inheritance working and also smart-matching against a grammar, which runs it's TOP rule. Therefore you can now run the following example in Rakudo.

grammar Loads { regex Lots { \d+s } };
grammar Many is Loads { rule TOP { <Lots> of <Lots> } };
if "100s of 1000s" ~~ Many {
    say $/; # 100s of 1000s
}

Having closed four RT tickets so far, I took a look through there to see what else there was. There was one that did most of what was needed to implement the .perl method on Junctions (which you can call, in theory anyway, on anything to get a Perl representation of it). I did the required fixes and applied it, but realized in the process that we were missing .perl for any of the really fundamental types, so I added it for Num, Int and Str.

With that done, I spent a little time on the S12 tests. I added fudge directives to get one of the that failed to parse to do so, and added an extra test. I plan to add much more here and flesh out the tests quite a bit over time.

Turning back to the OO support, I did some updates to the grammar that both brought us closer to STD.pm - the official grammar - and added the ability to parse a range of extra things. The first grammar changes were related to method calling. Normally you call a method just with ".", but private methods are called with "!". Additionally, there are ways to call sets of methods with quantifiers (.?, .+ and .*, with meanings analogous to those in regexes). Finally, I added the ability to scope declarations of routines too, so we can now parse lexical subs and private methods. I stubbed in conditionals for all of these cases that throw unimplemented exceptions, so people didn't use them expecting that because they parse, they will also work.

So, now there's a bunch of stubs in there for another bunch of OO features and, if nobody beats me to it, I'll be filling some of those out on my next Rakudo day, or maybe before then if time allows (though I'm moving apartment - and country - over the coming week, so I'm not expecting to have much time). A big thanks to Vienna.pm for funding today's work.

by JonathanWorthington at May 06, 2008 22:10 UTC

May 04, 2008

v
^
x

chromaticWhat are You Going to Do, SMS at Me?

For years, I've thought that the only thing sillier than complaining about disaffection and how the world really should work differently in IRC was to sign silly Internet petitions about it. Now I realize that people who feel compelled to register their righteous indignation in 141 characters of chatspeak matter least.

Here's a quarter, kid. Go buy a postcard.

by chromatic at May 04, 2008 00:23 UTC

May 01, 2008

v
^
x

Jonathan WorthingtonToday's Rakudo Progress: Object Initialization And Grammars

Today's work has been a mixture of refactoring and clean-ups that had been on the want list for a while, but just hadn't happened, as well as making some new things work.

First, the initial work I did on types attached them to variables, but what we really needed was a more general way to attach properties. Therefore, there is now a hash of properties instead, where we can stash other stuff.

Next up, I had based pairs on the Parrot Pair PMC, though as Patrick pointed out it's so far off being right for Perl 6 (for example, it's mutable, the Perl 6 one isn't) that we might as well just have our own. Dropping the Parrot Pair PMC and doing that took me all of ten minutes of work, and we get the semantics of pairs a bit more correct too. So that's much cleaner now.

A few days back, dakkar sent in a bug report regarding inheritance. It was almost correct code, but didn't work on Rakudo, since initialization of parent attributes was not yet implemented. I've now implemented this, and I'll borrow the example from the bug report to demonstrate it.

class Foo {
    has $.x;
    method boo { say $.x }
}

class Bar is Foo {
    method set($v) { $.x = $v }
}

my Foo $u .= new(:x(5));
$u.boo;                    # 5

$u= Bar.new(Foo{ :x(12) }); # This is what now works
$u.boo;                    # 12
$u.set(9);
$u.boo;                    # 9

This is not some magical hacky syntax just to make constructors work; you can use it in the general case to associate some vivification data with a proto-object, which gives you a copy of it back with the data attached. It's a bit like currying the object instantiation. So after making the above work, it wasn't much more work to get the following working.

class Foo { has $.x }
my $foo42 = Foo{ :x(42) };
my $test = $foo42.new();
say $test.x; # 42

Note that the original Foo itself isn't changed. We'll have to revisit this again later, because the way I've done it now doesn't have the lazy semantics it's eventually meant to have. It makes the common use case work, though.

With some time spent on objects, I moved onto some improvements to regex stuff. The upshot of this is that you can now use grammar to group regexes into a namespace.

grammar Test {
    regex Load { \d+s };
    rule Loads { <Load> of <Load> };
}
if "100s of 1000s" ~~ Test::Loads { say "yes" }
yes

Note that this is just the start of grammars; inheritance doesn't yet work and you can't smart-match against them yet. It's a stop forward, though.

If you're an avid Rakudo follower, you'll have noted that regex, rule and token all (wrongly) did the same thing before today. I've fixed that too now (there was some behind the scenes work in being able to pass options to the compiler that will be useful elsewhere, as to users of the Parrot Compiler Toolkit in general). In a nutshell, token and rule don't backtrack, where as regex does, and additionally rule translates spaces to the rule, whereas normally they have no effect on the match.

# Demonstrating :ratchet semantics (rule like token here).
regex WillBT { a*a }
token WontBT { a*a }
if "aaa" ~~ WillBT { say "yes" } else { say "no" }
yes
if "aaa" ~~ WontBT { say "yes" } else { say "no" }
no

# Demonstrating :sigspace semantics.
regex Test1 { \d \d };
rule Test2 { \d \d };
if "12" ~~ Test1 { say "yes" } else { say "no" }
yes
if "1 2" ~~ Test1 { say "yes" } else { say "no" }
no
if "12" ~~ Test2 { say "yes" } else { say "no" }
no
if "1 2" ~~ Test2 { say "yes" } else { say "no" }
yes

So, that's what got done today. I'd like to thank Vienna.pm for funding this work, and hope you'll have fun playing with it, breaking it and reporting bugs. :-)

by JonathanWorthington at May 01, 2008 21:47 UTC

v
^
x

Bernhard SchmalhoferEclectus now emits Not Quite Perl6

Eclectus is a Scheme-compiler implemented in Scheme. The compilation target is a Parrot Abstract Syntax Tree, which is being run with the help of the Parrot Compiler Toolkit.

A sideline in Eclectus is the the problem how to tell PCT about the PAST generated in Scheme. Up to now I generated PIR that built up that data structure. This involved nasty dealings with unique ids for Parrot registers.

Generating NQP is much saner. The PAST is now set up with nested PAST::Node constructors.

Even nicer would be to create a YAML representation of PAST. But PCT doesn't support this yet.

by Bernhard at May 01, 2008 10:32 UTC

v
^
x

Jonathan WorthingtonVarious Rakudo Updates

First of all, before I dig into what my recent Rakudo hackings have been, I'd like to thank Vienna.pm for funding me to work on Rakudo. I will be working one full day a week on Rakudo from now on, at least for the next three months and, hopefully, longer. Today is the first day I'm working under this funding, so I'll be posting again later on today about what I got done. This post is just to update you on little bits that I've been doing, but didn't get written up yet.

First of all, you can now use the .= operator.

class Foo { }
my Foo $x .= new();

Here we call the 'new' method on $x, which we know is of type Foo thanks to the type declaration, and assign what it returns - namely, an instance of Foo - to $x. I did initially put this in a while ago, but it was a tad buggy and I wanted to get those worked out before posting it. That's been done, so happy playing. (And note you can use it in places other than declarations too.)

Additionally, some very basic multi-method dispatch based upon types is now in place. You can only use class names, not constraints or role names at the moment for the types, and certainly not more complex types than that. However, it's a start and allows us to run the following example.

class Thing {}
class Rock is Thing {}
class Paper is Thing {}
class Scissors is Thing {}

multi sub defeats(Thing $t1, Thing $t2) { 0 };
multi sub defeats(Paper $t1, Rock $t2) { 1 };
multi sub defeats(Rock $t1, Scissors $t2) { 1 };
multi sub defeats(Scissors $t1, Paper $t2) { 1 };

my $paper = Paper.new;
my $rock = Rock.new;

say defeats($paper, $rock); # 1
say defeats($rock, $paper); # 0

Finally, I put in a small optimization to avoid having to run some runtime type-checks when we can statically determine they're not needed. This should help performance a little.

by JonathanWorthington at May 01, 2008 08:37 UTC

April 28, 2008

v
^
x

Patrick MichaudRakudo milestones posted

Last week I heard that someone was having trouble finding the Rakudo sources, and it was suggested that blogging about it here might provide an additional pointer. So, if you're looking for the sources to Rakudo Perl, it's currently held in the languages/perl6/ directory of the Parrot repository.

Eventually Rakudo and other languages on Parrot will likely get separate repositories, but for now we all find it easier to keep everything in a single repository.

The architecture and layout of the Rakudo source code is described in docs/compiler_overview.pod. This says what each source file does and how it all fits together.

Also, by popular demand, I've created a list of "milestones" for Rakudo Perl development and stuck them in the ROADMAP. Reproducing the 2008-04-28 version here, we have:

* list context, list assignment
* return and control exceptions
* class, role, objects
* regex, token, rule, grammar
* selected libraries written in Perl 6
* modules
* junctions
* lazy lists
* slices
* multi sub & multi-method dispatch
* captures and signature handling
* operator overloading
* other S09 features (typed arrays, sized types)
* heredocs
* macros
* module versioning

While the milestones roughly in priority sequence, this list isn't meant to be rigid or strictly sequential in nature. If someone wants to work on a later milestone even though we haven't completed the earlier ones, that's certainly okay. This list just gives us a way to see where things are fitting in my head.

Suggestions and additions to the above list are welcome.

Pm

by pmichaud at April 28, 2008 18:27 UTC

v
^
x

chromaticWhat a GC Bug Looks Like in Parrot

Every so often someone reports weird behavior in #parrot, and someone says "Hey, that looks like a GC bug!"

Most of them aren't. (Most of them lately seem to be that we're changing the way bytecode works, and we don't have all of the dependencies for all of the generated PBC files correct, so you have to run make realclean and rebuild.)

While adding the vtable override cache the other day, I did create a forehead-slapper of GC bug, but I caught it before I checked in the code. How did I know it was a GC bug? Easy.

The Class PMC itself contains pointers to several other PMCs and GC entities, including the name of the class and its corresponding namespace. I added a pointer to a Hash PMC which maps the names of vtable overrides to Sub PMCs.

I remember thinking at the time "Hey, it's just a cache. I don't have to mark it during the mark phase explicitly. All of the Subs it refers to will stay alive as long as their namespaces live. That's easy."

When I ran the tests, I saw a weird error about not being able to perform a keyed index PMC lookup on a Key PMC. I set a breakpoint on the real_exception function (which reports these kinds of errors) in the debugger, and the backtrace showed that the cause of the call was my cache lookup function.

"That's weird," I thought. Then I realized what I had done.

My line of thinking was correct in that I don't have to mark all of the PMCs contained in the cache PMC. They're already reachable from the rootset through the namespace. The GC won't collect them.

The problem is that the cache itself -- the Hash PMC -- is only reachable through the Class PMC. Unless it gets marked as live, the GC will reclaim its header and put it on the free list again.

The Class PMC still has a pointer to that header, but the next PMC allocated from the GC which uses that header will overwrite the PMC's information, effectively morphing my lovely cache into something else. In this case, my Hash PMC turned into a Key PMC.

Usually they're not this obvious, but I've gone through all of the PMCs in Parrot to make sure they mark their contained GCable entities appropriately.

by chromatic at April 28, 2008 07:19 UTC

April 24, 2008

v
^
x

chromaticPerl 6 Design Minutes for 23 April 2008

The Perl 6 design team met by phone on 23 April 2008. Larry, Allison, Patrick, Jerry, Will, Jesse, Nicholas, and chromatic attended.

Jerry:

  • we're in a period where everyone's trying to break Parrot
  • they're adding new features and accidentally breaking thing
  • but they're fixing it
  • it's a good part of the cycle
  • people fix it
  • we don't have a build farm, so we can't test everywhere before committing to trunk

Patrick:

  • I thought that was the point of the release cycle

Jerry:

  • some people have suggested that we always keep trunk building and passing tests
  • but we don't have the means to do that
  • especially when we're playing with config
  • moving on, the big news is that TPF has six slots in Google's Summer of Code
  • one of them is fleshing out the Perl 6 test suite
  • we've needed someone to spearhead that
  • having a funded new contributor is wonderful
  • two Parrot-related projects
  • one is generating NCI stubs
  • Kevin Tew, a long-time contributor
  • the other is the incremental GC specified in the PDD
  • that's Andrew Whitworth
  • there's also an ASF project for integrating the GC from Apache Harmony into Parrot
  • they've wanted to release it as a standalone library
  • Parrot's the first test of a standalone system

Nicholas:

  • nice that it doesn't count against TPF's slot list

Jesse:

  • and it's nice visibility for Parrot from another group

Jerry:

  • I finally have six weeks of no plans to travel
  • should be able to devote more time to Parrot and Rakudo
  • looking forward to that

Larry:

  • getting some hacking in on my two pet bugaboos, the longest token matcher and match object generator
  • I refactored the matcher
  • it still uses TGE
  • instead of lumping all of the expect term possible tokens (that is, all of them) into one bucket it separates them into buckets based on their first letter
  • it's a one-level tree
  • we can build a much smaller DFA for the regexes that start with that letter
  • it caches that, of course
  • can get an instant reject if the next token can't possibly start with that letter
  • also flattened out all of the rules such that the list of tokens is easy to read
  • if the first probe with the DFA engine fails, I can take that small set of tokens that start with the same letter, run all of those rules, and sort from longest to shortest
  • preserves the token matching order without building huge DFA structures
  • as a backoff strategy, that will scale pretty well
  • refactored the parameter passing on the matcher side (STD 5)
  • instead of passing an initial array of random things, I have parameters
  • constructs match objects more correctly
  • in the sense that it gets all the information it's supposed to have
  • also has some attachments where it shouldn't have
  • I'm cleaning that up
  • that should scale pretty well

Jerry:

  • is there a drop in memory usage?

Larry:

  • I haven't measured
  • I'm sure that not feeding 800 regexes to TRE at once will make it allocate 17 megabytes on the stack
  • it might still be allocating too much for some of the larger things
  • I'm still aiming for correct, as opposed to fast
  • just trying to bear fast in mind
  • the longest token matcher now returns a linked list of states
  • not a string
  • should be a lot faster; easier to cache
  • functionally it's the same as before
  • one of those things you don't even have to measure to know it'll be faster
  • trying to avoid the bugaboo of premature optimization by doing what I know will be efficient to begin with
  • all the while trying to make the thing work
  • it has a good chance of being pretty speedy
  • my talk in Tokyo will be about all of the places where the current grammar allows extensibility
  • it'd be nice to be able to demonstrate some of that

Allison:

  • getting work done again
  • launched the Strings PDD
  • list of tasks for concurrency that I'm breaking down into smaller pieces
  • may post what I have now, and leave other people to break them down

Patrick:

  • are they parallelizable? :)

Allison:

  • many of them are
  • there are some bigger things, like switching the exception system over to the event handler
  • otherwise, just life stuff

Patrick:

  • had paying work come up this past week, so not a lot of actual coding
  • need to type the milestones document
  • it's all in my head
  • managed to remove a lot of unused code thanks to chromatic's post about possible optimizations
  • mostly just cleanup
  • but helped me figure out things which will feed into my redesign of PGE for longest token matching
  • should be able to return to direct Rakduo hacking later this week

Will:

  • various Parrot cleanups
  • TPF quarterly grant proposals are due at the end of the month
  • haven't seen anything come in yet

Allison:

  • they're queued in an RT queue
  • I don't know if grant members have access to that queue

Will:

  • we do
  • please, get your proposal in now, sooner than later
  • that goes for you on the call as well as people reading the minutes

c:

  • mostly spent the past week optimizing Parrot and Rakudo
  • looks like it's the building speed is twice as fast as when I started
  • runtime is faster too, but the optimization is compilation time
  • found some infelicities that need more design thought
  • but I'm happy to put these improvements in now and take them out later when the design improves around them
  • hope to start adding new features again soon

Patrick:

  • most of the test execution time is in compilation
  • how useful would it be to compile Rakudo to standalone PIR?

c:

  • I'd find it useful
  • but I'd find about ten things useful with all I work on
  • so not a blocker at the moment

Jesse:

  • how far will that get you toward native executables?

Patrick:

  • the existing trick for building perl6 would work
  • but it's not the same

c:

  • if it takes an hour or two, it would help me with debugging and profiling
  • if it takes more time, it's not that important

Patrick:

  • we have to figure out runtime deployment issues for the Perl 6 runtime library

Will:

  • we could add the requirement to run from languages/perl6/ right now

c:

  • that's fine by me for now

Patrick:

  • that's an afternoon job, not too bad
  • what do we need to do to get the Perl 6 and Parrot pages up to date?

Will:

  • I'll work on that

Nicholas:

  • why is C99 useful to Parrot and the compiler tools?

c:

  • front-end parsing for C header files to build NCI declarations automatically
  • the backend is pretty easy, that's thunk generation
  • the front-end keeps people from having to write boilerplate code by hand
  • generate the front-end once, where you have the headers, and then you can run the generated code anywhere even if you don't have the headers

Jesse:

  • how does that compare to Python's ctypes?

c:

  • as I understand it, they have the nice backend stuff
  • not so sure about the front-end
  • my P5NCI is nicer, if incomplete
  • just haven't had time to work on it...

Jesse:

  • if you put in for a TPF grant, that would be very useful

c:

  • get me a clone first, and you have a deal

Jerry:

  • Allison, we talked about implementing return
  • that requires tying in exceptions to the concurrency scheduler?

Allison:

  • yes
  • just not implemented yet
  • when we did exceptions, we didn't have concurrency
  • so it's on the top of my list to tie them together

Will:

  • Tcl's already using exceptions to handle return, break, and continue

Allison:

  • right now, you can't have an exception handler which is a full subroutine

Patrick:

  • I'm not sure we need one for that feature
  • every subroutine block decorated as such in PAST puts an exception handler in the block
  • if any nested block throws a return type exception, it grabs the arguments, does what it has to, and then does a Parrot return

Allison:

  • if that's what you need, go ahead and do it
  • I thought there are some features you didn't have yet

Patrick:

  • I thought there might be some opcodes I needed, like handled
  • but we can do something now
  • might not be completely optimal
  • but it's just packaging things up now
  • I have something I think will work
  • it's not trivial, but I'm 80% confident
  • just a matter of sitting down and doing it

Allison:

  • the concurrency stuff will be there before the next release
  • might not want to roll it in before the release
  • but it'll be there soon

Patrick:

  • I want to get return in for the April 15 milestone we're behind on

Jerry:

  • have you put in tickets for the breakdown of specific tasks?

Allison:

  • I've never done tickets for that
  • just sent mail to the list of the tasks
  • handed them out to people as they volunteered

c:

  • can you put them on a wiki page?
  • some of the other committers wanted that

Allison:

  • I can do that

by chromatic at April 24, 2008 21:16 UTC

April 22, 2008

v
^
x

chromaticRefcounting Isn't All Bad

A lot of people complain about reference counting as a memory management strategy in Perl 5, and they have a point. Unless you're very careful, reference counting spreads its tendrils throughout your system. Unless you're very disciplined, you'll forget to increase a reference count somewhere, and you'll end up collecting things in the wrong places.

The paper A Unified Theory of Garbage Collection discusses the most popular modern GC strategies to demonstrate how reference counting and tracing garbage collection are roughly equivalent, especially as concerns about performance, correctness, and conservativeness increase.

I explain this because I added reference counting to a small part of Parrot today as an optimization -- 11.73% on my Rakudo-building benchmark.

Parrot has an internal data structure called a stack. This represents a particular environment within the call graph. If you think of a normal program flow, where function A calls function B calls function C which returns to B and returns to A, you'll have a sense of a normal call stack. Parrot's stack is similar, except that we use continuation passing style (CPS), where it's not a stack, it's a graph, and control flow doesn't have to be that linear.

There's plenty of literature on CPS (including Luke Palmer's how to get a free sandwich with continuations, but the important idea is that function A sets a bookmark inside and passes that bookmark along to function B. B does the same thing when it calls C. At any point, you can look up any bookmark you happen to have and magically teleport (okay, it's just a jump instruction in your processor -- someday I'll explain how very low level languages have eval and very high level languages have eval and it's just the poor jerks in the middle who can't do anything useful without feeding the world's pointiest programming language into their AbstractFactoryFactoryFactories) to the point of that bookmark.

Ranty story short, our stack is a mostly-acyclic graph represented as a linked list, and stack entries -- or chunks -- are garbage collectable entities.

If you've even sat next to one of my "Let's Make Parrot Go Faster!" entries on a bus somewhere, or heard it playing softly in an elevator, you know that one good way to make Parrot go faster is to use fewer garbage collectable entities. One way to make an O(2n) algorithm faster is to make it an O(log n) algorithm. Another way is to make n smaller, and that's easier for now, especially because all I have to do is watch over Andrew Whitworth who's going to write a nice new garbage collector and keep track of Senaka Fernando, who's going to connect the nice garbage collector from Apache Harmony into Parrot.

One nice feature of reference counting (ah, now you see the connection!) is that you can recycle dead objects as soon as you know they're dead. One sad misfeature of a conservative garbage collector is that dead bodies stack up for a while until you can mulch them. One horrible thing about a stop-the-world full mark-and-sweep garbage collector (and try to guess what Parrot has at the moment) is that if you're out of memory in one of the arenas you care about, you have to mark all of the living GCable entities in your system and sweep all of the pools in all of the arenas to find everything that's alive and to recycle everything that isn't.

If you want to go fast, you want to avoid this until it's absolutely necessary.

Here's the interesting thing about stack chunks: Parrot stores them in one place in one function (stack_push) and unstores them in one place in one function (stack_pop). Again, the names are really bad because it's a graph and not a stack, but you get the point.

I noticed this when I profiled the Rakudo-building benchmark again and saw that stack_push was one of the most expensive inclusive calls, apart from the garbage collector. It calls a function which calls a function which calls a function which asks for a new stack chunk from the GC. When there are no more free stack chunks, the GC runs again, and that eats up precious cycles.

After a few minutes of thinking, I realized that the stack_push/stack_pop pair formed a boundary around a stack chunk's lifespan. Popping a chunk off of the stack meant that nothing else needed it. That chunk is immediately identifiably a dead object, suitable for recycling.

Rather than waiting for a full GC run to find every live object in the system and then run across this dead object and recycle it, I added a couple of lines of code to recycle it automatically then and there.

There was my immediate 11.73% speedup.

There was also a problem in the Continuation PMC tests; I noticed this only after I patted myself on the back for moving the bottleneck in my benchmark elsewhere. I'm sure you'll see it if you think for a moment, especially about how I kept writing that the stack isn't a stack, it's a graph implemented as a linked list.

The problem is that there's not a single linked list. If you take a continuation at various points in the call graph, you can have several linked lists that all share the same tail. If you exhaust one of those call chains, you may pop a stack chunk that's part of another call graph elsewhere. Your reference to it goes away, but it's still live elsewhere, so if you recycle it then and there, you may end up reusing the chunk and storing inappropriate data in it while something else expects it to stay pristine.

If you're very clever, you've figured out the solution.

I added a reference count to the stack chunk structure, incremented it on stack_push, decremented it on stack_pop, and recycled the chunk in the latter function only if the refcount is zero.

Is this a long-term solution? Probably not; I posted my initial investigation to the parrot-porters list, and we had a nice discussion on alternate techniques. We'll have to refactor the code more heavily once we find a better approach, but for now, I'll live with this optimization.

(Seneca Cunningham found what appears to be a crash related to this, but I think I've found the culprit and fixed it by adding another reference count pair to the Continuation PMC. See? I told you reference counting made itself pervasive.)

by chromatic at April 22, 2008 06:50 UTC

April 21, 2008

v
^
x

chromaticIt's Cheaper at Compile Time

I've long used part of the Rakudo-building process as my benchmark for Parrot optimizations. It doesn't exercise all of Parrot by any means, but it pushes the garbage collector, the string system, and the object system pretty hard. It's also the longest part of the build process anywhere, so improving performance there has a dramatic effect on productivity.

I've long known that it appends to and concatenates strings very frequently. The corresponding Parrot functions always show up near the top of the cost list from Callgrind. I've spent a few hours here and there trying to speed them up, but short of rethinking them entirely, I've never found a good solution.

We don't (yet) have a PIR-level profiler like Callgrind (but I'm working on it). Sometimes I can read through the code and see things that might be expensive, knowing what eventually gets called and where. On a whim, I browsed through the generated code for Rakudo to find concatenation. One idiom jumped out at me:

$S0 = concat "PIR", ':"'

This line of code catenates two constant strings into one, and assigns the result to the virtual string register $S0. It's not terribly inefficient; the two constant strings are in the constant table in the bytecode, so they don't get recreated on execution. However, the result is always going to be a constant string.

There's no reason the optimizer couldn't rewrite that line of code into:

$S0 = 'PIR:"'

It didn't. Now it does.

The optimizer already knew how to optimize away constant addition, subtraction, and other math operations. All I had to do was tell the opcode generator not to generate the concat_s_sc_sc op (concatenate two string constants and put the results in a string register -- the destination register always comes first in the argument list for an opcode), add the concat operator to the list of rewritable three-register opcodes, and add a case to the rewriter where the destination register is a string register.

The result is a modest 5.5% speedup in my benchmark. Part of the gain is avoiding run-time concatenation, but part of it is also using fewer GCable entities. The less garbage you create, the less garbage you have to collect and the less frequently you have to collect it. 5.5% isn't a huge gain, but it's a step in the right direction. Every second we shave off of the hack-compile-test cycle is an improvement, and these optimizations help all Parrot programs as well.

(Are these entries interesting? I seem to get more feedback when I complain about various Technology Distortion Fields.)

by chromatic at April 21, 2008 07:09 UTC

April 19, 2008

v
^
x

chromaticRakudo Copy on Wrong

Late last night, PerlJam posted in #parrot a small Perl 6 program which gave the wrong answer in Rakudo:

my $foo = 'fred';
say $foo;
$foo--;
my $bar = 'fred';
say $bar;

The correct output is obviously:

fred
fred

Rakudo gave:

fred
frec

PerlJam and Infinoid both correctly diagnosed the problem as a COW problem. What's that, and why does it matter?

The Rakudo compiler turns this code into PIR code. PIR is the native high level language of Parrot. Inside Parrot, the PIR compiler (IMCC) turns PIR into Parrot bytecode. As part of that process, IMCC identifies constant string literals and treats them specially.

Like the Perl 6 code, the PIR code produced by Rakudo contains the string literal fred twice. The PBC produced by IMCC doesn't; it refers to a single internal data structure twice.

This is usually the right approach. In this case, where the literal string appears twice and is only four characters long, there's little benefit, but in a complex program, you can save a lot of memory and time with judicious caching.

Now of course sometimes people want to mutate these strings. They're mutable; you can change them. That's where the COW comes in. It's like memory handling on a decent operating system. You only make a copy of the memory at the last possible point, where you know you're going to modify your copy. Parrot strings support this, so if you use Parrot operations directly, you don't even have to know that COW exists. It just works.

The problem was that the string modification took place outside of Parrot, in a custom Perl6Str PMC. Think of a PMC like a class which represents internal data structures, and you're most of the way to understanding them. The Perl6Str PMC has two operations, increment and decrement which do exactly what you'd expect to strings on the C level. This means that they modify the C string directly.

Because this occurs at the C level (working directly on C pointers), Parrot doesn't have a chance to perform the copy-on-write operation to the string, and the modification of one string produces the modification of all other strings which refer to the same string literal.

My first solution was to call the Parrot string function to perform the copy (because there's a write coming up) directly, but that made too much code move around (C89 and declarations before code, grr). Instead, I made a two-line macro which does an in-place copy and assign, and only two lines of code had to change to do the right thing. Now the code prints, as it should:

fred
fred

(I spent more time writing this entry than I did fixing the problem.)

by chromatic at April 19, 2008 17:32 UTC

v
^
x

chromaticThose with Loaded Mouths, and Those Who Code

This is complete vaporware. The constraints and software development link above is just annoying. There is no excuse for an 8 year long dev effort with no milestones or end in sight. We could have had a nice language by now with some simple features like parameter lists...but we're stuck with Perl 5 circa 1996 because some people got big useless ideas into their heads and hijacked what was once a useful technology and turned it into an obsolete pile of code about as valueable as awk or sed. The article should really begin to change, from a "whats new and cool in [Perl 6]" to "why [Perl 6] was such an utter failure". the [Perl 6] team has failed us all.

Justforasecond, Personal Comments on the Perl 6 discussion page on Wikipedia

If my economics are wrong, show me. If my math is wrong, show me. If you know the secret to software development, show me. Here's zero dollars and the source code. Fix everything you can about Perl 5. See you in two years.

Here's 1084 patches I've made or applied to Parrot in the past three years. Not coincidentally, I've had actual Perl 6 code running in public for about the same amount of time. You can find it if you search the web. Of course, you can find Perl 6 and Parrot milestones if you search the web too.

P.S. Perl 5.10, circa 2007 is quite decent. Try upgrading to software released this millennium.

by chromatic at April 19, 2008 01:54 UTC

April 18, 2008

v
^
x

chromaticPerl 6 Design Minutes for 16 April 2008

The Perl 6 design team met by phone on 16 April 2008. Larry, Allison, Jerry, Will, Nicholas, Jesse, and chromatic attended.

Jerry:

  • the release went pretty smoothly
  • had some help
  • the make release target is broken on Windows, and that's the only platform I had out here
  • we'll fix that before the next release from Windows
  • talking to Andy Armstrong about getting Test::Harness to 3.0 and subclass TAP::Parser so that we can report Rakudo's fudge tests better
  • every fudged test is a failure right now, even if all subtests pass
  • I talked Jonathan into implementing simple MMD in Rakudo
  • chromatic wrote about it
  • but it's broken in the release (and only the release)
  • put some work into internationalization
  • need to figure out the make rules
  • then I can put more work into localization too
  • trying to secure the parrot.org domain too
  • TPF is likely to get five slots for GSoC
  • some will be Parrot and Perl 6 related
  • the official announcement is on Monday
  • trying to encourage others to take on responsibility
  • seems to be working
  • some of the committers I've mentored are becoming mentors themselves

Patrick:

  • mostly reviewing different things
  • working with Jonathan on his various objects and MMD implementations
  • will check in my Rakudo milestones document tonight or tomorrow morning
  • cleaning up bug reports, closing tickets, catching up on patches
  • need to spend a little more time on paying work this week

Will:

  • trying to get the most recent release bundled as a macport
  • there's apparently a build issue since the previous macport
  • should have an easy way to install Rakudo as perl6 after that gets straightened out too
  • still trying to cut dead things out of Parrot

c:

  • poked at optimizations per Patrick's request
  • sped up OO and Rakudo by about 40%
  • not as much as I wanted, but you notice it
  • looked at a few more optimizations, but they're bigger and take more work
  • GC is the biggest, so let's hope we get that as one of the GSoC projects
  • think I can get the profiling core to emit Callgrind-compatible output for PIR
  • I know mostly how to do it now

Larry:

  • spending a lot of time tweaking a new laptop and getting it all set up and customized nicely
  • feebly trying to keep up with the onslaught in p6l
  • I haven't been keeping up, but I'm keeping my eye out for things going off track badly
  • plotting how to get rid of TRE as my longest token matcher
  • want something that will scale better
  • the handwriting was on the wall the first time I went in it with the debugger
  • for the regex matching where a token is expected, TRE was allocating 17 MB on the stack
  • I have some ideas for something with better semantics and less memory usage
  • TRE optimizes for running one regex a lot over a lot of data, rather than running a lot of regexes

Jesse:

  • spent a lot of time in discussions about funding Perl 5 hackers
  • I see François Perrad has released a Win32 binary of Parrot

Jerry:

  • he's been doing that for the past few releases
  • just a release in binary form, not a fork or branch

c:

  • what's the status of Parrot in Debian?

Allison:

  • one final build bug in IA-64
  • the guy with the box should be getting to it this week
  • we're planning to put up the 0.6.0 release
  • if it doesn't go soon, I'll just put the Fs on the site
  • we might put IA-64 builds on our platform wishlist

c:

  • we need someone who knows how to fix them too

Allison:

  • it was a PGE bug unrelated to the arch, I think, in 0.4.0
  • we just need the bug confirmed fixed on that arch for Debian

Jerry:

  • PAUSE kinda stinks
  • I hate that we don't know if we'll have an authorized release of Parrot until it gets uploaded
  • PARROTRE lacks permissions on eight modules
  • all of which have been refactored on something else
  • they're all related to the configure or test system

Allison:

  • we don't have to distribute through PAUSE

Jerry:

  • we should not index those modules anyway

c:

  • they're not that useful outside of Parrot anyway

Will:

  • the release link on the Parrotcode site, http://www.parrotcode.org/release/devel, links to the CPAN download
  • there's a lag between the update and the availability

Jerry:

  • we could upload to Parrotcode first
  • upload to the CPAN from that site

c:

  • I'm not sure why we should index the config::* modules
  • the only ones I care about are in Parrot::Embed

by chromatic at April 18, 2008 07:11 UTC

Perl.org sites : books | dev | history | jobs | learn | lists | use   
When you need perl, think perl.org  
the camel    
(Last updated: May 13, 2008 15:40 GMT)