ThinkGeek - Cool Stuff for Geeks and Technophiles

Monday, July 7, 2008

Parrot magic cookies

In a statically typed language, all variables are given a type when they are created. Take these declarations in C:


int i = 4;
char c = '4';


The variables i and c are not equal, and can't interact with each other. You can't add i and c to get 8, and you can't concatenate i and c to get 44, not without some sort of explicit conversion. (Well, technically you can add or concatenate them, but you'll be surprised by the results.)


int add = i + c;


The value of add is not 8, but 56. When a char variable is accessed as an integer, is is the ascii value (in this case, 52 for the character '4'), which is used.

In a dynamically typed language, that's not the case. Types are inferred. For example, in Perl:


$i = 4;
$c = '4';

$add = $i + $c;
$cat = $i . $c;


In this case, $add = 8 and $cat = 44. An integer or a character will be implicitly converted to the type the operator needs.

The currently most popular virtual machines (the Java VM and Microsoft's CLR) were built for statically typed languages. Parrot, since it will target Perl along with other dynamically typed languages, must have a way to handle implicit type conversion.

Parrot's solution is the PMC -- Parrot Magic Cookie. What can we do with PMCs?


.sub main :main
    $P0 = new Integer
    $P0 = 2
    printout($P0)
    inc $P0
    printout($P0)
    $P0 = "0"
    printout($P0)
    $P1 = new Float
    $P1 = 3.0
    $P0 = $P1
    printout($P0)
    $P0 = $P0 / 2.0
    printout($P0)
    $P0 = "2"
    printout($P0)
    $P0 = $P0 + 2
    printout($P0)
    $P0 = $P0 . "2"
    printout($P0)
    $P0 = $P0 / 8
    printout($P0)
    $P1 = "String"
    $P0 = $P0 + $P1
    printout($P0)
.end

.sub printout
    .param pmc pmcvalue
    .local string pmctype
    pmctype = typeof pmcvalue
    print pmcvalue
    print "\t"
    print pmctype
    print "\n"
.end


$P0 is a PMC register. The value in $P0 can be of any type, and can change its type dynamically based on the current context. The output of this script:


2       Integer
3       Integer
0       String
3       Float
1.5     Float
2       String
4       String
42      String
5.25    String
0       String


The biggest surprise to me was that after the $P0 = "2" assignment, the typeof $P0 remained a string even after arithmetic operations, yet those arithmetic operations accurately updated the value of $P0. The only way to change the PMC type was to explicitly assign a different type. Less of a surprise was that the attempt to add a string to the float value stored in $P0 yielded a value of 0.

PMCs are useful for much more, however, than mere dynamic typing. I'll cover some other uses in later posts.

Labels: , ,

Tuesday, July 1, 2008

grokking pir

If you're writing a compiler for the Parrot Virtual Machine, you'll need to understand Parrot Intermediate Representation (PIR). I've been slogging through the PIR documentation, and am slowly beginning to understand how it works.

PIR is not Parrot's bytecode, and it's not Parrot's assembly language. Those are PBC and PASM, respectively. PIR resembles PASM, but has some additional syntactic sugar that makes a lot of things easier. With PASM, for example, you have to assign variables to registers manually. PIR has "temporary registers" that let you assign a register type and let Parrot take it from there.

The following examples are all taken from somewhere within the Parrot documentation (although I can't locate the first one at the moment).



.sub _ :main
$P0 = new .Random
$N0 = $P0
print $N0
print "\n"
$N0 = $P0
print $N0
print "\n"
.end



This short code snippit generates random numbers. (Well, not true random numbers; I get the same two values every time I run it.) Notice the $P0 and $N0: These represent a PMC register and a numeric (floating point) register, respectively. Rather than assigning these directly to registers P0 and N0, we can use register aliases. This will be very helpful in larger programs where it might otherwise be difficult to keep track of what's available.

Here's another snippet:



.sub double
.param int arg
arg *= 2
.return(arg)
.end

.sub main :main
.local int result
result = double(42)
print result
print " was returned\n"
.end



Note the :main directive in .sub main. It is the :main directive, not the name of the sub, that tells Parrot where to start. If :main does not appear, by default the first sub will be the entry point.

Note, too, the .local int result directive in main. This is not a local variable; Parrot will assign an integer register for the result. The .param directive in the subroutine tells Parrot to expect a parameter of type int.

One more example for now:


.sub almost
.local num almost_pi
almost_pi = 22/7
print almost_pi
print "\n"
.end



I've modified this from the Parrot documentation code. almost_pi is assigned to a num (float) register; and given a value of 22/7. Although 22 and 7 are both integers, the result is a float because of the register type. If almost_pi had been an int register, the resulting value would have been 3.

Knowing PIR is essential to designing a compiler for Parrot, because all the other languages targeting Parrot are compiled to PIR. The source code for all PBC is PIR, or to put it another way, from Parrot's perspective, there is only one source language. This means that any language targeting Parrot could theoretically have access to any code written in any other language targeting Parrot, as long as they compile to compatible PIR.

More information:

Labels: ,