ThinkGeek - Cool Stuff for Geeks and Technophiles

Thursday, July 30, 2009

Parrot documentation

It's been a while since I last blogged about Parrot and PIR. In the meantime, Parrot documentation has been coalescing nicely. The Parrot docs page even includes chapter-by-chapter links to drafts of a Parrot book and a PIR book. It's nice to see how much improvement has been made in Parrot documentation.

I made my first contribution to the documentation last weekend, a trivial correction of a typo. I hope to be able to make more substantial contributions in the future.

Labels: , ,

Wednesday, December 3, 2008

introducing SPTL

There seems to be a long period of initial obscurity for any new language. Then after that comes a long period of semi-obscurity, followed by total obscurity.

Paul Bissex, quoted by Steve Yegge in The Next Big Language



So I've been working on the grammar for my language to target the Parrot Virtual Machine. As it turns out, creating a consistent context-free grammar is a lot harder than it may seem. Or maybe I've got a warped sense of what seems easy and what seems hard.

At any rate, the Simple Parrot Test Language (SPTL) is a small Perl-like language. In fact, at a glance it might appear to be a subset of Perl. However, some elements have different meanings in SPTL than in Perl. I'm going to spend the next few posts going over the details and explaining my design decisions. For now, the full BNF grammar can be found here (text file).

Labels: ,

Tuesday, August 12, 2008

PIR file i/o

File I/O in PIR is very simple and straightforward, just as in a high-level language. Here's a short code snippet to read from an existing file and display its contents.



.sub main
.local string readfile
.local string data
.local pmc fd

readfile = 'read.txt'
print 'Opening: '
say readfile

fd = open readfile, '<'
data = readline fd
close fd
say data
say "File closed.\n"

.end



And here's one to read input from stdin, line by line. A blank line marks the end of input. As the lines are received, they are written to an output file. The final blank line is not written.



.sub main
.local string writefile
.local string data
.local pmc fd

writefile = 'write.txt'
fd = open writefile, '>'
say "Enter text to write. Blank line to end."
loop:
data = read 80
if data == "\n" goto lastline
print fd, data
goto loop
lastline:
print "Writing to "
say writefile
close fd
say "File closed."

.end



I'm a little puzzled about the read command. The Parrot IO API shows the following forms to be valid:

read(out STR, in INT): Read up to N bytes from standard input stream
read(out STR, invar PMC, in INT): Read up to N bytes from IO PMC stream.
readline(out STR, invar PMC): Read a line up to EOL from filehandle $2. This switches the filehandle to linebuffer-mode.

It seems to me that there ought to be a generic read(out STR) that reads until an EOF or a \n. Maybe there is a value (zero? minus one?) that causes read to behave this way. I'll have to do some more digging until I understand it better.

Labels: ,

Monday, July 7, 2008

Parrot magic cookies

In a statically typed language, all variables are given a type when they are created. Take these declarations in C:


int i = 4;
char c = '4';


The variables i and c are not equal, and can't interact with each other. You can't add i and c to get 8, and you can't concatenate i and c to get 44, not without some sort of explicit conversion. (Well, technically you can add or concatenate them, but you'll be surprised by the results.)


int add = i + c;


The value of add is not 8, but 56. When a char variable is accessed as an integer, is is the ascii value (in this case, 52 for the character '4'), which is used.

In a dynamically typed language, that's not the case. Types are inferred. For example, in Perl:


$i = 4;
$c = '4';

$add = $i + $c;
$cat = $i . $c;


In this case, $add = 8 and $cat = 44. An integer or a character will be implicitly converted to the type the operator needs.

The currently most popular virtual machines (the Java VM and Microsoft's CLR) were built for statically typed languages. Parrot, since it will target Perl along with other dynamically typed languages, must have a way to handle implicit type conversion.

Parrot's solution is the PMC -- Parrot Magic Cookie. What can we do with PMCs?


.sub main :main
    $P0 = new Integer
    $P0 = 2
    printout($P0)
    inc $P0
    printout($P0)
    $P0 = "0"
    printout($P0)
    $P1 = new Float
    $P1 = 3.0
    $P0 = $P1
    printout($P0)
    $P0 = $P0 / 2.0
    printout($P0)
    $P0 = "2"
    printout($P0)
    $P0 = $P0 + 2
    printout($P0)
    $P0 = $P0 . "2"
    printout($P0)
    $P0 = $P0 / 8
    printout($P0)
    $P1 = "String"
    $P0 = $P0 + $P1
    printout($P0)
.end

.sub printout
    .param pmc pmcvalue
    .local string pmctype
    pmctype = typeof pmcvalue
    print pmcvalue
    print "\t"
    print pmctype
    print "\n"
.end


$P0 is a PMC register. The value in $P0 can be of any type, and can change its type dynamically based on the current context. The output of this script:


2       Integer
3       Integer
0       String
3       Float
1.5     Float
2       String
4       String
42      String
5.25    String
0       String


The biggest surprise to me was that after the $P0 = "2" assignment, the typeof $P0 remained a string even after arithmetic operations, yet those arithmetic operations accurately updated the value of $P0. The only way to change the PMC type was to explicitly assign a different type. Less of a surprise was that the attempt to add a string to the float value stored in $P0 yielded a value of 0.

PMCs are useful for much more, however, than mere dynamic typing. I'll cover some other uses in later posts.

Labels: , ,

Tuesday, July 1, 2008

grokking pir

If you're writing a compiler for the Parrot Virtual Machine, you'll need to understand Parrot Intermediate Representation (PIR). I've been slogging through the PIR documentation, and am slowly beginning to understand how it works.

PIR is not Parrot's bytecode, and it's not Parrot's assembly language. Those are PBC and PASM, respectively. PIR resembles PASM, but has some additional syntactic sugar that makes a lot of things easier. With PASM, for example, you have to assign variables to registers manually. PIR has "temporary registers" that let you assign a register type and let Parrot take it from there.

The following examples are all taken from somewhere within the Parrot documentation (although I can't locate the first one at the moment).



.sub _ :main
$P0 = new .Random
$N0 = $P0
print $N0
print "\n"
$N0 = $P0
print $N0
print "\n"
.end



This short code snippit generates random numbers. (Well, not true random numbers; I get the same two values every time I run it.) Notice the $P0 and $N0: These represent a PMC register and a numeric (floating point) register, respectively. Rather than assigning these directly to registers P0 and N0, we can use register aliases. This will be very helpful in larger programs where it might otherwise be difficult to keep track of what's available.

Here's another snippet:



.sub double
.param int arg
arg *= 2
.return(arg)
.end

.sub main :main
.local int result
result = double(42)
print result
print " was returned\n"
.end



Note the :main directive in .sub main. It is the :main directive, not the name of the sub, that tells Parrot where to start. If :main does not appear, by default the first sub will be the entry point.

Note, too, the .local int result directive in main. This is not a local variable; Parrot will assign an integer register for the result. The .param directive in the subroutine tells Parrot to expect a parameter of type int.

One more example for now:


.sub almost
.local num almost_pi
almost_pi = 22/7
print almost_pi
print "\n"
.end



I've modified this from the Parrot documentation code. almost_pi is assigned to a num (float) register; and given a value of 22/7. Although 22 and 7 are both integers, the result is a float because of the register type. If almost_pi had been an int register, the resulting value would have been 3.

Knowing PIR is essential to designing a compiler for Parrot, because all the other languages targeting Parrot are compiled to PIR. The source code for all PBC is PIR, or to put it another way, from Parrot's perspective, there is only one source language. This means that any language targeting Parrot could theoretically have access to any code written in any other language targeting Parrot, as long as they compile to compatible PIR.

More information:

Labels: ,