Tuesday, August 12, 2008

pir file i/o

File I/O in PIR is very simple and straightforward, just as in a high-level language. Here's a short code snippet to read from an existing file and display its contents.



.sub main
.local string readfile
.local string data
.local pmc fd

readfile = 'read.txt'
print 'Opening: '
say readfile

fd = open readfile, '<'
data = readline fd
close fd
say data
say "File closed.\n"

.end



And here's one to read input from stdin, line by line. A blank line marks the end of input. As the lines are received, they are written to an output file. The final blank line is not written.



.sub main
.local string writefile
.local string data
.local pmc fd

writefile = 'write.txt'
fd = open writefile, '>'
say "Enter text to write. Blank line to end."
loop:
data = read 80
if data == "\n" goto lastline
print fd, data
goto loop
lastline:
print "Writing to "
say writefile
close fd
say "File closed."

.end



I'm a little puzzled about the read command. The Parrot IO API shows the following forms to be valid:

read(out STR, in INT): Read up to N bytes from standard input stream
read(out STR, invar PMC, in INT): Read up to N bytes from IO PMC stream.
readline(out STR, invar PMC): Read a line up to EOL from filehandle $2. This switches the filehandle to linebuffer-mode.

It seems to me that there ought to be a generic read(out STR) that reads until an EOF or a \n. Maybe there is a value (zero? minus one?) that causes read to behave this way. I'll have to do some more digging until I understand it better.

Labels: ,

Monday, July 7, 2008

parrot magic cookies

In a statically typed language, all variables are given a type when they are created. Take these declarations in C:


int i = 4;
char c = '4';


The variables i and c are not equal, and can't interact with each other. You can't add i and c to get 8, and you can't concatenate i and c to get 44, not without some sort of explicit conversion. (Well, technically you can add or concatenate them, but you'll be surprised by the results.)


int add = i + c;


The value of add is not 8, but 56. When a char variable is accessed as an integer, is is the ascii value (in this case, 52 for the character '4'), which is used.

In a dynamically typed language, that's not the case. Types are inferred. For example, in Perl:


$i = 4;
$c = '4';

$add = $i + $c;
$cat = $i . $c;


In this case, $add = 8 and $cat = 44. An integer or a character will be implicitly converted to the type the operator needs.

The currently most popular virtual machines (the Java VM and Microsoft's CLR) were built for statically typed languages. Parrot, since it will target Perl along with other dynamically typed languages, must have a way to handle implicit type conversion.

Parrot's solution is the PMC -- Parrot Magic Cookie. What can we do with PMCs?


.sub main :main
    $P0 = new Integer
    $P0 = 2
    printout($P0)
    inc $P0
    printout($P0)
    $P0 = "0"
    printout($P0)
    $P1 = new Float
    $P1 = 3.0
    $P0 = $P1
    printout($P0)
    $P0 = $P0 / 2.0
    printout($P0)
    $P0 = "2"
    printout($P0)
    $P0 = $P0 + 2
    printout($P0)
    $P0 = $P0 . "2"
    printout($P0)
    $P0 = $P0 / 8
    printout($P0)
    $P1 = "String"
    $P0 = $P0 + $P1
    printout($P0)
.end

.sub printout
    .param pmc pmcvalue
    .local string pmctype
    pmctype = typeof pmcvalue
    print pmcvalue
    print "\t"
    print pmctype
    print "\n"
.end


$P0 is a PMC register. The value in $P0 can be of any type, and can change its type dynamically based on the current context. The output of this script:


2       Integer
3       Integer
0       String
3       Float
1.5     Float
2       String
4       String
42      String
5.25    String
0       String


The biggest surprise to me was that after the $P0 = "2" assignment, the typeof $P0 remained a string even after arithmetic operations, yet those arithmetic operations accurately updated the value of $P0. The only way to change the PMC type was to explicitly assign a different type. Less of a surprise was that the attempt to add a string to the float value stored in $P0 yielded a value of 0.

PMCs are useful for much more, however, than mere dynamic typing. I'll cover some other uses in later posts.

Labels: , ,

Tuesday, July 1, 2008

grokking pir

If you're writing a compiler for the Parrot Virtual Machine, you'll need to understand Parrot Intermediate Representation (PIR). I've been slogging through the PIR documentation, and am slowly beginning to understand how it works.

PIR is not Parrot's bytecode, and it's not Parrot's assembly language. Those are PBC and PASM, respectively. PIR resembles PASM, but has some additional syntactic sugar that makes a lot of things easier. With PASM, for example, you have to assign variables to registers manually. PIR has "temporary registers" that let you assign a register type and let Parrot take it from there.

The following examples are all taken from somewhere within the Parrot documentation (although I can't locate the first one at the moment).



.sub _ :main
$P0 = new .Random
$N0 = $P0
print $N0
print "\n"
$N0 = $P0
print $N0
print "\n"
.end



This short code snippit generates random numbers. (Well, not true random numbers; I get the same two values every time I run it.) Notice the $P0 and $N0: These represent a PMC register and a numeric (floating point) register, respectively. Rather than assigning these directly to registers P0 and N0, we can use register aliases. This will be very helpful in larger programs where it might otherwise be difficult to keep track of what's available.

Here's another snippet:



.sub double
.param int arg
arg *= 2
.return(arg)
.end

.sub main :main
.local int result
result = double(42)
print result
print " was returned\n"
.end



Note the :main directive in .sub main. It is the :main directive, not the name of the sub, that tells Parrot where to start. If :main does not appear, by default the first sub will be the entry point.

Note, too, the .local int result directive in main. This is not a local variable; Parrot will assign an integer register for the result. The .param directive in the subroutine tells Parrot to expect a parameter of type int.

One more example for now:


.sub almost
.local num almost_pi
almost_pi = 22/7
print almost_pi
print "\n"
.end



I've modified this from the Parrot documentation code. almost_pi is assigned to a num (float) register; and given a value of 22/7. Although 22 and 7 are both integers, the result is a float because of the register type. If almost_pi had been an int register, the resulting value would have been 3.

Knowing PIR is essential to designing a compiler for Parrot, because all the other languages targeting Parrot are compiled to PIR. The source code for all PBC is PIR, or to put it another way, from Parrot's perspective, there is only one source language. This means that any language targeting Parrot could theoretically have access to any code written in any other language targeting Parrot, as long as they compile to compatible PIR.

More information:

Labels: ,

Monday, June 16, 2008

what you might see here

I plan to use this blog to post my thoughts about my current non-work-related projects, whatever they may be at the time. Currently, I'm interested in the following:


  • The Parrot Virtual Machine. I'm creating a small language I call Simple Parrot Test Language (SPTL), and writing a compiler to translate SPTL to Parrot's PIR code.

  • The jQuery JavaScript Library. I've just started looking into this, and it might be a little while before I post about it.

  • The new SearchMonkey API from Yahoo! I don't know what all can be done with it, but I'd like to try it out.

  • A project I hope to start this summer, tentatively called WishList. More information on this later.



I also plan to post occasionally on my non-computer (but somewhat related) interests such as chess, logic puzzles, and game theory. Posts that are more opiniony will appear on my other blog, it seems to me...

Labels:

Sunday, June 15, 2008

what's in a name?

It's not every day that someone, upon starting a new blog, decides to name said blog considered harmful, even if the blog's subject is computer programming, where this phrase has been part of the lexicon since the publication of Edsger Dijsktra's seminal 1968 paper Go-to Statement Considered Harmful (pdf) in Communications of the ACM.

Considered harmful has become such an overused phrase in computer science that some have opined that the phrase itself should be considered harmful.

So why would I use it for the name of my blog?

It's not because I'm an expert programmer who plans to use this blog to analyze others' code and point out their failures and inefficiencies. If I want to do that, I've got plenty of my own bad code to analyze.

No, the reason I'm calling this blog considered harmful is that I think blogs are a poor medium for dispensing expert advice. First, the format lends itself neither to a comprehensive overview nor to a detailed examination of any subject. Second, anybody can create a blog with minimal effort and dispense bad advice. Third, blogging often generates more heat than light; in fact, the more controversial the blog, the more readers it attracts. Conversely, the more popular a blog is, the more controversy it attracts.

Alastair Rankine recently offered a stinging critique of Jeff Atwood's popular Coding Horror blog. Among other things, Rankine does not think Atwood is the expert that the tone of some of Atwood's posts seems to convey. Atwood, in his response, admits no expertise: "I've always thought of myself as nothing more than a rank amateur seeking enlightenment." (emphasis in original)

The reality is, we are all rank amateurs. Computer science is still in its infancy. Every few years a new language, a new framework, a new methodology appears, promising to make programmers more productive. Some of this technology has delivered only a fraction of this promise; most has fallen by the wayside, leaving nothing but tons of legacy code to maintain. In the end, for all the advances in computer science over the last half century, none of us really knows anything.

So with all that in mind, why even start a programming blog? Simply because experience tells me that typing my thoughts helps me to organize them, and that gives me a better understanding of what I'm doing. Like Jeff Atwood, I'm seeking enlightenment. If I post a tutorial, it's mostly for my own use, though others might find it helpful as well. I may post work in progress, code that was hacked together for a small task that slowly accreted more features and is now in desperate need of refactoring. It's always interesting to see what vastly different approaches people will take toward improving kludgy code.

Furthermore, my experience with my theology blog (another area where nobody really knows anything) has taught me that feedback from others enhances the learning experience. Commenters might provide insight that I may never have found on my own, or they might provide the spark that sends my thoughts down a new alley to explore.

So, welcome to considered harmful. Maybe we can learn something together.

Labels: ,