Download presentation
Presentation is loading. Please wait.
Published byRosalind Robertson Modified over 9 years ago
1
Perl 6 Internals Dan Sugalski TPC 5.0 “Here there be dragons”
2
The big goals of perl 6's internals Speed Extendibility Cleanliness Compatibility Modularity Thread Safety Flexibility
3
Some global decisions The core will be in C. (Like it or not, it's appropriate for code at this level) The core must be modular, so pieces can be swapped out without rebuilding It must be fast Long-term binary compatibility is a must Your average perl coder or extension writer shouldn't need any info about the guts Things should generally be thought out, documented, and engineered
4
The quick overview Parser Compiler Optimizer Runtime engine
5
Parser Compiler Optimizer Interpreter Syntax Tree Unoptimized Bytecode Optimized Bytecode Fully-laden Interpreter Precompiled Bytecode
6
The parser Where the whole thing starts Generally takes source of some sort and turns it into a syntax tree
7
The Bytecode Compiler Turns a syntax tree into bytecode Performs some simple optimization
8
The optimizer Takes the plain bytecode from the compiler and abuses it heavily An optional step, generally skipped for compile- and-go execution Should be able to work on small parts of a program for JIT optimization
9
The Interpreter Takes compiled (and possibly optimized) bytecode and does something with it Generally that something is execute, but it might also be: Save to disk Translate to another format (.NET, Java bytecode) Compile to machine code
10
The Parser “Double, double, toil and trouble Fire burn, and cauldron bubble”
11
Parser goals Extendible in perl More powerful than what we have now Retargetable Self-contained and removable
12
Parsing perl isn't easy May well be one of the toughest languages to properly parse If we get perl right other languages are easy. Or at least easier We have the full power of perl to draw on to do the parsing (Including the regex engine and Damian's Bizarre Idea de Jour)
13
The parser will be in C We will be using C for the parser A full set of callbacks will be available to hook into the parser in lots of places Adding new parsing rules (probably with regexes describing them) will be easy The parser will be extendable via perl code
14
The Compiler “Mmmmm, tasty!”
15
From syntax tree to bytecode The compiler takes a syntax tree and turns it into bytecode Very little optimization is done here. Optimization is expensive and optional Pretty straightforward—this isn't rocket science
16
The Optimizer “We can rebuild it. Make it better, faster, stronger”
17
The Optimizer Takes plain bytecode and makes it faster Does all the sorts of things that you expect an optimizer to do—code motion, loop unrolling, common subexpression work, etc. Will be an iterative process This will be interesting, as perl's a pain to optimize An optional step, of course
18
Things that make optimizing perl tough Active data Runtime redefinitions of everything Really, really late binding (Waiting for Godot late) Perl programmers are used to more predictable runtime characteristics than, say, C programmers.
19
The Interpreter “Polly want a cracker?”
20
Interpreter goals Fast Tuned for perl Language neutral where possible Event capable Sandboxable Asynchronous I/O built in Built with an eye towards TIL and/or native code compilation Better debugging support than perl 5
21
The perl 6 interpreter is software CPU Complete with registers and an assembly language This can make translating perl 6 bytecode into native machine code easier There's a lot of literature on building optimzing compilers that can be leveraged While more complex than a pure stack-based machine, it's also faster Opcode dispatch needs to be faster than perl 5 Opcode functions can be written in perl
22
CPU specs 64 int, float, string, and PMC registers A segmented multiple stack architecture Interrupt-capable (for events) Pretty much completely position independent— everything is referenced via register, pad entry, or name
23
The regex engine The regex engine is going to be part of the perl 6 CPU, not separate as it is now A good incentive to get opcode dispatch fast Makes expanding the regex engine a bit easier Details will be hidden as a set of regex opcodes
24
A few words on the stack system Each register file has an associated stack All registers of a particular type can be pushed onto or popped off the stack in one go Individual registers or groups of registers can be pushed or popped The stacks are all segmented so we're not relying on finding contiguous chunks of memory for them There's also a set of call and scratch stacks
25
Bytecode “Could you say that a little differently?”
26
What is bytecode? A distilled version of a program Machine language for the PVM Can contain a lot of 'extra' information, including full source Designed to be platform independent Should be mostly mappable as shared data (modulo the fixup sections)
27
Data Structures “Vtables and strings and floats, oh my!”
28
Variables Vtable Pointer Data Pointer Integer Value Float Value Flags Synchronization GC Data Generically called a PMC Bigger than Perl 5's base data structure Synchronization data built-in Same for all variable types GC data is not part of base structure
29
Scalars Built off the base PMC structure Use the integer and float areas as caches Data pointer points off to string, large int, or large float Vtable functions determine how it all works
30
Arrays Built off the base PMC structure Data pointer points to array data All perl 6 arrays are typed May have an array of scalars, strings, integers, or floats Array only takes up enough memory to hold their types
31
Hashes Built off the base PMC structure Data pointer points to array data All perl 6 hashes are typed May have a hash of scalars, strings, integers, or floats Hashes only takes up enough memory to hold their types Hashing function is overridable
32
Strings Encoding Type Buffer Start Buffer Length String Length String Size Strings are sort of abstract Perl 6 can mix and match string data (Unicode, ASCII, EBCDIC, etc) New string types can be loaded on the fly Flags Unused
33
String handling Perl 6 has no 'built-in' string support—all string support is via loadable libraries There'll be Unicode, ASCII, and EBCDIC support provided (at least) to start
34
Numbers Bigints and bigfloats share the same header Arbitrary-length floating point and integer numbers are supported Perl automagically upgrades ints and floats when needed Buffer Pointer Length Exponent Flags
35
Vtables All variable data access is done through a table of functions that the variable carries around with it This allows us faster access, since code paths are specialized for just the functions they need to perform Isolates us from the implementation of variables internally Allows special purpose behaviour (like perl 5's magic) to be attached without cost to the rest of perl
36
Vtables (cont'd) Makes thread safety easier A little bit more overhead because of the extra level of indirection, but the smaller functions make up for that Vtable functions can be written in perl. (Each class with objects blessed into it will have at least one) There may be more than one vtable per package
37
Vtables hide data manipulation Pretty much all the code to handle data manipulation will be done via variable vtables Ths allows the variable implementation to change without perl needing to know Allows far more flexibility in what you can make a variable do Shortens the code path for data functions and trims out extraneous conditionals
38
For example: Fetching the string value of a scalar For scalars with strings: String *get_str(PMC *my_PMC) { return my_PMC->data_pointer; } For int-only scalar: String *get_str(PMC *my_PMC) { my_PMC->data_pointer = make_string(my_PMC->integer); my_PMC->vtable = int_and_string_vtable; return my_PMC->data_pointer; }
39
Memory Management “Now where did I put that?”
40
Getting headers All the fixed-size things (PMCs, string/number headers) get allocated from arenas All headers, with the exception of PMCs (maybe) are moveable by the garbage collector Non-PMC header allocation is very fast PMC allocation is only mostly fast
41
Buffer Management Anything that isn't a fixed size gets allocated from the buffer pools All buffered data, with the exception of data allocated in special pools, is moveable by the garbage collector Because of GC, allocation is very quick
42
Garbage Collection “Bring out yer dead!”
43
The perl 6 GC is a copying collector Everything except PMCs is moveable in Perl 6 PMCs might be moveable too We get a compact memory heap out of this, which allows for fast allocation Perl 6 will release empty memory back to the system when it can Refcounts are used only to note object lifetimes, not for GC Refcounts, for the most part, are dead
44
GC considerations for Objects Garbage collection and object death are now separate things Perl's guarantee of timely object death is stronger We still don't guarantee perfect collection (but it sucks less) We still refcount for real perl references, but only 2 bits are used Objects with more than two simultaneous references won't get collected until a full dead variable scan is made
45
Extensions beware! Since we have no refcounts, extensions must tell perl when they hold on to PMCs Not a huge deal, as we piggy-back on the cross- interpreter PMC tracking we use for threads No more struct PMC; in extensions...
46
Extending Perl 6
47
Extensions Made Easier Perl 6 will have a real API The API is multilevel Simple for embedders More complex for extension authors Pretty messy for vtable or opcode writers Binary compatibility is a very strong consideration
48
Embedding Guaranteed stable and binary compatible for the life of perl 6 Very simple API Create interpreter Destroy interpreter Parse source Run code Register native functions
49
Extensions Much simpler interface to perl's internals The gory details are hidden Stable binary compatibility is a very strong goal We may add functions or options, but we won't take them away Extensions built for perl 6.0.1 should still run with perl 6.8.12 without rebuilding Manipulating perl data should be much easier If you have to resort to Inline to wrap a library then it means we've not got it right
50
Extensions (cont) Inline, or something like it, is probably going to be the standard for extending perl XS, when you have to resort to it, will be far less nasty than it is now
51
Homegrown Opcodes and Vtables This is part of the grubby inside of perl 6 You can use any of the internal routines of perl If you do, though, you may run into backward- compatibility issues at some point. (If it's not part of the embedding, utility, or extension API, we make no promises) There's no guarantee that calling conventions won't change. No guarantees that perl 6.4 will even use vtables or opcodes
52
Utility library Perl 6 will provide a set of utility routines to handle common tasks String manipulation Encoding changes (Shift-JIS to Unicode, EBCDIC to ASCII) Conversion routines (string to int or float) Extended precision math (int and float) These will be stable, like the rest of the API
53
Variations on a Theme “Tocatta and Fuge in perl minor by Wall”
54
The source doesn't have to be perl The parser isn't obligated to be parsing perl Input source could be Python, Ruby, Java, or INTERCAL The full perl parser is optional
55
The interpreter doesn't have to interpret The interpreter is the destination for bytecode, but it doesn't have to interpret it It might save directly to disk It might translate the bytecode into an alternate form—Java bytecode,.NET code, or executable code, for example The interpreter might translate to machine code on the fly, as a sort of JIT compiler. (Well, really a TIL, but...)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.