| 1 | =head1 NAME |
|---|
| 2 | |
|---|
| 3 | Architecture - the Parse::Rule OO architecture |
|---|
| 4 | |
|---|
| 5 | =head1 DESCRIPTION |
|---|
| 6 | |
|---|
| 7 | This architecture guide was written largely for the purpose of me, the |
|---|
| 8 | implementor, because as I convert to the new architecture, I'm having |
|---|
| 9 | trouble navigating and understanding what I still need to do. But it |
|---|
| 10 | will probably serve as a useful guide to the hacker/extender. |
|---|
| 11 | |
|---|
| 12 | Here are the goals: |
|---|
| 13 | |
|---|
| 14 | =over |
|---|
| 15 | |
|---|
| 16 | =item * Not to commit to a particular evaluation strategy. |
|---|
| 17 | |
|---|
| 18 | PGE uses coroutines, and my version uses continuation passing style. We |
|---|
| 19 | don't know which will be better or faster yet. We'd also like to |
|---|
| 20 | eventually support local DFA optimization. Being noncommital is the |
|---|
| 21 | best choice. |
|---|
| 22 | |
|---|
| 23 | =item * Not to commit to particular media. |
|---|
| 24 | |
|---|
| 25 | In Perl 6, rules can match against strings and they can match against |
|---|
| 26 | arrays. That's only two, and we could support each explicitly. But |
|---|
| 27 | what about matching against parameter lists, against trees and data |
|---|
| 28 | structures? Those are possibilities that we would like to give the |
|---|
| 29 | module writer, if not the language designer (actually, I would rather |
|---|
| 30 | give those choices to module writers than language designers). |
|---|
| 31 | |
|---|
| 32 | =back |
|---|
| 33 | |
|---|
| 34 | However, being noncommittal to these two things at once is a bit of a |
|---|
| 35 | challenge. They are not orthogonal. For example, the CPS runtime would |
|---|
| 36 | like the Text medium to take a match object and return an updated match |
|---|
| 37 | object upon matching a literal and fail if it couldn't. However, the |
|---|
| 38 | optimized DFA runtime would like to query the Text medium for each of |
|---|
| 39 | its characters and then compile a match itself. |
|---|
| 40 | |
|---|
| 41 | So I designed it as a combinator library, much like Haskell's Parsec. |
|---|
| 42 | However, instead of providing one library of combinators, I make a |
|---|
| 43 | "combinator library" an object. Then to build a match, you call |
|---|
| 44 | combinators as methods from that object: |
|---|
| 45 | |
|---|
| 46 | my $c = Parse::Rule::CPS::Text.new; |
|---|
| 47 | # / [foo]+ / |
|---|
| 48 | $c.quantify(:min(1), $c.literal("foo")); |
|---|
| 49 | |
|---|
| 50 | This combinator library object is composed out of roles to achieve |
|---|
| 51 | almost independent modularity. There is a base role called C<Strategy> |
|---|
| 52 | that has things like C<quantify>, C<concat>, C<capture>, etc. Every |
|---|
| 53 | runtime must implement all of these combinators (except those |
|---|
| 54 | combinators -- currently none -- that can be built out of other |
|---|
| 55 | combinators). |
|---|
| 56 | |
|---|
| 57 | There is also a base role for each medium. For example, for C<Text>, |
|---|
| 58 | the base role requires the C<literal> and C<any_char> combinators to be |
|---|
| 59 | implemented. On top of those, together with the combinators from a |
|---|
| 60 | C<Strategy> role, it builds things like C<beginning_of_line>, |
|---|
| 61 | C<word_boundary>, etc. Since these use the combinators in a role, they |
|---|
| 62 | are not specific to any particular strategy. |
|---|
| 63 | |
|---|
| 64 | Then to build the final library object, create a class that combines a |
|---|
| 65 | C<Strategy> role and a medium role, and override those methods required |
|---|
| 66 | by the medium role in terms of that strategy. There will therefore be |
|---|
| 67 | one class for every strategy/medium combination, but it should be very |
|---|
| 68 | small. |
|---|
| 69 | |
|---|
| 70 | Here is the module tree layout: |
|---|
| 71 | |
|---|
| 72 | Core - absolutely global stuff, like the structure of the |
|---|
| 73 | match object |
|---|
| 74 | Strategy - the strategy base role |
|---|
| 75 | Medium - the medium and pos base roles |
|---|
| 76 | Media:: - the base roles and pos objects for each medium |
|---|
| 77 | Strategies:: - one module for each strategy, with a submodule for |
|---|
| 78 | each medium |
|---|