root/v6/v6-KindaPerl6/docs/p6regex-on-p5regex.txt

Revision 18821, 3.1 kB (checked in by fglock, 13 months ago)

[kp6] move docs around

Line 
1Perl6regex on Perl5regex
2
3### Note: This Document is a Draft
4
5------------------
6Introduction:
7
8"Perl6regex-on-Perl5regex" is a "Perl 6 regex engine" that uses Perl 5 regexes to implement the matching, and Perl 5 code to implement the OO "Match" structure.
9
10The implementation so far is compatible with Perl 5.8.8.
11
12
13------------------
14The compilation is implemented as follows:
15
16- a regex Grammar is run on the Perl 6 regex source code, and returns an AST
17
18- the AST is annotated for positional capture numbering, and for "capture to array" flags
19
20- the Perl 5 regexes and the Perl 5 methods are emitted
21
22
23------------------
24At runtime:
25
26- while the regex is matching, it generates a linked list of operations
27
28- the operation list is rolled-back on backtracking.
29"Safe-backtracking" is implemented with "local" redeclarations inside the Perl 5 regex (see [1], [2]).
30
31- after the match finishes, the operations are interpreted, and the result is a Match object.
32The interpreter is implemented on a subroutine in the Match class.
33
34
35------------------
36The operations mini-language is implemented like this:
37
38op-list
39... TODO ...
40
41------------------
42Differences from the Perl 6 specification
43
44* <after> only matches fixed-width patterns,
45because that's how Perl 5 "(?<=pattern)" works.
46There is no fix for this problem yet.
47
48------------------
49Fixable Differences from the Perl 6 specification
50
51* <?after> and <?before> do not create a lexical scope:
52this means that <?before (.) > wrongly does a positional capture.
53This is fixable, by adding a discard_capture operation.
54
55* return() in blocks don't cause the regex to succeed, and don't terminate the regex.
56The Perl 5.10 version should use (*ACCEPT).
57
58* The $/ inside regex closures is a copy of the matching $/.
59This means that modifying $/ inside a closure does not modify the match.
60This can be fixed with some magic in the Match class.
61
62------------------
63TODO list:
64
65* longest-token and multi-regex
66
67* identify possible perl5.8 bugs, that could justify requiring perl5.10
68
69* regexes inside code blocks may have side-effects inside a regex; this needs further testing
70
71* the Match class needs some tweaks to follow the MOP calling convention better
72** hash, array, from, to should be Perl 6 objects; autoboxing can fix that
73
74* backtracking controls; token/rule/regex
75
76* the $_ and $/ scopes need to be fixed
77
78* in order to support Matcher methods, OUTER::<$/> needs to be implemented
79
80* rule/subrule parameters;
81
82* the way inheritance works right now is by eval'ing the regex variable in the grammar's namespace;
83this is supposed to be refined later
84
85* calling subrules in other grammars
86** there should probably be a method that returns the regex, because directly accessing the $_regex_name variable doesn't work with inheritance.
87** code blocks should probably be installed as methods, because regexes are inlined as string, which breaks lexical scoping, package names, and inheritance.
88
89* <at()>
90
91* infix:<~~>
92
93* variable interpolation
94
95------------------
96Blogs:
97
98http://pugs.blogs.com/pugs/2007/07/perl6-regex-on-.html
99
100------------------
101References:
102
103[1] http://www.justatheory.com/computers/programming/perl/regex_named_captures.html
104
105[2] perldoc perlre
Note: See TracBrowser for help on using the browser.