| | 44 | |
| | 45 | <fglock> TimToady: is the lexer the right place to make the '<'/'<a>' distinction? (it is not user-modifyable) |
| | 46 | |
| | 47 | <TimToady> fglock: that depends on what you count as part of the lexer. |
| | 48 | The bottom-up parser knows when it's looking for a <%infix> vs <%prefix> vs <%postfix>, so only those tokens are active that would be valid at that spot. |
| | 49 | <TimToady> (I oversimplify the hash stuff slightly there.) |
| | 50 | <TimToady> It's more like: |
| | 51 | <TimToady> <%infix> vs <%prefix|%term|%circumfix> vs <%postfix|%postcircumfix> |
| | 52 | <TimToady> assuming we adopt the new <%a|%b|%c> notation to combine |
| | 53 | longest-token processing of multiple hashes. |
| | 54 | <TimToady> fglock: for speed one could cache all the hash keys for all the hashes in a trie or some similar structure. Just have to be careful that longest key wins regardless of hash, and in case of tie first hash wins. |
| | 55 | <TimToady> 'course you have to recalculate if any of the hashes is modified... |
| | 56 | |
| | 57 | <TimToady> can probably treat alphanumeric sub names specially so that you don't have to recalculate on every sub declaration. |
| | 58 | <TimToady> if you assume that no "foo" prefix operator or term can match if the next char is alphanumeric. |
| | 59 | <TimToady> maybe just run the prescanned identifier down a different trie than the non-alpha ops. |
| | 60 | <TimToady> actually, if you know the length then the ident one doesn't need a tree. Just a hash would work. |
| | 61 | <TimToady> since you know its length already. |
| | 62 | |
| | 63 | <fglock> TimToady: what if both postcircumfix and infix are expected? then the op is chosen based on if there is whitespace or not? |
| | 64 | <fglock> like in %ENV<x> vs. %ENV <... |
| | 65 | <TimToady> <%postcircumfix|%infix> is what you look for before whitespace, and <%infix> after. |
| | 66 | <TimToady> that's why we completely outlawed whitespace before postfix. |
| | 67 | <TimToady> hmm, that doesn't quite work. |
| | 68 | <TimToady> I think at postfix location you actually look for <%postfix|%postcircumfix>|<%infix> becuase |
| | 69 | <TimToady> you don't want the %infix participating in longest token there. |
| | 70 | <TimToady> $x<=2 is an error, but $x <= 2 is okay. |
| | 71 | <TimToady> or looking at it in terms of whitespace, if you don't get any match on a postfix, then you can pretend there was whitespace even if there wasn't, and try %infix. |
| | 72 | <fglock> I think I'll need to do some tests ... - how about /rule/ vs. division? is it just that rule is a term and division is an op? |
| | 73 | <TimToady> yeah, that's just simple term vs op expectation. |
| | 74 | <TimToady> just as in P5. |
| | 75 | <TimToady> It's really only the postfix category that's new to P6 |