| 1 | = Laziness |
|---|
| 2 | |
|---|
| 3 | A draft on the runtime view of how Lazy things (like Arrays) work. |
|---|
| 4 | |
|---|
| 5 | # by fglock |
|---|
| 6 | |
|---|
| 7 | The text currently has 2 main sections: |
|---|
| 8 | - a review of the Perl 6 Bible, with some interpretation and guessing |
|---|
| 9 | - implementation notes. |
|---|
| 10 | |
|---|
| 11 | Each section has 3 subsections: |
|---|
| 12 | - Native data - about the internal representation of data types |
|---|
| 13 | - Values - about the basic Perl 6 "OO" things |
|---|
| 14 | - Containers - about the glue that hold things together in data structures |
|---|
| 15 | |
|---|
| 16 | == TODO |
|---|
| 17 | - include drawings |
|---|
| 18 | - comparing with Ruby/Python should help explaining some concepts |
|---|
| 19 | - this text is supposed to talk about all these things: |
|---|
| 20 | Scalar / Array / Hash Container |
|---|
| 21 | Tuple / List / Generator / Iterator / Range |
|---|
| 22 | file iterator |
|---|
| 23 | Signature / Argument List |
|---|
| 24 | Array / Hash slice |
|---|
| 25 | enum |
|---|
| 26 | tie / proxy / rw / env |
|---|
| 27 | Constrained Types (eg: Int / Even / 0..100 where Even) |
|---|
| 28 | Piddle |
|---|
| 29 | lazy string / int / num |
|---|
| 30 | |
|---|
| 31 | |
|---|
| 32 | = Part I - Perl 6 Bible Readings |
|---|
| 33 | |
|---|
| 34 | References: |
|---|
| 35 | - S02 Bits and Pieces - for "native data" |
|---|
| 36 | - S09 Data Structures - for Object types |
|---|
| 37 | - A bit of each of the other AES |
|---|
| 38 | |
|---|
| 39 | = Native data |
|---|
| 40 | |
|---|
| 41 | Native data is built with a thin, (possibly non-OO) library over the virtual |
|---|
| 42 | machine data types. |
|---|
| 43 | |
|---|
| 44 | For example, you may need a wrapper to implement 1-bit values, if the |
|---|
| 45 | virtual machine only provides bytes. |
|---|
| 46 | |
|---|
| 47 | - native "singular" value |
|---|
| 48 | |
|---|
| 49 | Native values are: num, int, uint, bit, complex, ref. |
|---|
| 50 | Native values can have a bit-size specification: int1, int16. |
|---|
| 51 | |
|---|
| 52 | A "native container" is whatever a "ref" value points to. |
|---|
| 53 | |
|---|
| 54 | A "num" value must support IEEE floating point conventions - which means there |
|---|
| 55 | are native "Inf" and "NaN". |
|---|
| 56 | Native values don't need to be able to represent "undef". |
|---|
| 57 | |
|---|
| 58 | - native "plural" value |
|---|
| 59 | |
|---|
| 60 | Native plural values are: array, str, hash. |
|---|
| 61 | |
|---|
| 62 | A "str" is an array of "int". |
|---|
| 63 | |
|---|
| 64 | There may be separate implementations for fixed-size and for |
|---|
| 65 | auto-extending allocation. |
|---|
| 66 | |
|---|
| 67 | These are fast, in-memory structures provided by the virtual machine. |
|---|
| 68 | These structures are assumed to be one-dimensional. |
|---|
| 69 | |
|---|
| 70 | Native hash is mostly used internally, for holding "namespaces": Classes, variables, labels. |
|---|
| 71 | |
|---|
| 72 | There may be separate implementations for string-indexed and for object-indexed hashes. |
|---|
| 73 | |
|---|
| 74 | = Value Class |
|---|
| 75 | |
|---|
| 76 | Values can be "singular" or "plural". |
|---|
| 77 | |
|---|
| 78 | Singular Values are: Str, Int, Num, Pair, Ref, Junction, |
|---|
| 79 | Code, Block, Class, Grammar, Macro ... |
|---|
| 80 | |
|---|
| 81 | Plural Values are things that look like single-dimensional arrays: |
|---|
| 82 | List, Generator, Iterator. |
|---|
| 83 | |
|---|
| 84 | -- List, Lazy List, Tuple classes |
|---|
| 85 | |
|---|
| 86 | List is implemented as a native Array, or as a contiguous Stack area if this |
|---|
| 87 | is supported by the virtual machine. |
|---|
| 88 | |
|---|
| 89 | List is set of things. |
|---|
| 90 | List is a subclass of "Value". |
|---|
| 91 | List is one-dimensional. |
|---|
| 92 | |
|---|
| 93 | A List may be built with a mix of Values and Containers. |
|---|
| 94 | |
|---|
| 95 | A List might exist in 4 "states": |
|---|
| 96 | |
|---|
| 97 | @a = ( 7, 20..30 ); |
|---|
| 98 | - this is the Array that will be used in the examples |
|---|
| 99 | @a = ( |
|---|
| 100 | ( (Cell = 7), ), |
|---|
| 101 | ( Generator 20..30) |
|---|
| 102 | ); |
|---|
| 103 | - this is more-or-less what it looks like internally |
|---|
| 104 | |
|---|
| 105 | - unflattened List |
|---|
| 106 | |
|---|
| 107 | ( 1, 5..10, @a ) |
|---|
| 108 | - a List built from a Scalar, a Range, and an Array |
|---|
| 109 | |
|---|
| 110 | - lazily flattened with "*" operator |
|---|
| 111 | |
|---|
| 112 | ( 1, 5..10, (Scalar := @a[0]), (Generator (Scalar := @a[1])..(Scalar := @a[10])) ) |
|---|
| 113 | - the Array was "flattened", but we still have generators. |
|---|
| 114 | |
|---|
| 115 | - non-lazily flattened (eager) with "**" operator |
|---|
| 116 | |
|---|
| 117 | ( 1, 5, 6, 7, 8, 9, 10, (Scalar := @a[0]), (Scalar := @a[1]), (Scalar := @a[2]), ..... ) |
|---|
| 118 | - everything is flattened |
|---|
| 119 | |
|---|
| 120 | - fully evaluated (Tuple) |
|---|
| 121 | |
|---|
| 122 | ( 1, 5, 6, 7, 8, 9, 10, @a[0].value, @a[1].value, @a[2].value, ..... ) |
|---|
| 123 | - fully evaluated and immutable |
|---|
| 124 | |
|---|
| 125 | List only implements a subset of the Array API. |
|---|
| 126 | A List slice is a new List - elements are shallow copied. |
|---|
| 127 | List don't support adding, removing or substituting elements. |
|---|
| 128 | |
|---|
| 129 | List has no Cells - thus you can't bind a Container to a List element or slice. ??? |
|---|
| 130 | |
|---|
| 131 | $a := (1,2,3)[2]; # compile error - or is this creating a Constant ??? |
|---|
| 132 | |
|---|
| 133 | If a List is built with Generators, it is a Lazy List. |
|---|
| 134 | |
|---|
| 135 | -- Generator / Range Classes |
|---|
| 136 | |
|---|
| 137 | Generator is an object that creates Values on demand. |
|---|
| 138 | Generator is "functional" rather than "physical memory" |
|---|
| 139 | |
|---|
| 140 | Range is a special case, in which there is a start, an end, and a step. |
|---|
| 141 | |
|---|
| 142 | #At runtime, you only access Generator elements through Containers. |
|---|
| 143 | |
|---|
| 144 | Generator also implements "match": |
|---|
| 145 | |
|---|
| 146 | 5 ~~ 1..10 |
|---|
| 147 | |
|---|
| 148 | The "*" operator flattens a Generator eagerly. |
|---|
| 149 | |
|---|
| 150 | TODO - detail the behaviour of "Str Range" |
|---|
| 151 | |
|---|
| 152 | -- Iterator Class |
|---|
| 153 | |
|---|
| 154 | TODO |
|---|
| 155 | |
|---|
| 156 | =<> |
|---|
| 157 | |
|---|
| 158 | The "*" operator flattens an Iterator eagerly. |
|---|
| 159 | |
|---|
| 160 | -- List Slice |
|---|
| 161 | |
|---|
| 162 | A List Slice is a shallow copy of a part of a List. |
|---|
| 163 | |
|---|
| 164 | A List Slice is just a List - there is not a separate Class for it. |
|---|
| 165 | |
|---|
| 166 | = Container Class |
|---|
| 167 | |
|---|
| 168 | Containers are the glue that hold Values together in data structures. |
|---|
| 169 | |
|---|
| 170 | The container classes are: Scalar, Array, Hash. |
|---|
| 171 | |
|---|
| 172 | Array, like List, is Lazy by default. |
|---|
| 173 | Hash and Scalar can be lazy too, but they are not Lazy by default. |
|---|
| 174 | |
|---|
| 175 | == Scalar Container Class |
|---|
| 176 | |
|---|
| 177 | Scalar is a "singular Value" Container. |
|---|
| 178 | |
|---|
| 179 | One of the reasons to use a container is that you can easily change |
|---|
| 180 | it's semantics by replacing what's inside it - that is, a Container implements |
|---|
| 181 | a "variable": |
|---|
| 182 | |
|---|
| 183 | value '1' - always means 'one' |
|---|
| 184 | |
|---|
| 185 | container ('1') - it means 'one' too, but you can increment it to mean 'two' |
|---|
| 186 | |
|---|
| 187 | == Array Container Class |
|---|
| 188 | |
|---|
| 189 | The Array class defines all objects that implement the Array API. |
|---|
| 190 | |
|---|
| 191 | Array is a subclass of "Container" |
|---|
| 192 | Array contains an ordered set of Cells. |
|---|
| 193 | Array is multi-dimensional. |
|---|
| 194 | |
|---|
| 195 | Array is not meant to be subclassed - use "roles" to make different implementations. |
|---|
| 196 | |
|---|
| 197 | The actual implementation depends on the "Eager" or the "Lazy" roles. |
|---|
| 198 | Other implementations can be specified. |
|---|
| 199 | |
|---|
| 200 | --- Eager Role |
|---|
| 201 | |
|---|
| 202 | An Array object that wraps a native array. |
|---|
| 203 | |
|---|
| 204 | --- Lazy Role |
|---|
| 205 | |
|---|
| 206 | An Array object that generates Cells on demand. |
|---|
| 207 | This is the default implementation. |
|---|
| 208 | |
|---|
| 209 | --- Shape (Role or parameter? trait?) |
|---|
| 210 | |
|---|
| 211 | --- Compact Array Role (piddle) |
|---|
| 212 | |
|---|
| 213 | TODO - piddle is different from Compact |
|---|
| 214 | |
|---|
| 215 | -- Container Slice |
|---|
| 216 | |
|---|
| 217 | A Container Slice is a specialized Array/Hash, in which all the elements |
|---|
| 218 | are bound to objects inside other Containers (Array, Hash or Scalar). |
|---|
| 219 | |
|---|
| 220 | A Lazy Container Slice contains pointers to Generators that are inside other |
|---|
| 221 | Containers. |
|---|
| 222 | |
|---|
| 223 | Container Slice is multi-dimensional. |
|---|
| 224 | |
|---|
| 225 | == Hash |
|---|
| 226 | |
|---|
| 227 | The Hash class defines all objects that implement the hash API. |
|---|
| 228 | |
|---|
| 229 | Hash is like an Array, in which the Cell indexes can be of any type |
|---|
| 230 | (not just numbers). |
|---|
| 231 | |
|---|
| 232 | Hash is multi-dimensional. |
|---|
| 233 | |
|---|
| 234 | |
|---|
| 235 | = Part II - Internals |
|---|
| 236 | |
|---|
| 237 | == Generator Internals |
|---|
| 238 | |
|---|
| 239 | - TODO |
|---|
| 240 | |
|---|
| 241 | |
|---|
| 242 | == Generator API |
|---|
| 243 | |
|---|
| 244 | - match |
|---|
| 245 | |
|---|
| 246 | This operator is used in a match statement: |
|---|
| 247 | |
|---|
| 248 | 5 ~~ 1..10 |
|---|
| 249 | |
|---|
| 250 | - shift / pop |
|---|
| 251 | |
|---|
| 252 | TODO - not sure if "Generator" is an iterator (read-only) or uses destructive |
|---|
| 253 | read (shift/pop) |
|---|
| 254 | |
|---|
| 255 | - sum / reverse / map / zip |
|---|
| 256 | |
|---|
| 257 | - flatten |
|---|
| 258 | |
|---|
| 259 | Takes all Values from the Generator, and returns an Array (possibly Native). |
|---|
| 260 | |
|---|
| 261 | - elems |
|---|
| 262 | |
|---|
| 263 | The number of elements |
|---|
| 264 | |
|---|
| 265 | === This part of the Generator API is only used by the runtime internals. |
|---|
| 266 | |
|---|
| 267 | - head(n) |
|---|
| 268 | |
|---|
| 269 | This operator is used by the Array splice operator. |
|---|
| 270 | |
|---|
| 271 | Takes "n" Values from the Generator, like "shift" does. |
|---|
| 272 | The result can be a Generator (when n is big), an Array (possibly Native), |
|---|
| 273 | a single Value (when n == 1), or Void (when n <= 0). |
|---|
| 274 | |
|---|
| 275 | TODO - not sure if "Generator" is an iterator (read-only) or uses destructive |
|---|
| 276 | read (head/tail) |
|---|
| 277 | |
|---|
| 278 | - tail(n) |
|---|
| 279 | |
|---|
| 280 | This operator is used by the Array splice operator. |
|---|
| 281 | |
|---|
| 282 | Takes "n" Values from the back of the Generator, like "pop" does. |
|---|
| 283 | The result can be a Generator (when n is big), an Array (possibly Native), |
|---|
| 284 | a single Value (when n == 1), or Void (when n <= 0). |
|---|
| 285 | |
|---|
| 286 | TODO - check if "Generator" is an iterator (read-only) or uses destructive |
|---|
| 287 | read (head/tail) |
|---|
| 288 | |
|---|
| 289 | - is_contiguous |
|---|
| 290 | |
|---|
| 291 | This is used for the optimization of Slice, when the sequence is used |
|---|
| 292 | as a slice index. |
|---|
| 293 | |
|---|
| 294 | is_contiguous() is True is the generated sequence is like "2 .. 10". |
|---|
| 295 | |
|---|
| 296 | |
|---|
| 297 | = Array internals |
|---|
| 298 | |
|---|
| 299 | == Array (base Class) |
|---|
| 300 | |
|---|
| 301 | TODO |
|---|
| 302 | |
|---|
| 303 | == Eager (Role) |
|---|
| 304 | |
|---|
| 305 | An Eager Array contains a Native Array. |
|---|
| 306 | |
|---|
| 307 | TODO - explain "Cell" |
|---|
| 308 | |
|---|
| 309 | Contents: 1 2 3 |
|---|
| 310 | Internals: [ 1, 2, 3 ] |
|---|
| 311 | |
|---|
| 312 | == Lazy (Role) |
|---|
| 313 | |
|---|
| 314 | A Lazy Array contains a "specs" Array, which contains Generators, or Native Arrays. |
|---|
| 315 | |
|---|
| 316 | Contents: 1 2 3 |
|---|
| 317 | Internals: [ |
|---|
| 318 | [ 1, 2, 3 ] |
|---|
| 319 | ] |
|---|
| 320 | |
|---|
| 321 | Contents: 10 20 30 100..10000 50 400..900 |
|---|
| 322 | Internals: [ |
|---|
| 323 | [ 10, 20, 30 ], |
|---|
| 324 | Generator( 100, 10000 ), |
|---|
| 325 | [ 50 ], |
|---|
| 326 | Generator( 400, 900 ) |
|---|
| 327 | ] |
|---|
| 328 | |
|---|
| 329 | |
|---|
| 330 | = Array API |
|---|
| 331 | |
|---|
| 332 | - TODO |
|---|
| 333 | Array can do "zip", "map", "grep", splice, push, pop, store, fetch. |
|---|
| 334 | Array can be "bound" and "tied". |
|---|
| 335 | |
|---|
| 336 | |
|---|
| 337 | == map() |
|---|
| 338 | |
|---|
| 339 | map() applies a 'Code' to an Array state: |
|---|
| 340 | |
|---|
| 341 | 1: @a = 1..100; # 1, 2, 3, ... |
|---|
| 342 | 2: @b = @a.map:{ $_ * 2 }; # 2, 4, 6, ... |
|---|
| 343 | 3: |
|---|
| 344 | 4: @a[2] = 0; # 1, 0, 3, ... - original has changed |
|---|
| 345 | 5: say @b[1..3]; # 2, 4, 6, ... - result didn't change |
|---|
| 346 | |
|---|
| 347 | Line 1: |
|---|
| 348 | |
|---|
| 349 | Container: @a |
|---|
| 350 | Internals: [ |
|---|
| 351 | Generator( 1, 100 ), |
|---|
| 352 | ] |
|---|
| 353 | |
|---|
| 354 | Line 2: |
|---|
| 355 | |
|---|
| 356 | Container: @_TEMP_ |
|---|
| 357 | Internals: [ |
|---|
| 358 | Generator( 1, 100 ), |
|---|
| 359 | ] |
|---|
| 360 | |
|---|
| 361 | Container: @a |
|---|
| 362 | Internals: [ |
|---|
| 363 | RefArray( @_TEMP_ ), |
|---|
| 364 | ] |
|---|
| 365 | |
|---|
| 366 | Container: @b |
|---|
| 367 | Internals: [ |
|---|
| 368 | MapArray( @_TEMP_, { $_ * 2 } ), |
|---|
| 369 | ] |
|---|
| 370 | |
|---|
| 371 | Line 4: |
|---|
| 372 | |
|---|
| 373 | Container: @_TEMP_ |
|---|
| 374 | Internals: [ |
|---|
| 375 | [ 1, 2 ] |
|---|
| 376 | Generator( 3, 100 ), |
|---|
| 377 | ] |
|---|
| 378 | |
|---|
| 379 | Container: @a |
|---|
| 380 | Internals: [ |
|---|
| 381 | [ 1, 0 ], |
|---|
| 382 | SpliceArray( @_TEMP_, 2 ), |
|---|
| 383 | ] |
|---|
| 384 | |
|---|
| 385 | Container: @b |
|---|
| 386 | Internals: [ |
|---|
| 387 | MapArray( @_TEMP_, { $_ * 2 } ), |
|---|
| 388 | ] |
|---|
| 389 | |
|---|
| 390 | == splice() |
|---|
| 391 | |
|---|
| 392 | This pseudo-code implements splice(). |
|---|
| 393 | It assumes that the internal representation of Array can do .head(n) and .tail(n) - |
|---|
| 394 | see "Generator" for details. |
|---|
| 395 | |
|---|
| 396 | sub splice ( @array, $offset? = 0, $length? = Inf, @list? = () ) { |
|---|
| 397 | my ( @head, @body, @tail ); |
|---|
| 398 | if ( $offset >= 0 ) { |
|---|
| 399 | @head = @array.head( $offset ); |
|---|
| 400 | if ( $length >= 0 ) { |
|---|
| 401 | @body = @array.head( $length ); |
|---|
| 402 | @tail = @array; |
|---|
| 403 | } |
|---|
| 404 | else { |
|---|
| 405 | @tail = @array.tail( -$length ); |
|---|
| 406 | @body = @array; |
|---|
| 407 | } |
|---|
| 408 | } |
|---|
| 409 | else { |
|---|
| 410 | @tail = @array.tail( -$offset ); |
|---|
| 411 | @head = @array; |
|---|
| 412 | if ( $length >= 0 ) { |
|---|
| 413 | @body = $tail.head( $length ); |
|---|
| 414 | } |
|---|
| 415 | else { |
|---|
| 416 | @body = @tail; |
|---|
| 417 | @tail = $body.tail( -$length ); |
|---|
| 418 | } |
|---|
| 419 | }; |
|---|
| 420 | @array = ( @head, @list, @tail ); |
|---|
| 421 | return @body; |
|---|
| 422 | } |
|---|
| 423 | |
|---|
| 424 | = Hash internals |
|---|
| 425 | |
|---|
| 426 | A Hash contains a Native Hash. |
|---|
| 427 | |
|---|
| 428 | Hash keys can be objects, but the objects must be immutable |
|---|
| 429 | |
|---|
| 430 | Contents: a:1 b:2 c:3 |
|---|
| 431 | Internals: { a:1 b:2 c:3 } |
|---|
| 432 | |
|---|
| 433 | A Hash may contain Lazy items. |
|---|
| 434 | This structure contains a Native Array, which contains Generators or Native Hashes. |
|---|
| 435 | This structure permits near O(1) complexity for most operations. |
|---|
| 436 | |
|---|
| 437 | Contents: a:1 b:2 c:3 |
|---|
| 438 | Internals: [ |
|---|
| 439 | { a:1, b:2, c:3 } |
|---|
| 440 | ] |
|---|
| 441 | |
|---|
| 442 | Contents: a:10 b:20 c:30 (1:100 .. 10000:10100) d:50 |
|---|
| 443 | Internals: [ |
|---|
| 444 | { a:10 b:20 c:30 }, |
|---|
| 445 | Generator( 1:100, 10000:10100 ), |
|---|
| 446 | { d:50 }, |
|---|
| 447 | ] |
|---|
| 448 | |
|---|
| 449 | |
|---|
| 450 | = Hash API |
|---|
| 451 | |
|---|
| 452 | - TODO |
|---|
| 453 | |
|---|
| 454 | |
|---|
| 455 | = Slice API |
|---|
| 456 | |
|---|
| 457 | - TODO |
|---|
| 458 | |
|---|
| 459 | |
|---|
| 460 | = Slice Internals |
|---|
| 461 | |
|---|
| 462 | - TODO - hash slice / array slice |
|---|
| 463 | |
|---|
| 464 | |
|---|
| 465 | __END__ |
|---|