seqlua
annotate README @ 52:3362ec36cb09
Do not automatically assume that functions passed to ipairs are iterators
but require ipairs(func, mode) to have an explicit mode set to "call" or "generator"
but require ipairs(func, mode) to have an explicit mode set to "call" or "generator"
author | jbe |
---|---|
date | Tue Aug 26 21:10:03 2014 +0200 (2014-08-26) |
parents | 06c5f2f9ec41 |
children | 664736a8fcbf |
rev | line source |
---|---|
jbe@37 | 1 seqlua: Extension for handling sequential data in Lua |
jbe@37 | 2 ===================================================== |
jbe@0 | 3 |
jbe@52 | 4 This package is an experimental extension for the Lua programming language |
jbe@52 | 5 (version 5.2) which: |
jbe@0 | 6 |
jbe@52 | 7 * makes ``ipairs(tbl)`` respect both metamethods ``__index`` and ``__ipairs`` |
jbe@52 | 8 (where ``__ipairs`` has precedence over ``__index``), |
jbe@52 | 9 * allows ``ipairs(seq, "call")`` to accept either tables or functions as first |
jbe@52 | 10 argument where a function is used as iterator, |
jbe@52 | 11 * allows ``ipairs(seq, "generator")`` to accept either tables or functions as |
jbe@52 | 12 first argument where a function is used as generator for an iterator, |
jbe@49 | 13 * adds a new function ``string.concat(separator, seq)`` that concats either |
jbe@32 | 14 table entries or function return values, |
jbe@49 | 15 * provides auxiliary C functions and macros to simplify iterating over both |
jbe@38 | 16 tables and iterator functions with a generic statement. |
jbe@0 | 17 |
jbe@33 | 18 Existing ``__ipairs`` or ``__index`` (but not ``__len``) metamethods are |
jbe@33 | 19 respected by both the Lua functions and the C functions and macros. The |
jbe@32 | 20 ``__ipairs`` metamethod takes precedence over ``__index``, while the |
jbe@32 | 21 ``__len`` metamethod is never used. |
jbe@32 | 22 |
jbe@37 | 23 Metamethod handling in detail is explained in the last section |
jbe@37 | 24 ("Respected metamethods") at the bottom of this README. |
jbe@37 | 25 |
jbe@49 | 26 In Lua, this extension is loaded by ``require "seqlua"``. In order to use the |
jbe@49 | 27 auxiliary C functions and macros, add ``#include <seqlualib.h>`` to your C file |
jbe@49 | 28 and ensure that the functions implemented in ``seqlualib.c`` are statically or |
jbe@49 | 29 dynamically linked with your C Lua library. |
jbe@49 | 30 |
jbe@37 | 31 |
jbe@37 | 32 |
jbe@37 | 33 Motivation |
jbe@37 | 34 ---------- |
jbe@37 | 35 |
jbe@37 | 36 Sequential data (such as arrays or streams) is often represented in two |
jbe@37 | 37 different ways: |
jbe@37 | 38 |
jbe@37 | 39 * as an ordered set of values (usually implemented as an array in other |
jbe@37 | 40 programming languages, or as a sequence in Lua: a table with numeric keys |
jbe@37 | 41 {1..n} associated with a value each), |
jbe@37 | 42 * as some sort of data stream (sometimes implemented as a class of objects |
jbe@37 | 43 providing certain methods, or as an iterator function in Lua: a function that |
jbe@37 | 44 returns the next value with every call, where nil indicates the end of the |
jbe@37 | 45 stream). |
jbe@37 | 46 |
jbe@37 | 47 Quite often, when functions work on sequential data, it shouldn't matter in |
jbe@37 | 48 which form the sequential data is being provided to the function. As an |
jbe@37 | 49 example, consider a function that is writing a sequence of strings to a file. |
jbe@37 | 50 Such function could either be fed with an array of strings (a table with |
jbe@37 | 51 numeric keys in Lua) or with a (possibly infinite) stream of data (an iterator |
jbe@37 | 52 function in Lua). |
jbe@37 | 53 |
jbe@37 | 54 A function in Lua that accepts a table, might look like as follows: |
jbe@37 | 55 |
jbe@37 | 56 function write_lines(lines) |
jbe@37 | 57 for i, line in ipairs(lines) do |
jbe@37 | 58 io.stdout:write(line) |
jbe@37 | 59 io.stdout:write("\n") |
jbe@37 | 60 end |
jbe@37 | 61 end |
jbe@37 | 62 |
jbe@37 | 63 In contrast, a function in Lua that accepts an iterator function would have to |
jbe@37 | 64 be implemented differently: |
jbe@37 | 65 |
jbe@37 | 66 function write_lines(get_next_line) |
jbe@37 | 67 for line in get_next_line do |
jbe@37 | 68 io.stdout:write(line) |
jbe@37 | 69 io.stdout:write("\n") |
jbe@37 | 70 end |
jbe@37 | 71 end |
jbe@37 | 72 |
jbe@37 | 73 If one wanted to create a function that accepts either a sequence in form of a |
jbe@37 | 74 table or an iterator function, then one might need to write: |
jbe@37 | 75 |
jbe@41 | 76 do |
jbe@41 | 77 local function write_line(line) |
jbe@37 | 78 io.stdout:write(line) |
jbe@37 | 79 io.stdout:write("\n") |
jbe@37 | 80 end |
jbe@41 | 81 function write_lines(lines) |
jbe@41 | 82 if type(lines) == "function" then |
jbe@41 | 83 for line in lines do |
jbe@41 | 84 write_line(line) |
jbe@41 | 85 end |
jbe@41 | 86 else |
jbe@41 | 87 for i, line in ipairs(lines) do |
jbe@41 | 88 write_line(line) |
jbe@41 | 89 end |
jbe@41 | 90 end |
jbe@41 | 91 end |
jbe@37 | 92 end |
jbe@37 | 93 |
jbe@41 | 94 Obviously, this isn't something we want to do in every function that accepts |
jbe@37 | 95 sequential data. Therefore, we usually decide for one of the two first forms |
jbe@48 | 96 and thus disallow the other possible representation of sequential data to be |
jbe@48 | 97 passed to the function. |
jbe@37 | 98 |
jbe@37 | 99 This extension, however, modifies Lua's ``ipairs`` statement in such way that |
jbe@37 | 100 it automatically accepts either a table or an iterator function as argument. |
jbe@52 | 101 Thus, the function below will accept both (table) sequences and (function) |
jbe@52 | 102 iterators: |
jbe@52 | 103 |
jbe@52 | 104 function write_lines(lines) |
jbe@52 | 105 for i, line in ipairs(lines, "call") do |
jbe@52 | 106 io.stdout:write(line) |
jbe@52 | 107 io.stdout:write("\n") |
jbe@52 | 108 end |
jbe@52 | 109 end |
jbe@37 | 110 |
jbe@37 | 111 In addition to the modification of ``ipairs``, it also provides C functions and |
jbe@37 | 112 macros to iterate over values in the same manner as a generic loop statement |
jbe@37 | 113 with ``ipairs`` would do. |
jbe@37 | 114 |
jbe@37 | 115 Note that this extension doesn't aim to supersede Lua's concept of iterator |
jbe@37 | 116 functions. While metamethods (see section "Respected metamethods" below) may be |
jbe@37 | 117 used to customize iteration behavior on values, this extension isn't thought to |
jbe@37 | 118 replace the common practice to use function closures as iterators. Consider the |
jbe@37 | 119 following example: |
jbe@37 | 120 |
jbe@37 | 121 local result = sql_query("SELECT * FROM actor ORDER BY birthdate") |
jbe@37 | 122 write_lines(result:get_column_entries("name")) |
jbe@37 | 123 |
jbe@37 | 124 The ``get_column_entries`` method can return a simple function closure that |
jbe@37 | 125 returns the next entry in the "name" column (returning ``nil`` to indicate the |
jbe@37 | 126 end). Such a closure can then be passed to another function that iterates |
jbe@37 | 127 through a sequence of values by invoking ``ipairs`` with the general for-loop |
jbe@37 | 128 (as previously shown). |
jbe@37 | 129 |
jbe@37 | 130 Where desired, it is also possible to use metamethods to customize iteration |
jbe@44 | 131 behavior: |
jbe@44 | 132 |
jbe@44 | 133 function print_rows(rows) |
jbe@44 | 134 for i, row in ipairs(rows) do |
jbe@44 | 135 print_row(row) |
jbe@44 | 136 end |
jbe@44 | 137 end |
jbe@44 | 138 local result = sql_query("SELECT * FROM actor ORDER BY birthday") |
jbe@46 | 139 assert(type(result) == "userdata") |
jbe@44 | 140 |
jbe@44 | 141 -- we may rely on the ``__index`` or ``__ipairs`` metamethod to |
jbe@44 | 142 -- iterate through all result rows here: |
jbe@44 | 143 print_rows(result) -- no need to use ":rows()" or a similar syntax |
jbe@44 | 144 |
jbe@45 | 145 -- but we can also still pass an individual set of result rows to the |
jbe@44 | 146 -- print_rows function: |
jbe@44 | 147 print_rows{result[1], result[#result]} |
jbe@44 | 148 |
jbe@44 | 149 This extension, however, doesn't respect the ``__len`` metamethod due to the |
jbe@47 | 150 following considerations: |
jbe@37 | 151 |
jbe@39 | 152 * An efficient implementation where ``for i, v in ipairs(tbl) do ... end`` does |
jbe@39 | 153 neither create a closure nor repeatedly evaluate ``#tbl`` seems to be |
jbe@39 | 154 impossible. |
jbe@37 | 155 * Respecting ``__len`` could be used to implement sparse arrays, but this would |
jbe@37 | 156 require iterating functions to expect ``nil`` as a potential value. This may |
jbe@37 | 157 lead to problems because ``nil`` is usually also used to indicate the absence |
jbe@37 | 158 of a value. |
jbe@37 | 159 |
jbe@40 | 160 Though, if such behavior is desired, it can still be implemented through the |
jbe@37 | 161 ``__ipairs`` metamethod. |
jbe@37 | 162 |
jbe@48 | 163 Unless manually done by the user in the ``__ipairs`` metamethod, the ``ipairs`` |
jbe@48 | 164 function as well as the corresponding C functions and macros provided by this |
jbe@48 | 165 extension never create any closures or other values that need to be garbage |
jbe@48 | 166 collected. |
jbe@37 | 167 |
jbe@0 | 168 |
jbe@0 | 169 |
jbe@0 | 170 Lua part of the library |
jbe@0 | 171 ----------------------- |
jbe@0 | 172 |
jbe@30 | 173 The modified ``ipairs(seq)`` and the new ``string.concat(sep, seq)`` functions |
jbe@30 | 174 accept either a table or a function as ``seq``. This is demonstrated in the |
jbe@30 | 175 following examples: |
jbe@0 | 176 |
jbe@0 | 177 require "seqlua" |
jbe@0 | 178 |
jbe@0 | 179 t = {"a", "b", "c"} |
jbe@0 | 180 |
jbe@52 | 181 for i, v in ipairs(t, "call") do |
jbe@0 | 182 print(i, v) |
jbe@0 | 183 end |
jbe@0 | 184 -- prints: |
jbe@0 | 185 -- 1 a |
jbe@0 | 186 -- 2 b |
jbe@0 | 187 -- 3 c |
jbe@0 | 188 |
jbe@25 | 189 print(string.concat(",", t)) |
jbe@25 | 190 -- prints: a,b,c |
jbe@25 | 191 |
jbe@19 | 192 function alphabet() |
jbe@0 | 193 local letter = nil |
jbe@0 | 194 return function() |
jbe@0 | 195 if letter == nil then |
jbe@19 | 196 letter = "a" |
jbe@19 | 197 elseif letter == "z" then |
jbe@0 | 198 return nil |
jbe@0 | 199 else |
jbe@0 | 200 letter = string.char(string.byte(letter) + 1) |
jbe@0 | 201 end |
jbe@0 | 202 return letter |
jbe@0 | 203 end |
jbe@0 | 204 end |
jbe@0 | 205 |
jbe@52 | 206 for i, v in ipairs(alphabet(), "call") do |
jbe@52 | 207 print(i, v) |
jbe@52 | 208 end |
jbe@52 | 209 -- prints: |
jbe@52 | 210 -- 1 a |
jbe@52 | 211 -- 2 b |
jbe@52 | 212 -- 3 c |
jbe@52 | 213 -- ... |
jbe@52 | 214 -- 25 y |
jbe@52 | 215 -- 26 z |
jbe@52 | 216 |
jbe@52 | 217 for i, v in ipairs(alphabet, "generator") do |
jbe@0 | 218 print(i, v) |
jbe@0 | 219 end |
jbe@0 | 220 -- prints: |
jbe@0 | 221 -- 1 a |
jbe@0 | 222 -- 2 b |
jbe@0 | 223 -- 3 c |
jbe@0 | 224 -- ... |
jbe@0 | 225 -- 25 y |
jbe@0 | 226 -- 26 z |
jbe@0 | 227 |
jbe@25 | 228 print(string.concat(",", alphabet())) |
jbe@25 | 229 -- prints: a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z |
jbe@25 | 230 |
jbe@26 | 231 function filter(f) |
jbe@26 | 232 return function(seq) |
jbe@26 | 233 return coroutine.wrap(function() |
jbe@52 | 234 for i, v in ipairs(seq, "call") do f(v) end |
jbe@26 | 235 end) |
jbe@26 | 236 end |
jbe@0 | 237 end |
jbe@19 | 238 |
jbe@29 | 239 alpha_beta_x = filter(function(v) |
jbe@28 | 240 if v == "a" then |
jbe@28 | 241 coroutine.yield("alpha") |
jbe@28 | 242 elseif v == "b" then |
jbe@28 | 243 coroutine.yield("beta") |
jbe@28 | 244 elseif type(v) == "number" then |
jbe@23 | 245 for i = 1, v do |
jbe@28 | 246 coroutine.yield("X") |
jbe@23 | 247 end |
jbe@0 | 248 end |
jbe@26 | 249 end) |
jbe@0 | 250 |
jbe@29 | 251 print((","):concat(alpha_beta_x{"a", 3, "b", "c", "d"})) |
jbe@28 | 252 -- prints: alpha,X,X,X,beta |
jbe@25 | 253 |
jbe@29 | 254 print((","):concat(alpha_beta_x(alphabet()))) |
jbe@28 | 255 -- prints: alpha,beta |
jbe@27 | 256 |
jbe@0 | 257 |
jbe@37 | 258 |
jbe@0 | 259 C part of the library |
jbe@0 | 260 --------------------- |
jbe@0 | 261 |
jbe@0 | 262 In ``seqlualib.h``, the following macro is defined: |
jbe@0 | 263 |
jbe@0 | 264 #define seqlua_iterloop(L, iter, idx) \ |
jbe@0 | 265 for ( \ |
jbe@0 | 266 seqlua_iterinit((L), (iter), (idx)); \ |
jbe@0 | 267 seqlua_iternext(iter); \ |
jbe@25 | 268 ) |
jbe@25 | 269 |
jbe@25 | 270 and |
jbe@25 | 271 |
jbe@25 | 272 #define seqlua_iterloopauto(L, iter, idx) \ |
jbe@25 | 273 for ( \ |
jbe@25 | 274 seqlua_iterinit((L), (iter), (idx)); \ |
jbe@25 | 275 seqlua_iternext(iter); \ |
jbe@0 | 276 lua_pop((L), 1) \ |
jbe@0 | 277 ) |
jbe@0 | 278 |
jbe@23 | 279 This macro allows iteration over either tables or iterator functions as the |
jbe@23 | 280 following example function demonstrates: |
jbe@0 | 281 |
jbe@0 | 282 int printcsv(lua_State *L) { |
jbe@0 | 283 seqlua_Iterator iter; |
jbe@0 | 284 seqlua_iterloop(L, &iter, 1) { |
jbe@0 | 285 if (seqlua_itercount(&iter) > 1) fputs(",", stdout); |
jbe@0 | 286 fputs(luaL_tolstring(L, -1, NULL), stdout); |
jbe@25 | 287 // two values need to be popped (the value pushed by |
jbe@25 | 288 // seqlua_iternext and the value pushed by luaL_tolstring) |
jbe@25 | 289 lua_pop(L, 2); |
jbe@0 | 290 } |
jbe@0 | 291 fputs("\n", stdout); |
jbe@0 | 292 return 0; |
jbe@0 | 293 } |
jbe@0 | 294 |
jbe@11 | 295 printcsv{"a", "b", "c"} |
jbe@11 | 296 -- prints: a,b,c |
jbe@11 | 297 |
jbe@11 | 298 printcsv(assert(io.open("testfile")):lines()) |
jbe@11 | 299 -- prints: line1,line2,... of "testfile" |
jbe@0 | 300 |
jbe@31 | 301 NOTE: During iteration using ``seqlua_iterloop``, ``seqlua_iterloopauto``, or |
jbe@31 | 302 ``seqlua_iterinit``, three extra elements are stored on the stack (additionally |
jbe@31 | 303 to the value). These extra elements are removed automatically when the loop ends |
jbe@31 | 304 (i.e. when ``seqlua_iternext`` returns zero). The value pushed onto the stack |
jbe@31 | 305 for every iteration step has to be removed manually from the stack, unless |
jbe@31 | 306 ``seqlua_iterloopauto`` is used. |
jbe@0 | 307 |
jbe@31 | 308 |
jbe@37 | 309 |
jbe@35 | 310 Respected metamethods |
jbe@35 | 311 --------------------- |
jbe@35 | 312 |
jbe@35 | 313 Regarding the behavior of the Lua functions and the C functions and macros |
jbe@35 | 314 provided by this extension, an existing ``__index`` metamethod will be |
jbe@35 | 315 respected automatically. An existing ``__ipairs`` metamethod, however, takes |
jbe@35 | 316 precedence. |
jbe@35 | 317 |
jbe@35 | 318 If the ``__ipairs`` field of a value's metatable is set, then it must always |
jbe@35 | 319 refer to a function. When starting iteration over a value with such a |
jbe@35 | 320 metamethod being set, then this function is called with ``self`` (i.e. the |
jbe@35 | 321 value itself) passed as first argument. The return values of the ``__ipairs`` |
jbe@35 | 322 metamethod may take one of the following 4 forms: |
jbe@35 | 323 |
jbe@35 | 324 * ``return function_or_callable, static_argument, startindex`` causes the three |
jbe@35 | 325 arguments to be returned by ``ipairs`` without further modification. Using |
jbe@35 | 326 the C macros and functions for iteration, the behavior is according to the |
jbe@35 | 327 generic loop statement in Lua: |
jbe@35 | 328 ``for i, v in function_or_callable, static_argument, startindex do ... end`` |
jbe@35 | 329 * ``return "raw", table`` will result in iteration over the table ``table`` |
jbe@35 | 330 using ``lua_rawgeti`` |
jbe@35 | 331 * ``return "index", table_or_userdata`` will result in iteration over the table |
jbe@35 | 332 or userdata while respecting any ``__index`` metamethod of the table or |
jbe@35 | 333 userdata value |
jbe@35 | 334 * ``return "call", function_or_callable`` will use the callable value as |
jbe@35 | 335 (function) iterator where the function is expected to return a single value |
jbe@35 | 336 without any index (the index is inserted automatically when using the |
jbe@35 | 337 ``ipairs`` function for iteration) |
jbe@35 | 338 |
jbe@35 | 339 These possiblities are demonstrated by the following example code: |
jbe@35 | 340 |
jbe@35 | 341 require "seqlua" |
jbe@35 | 342 |
jbe@35 | 343 do |
jbe@35 | 344 local function ipairsaux(t, i) |
jbe@35 | 345 i = i + 1 |
jbe@35 | 346 if i <= 3 then |
jbe@35 | 347 return i, t[i] |
jbe@35 | 348 end |
jbe@35 | 349 end |
jbe@35 | 350 custom = setmetatable( |
jbe@35 | 351 {"one", "two", "three", "four", "five"}, |
jbe@35 | 352 { |
jbe@35 | 353 __ipairs = function(self) |
jbe@35 | 354 return ipairsaux, self, 0 |
jbe@35 | 355 end |
jbe@35 | 356 } |
jbe@35 | 357 ) |
jbe@35 | 358 end |
jbe@35 | 359 print(string.concat(",", custom)) |
jbe@36 | 360 -- prints: one,two,three |
jbe@35 | 361 -- (note that "four" and "five" are not printed) |
jbe@35 | 362 |
jbe@35 | 363 tbl = {"alpha", "beta"} |
jbe@35 | 364 |
jbe@35 | 365 proxy1 = setmetatable({}, {__index = tbl}) |
jbe@35 | 366 for i, v in ipairs(proxy1) do print(i, v) end |
jbe@35 | 367 -- prints: |
jbe@35 | 368 -- 1 alpha |
jbe@35 | 369 -- 2 beta |
jbe@35 | 370 |
jbe@35 | 371 proxy2 = setmetatable({}, { |
jbe@35 | 372 __ipairs = function(self) |
jbe@35 | 373 return "index", proxy1 |
jbe@35 | 374 end |
jbe@35 | 375 }) |
jbe@35 | 376 for i, v in ipairs(proxy2) do print(i, v) end |
jbe@35 | 377 -- prints: |
jbe@35 | 378 -- 1 alpha |
jbe@35 | 379 -- 2 beta |
jbe@35 | 380 print(proxy2[1]) |
jbe@35 | 381 -- prints: nil |
jbe@35 | 382 |
jbe@35 | 383 cursor = setmetatable({ |
jbe@35 | 384 "alice", "bob", "charlie", pos=1 |
jbe@35 | 385 }, { |
jbe@35 | 386 __call = function(self) |
jbe@35 | 387 local value = self[self.pos] |
jbe@35 | 388 if value == nil then |
jbe@35 | 389 self.pos = 1 |
jbe@35 | 390 else |
jbe@35 | 391 self.pos = self.pos + 1 |
jbe@35 | 392 end |
jbe@35 | 393 return value |
jbe@35 | 394 end, |
jbe@35 | 395 __ipairs = function(self) |
jbe@35 | 396 return "call", self |
jbe@35 | 397 end |
jbe@35 | 398 }) |
jbe@35 | 399 for i, v in ipairs(cursor) do print(i, v) end |
jbe@35 | 400 -- prints: |
jbe@35 | 401 -- 1 alice |
jbe@35 | 402 -- 2 bob |
jbe@35 | 403 -- 3 charlie |
jbe@35 | 404 print(cursor()) |
jbe@35 | 405 -- prints: alice |
jbe@35 | 406 for i, v in ipairs(cursor) do print(i, v) end |
jbe@35 | 407 -- prints: |
jbe@35 | 408 -- 1 bob |
jbe@35 | 409 -- 2 charlie |
jbe@35 | 410 -- (note that "alice" has been returned earlier) |
jbe@35 | 411 |
jbe@35 | 412 coefficients = setmetatable({1.25, 3.14, 17.5}, { |
jbe@35 | 413 __index = function(self) return 1 end, |
jbe@35 | 414 __ipairs = function(self) return "raw", self end |
jbe@35 | 415 }) |
jbe@35 | 416 for i, v in ipairs(coefficients) do print(i, v) end |
jbe@35 | 417 -- prints: |
jbe@35 | 418 -- 1 1.25 |
jbe@35 | 419 -- 2 3.14 |
jbe@35 | 420 -- 3 17.5 |
jbe@35 | 421 -- (note that iteration terminates even if coefficients[4] == 1) |
jbe@35 | 422 print(coefficients[4]) |
jbe@35 | 423 -- prints: 1 |
jbe@35 | 424 |
jbe@35 | 425 |