jbe@37: seqlua: Extension for handling sequential data in Lua jbe@37: ===================================================== jbe@0: jbe@55: This package is an experimental extension for the Lua 5.2 programming language jbe@54: which: jbe@0: jbe@54: * allows ``ipairs(seq)`` to accept either tables or functions (i.e function jbe@54: iterators) as an argument, jbe@49: * adds a new function ``string.concat(separator, seq)`` that concats either jbe@32: table entries or function return values, jbe@49: * provides auxiliary C functions and macros to simplify iterating over both jbe@38: tables and iterator functions with a generic statement. jbe@0: jbe@33: Existing ``__ipairs`` or ``__index`` (but not ``__len``) metamethods are jbe@33: respected by both the Lua functions and the C functions and macros. The jbe@32: ``__ipairs`` metamethod takes precedence over ``__index``, while the jbe@32: ``__len`` metamethod is never used. jbe@32: jbe@37: Metamethod handling in detail is explained in the last section jbe@37: ("Respected metamethods") at the bottom of this README. jbe@37: jbe@49: In Lua, this extension is loaded by ``require "seqlua"``. In order to use the jbe@49: auxiliary C functions and macros, add ``#include `` to your C file jbe@49: and ensure that the functions implemented in ``seqlualib.c`` are statically or jbe@49: dynamically linked with your C Lua library. jbe@49: jbe@37: jbe@37: jbe@37: Motivation jbe@37: ---------- jbe@37: jbe@37: Sequential data (such as arrays or streams) is often represented in two jbe@37: different ways: jbe@37: jbe@37: * as an ordered set of values (usually implemented as an array in other jbe@37: programming languages, or as a sequence in Lua: a table with numeric keys jbe@37: {1..n} associated with a value each), jbe@37: * as some sort of data stream (sometimes implemented as a class of objects jbe@37: providing certain methods, or as an iterator function in Lua: a function that jbe@37: returns the next value with every call, where nil indicates the end of the jbe@37: stream). jbe@37: jbe@37: Quite often, when functions work on sequential data, it shouldn't matter in jbe@37: which form the sequential data is being provided to the function. As an jbe@37: example, consider a function that is writing a sequence of strings to a file. jbe@37: Such function could either be fed with an array of strings (a table with jbe@37: numeric keys in Lua) or with a (possibly infinite) stream of data (an iterator jbe@37: function in Lua). jbe@37: jbe@37: A function in Lua that accepts a table, might look like as follows: jbe@37: jbe@37: function write_lines(lines) jbe@37: for i, line in ipairs(lines) do jbe@37: io.stdout:write(line) jbe@37: io.stdout:write("\n") jbe@37: end jbe@37: end jbe@37: jbe@37: In contrast, a function in Lua that accepts an iterator function would have to jbe@37: be implemented differently: jbe@37: jbe@37: function write_lines(get_next_line) jbe@37: for line in get_next_line do jbe@37: io.stdout:write(line) jbe@37: io.stdout:write("\n") jbe@37: end jbe@37: end jbe@37: jbe@37: If one wanted to create a function that accepts either a sequence in form of a jbe@37: table or an iterator function, then one might need to write: jbe@37: jbe@41: do jbe@41: local function write_line(line) jbe@37: io.stdout:write(line) jbe@37: io.stdout:write("\n") jbe@37: end jbe@41: function write_lines(lines) jbe@41: if type(lines) == "function" then jbe@41: for line in lines do jbe@41: write_line(line) jbe@41: end jbe@41: else jbe@41: for i, line in ipairs(lines) do jbe@41: write_line(line) jbe@41: end jbe@41: end jbe@41: end jbe@37: end jbe@37: jbe@41: Obviously, this isn't something we want to do in every function that accepts jbe@37: sequential data. Therefore, we usually decide for one of the two first forms jbe@48: and thus disallow the other possible representation of sequential data to be jbe@48: passed to the function. jbe@37: jbe@37: This extension, however, modifies Lua's ``ipairs`` statement in such way that jbe@37: it automatically accepts either a table or an iterator function as argument. jbe@54: Thus, the first of the three ``write_lines`` functions above will accept both jbe@54: (table) sequences and (function) iterators. jbe@37: jbe@37: In addition to the modification of ``ipairs``, it also provides C functions and jbe@37: macros to iterate over values in the same manner as a generic loop statement jbe@37: with ``ipairs`` would do. jbe@37: jbe@56: This extension doesn't aim to supersede Lua's concept of iterator functions. jbe@56: While metamethods (see section "Respected metamethods" below) may be used to jbe@56: customize iteration behavior on values, this extension isn't thought to replace jbe@56: the common practice to use function closures as iterators. Consider the jbe@56: following example: jbe@56: jbe@56: function write_lines(lines) jbe@56: for i, line in ipairs(lines) do jbe@56: io.stdout:write(line) jbe@56: io.stdout:write("\n") jbe@56: end jbe@56: end jbe@56: local result = sql_query("SELECT * FROM actor ORDER BY birthdate") jbe@56: -- assert(type(result:get_column_entries("name")) == "function") jbe@56: write_lines(result:get_column_entries("name")) jbe@56: jbe@56: Note, however, that in case of repeated or nested loops, using function jbe@56: iterators may not be feasible: jbe@55: jbe@55: function print_list_twice(seq) jbe@55: for i = 1, 2 do jbe@55: for i, v in ipairs(seq) do jbe@55: print(v) jbe@55: end jbe@55: end jbe@55: end jbe@55: print_list_twice(io.stdin:lines()) -- won't work as expected jbe@55: jbe@56: Where desired, it is possible to use metamethods to customize iteration jbe@44: behavior: jbe@44: jbe@44: function print_rows(rows) jbe@44: for i, row in ipairs(rows) do jbe@44: print_row(row) jbe@44: end jbe@44: end jbe@44: local result = sql_query("SELECT * FROM actor ORDER BY birthday") jbe@46: assert(type(result) == "userdata") jbe@44: jbe@44: -- we may rely on the ``__index`` or ``__ipairs`` metamethod to jbe@44: -- iterate through all result rows here: jbe@44: print_rows(result) -- no need to use ":rows()" or a similar syntax jbe@44: jbe@45: -- but we can also still pass an individual set of result rows to the jbe@44: -- print_rows function: jbe@44: print_rows{result[1], result[#result]} jbe@44: jbe@44: This extension, however, doesn't respect the ``__len`` metamethod due to the jbe@47: following considerations: jbe@37: jbe@39: * An efficient implementation where ``for i, v in ipairs(tbl) do ... end`` does jbe@39: neither create a closure nor repeatedly evaluate ``#tbl`` seems to be jbe@39: impossible. jbe@37: * Respecting ``__len`` could be used to implement sparse arrays, but this would jbe@37: require iterating functions to expect ``nil`` as a potential value. This may jbe@37: lead to problems because ``nil`` is usually also used to indicate the absence jbe@37: of a value. jbe@37: jbe@40: Though, if such behavior is desired, it can still be implemented through the jbe@37: ``__ipairs`` metamethod. jbe@37: jbe@48: Unless manually done by the user in the ``__ipairs`` metamethod, the ``ipairs`` jbe@48: function as well as the corresponding C functions and macros provided by this jbe@48: extension never create any closures or other values that need to be garbage jbe@48: collected. jbe@37: jbe@0: jbe@0: jbe@0: Lua part of the library jbe@0: ----------------------- jbe@0: jbe@30: The modified ``ipairs(seq)`` and the new ``string.concat(sep, seq)`` functions jbe@30: accept either a table or a function as ``seq``. This is demonstrated in the jbe@30: following examples: jbe@0: jbe@0: require "seqlua" jbe@0: jbe@0: t = {"a", "b", "c"} jbe@0: jbe@54: for i, v in ipairs(t) do jbe@0: print(i, v) jbe@0: end jbe@0: -- prints: jbe@0: -- 1 a jbe@0: -- 2 b jbe@0: -- 3 c jbe@0: jbe@25: print(string.concat(",", t)) jbe@25: -- prints: a,b,c jbe@25: jbe@19: function alphabet() jbe@0: local letter = nil jbe@0: return function() jbe@0: if letter == nil then jbe@19: letter = "a" jbe@19: elseif letter == "z" then jbe@0: return nil jbe@0: else jbe@0: letter = string.char(string.byte(letter) + 1) jbe@0: end jbe@0: return letter jbe@0: end jbe@0: end jbe@0: jbe@54: for i, v in ipairs(alphabet()) do jbe@0: print(i, v) jbe@0: end jbe@0: -- prints: jbe@0: -- 1 a jbe@0: -- 2 b jbe@0: -- 3 c jbe@0: -- ... jbe@0: -- 25 y jbe@0: -- 26 z jbe@0: jbe@25: print(string.concat(",", alphabet())) jbe@25: -- prints: a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z jbe@25: jbe@26: function filter(f) jbe@26: return function(seq) jbe@26: return coroutine.wrap(function() jbe@54: for i, v in ipairs(seq) do f(v) end jbe@26: end) jbe@26: end jbe@0: end jbe@19: jbe@29: alpha_beta_x = filter(function(v) jbe@28: if v == "a" then jbe@28: coroutine.yield("alpha") jbe@28: elseif v == "b" then jbe@28: coroutine.yield("beta") jbe@28: elseif type(v) == "number" then jbe@23: for i = 1, v do jbe@28: coroutine.yield("X") jbe@23: end jbe@0: end jbe@26: end) jbe@0: jbe@29: print((","):concat(alpha_beta_x{"a", 3, "b", "c", "d"})) jbe@28: -- prints: alpha,X,X,X,beta jbe@25: jbe@29: print((","):concat(alpha_beta_x(alphabet()))) jbe@28: -- prints: alpha,beta jbe@27: jbe@0: jbe@37: jbe@0: C part of the library jbe@0: --------------------- jbe@0: jbe@0: In ``seqlualib.h``, the following macro is defined: jbe@0: jbe@54: #define seqlua_iterloop(L, iter, idx) \ jbe@0: for ( \ jbe@54: seqlua_iterinit((L), (iter), (idx)); \ jbe@0: seqlua_iternext(iter); \ jbe@25: ) jbe@25: jbe@25: and jbe@25: jbe@25: #define seqlua_iterloopauto(L, iter, idx) \ jbe@25: for ( \ jbe@54: seqlua_iterinit((L), (iter), (idx)); \ jbe@25: seqlua_iternext(iter); \ jbe@0: lua_pop((L), 1) \ jbe@0: ) jbe@0: jbe@23: This macro allows iteration over either tables or iterator functions as the jbe@23: following example function demonstrates: jbe@0: jbe@0: int printcsv(lua_State *L) { jbe@0: seqlua_Iterator iter; jbe@54: seqlua_iterloop(L, &iter, 1) { jbe@0: if (seqlua_itercount(&iter) > 1) fputs(",", stdout); jbe@0: fputs(luaL_tolstring(L, -1, NULL), stdout); jbe@25: // two values need to be popped (the value pushed by jbe@25: // seqlua_iternext and the value pushed by luaL_tolstring) jbe@25: lua_pop(L, 2); jbe@0: } jbe@0: fputs("\n", stdout); jbe@0: return 0; jbe@0: } jbe@0: jbe@11: printcsv{"a", "b", "c"} jbe@11: -- prints: a,b,c jbe@11: jbe@11: printcsv(assert(io.open("testfile")):lines()) jbe@11: -- prints: line1,line2,... of "testfile" jbe@0: jbe@31: NOTE: During iteration using ``seqlua_iterloop``, ``seqlua_iterloopauto``, or jbe@31: ``seqlua_iterinit``, three extra elements are stored on the stack (additionally jbe@31: to the value). These extra elements are removed automatically when the loop ends jbe@31: (i.e. when ``seqlua_iternext`` returns zero). The value pushed onto the stack jbe@31: for every iteration step has to be removed manually from the stack, unless jbe@31: ``seqlua_iterloopauto`` is used. jbe@0: jbe@31: jbe@37: jbe@35: Respected metamethods jbe@35: --------------------- jbe@35: jbe@35: Regarding the behavior of the Lua functions and the C functions and macros jbe@35: provided by this extension, an existing ``__index`` metamethod will be jbe@35: respected automatically. An existing ``__ipairs`` metamethod, however, takes jbe@35: precedence. jbe@35: jbe@35: If the ``__ipairs`` field of a value's metatable is set, then it must always jbe@35: refer to a function. When starting iteration over a value with such a jbe@35: metamethod being set, then this function is called with ``self`` (i.e. the jbe@35: value itself) passed as first argument. The return values of the ``__ipairs`` jbe@35: metamethod may take one of the following 4 forms: jbe@35: jbe@35: * ``return function_or_callable, static_argument, startindex`` causes the three jbe@35: arguments to be returned by ``ipairs`` without further modification. Using jbe@35: the C macros and functions for iteration, the behavior is according to the jbe@35: generic loop statement in Lua: jbe@35: ``for i, v in function_or_callable, static_argument, startindex do ... end`` jbe@35: * ``return "raw", table`` will result in iteration over the table ``table`` jbe@35: using ``lua_rawgeti`` jbe@35: * ``return "index", table_or_userdata`` will result in iteration over the table jbe@35: or userdata while respecting any ``__index`` metamethod of the table or jbe@35: userdata value jbe@35: * ``return "call", function_or_callable`` will use the callable value as jbe@35: (function) iterator where the function is expected to return a single value jbe@35: without any index (the index is inserted automatically when using the jbe@35: ``ipairs`` function for iteration) jbe@35: jbe@35: These possiblities are demonstrated by the following example code: jbe@35: jbe@35: require "seqlua" jbe@35: jbe@35: do jbe@35: local function ipairsaux(t, i) jbe@35: i = i + 1 jbe@35: if i <= 3 then jbe@35: return i, t[i] jbe@35: end jbe@35: end jbe@35: custom = setmetatable( jbe@35: {"one", "two", "three", "four", "five"}, jbe@35: { jbe@35: __ipairs = function(self) jbe@35: return ipairsaux, self, 0 jbe@35: end jbe@35: } jbe@35: ) jbe@35: end jbe@35: print(string.concat(",", custom)) jbe@36: -- prints: one,two,three jbe@35: -- (note that "four" and "five" are not printed) jbe@35: jbe@35: tbl = {"alpha", "beta"} jbe@35: jbe@35: proxy1 = setmetatable({}, {__index = tbl}) jbe@35: for i, v in ipairs(proxy1) do print(i, v) end jbe@35: -- prints: jbe@35: -- 1 alpha jbe@35: -- 2 beta jbe@35: jbe@35: proxy2 = setmetatable({}, { jbe@35: __ipairs = function(self) jbe@35: return "index", proxy1 jbe@35: end jbe@35: }) jbe@35: for i, v in ipairs(proxy2) do print(i, v) end jbe@35: -- prints: jbe@35: -- 1 alpha jbe@35: -- 2 beta jbe@35: print(proxy2[1]) jbe@35: -- prints: nil jbe@35: jbe@35: cursor = setmetatable({ jbe@35: "alice", "bob", "charlie", pos=1 jbe@35: }, { jbe@35: __call = function(self) jbe@35: local value = self[self.pos] jbe@35: if value == nil then jbe@35: self.pos = 1 jbe@35: else jbe@35: self.pos = self.pos + 1 jbe@35: end jbe@35: return value jbe@35: end, jbe@35: __ipairs = function(self) jbe@35: return "call", self jbe@35: end jbe@35: }) jbe@35: for i, v in ipairs(cursor) do print(i, v) end jbe@35: -- prints: jbe@35: -- 1 alice jbe@35: -- 2 bob jbe@35: -- 3 charlie jbe@35: print(cursor()) jbe@35: -- prints: alice jbe@35: for i, v in ipairs(cursor) do print(i, v) end jbe@35: -- prints: jbe@35: -- 1 bob jbe@35: -- 2 charlie jbe@35: -- (note that "alice" has been returned earlier) jbe@35: jbe@35: coefficients = setmetatable({1.25, 3.14, 17.5}, { jbe@35: __index = function(self) return 1 end, jbe@35: __ipairs = function(self) return "raw", self end jbe@35: }) jbe@35: for i, v in ipairs(coefficients) do print(i, v) end jbe@35: -- prints: jbe@35: -- 1 1.25 jbe@35: -- 2 3.14 jbe@35: -- 3 17.5 jbe@35: -- (note that iteration terminates even if coefficients[4] == 1) jbe@35: print(coefficients[4]) jbe@35: -- prints: 1 jbe@35: jbe@35: