seqlua
view README @ 55:37c2841c6e8c
Added warning regarding nested/repeated loops to README
| author | jbe | 
|---|---|
| date | Wed Aug 27 00:21:04 2014 +0200 (2014-08-27) | 
| parents | 92ce3958aca7 | 
| children | c3976eacc6ab | 
 line source
     1 seqlua: Extension for handling sequential data in Lua
     2 =====================================================
     4 This package is an experimental extension for the Lua 5.2 programming language
     5 which:
     7 * allows ``ipairs(seq)`` to accept either tables or functions (i.e function
     8   iterators) as an argument,
     9 * adds a new function ``string.concat(separator, seq)`` that concats either
    10   table entries or function return values,
    11 * provides auxiliary C functions and macros to simplify iterating over both
    12   tables and iterator functions with a generic statement.
    14 Existing ``__ipairs`` or ``__index`` (but not ``__len``) metamethods are
    15 respected by both the Lua functions and the C functions and macros. The
    16 ``__ipairs`` metamethod takes precedence over ``__index``, while the
    17 ``__len`` metamethod is never used.
    19 Metamethod handling in detail is explained in the last section
    20 ("Respected metamethods") at the bottom of this README.
    22 In Lua, this extension is loaded by ``require "seqlua"``. In order to use the
    23 auxiliary C functions and macros, add ``#include <seqlualib.h>`` to your C file
    24 and ensure that the functions implemented in ``seqlualib.c`` are statically or
    25 dynamically linked with your C Lua library.
    29 Motivation
    30 ----------
    32 Sequential data (such as arrays or streams) is often represented in two
    33 different ways:
    35 * as an ordered set of values (usually implemented as an array in other
    36   programming languages, or as a sequence in Lua: a table with numeric keys
    37   {1..n} associated with a value each),
    38 * as some sort of data stream (sometimes implemented as a class of objects
    39   providing certain methods, or as an iterator function in Lua: a function that
    40   returns the next value with every call, where nil indicates the end of the
    41   stream).
    43 Quite often, when functions work on sequential data, it shouldn't matter in
    44 which form the sequential data is being provided to the function. As an
    45 example, consider a function that is writing a sequence of strings to a file.
    46 Such function could either be fed with an array of strings (a table with
    47 numeric keys in Lua) or with a (possibly infinite) stream of data (an iterator
    48 function in Lua).
    50 A function in Lua that accepts a table, might look like as follows:
    52     function write_lines(lines)
    53       for i, line in ipairs(lines) do
    54         io.stdout:write(line)
    55         io.stdout:write("\n")
    56       end
    57     end
    59 In contrast, a function in Lua that accepts an iterator function would have to
    60 be implemented differently:
    62     function write_lines(get_next_line)
    63       for line in get_next_line do
    64         io.stdout:write(line)
    65         io.stdout:write("\n")
    66       end
    67     end
    69 If one wanted to create a function that accepts either a sequence in form of a
    70 table or an iterator function, then one might need to write:
    72     do
    73       local function write_line(line)
    74         io.stdout:write(line)
    75         io.stdout:write("\n")
    76       end
    77       function write_lines(lines)
    78         if type(lines) == "function" then
    79           for line in lines do
    80             write_line(line)
    81           end
    82         else
    83           for i, line in ipairs(lines) do
    84             write_line(line)
    85           end
    86         end
    87       end
    88     end
    90 Obviously, this isn't something we want to do in every function that accepts
    91 sequential data. Therefore, we usually decide for one of the two first forms
    92 and thus disallow the other possible representation of sequential data to be
    93 passed to the function.
    95 This extension, however, modifies Lua's ``ipairs`` statement in such way that
    96 it automatically accepts either a table or an iterator function as argument.
    97 Thus, the first of the three ``write_lines`` functions above will accept both
    98 (table) sequences and (function) iterators.
   100 In addition to the modification of ``ipairs``, it also provides C functions and
   101 macros to iterate over values in the same manner as a generic loop statement
   102 with ``ipairs`` would do.
   104 Note that in case of repeated or nested loops, using function iterators may not
   105 be feasible:
   107     function print_list_twice(seq)
   108       for i = 1, 2 do
   109         for i, v in ipairs(seq) do
   110           print(v)
   111         end
   112       end
   113     end
   114     print_list_twice(io.stdin:lines())  -- won't work as expected
   116 Also note that this extension doesn't aim to supersede Lua's concept of
   117 iterator functions. While metamethods (see section "Respected metamethods"
   118 below) may be used to customize iteration behavior on values, this extension
   119 isn't thought to replace the common practice to use function closures as
   120 iterators. Consider the following example:
   122     local result = sql_query("SELECT * FROM actor ORDER BY birthdate")
   123     write_lines(result:get_column_entries("name"))
   125 The ``get_column_entries`` method can return a simple function closure that
   126 returns the next entry in the "name" column (returning ``nil`` to indicate the
   127 end). Such a closure can then be passed to another function that iterates
   128 through a sequence of values by invoking ``ipairs`` with the general for-loop
   129 (as previously shown).
   131 Where desired, it is also possible to use metamethods to customize iteration
   132 behavior:
   134     function print_rows(rows)
   135       for i, row in ipairs(rows) do
   136         print_row(row)
   137       end
   138     end
   139     local result = sql_query("SELECT * FROM actor ORDER BY birthday")
   140     assert(type(result) == "userdata")
   142     -- we may rely on the ``__index`` or ``__ipairs`` metamethod to
   143     -- iterate through all result rows here:
   144     print_rows(result)  -- no need to use ":rows()" or a similar syntax
   146     -- but we can also still pass an individual set of result rows to the
   147     -- print_rows function:
   148     print_rows{result[1], result[#result]}
   150 This extension, however, doesn't respect the ``__len`` metamethod due to the
   151 following considerations:
   153 * An efficient implementation where ``for i, v in ipairs(tbl) do ... end`` does
   154   neither create a closure nor repeatedly evaluate ``#tbl`` seems to be
   155   impossible.
   156 * Respecting ``__len`` could be used to implement sparse arrays, but this would
   157   require iterating functions to expect ``nil`` as a potential value. This may
   158   lead to problems because ``nil`` is usually also used to indicate the absence
   159   of a value.
   161 Though, if such behavior is desired, it can still be implemented through the
   162 ``__ipairs`` metamethod.
   164 Unless manually done by the user in the ``__ipairs`` metamethod, the ``ipairs``
   165 function as well as the corresponding C functions and macros provided by this
   166 extension never create any closures or other values that need to be garbage
   167 collected.
   171 Lua part of the library
   172 -----------------------
   174 The modified ``ipairs(seq)`` and the new ``string.concat(sep, seq)`` functions
   175 accept either a table or a function as ``seq``. This is demonstrated in the
   176 following examples:
   178     require "seqlua"
   180     t = {"a", "b", "c"}
   182     for i, v in ipairs(t) do
   183       print(i, v)
   184     end
   185     -- prints:
   186     --  1   a
   187     --  2   b
   188     --  3   c
   190     print(string.concat(",", t))
   191     -- prints: a,b,c
   193     function alphabet()
   194       local letter = nil
   195       return function()
   196         if letter == nil then
   197           letter = "a"
   198         elseif letter == "z" then
   199           return nil
   200         else
   201           letter = string.char(string.byte(letter) + 1)
   202         end
   203         return letter
   204       end
   205     end
   207     for i, v in ipairs(alphabet()) do
   208       print(i, v)
   209     end
   210     -- prints:
   211     --  1   a
   212     --  2   b
   213     --  3   c
   214     --  ...
   215     --  25  y
   216     --  26  z
   218     print(string.concat(",", alphabet()))
   219     -- prints: a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z
   221     function filter(f)
   222       return function(seq)
   223         return coroutine.wrap(function()
   224           for i, v in ipairs(seq) do f(v) end
   225         end)
   226       end
   227     end
   229     alpha_beta_x = filter(function(v)
   230       if v == "a" then
   231         coroutine.yield("alpha")
   232       elseif v == "b" then
   233         coroutine.yield("beta")
   234       elseif type(v) == "number" then
   235         for i = 1, v do
   236           coroutine.yield("X")
   237         end
   238       end
   239     end)
   241     print((","):concat(alpha_beta_x{"a", 3, "b", "c", "d"}))
   242     -- prints: alpha,X,X,X,beta
   244     print((","):concat(alpha_beta_x(alphabet())))
   245     -- prints: alpha,beta
   249 C part of the library
   250 ---------------------
   252 In ``seqlualib.h``, the following macro is defined:
   254     #define seqlua_iterloop(L, iter, idx) \
   255       for ( \
   256         seqlua_iterinit((L), (iter), (idx)); \
   257         seqlua_iternext(iter); \
   258       )
   260 and
   262     #define seqlua_iterloopauto(L, iter, idx) \
   263       for ( \
   264         seqlua_iterinit((L), (iter), (idx)); \
   265         seqlua_iternext(iter); \
   266         lua_pop((L), 1) \
   267       )
   269 This macro allows iteration over either tables or iterator functions as the
   270 following example function demonstrates:
   272     int printcsv(lua_State *L) {
   273       seqlua_Iterator iter;
   274       seqlua_iterloop(L, &iter, 1) {
   275         if (seqlua_itercount(&iter) > 1) fputs(",", stdout);
   276         fputs(luaL_tolstring(L, -1, NULL), stdout);
   277         // two values need to be popped (the value pushed by
   278         // seqlua_iternext and the value pushed by luaL_tolstring)
   279         lua_pop(L, 2);
   280       }
   281       fputs("\n", stdout);
   282       return 0;
   283     }
   285     printcsv{"a", "b", "c"}
   286     -- prints: a,b,c
   288     printcsv(assert(io.open("testfile")):lines())
   289     -- prints: line1,line2,... of "testfile"
   291 NOTE: During iteration using ``seqlua_iterloop``, ``seqlua_iterloopauto``, or
   292 ``seqlua_iterinit``, three extra elements are stored on the stack (additionally
   293 to the value). These extra elements are removed automatically when the loop ends
   294 (i.e. when ``seqlua_iternext`` returns zero). The value pushed onto the stack
   295 for every iteration step has to be removed manually from the stack, unless
   296 ``seqlua_iterloopauto`` is used.
   300 Respected metamethods
   301 ---------------------
   303 Regarding the behavior of the Lua functions and the C functions and macros
   304 provided by this extension, an existing ``__index`` metamethod will be
   305 respected automatically. An existing ``__ipairs`` metamethod, however, takes
   306 precedence.
   308 If the ``__ipairs`` field of a value's metatable is set, then it must always
   309 refer to a function. When starting iteration over a value with such a
   310 metamethod being set, then this function is called with ``self`` (i.e. the
   311 value itself) passed as first argument. The return values of the ``__ipairs``
   312 metamethod may take one of the following 4 forms:
   314 * ``return function_or_callable, static_argument, startindex`` causes the three
   315   arguments to be returned by ``ipairs`` without further modification. Using
   316   the C macros and functions for iteration, the behavior is according to the
   317   generic loop statement in Lua:
   318   ``for i, v in function_or_callable, static_argument, startindex do ... end``
   319 * ``return "raw", table`` will result in iteration over the table ``table``
   320   using ``lua_rawgeti``
   321 * ``return "index", table_or_userdata`` will result in iteration over the table
   322   or userdata while respecting any ``__index`` metamethod of the table or
   323   userdata value
   324 * ``return "call", function_or_callable`` will use the callable value as
   325   (function) iterator where the function is expected to return a single value
   326   without any index (the index is inserted automatically when using the
   327   ``ipairs`` function for iteration)
   329 These possiblities are demonstrated by the following example code:
   331     require "seqlua"
   333     do
   334       local function ipairsaux(t, i)
   335         i = i + 1
   336         if i <= 3 then
   337           return i, t[i]
   338         end
   339       end
   340       custom = setmetatable(
   341         {"one", "two", "three", "four", "five"},
   342         {
   343           __ipairs = function(self)
   344             return ipairsaux, self, 0
   345           end
   346         }
   347       )
   348     end
   349     print(string.concat(",", custom))
   350     -- prints: one,two,three
   351     -- (note that "four" and "five" are not printed)
   353     tbl = {"alpha", "beta"}
   355     proxy1 = setmetatable({}, {__index = tbl})
   356     for i, v in ipairs(proxy1) do print(i, v) end
   357     -- prints:
   358     --  1   alpha
   359     --  2   beta
   361     proxy2 = setmetatable({}, {
   362       __ipairs = function(self)
   363         return "index", proxy1
   364       end
   365     })
   366     for i, v in ipairs(proxy2) do print(i, v) end
   367     -- prints:
   368     --  1   alpha
   369     --  2   beta
   370     print(proxy2[1])
   371     -- prints: nil
   373     cursor = setmetatable({
   374       "alice", "bob", "charlie", pos=1
   375     }, {
   376       __call = function(self)
   377         local value = self[self.pos]
   378         if value == nil then
   379           self.pos = 1
   380         else
   381           self.pos = self.pos + 1
   382         end
   383         return value
   384       end,
   385       __ipairs = function(self)
   386         return "call", self
   387       end
   388     })
   389     for i, v in ipairs(cursor) do print(i, v) end
   390     -- prints:
   391     --  1   alice
   392     --  2   bob
   393     --  3   charlie
   394     print(cursor())
   395     -- prints: alice
   396     for i, v in ipairs(cursor) do print(i, v) end
   397     -- prints:
   398     --  1   bob
   399     --  2   charlie
   400     -- (note that "alice" has been returned earlier)
   402     coefficients = setmetatable({1.25, 3.14, 17.5}, {
   403       __index  = function(self) return 1 end,
   404       __ipairs = function(self) return "raw", self end
   405     })
   406     for i, v in ipairs(coefficients) do print(i, v) end
   407     -- prints:
   408     --  1   1.25
   409     --  2   3.14
   410     --  3   17.5
   411     -- (note that iteration terminates even if coefficients[4] == 1)
   412     print(coefficients[4])
   413     -- prints: 1
