seqlua
view README @ 52:3362ec36cb09
Do not automatically assume that functions passed to ipairs are iterators
but require ipairs(func, mode) to have an explicit mode set to "call" or "generator"
but require ipairs(func, mode) to have an explicit mode set to "call" or "generator"
| author | jbe | 
|---|---|
| date | Tue Aug 26 21:10:03 2014 +0200 (2014-08-26) | 
| parents | 06c5f2f9ec41 | 
| children | 664736a8fcbf | 
 line source
     1 seqlua: Extension for handling sequential data in Lua
     2 =====================================================
     4 This package is an experimental extension for the Lua programming language
     5 (version 5.2) which:
     7 * makes ``ipairs(tbl)`` respect both metamethods ``__index`` and ``__ipairs``
     8   (where ``__ipairs`` has precedence over ``__index``),
     9 * allows ``ipairs(seq, "call")`` to accept either tables or functions as first
    10   argument where a function is used as iterator,
    11 * allows ``ipairs(seq, "generator")`` to accept either tables or functions as
    12   first argument where a function is used as generator for an iterator,
    13 * adds a new function ``string.concat(separator, seq)`` that concats either
    14   table entries or function return values,
    15 * provides auxiliary C functions and macros to simplify iterating over both
    16   tables and iterator functions with a generic statement.
    18 Existing ``__ipairs`` or ``__index`` (but not ``__len``) metamethods are
    19 respected by both the Lua functions and the C functions and macros. The
    20 ``__ipairs`` metamethod takes precedence over ``__index``, while the
    21 ``__len`` metamethod is never used.
    23 Metamethod handling in detail is explained in the last section
    24 ("Respected metamethods") at the bottom of this README.
    26 In Lua, this extension is loaded by ``require "seqlua"``. In order to use the
    27 auxiliary C functions and macros, add ``#include <seqlualib.h>`` to your C file
    28 and ensure that the functions implemented in ``seqlualib.c`` are statically or
    29 dynamically linked with your C Lua library.
    33 Motivation
    34 ----------
    36 Sequential data (such as arrays or streams) is often represented in two
    37 different ways:
    39 * as an ordered set of values (usually implemented as an array in other
    40   programming languages, or as a sequence in Lua: a table with numeric keys
    41   {1..n} associated with a value each),
    42 * as some sort of data stream (sometimes implemented as a class of objects
    43   providing certain methods, or as an iterator function in Lua: a function that
    44   returns the next value with every call, where nil indicates the end of the
    45   stream).
    47 Quite often, when functions work on sequential data, it shouldn't matter in
    48 which form the sequential data is being provided to the function. As an
    49 example, consider a function that is writing a sequence of strings to a file.
    50 Such function could either be fed with an array of strings (a table with
    51 numeric keys in Lua) or with a (possibly infinite) stream of data (an iterator
    52 function in Lua).
    54 A function in Lua that accepts a table, might look like as follows:
    56     function write_lines(lines)
    57       for i, line in ipairs(lines) do
    58         io.stdout:write(line)
    59         io.stdout:write("\n")
    60       end
    61     end
    63 In contrast, a function in Lua that accepts an iterator function would have to
    64 be implemented differently:
    66     function write_lines(get_next_line)
    67       for line in get_next_line do
    68         io.stdout:write(line)
    69         io.stdout:write("\n")
    70       end
    71     end
    73 If one wanted to create a function that accepts either a sequence in form of a
    74 table or an iterator function, then one might need to write:
    76     do
    77       local function write_line(line)
    78         io.stdout:write(line)
    79         io.stdout:write("\n")
    80       end
    81       function write_lines(lines)
    82         if type(lines) == "function" then
    83           for line in lines do
    84             write_line(line)
    85           end
    86         else
    87           for i, line in ipairs(lines) do
    88             write_line(line)
    89           end
    90         end
    91       end
    92     end
    94 Obviously, this isn't something we want to do in every function that accepts
    95 sequential data. Therefore, we usually decide for one of the two first forms
    96 and thus disallow the other possible representation of sequential data to be
    97 passed to the function.
    99 This extension, however, modifies Lua's ``ipairs`` statement in such way that
   100 it automatically accepts either a table or an iterator function as argument.
   101 Thus, the function below will accept both (table) sequences and (function)
   102 iterators:
   104     function write_lines(lines)
   105       for i, line in ipairs(lines, "call") do
   106         io.stdout:write(line)
   107         io.stdout:write("\n")
   108       end
   109     end
   111 In addition to the modification of ``ipairs``, it also provides C functions and
   112 macros to iterate over values in the same manner as a generic loop statement
   113 with ``ipairs`` would do.
   115 Note that this extension doesn't aim to supersede Lua's concept of iterator
   116 functions. While metamethods (see section "Respected metamethods" below) may be
   117 used to customize iteration behavior on values, this extension isn't thought to
   118 replace the common practice to use function closures as iterators. Consider the
   119 following example:
   121     local result = sql_query("SELECT * FROM actor ORDER BY birthdate")
   122     write_lines(result:get_column_entries("name"))
   124 The ``get_column_entries`` method can return a simple function closure that
   125 returns the next entry in the "name" column (returning ``nil`` to indicate the
   126 end). Such a closure can then be passed to another function that iterates
   127 through a sequence of values by invoking ``ipairs`` with the general for-loop
   128 (as previously shown).
   130 Where desired, it is also possible to use metamethods to customize iteration
   131 behavior:
   133     function print_rows(rows)
   134       for i, row in ipairs(rows) do
   135         print_row(row)
   136       end
   137     end
   138     local result = sql_query("SELECT * FROM actor ORDER BY birthday")
   139     assert(type(result) == "userdata")
   141     -- we may rely on the ``__index`` or ``__ipairs`` metamethod to
   142     -- iterate through all result rows here:
   143     print_rows(result)  -- no need to use ":rows()" or a similar syntax
   145     -- but we can also still pass an individual set of result rows to the
   146     -- print_rows function:
   147     print_rows{result[1], result[#result]}
   149 This extension, however, doesn't respect the ``__len`` metamethod due to the
   150 following considerations:
   152 * An efficient implementation where ``for i, v in ipairs(tbl) do ... end`` does
   153   neither create a closure nor repeatedly evaluate ``#tbl`` seems to be
   154   impossible.
   155 * Respecting ``__len`` could be used to implement sparse arrays, but this would
   156   require iterating functions to expect ``nil`` as a potential value. This may
   157   lead to problems because ``nil`` is usually also used to indicate the absence
   158   of a value.
   160 Though, if such behavior is desired, it can still be implemented through the
   161 ``__ipairs`` metamethod.
   163 Unless manually done by the user in the ``__ipairs`` metamethod, the ``ipairs``
   164 function as well as the corresponding C functions and macros provided by this
   165 extension never create any closures or other values that need to be garbage
   166 collected.
   170 Lua part of the library
   171 -----------------------
   173 The modified ``ipairs(seq)`` and the new ``string.concat(sep, seq)`` functions
   174 accept either a table or a function as ``seq``. This is demonstrated in the
   175 following examples:
   177     require "seqlua"
   179     t = {"a", "b", "c"}
   181     for i, v in ipairs(t, "call") do
   182       print(i, v)
   183     end
   184     -- prints:
   185     --  1   a
   186     --  2   b
   187     --  3   c
   189     print(string.concat(",", t))
   190     -- prints: a,b,c
   192     function alphabet()
   193       local letter = nil
   194       return function()
   195         if letter == nil then
   196           letter = "a"
   197         elseif letter == "z" then
   198           return nil
   199         else
   200           letter = string.char(string.byte(letter) + 1)
   201         end
   202         return letter
   203       end
   204     end
   206     for i, v in ipairs(alphabet(), "call") do
   207       print(i, v)
   208     end
   209     -- prints:
   210     --  1   a
   211     --  2   b
   212     --  3   c
   213     --  ...
   214     --  25  y
   215     --  26  z
   217     for i, v in ipairs(alphabet, "generator") do
   218       print(i, v)
   219     end
   220     -- prints:
   221     --  1   a
   222     --  2   b
   223     --  3   c
   224     --  ...
   225     --  25  y
   226     --  26  z
   228     print(string.concat(",", alphabet()))
   229     -- prints: a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z
   231     function filter(f)
   232       return function(seq)
   233         return coroutine.wrap(function()
   234           for i, v in ipairs(seq, "call") do f(v) end
   235         end)
   236       end
   237     end
   239     alpha_beta_x = filter(function(v)
   240       if v == "a" then
   241         coroutine.yield("alpha")
   242       elseif v == "b" then
   243         coroutine.yield("beta")
   244       elseif type(v) == "number" then
   245         for i = 1, v do
   246           coroutine.yield("X")
   247         end
   248       end
   249     end)
   251     print((","):concat(alpha_beta_x{"a", 3, "b", "c", "d"}))
   252     -- prints: alpha,X,X,X,beta
   254     print((","):concat(alpha_beta_x(alphabet())))
   255     -- prints: alpha,beta
   259 C part of the library
   260 ---------------------
   262 In ``seqlualib.h``, the following macro is defined:
   264     #define seqlua_iterloop(L, iter, idx) \
   265       for ( \
   266         seqlua_iterinit((L), (iter), (idx)); \
   267         seqlua_iternext(iter); \
   268       )
   270 and
   272     #define seqlua_iterloopauto(L, iter, idx) \
   273       for ( \
   274         seqlua_iterinit((L), (iter), (idx)); \
   275         seqlua_iternext(iter); \
   276         lua_pop((L), 1) \
   277       )
   279 This macro allows iteration over either tables or iterator functions as the
   280 following example function demonstrates:
   282     int printcsv(lua_State *L) {
   283       seqlua_Iterator iter;
   284       seqlua_iterloop(L, &iter, 1) {
   285         if (seqlua_itercount(&iter) > 1) fputs(",", stdout);
   286         fputs(luaL_tolstring(L, -1, NULL), stdout);
   287         // two values need to be popped (the value pushed by
   288         // seqlua_iternext and the value pushed by luaL_tolstring)
   289         lua_pop(L, 2);
   290       }
   291       fputs("\n", stdout);
   292       return 0;
   293     }
   295     printcsv{"a", "b", "c"}
   296     -- prints: a,b,c
   298     printcsv(assert(io.open("testfile")):lines())
   299     -- prints: line1,line2,... of "testfile"
   301 NOTE: During iteration using ``seqlua_iterloop``, ``seqlua_iterloopauto``, or
   302 ``seqlua_iterinit``, three extra elements are stored on the stack (additionally
   303 to the value). These extra elements are removed automatically when the loop ends
   304 (i.e. when ``seqlua_iternext`` returns zero). The value pushed onto the stack
   305 for every iteration step has to be removed manually from the stack, unless
   306 ``seqlua_iterloopauto`` is used.
   310 Respected metamethods
   311 ---------------------
   313 Regarding the behavior of the Lua functions and the C functions and macros
   314 provided by this extension, an existing ``__index`` metamethod will be
   315 respected automatically. An existing ``__ipairs`` metamethod, however, takes
   316 precedence.
   318 If the ``__ipairs`` field of a value's metatable is set, then it must always
   319 refer to a function. When starting iteration over a value with such a
   320 metamethod being set, then this function is called with ``self`` (i.e. the
   321 value itself) passed as first argument. The return values of the ``__ipairs``
   322 metamethod may take one of the following 4 forms:
   324 * ``return function_or_callable, static_argument, startindex`` causes the three
   325   arguments to be returned by ``ipairs`` without further modification. Using
   326   the C macros and functions for iteration, the behavior is according to the
   327   generic loop statement in Lua:
   328   ``for i, v in function_or_callable, static_argument, startindex do ... end``
   329 * ``return "raw", table`` will result in iteration over the table ``table``
   330   using ``lua_rawgeti``
   331 * ``return "index", table_or_userdata`` will result in iteration over the table
   332   or userdata while respecting any ``__index`` metamethod of the table or
   333   userdata value
   334 * ``return "call", function_or_callable`` will use the callable value as
   335   (function) iterator where the function is expected to return a single value
   336   without any index (the index is inserted automatically when using the
   337   ``ipairs`` function for iteration)
   339 These possiblities are demonstrated by the following example code:
   341     require "seqlua"
   343     do
   344       local function ipairsaux(t, i)
   345         i = i + 1
   346         if i <= 3 then
   347           return i, t[i]
   348         end
   349       end
   350       custom = setmetatable(
   351         {"one", "two", "three", "four", "five"},
   352         {
   353           __ipairs = function(self)
   354             return ipairsaux, self, 0
   355           end
   356         }
   357       )
   358     end
   359     print(string.concat(",", custom))
   360     -- prints: one,two,three
   361     -- (note that "four" and "five" are not printed)
   363     tbl = {"alpha", "beta"}
   365     proxy1 = setmetatable({}, {__index = tbl})
   366     for i, v in ipairs(proxy1) do print(i, v) end
   367     -- prints:
   368     --  1   alpha
   369     --  2   beta
   371     proxy2 = setmetatable({}, {
   372       __ipairs = function(self)
   373         return "index", proxy1
   374       end
   375     })
   376     for i, v in ipairs(proxy2) do print(i, v) end
   377     -- prints:
   378     --  1   alpha
   379     --  2   beta
   380     print(proxy2[1])
   381     -- prints: nil
   383     cursor = setmetatable({
   384       "alice", "bob", "charlie", pos=1
   385     }, {
   386       __call = function(self)
   387         local value = self[self.pos]
   388         if value == nil then
   389           self.pos = 1
   390         else
   391           self.pos = self.pos + 1
   392         end
   393         return value
   394       end,
   395       __ipairs = function(self)
   396         return "call", self
   397       end
   398     })
   399     for i, v in ipairs(cursor) do print(i, v) end
   400     -- prints:
   401     --  1   alice
   402     --  2   bob
   403     --  3   charlie
   404     print(cursor())
   405     -- prints: alice
   406     for i, v in ipairs(cursor) do print(i, v) end
   407     -- prints:
   408     --  1   bob
   409     --  2   charlie
   410     -- (note that "alice" has been returned earlier)
   412     coefficients = setmetatable({1.25, 3.14, 17.5}, {
   413       __index  = function(self) return 1 end,
   414       __ipairs = function(self) return "raw", self end
   415     })
   416     for i, v in ipairs(coefficients) do print(i, v) end
   417     -- prints:
   418     --  1   1.25
   419     --  2   3.14
   420     --  3   17.5
   421     -- (note that iteration terminates even if coefficients[4] == 1)
   422     print(coefficients[4])
   423     -- prints: 1
