seqlua
view README @ 47:31a78781a1e0
Replaced "due to the following reasons" with "due to the following considerations" in README
| author | jbe | 
|---|---|
| date | Mon Aug 25 04:11:58 2014 +0200 (2014-08-25) | 
| parents | 8889bb4c9b24 | 
| children | facf29831f6f | 
 line source
     1 seqlua: Extension for handling sequential data in Lua
     2 =====================================================
     4 This is an experimental package to extend Lua in the following manner:
     6 * allow ``ipairs(seq)`` to accept either tables or functions (i.e function
     7   iterators) as an argument,
     8 * add a new function ``string.concat(separator, seq)`` that concats either
     9   table entries or function return values,
    10 * provide auxiliary C functions and macros to simplify iterating over both
    11   tables and iterator functions with a generic statement.
    13 Existing ``__ipairs`` or ``__index`` (but not ``__len``) metamethods are
    14 respected by both the Lua functions and the C functions and macros. The
    15 ``__ipairs`` metamethod takes precedence over ``__index``, while the
    16 ``__len`` metamethod is never used.
    18 Metamethod handling in detail is explained in the last section
    19 ("Respected metamethods") at the bottom of this README.
    23 Motivation
    24 ----------
    26 Sequential data (such as arrays or streams) is often represented in two
    27 different ways:
    29 * as an ordered set of values (usually implemented as an array in other
    30   programming languages, or as a sequence in Lua: a table with numeric keys
    31   {1..n} associated with a value each),
    32 * as some sort of data stream (sometimes implemented as a class of objects
    33   providing certain methods, or as an iterator function in Lua: a function that
    34   returns the next value with every call, where nil indicates the end of the
    35   stream).
    37 Quite often, when functions work on sequential data, it shouldn't matter in
    38 which form the sequential data is being provided to the function. As an
    39 example, consider a function that is writing a sequence of strings to a file.
    40 Such function could either be fed with an array of strings (a table with
    41 numeric keys in Lua) or with a (possibly infinite) stream of data (an iterator
    42 function in Lua).
    44 A function in Lua that accepts a table, might look like as follows:
    46     function write_lines(lines)
    47       for i, line in ipairs(lines) do
    48         io.stdout:write(line)
    49         io.stdout:write("\n")
    50       end
    51     end
    53 In contrast, a function in Lua that accepts an iterator function would have to
    54 be implemented differently:
    56     function write_lines(get_next_line)
    57       for line in get_next_line do
    58         io.stdout:write(line)
    59         io.stdout:write("\n")
    60       end
    61     end
    63 If one wanted to create a function that accepts either a sequence in form of a
    64 table or an iterator function, then one might need to write:
    66     do
    67       local function write_line(line)
    68         io.stdout:write(line)
    69         io.stdout:write("\n")
    70       end
    71       function write_lines(lines)
    72         if type(lines) == "function" then
    73           for line in lines do
    74             write_line(line)
    75           end
    76         else
    77           for i, line in ipairs(lines) do
    78             write_line(line)
    79           end
    80         end
    81       end
    82     end
    84 Obviously, this isn't something we want to do in every function that accepts
    85 sequential data. Therefore, we usually decide for one of the two first forms
    86 and therefore disallow the other possible representation of sequential data to
    87 be passed to the function.
    89 This extension, however, modifies Lua's ``ipairs`` statement in such way that
    90 it automatically accepts either a table or an iterator function as argument.
    91 Thus, the first of the three ``write_lines`` functions above will accept both
    92 (table) sequences and (function) iterators.
    94 In addition to the modification of ``ipairs``, it also provides C functions and
    95 macros to iterate over values in the same manner as a generic loop statement
    96 with ``ipairs`` would do.
    98 Note that this extension doesn't aim to supersede Lua's concept of iterator
    99 functions. While metamethods (see section "Respected metamethods" below) may be
   100 used to customize iteration behavior on values, this extension isn't thought to
   101 replace the common practice to use function closures as iterators. Consider the
   102 following example:
   104     local result = sql_query("SELECT * FROM actor ORDER BY birthdate")
   105     write_lines(result:get_column_entries("name"))
   107 The ``get_column_entries`` method can return a simple function closure that
   108 returns the next entry in the "name" column (returning ``nil`` to indicate the
   109 end). Such a closure can then be passed to another function that iterates
   110 through a sequence of values by invoking ``ipairs`` with the general for-loop
   111 (as previously shown).
   113 Where desired, it is also possible to use metamethods to customize iteration
   114 behavior:
   116     function print_rows(rows)
   117       for i, row in ipairs(rows) do
   118         print_row(row)
   119       end
   120     end
   121     local result = sql_query("SELECT * FROM actor ORDER BY birthday")
   122     assert(type(result) == "userdata")
   124     -- we may rely on the ``__index`` or ``__ipairs`` metamethod to
   125     -- iterate through all result rows here:
   126     print_rows(result)  -- no need to use ":rows()" or a similar syntax
   128     -- but we can also still pass an individual set of result rows to the
   129     -- print_rows function:
   130     print_rows{result[1], result[#result]}
   132 This extension, however, doesn't respect the ``__len`` metamethod due to the
   133 following considerations:
   135 * An efficient implementation where ``for i, v in ipairs(tbl) do ... end`` does
   136   neither create a closure nor repeatedly evaluate ``#tbl`` seems to be
   137   impossible.
   138 * Respecting ``__len`` could be used to implement sparse arrays, but this would
   139   require iterating functions to expect ``nil`` as a potential value. This may
   140   lead to problems because ``nil`` is usually also used to indicate the absence
   141   of a value.
   143 Though, if such behavior is desired, it can still be implemented through the
   144 ``__ipairs`` metamethod.
   146 Unless manually done by the user in the ``__ipairs`` metamethod, this extension
   147 never creates any closures or other values that need to be garbage collected.
   151 Lua part of the library
   152 -----------------------
   154 The modified ``ipairs(seq)`` and the new ``string.concat(sep, seq)`` functions
   155 accept either a table or a function as ``seq``. This is demonstrated in the
   156 following examples:
   158     require "seqlua"
   160     t = {"a", "b", "c"}
   162     for i, v in ipairs(t) do
   163       print(i, v)
   164     end
   165     -- prints:
   166     --  1   a
   167     --  2   b
   168     --  3   c
   170     print(string.concat(",", t))
   171     -- prints: a,b,c
   173     function alphabet()
   174       local letter = nil
   175       return function()
   176         if letter == nil then
   177           letter = "a"
   178         elseif letter == "z" then
   179           return nil
   180         else
   181           letter = string.char(string.byte(letter) + 1)
   182         end
   183         return letter
   184       end
   185     end
   187     for i, v in ipairs(alphabet()) do
   188       print(i, v)
   189     end
   190     -- prints:
   191     --  1   a
   192     --  2   b
   193     --  3   c
   194     --  ...
   195     --  25  y
   196     --  26  z
   198     print(string.concat(",", alphabet()))
   199     -- prints: a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z
   201     function filter(f)
   202       return function(seq)
   203         return coroutine.wrap(function()
   204           for i, v in ipairs(seq) do f(v) end
   205         end)
   206       end
   207     end
   209     alpha_beta_x = filter(function(v)
   210       if v == "a" then
   211         coroutine.yield("alpha")
   212       elseif v == "b" then
   213         coroutine.yield("beta")
   214       elseif type(v) == "number" then
   215         for i = 1, v do
   216           coroutine.yield("X")
   217         end
   218       end
   219     end)
   221     print((","):concat(alpha_beta_x{"a", 3, "b", "c", "d"}))
   222     -- prints: alpha,X,X,X,beta
   224     print((","):concat(alpha_beta_x(alphabet())))
   225     -- prints: alpha,beta
   229 C part of the library
   230 ---------------------
   232 In ``seqlualib.h``, the following macro is defined:
   234     #define seqlua_iterloop(L, iter, idx) \
   235       for ( \
   236         seqlua_iterinit((L), (iter), (idx)); \
   237         seqlua_iternext(iter); \
   238       )
   240 and
   242     #define seqlua_iterloopauto(L, iter, idx) \
   243       for ( \
   244         seqlua_iterinit((L), (iter), (idx)); \
   245         seqlua_iternext(iter); \
   246         lua_pop((L), 1) \
   247       )
   249 This macro allows iteration over either tables or iterator functions as the
   250 following example function demonstrates:
   252     int printcsv(lua_State *L) {
   253       seqlua_Iterator iter;
   254       seqlua_iterloop(L, &iter, 1) {
   255         if (seqlua_itercount(&iter) > 1) fputs(",", stdout);
   256         fputs(luaL_tolstring(L, -1, NULL), stdout);
   257         // two values need to be popped (the value pushed by
   258         // seqlua_iternext and the value pushed by luaL_tolstring)
   259         lua_pop(L, 2);
   260       }
   261       fputs("\n", stdout);
   262       return 0;
   263     }
   265     printcsv{"a", "b", "c"}
   266     -- prints: a,b,c
   268     printcsv(assert(io.open("testfile")):lines())
   269     -- prints: line1,line2,... of "testfile"
   271 NOTE: During iteration using ``seqlua_iterloop``, ``seqlua_iterloopauto``, or
   272 ``seqlua_iterinit``, three extra elements are stored on the stack (additionally
   273 to the value). These extra elements are removed automatically when the loop ends
   274 (i.e. when ``seqlua_iternext`` returns zero). The value pushed onto the stack
   275 for every iteration step has to be removed manually from the stack, unless
   276 ``seqlua_iterloopauto`` is used.
   280 Respected metamethods
   281 ---------------------
   283 Regarding the behavior of the Lua functions and the C functions and macros
   284 provided by this extension, an existing ``__index`` metamethod will be
   285 respected automatically. An existing ``__ipairs`` metamethod, however, takes
   286 precedence.
   288 If the ``__ipairs`` field of a value's metatable is set, then it must always
   289 refer to a function. When starting iteration over a value with such a
   290 metamethod being set, then this function is called with ``self`` (i.e. the
   291 value itself) passed as first argument. The return values of the ``__ipairs``
   292 metamethod may take one of the following 4 forms:
   294 * ``return function_or_callable, static_argument, startindex`` causes the three
   295   arguments to be returned by ``ipairs`` without further modification. Using
   296   the C macros and functions for iteration, the behavior is according to the
   297   generic loop statement in Lua:
   298   ``for i, v in function_or_callable, static_argument, startindex do ... end``
   299 * ``return "raw", table`` will result in iteration over the table ``table``
   300   using ``lua_rawgeti``
   301 * ``return "index", table_or_userdata`` will result in iteration over the table
   302   or userdata while respecting any ``__index`` metamethod of the table or
   303   userdata value
   304 * ``return "call", function_or_callable`` will use the callable value as
   305   (function) iterator where the function is expected to return a single value
   306   without any index (the index is inserted automatically when using the
   307   ``ipairs`` function for iteration)
   309 These possiblities are demonstrated by the following example code:
   311     require "seqlua"
   313     do
   314       local function ipairsaux(t, i)
   315         i = i + 1
   316         if i <= 3 then
   317           return i, t[i]
   318         end
   319       end
   320       custom = setmetatable(
   321         {"one", "two", "three", "four", "five"},
   322         {
   323           __ipairs = function(self)
   324             return ipairsaux, self, 0
   325           end
   326         }
   327       )
   328     end
   329     print(string.concat(",", custom))
   330     -- prints: one,two,three
   331     -- (note that "four" and "five" are not printed)
   333     tbl = {"alpha", "beta"}
   335     proxy1 = setmetatable({}, {__index = tbl})
   336     for i, v in ipairs(proxy1) do print(i, v) end
   337     -- prints:
   338     --  1   alpha
   339     --  2   beta
   341     proxy2 = setmetatable({}, {
   342       __ipairs = function(self)
   343         return "index", proxy1
   344       end
   345     })
   346     for i, v in ipairs(proxy2) do print(i, v) end
   347     -- prints:
   348     --  1   alpha
   349     --  2   beta
   350     print(proxy2[1])
   351     -- prints: nil
   353     cursor = setmetatable({
   354       "alice", "bob", "charlie", pos=1
   355     }, {
   356       __call = function(self)
   357         local value = self[self.pos]
   358         if value == nil then
   359           self.pos = 1
   360         else
   361           self.pos = self.pos + 1
   362         end
   363         return value
   364       end,
   365       __ipairs = function(self)
   366         return "call", self
   367       end
   368     })
   369     for i, v in ipairs(cursor) do print(i, v) end
   370     -- prints:
   371     --  1   alice
   372     --  2   bob
   373     --  3   charlie
   374     print(cursor())
   375     -- prints: alice
   376     for i, v in ipairs(cursor) do print(i, v) end
   377     -- prints:
   378     --  1   bob
   379     --  2   charlie
   380     -- (note that "alice" has been returned earlier)
   382     coefficients = setmetatable({1.25, 3.14, 17.5}, {
   383       __index  = function(self) return 1 end,
   384       __ipairs = function(self) return "raw", self end
   385     })
   386     for i, v in ipairs(coefficients) do print(i, v) end
   387     -- prints:
   388     --  1   1.25
   389     --  2   3.14
   390     --  3   17.5
   391     -- (note that iteration terminates even if coefficients[4] == 1)
   392     print(coefficients[4])
   393     -- prints: 1
