TIP 232: Creating New Math Functions for the 'expr' Command

Login
Author:         Arjen Markus <[email protected]>
Author:         Kevin Kenny <[email protected] >
State:          Final
Type:           Project
Vote:           Done
Created:        26-Nov-2004
Post-History:   
Keywords:       math,expr,Tcl
Tcl-Version:    8.5

Abstract

This TIP proposes to replace Tcl's math functions, ordinarily created using the Tcl_CreateMathFunc() function, with ordinary Tcl commands created in a known namespace, ::tcl::mathfunc. This change has two chief motivations: it allows programmers to define new math functions at the Tcl level, and it encapsulates the Tcl_Value API so that new math functions can work on data types that are not described adequately as Tcl_Value objects.

Rationale

The two authors of this TIP, Kevin Kenny and Arjen Markus, have come at wanting the same change to Tcl from two distinct directions.

Arjen Markus has been the maintainer of several modules in Tcllib that implement "special functions" such as the exponential integral, the incomplete gamma function and the Bessel functions. He has wanted to have "scripted" math functions available to simplify his notation. In place of a complex expression like

  set f [expr { $x*[J0 $x] + [J1 [expr {2.0*$x}]] }]

which is pretty unreadable (and prone to error because of the need to brace the subexpressions), he would like a mechanism for defining new mathematical functions so that he can write:

  set f [expr { $x*J0($x) + J1(2.0*$x) }]

Donal Fellows has felt the need for such a construct, too, enough to implement a "funcproc" extension that provides it. http://wiki.tcl.tk/541

Kevin Kenny has come at the desire for rework of the math function mechanism from a different direction. He has been investigating Tcl Core support for arbitrary-precision integers [237], and discovered that implementing math functions that accept them as arguments or return them as results would require an incompatible change to the Tcl_Value data type that is used to communicate with the math functions. When the last such change was made (in implementing 64-bit integers [72]), it required a source code change to several popular extensions that created new math functions min and max. Wishing the incompatibility introduced by [237] to be the last of the type, he decided to eliminate Tcl_Value and use the open-ended Tcl_Obj data type in its place. When he derived the type signature that a math function would have if it accepted its parameters in an array of Tcl_Obj pointers, he discovered that it was identical to the Tcl_ObjCmdProc -- making him ask, "what if math functions really were commands?"

This TIP is the result of those two investigations, and proposes a unification of math functions with Tcl commands.

Proposed Changes

This TIP proposes that:

  1. The [expr] command shall be modified so that an expression of the form:

       f(arg1,arg2,...)
    

    shall generate code equivalent to that generated by

       [tcl::mathfunc::f [expr { arg1 }] [expr { arg2 }] ...]
    

    so that math functions are interpreted as Tcl commands whose arguments are parsed as subexpressions. The existing code in [expr] that checks for correct argument counts to the math functions at compile time shall be removed (the general consensus among Core maintainers is that compile-time checks of this sort are a bad idea anyway).

    Note that the call to tcl::mathfunc::f has no leading namespace delimiter. A search for the function will try to resolve it first relative to the current namespace.

  2. The current math functions in Tcl shall be reimplemented as ordinary Tcl commands in the ::tcl::mathfunc namespace; a Tcl interpreter will contain commands ::tcl::mathfunc::abs, ::tcl::mathfunc::acos, etc.

  3. The Tcl command [info functions] shall be deprecated in favor of searching for math functions using [info commands] in the appropriate namespace.

  4. The C functions Tcl_CreateMathFunc, Tcl_GetMathFuncInfo, and Tcl_ListMathFuncs shall be deprecated in favor of Tcl_CreateObjCommand, Tcl_GetCommandInfo and [info commands]. (The last is provided only in Tcl at the present time; we do not export a C-level interface to enumerate commands in a namespace.) The functions will continue to work for extensions that use them. The Tcl_CreateMathFunc command will create a Tcl command in the ::tcl::mathfunc namespace whose command procedure checks argument count and types and dispatches to the math function handler. The Tcl_GetMathFuncInfo procedure will return TCL_OK or TCL_ERROR according to whether the given math function exists in ::tcl::mathfunc and will return parameter information for (only) those functions defined using Tcl_CreateMathFunc. Functions defined as ordinary Tcl commands shall return TCL_OK but have a parameter count of -1. The Tcl_ListMathFuncs procedure will simply enumerate all the commands in the ::tcl::mathfunc namespace.

  5. Several error messages change as a side effect of changing the math function implementation. All of the new messages are more informative than the old ones (for instance, identifying which math function encountered a parameter error rather than a generic "too few/too many arguments to math function"), with the exception of the error message for an unknown function, which is replaced with "unknown command ::tcl::mathfunc::" where is the name of the missing function.

Discussion

The proposed change lifts several other restrictions in the way math functions operate:

  1. There is no longer any restriction that new math functions must accept a fixed number of arguments, or, indeed, that their arguments and results must be numeric. It will be possible to create variadic functions like:

       proc ::tcl::mathfunc::min args {
           set m Inf
           foreach a $args {
               if { $a < $m } {
                   set m $a
               }
           }
           return $m
       }
    
  2. It will be possible to compile Tcl code that refers to a math function that has not yet been defined. That is:

       proc foo { x } {
           set have_f [llength [info commands ::tcl::mathfunc::f]]
           if { $have_f } {
               return [expr { f($x) }]
           } else {
               ... fallback ...
           }
       }
    

    will be a valid procedure. (In Tcl 8.4, the procedure will fail to compile if f is not known at compile time, something that runs contrary to the dynamic nature of Tcl.)

  3. Namespaces will be able to define their own math functions that are not visible outside those namespaces. If a namespace defines a function [namespace current]::tcl::mathfunc::f, then calls to f in expressions evaluated in that namespace will resolve to it in preference to ::tcl::mathfunc::f. Not only does this rule allow two extensions both to define functions f without collision, but it also allows an extension to override a builtin function such as sin.

Alas, all these improvements come at some cost of performance. On Kevin Kenny's machine, the command

 expr {sin(0.325)}

executes in roughly 520 nanoseconds before the change, and 870 nanoseconds afterward. Since the resolution of the function name is cached, name lookup is not the problem; rather, the issue is that invoking the function as a command needs to look for command traces; this whole mechanism costs about 300-350 ns (and has been observed to do so in other contexts). For real-life expressions, the additional cost tends to vanish quickly into statistical noise; variable access (with corresponding checks for traces) and bytecode overhead quickly comes to dominate the performance for all but the simplest expressions.

Safe Interpreters

It is not anticipated that exposing this functionality in a safe interpreter will present any new problems for safety. Any functionality that the interpreter can access by defining or overriding math functions is functionality that would have been available to it by calling the functions as commands.

Impact on C-coded Extensions

Extensions coded in C that wish to create math functions accepting parameters of type TCL_EITHER may find that they do not get type coercion from parameters of new numeric types, such as extended-precision integers. The coding change to replace them with Tcl commands is fairly easy and mechanical, at a level of effort comparable to that needed for [72]. Moreover, once it is completed, a math function will be using the known Tcl_CreateObjCmd API, which has been stable since Tcl 8.0 and is unlikely to change substantially in future releases.

The tbcload extension will need to implement a small amount of bytecode translation to preserve compatibility with bytecode compiled modules built against earlier versions of Tcl. The reason is that two bytecodes, INST_INVOKE_BUILTIN1 and INST_INVOKE_FUNC1 have been eliminated from Tcl_ExecuteByteCode since the compiler no longer emits them. If this change for some reason should prove infeasible, we can always put the bytecodes back into Tcl_ExecuteByteCode, but the authors of this TIP would prefer to avoid the code bloat.

Reference Implementation

A reference implementation is committed on the 'tcl-numerics-branch' at SourceForge and may be retrieved with

 cvs -d:ext:[email protected]:/cvsroot/tcl checkout -rTIP232 tcl_numerics

Copyright

Copyright (c) 2005 by Kevin B. Kenny and Adriaan Markus. All rights reserved.

This document may be distributed subject to the terms and conditions set forth in the Open Publication License, version 1.0 http://www.opencontent.org/openpub/ .