I am planning to do a major rewrite of my LLA library in the near future. LLA, which stands for Lisp Linear Algebra, uses BLAS and LAPACK to perform operations on matrices (and vectors), ranging from matrix multiplication to singular value decompositions. LLA is experimental, and its aim is to provide semantics which recognize special matrix types both as inputs and outputs. Maybe I will blog about it sometime, but that's not the focus of this post.
Currently, LLA uses Lisp arrays, but wrapped in a class which has a slot signifying the element type of the arrays. This is because it is meant to be portable, and not all implementations have upgraded array element types for every type LAPACK recognizes (see below). Also, LLA matrix representation is currently column major, and Lisp arrays are row-major.
However, I realized that I can use Lisp arrays in more places - LAPACK
handles row-major arrays just fine (as transposed column-major
arrays). Besides, having a native Lisp array is nice, eg one can just
use functions like REDUCE
or REPLACE
, without dealing with a
special type. However, there is a price to pay: when LAPACK routines
are called, the array element type has to be detectable. For
implementations which upgrade element types use by LLA to themselves
or something specific, this is a cheap operation. For implementations
that don't, it requires an extra pass.
Another issue is whether your implementation has what SBCL calls "pinned" arrays. Pinned arrays can make their contents available to foreign routines directly with negligible overhead for a limited duration (or maybe indefinitely, but I prefer the first).
I asked for implementation-specific details on c.l.l, and got the following information:
Implementation | Machine | S | D | CS | CD | I32 | I64 | pinned? |
---|---|---|---|---|---|---|---|---|
SBCL (1.0.38) | 64-bit | * | * | * | * | * | * | yes |
Lispworks (6.0.1) | 32-bit | * | * | T | T | * | * | ? |
Lispworks Personal Edition (5.1.1) | 32-bit | * | * | T | T | * | T | ? |
Clozure CL (1.4, 1.6) | 32-bit | * | * | T | T | * | T | yes |
Clozure CL (1.6) | 64-bit | * | * | T | T | * | * | yes |
ECL (10.3.1) | 32-bit | * | * | T | T | T | T | yes |
ABCL (0.13.0) | 64-bit | T | T | T | T | T | T | ? |
Allegco CL Enterprise Edition (8.1 & 8.2, Linux and Windows) | 32-bit | * | * | * | * | * | T | ? |
Allegco CL Enterprise Edition (8.1 & 8.2, Linux and Windows) | 64-bit | * | * | * | * | * | * | ? |
CLISP (2.48) | both | T | T | T | T | T | T | ? |
CMUCL (19e) | 32-bit | * | * | * | * | * | T | yes |
Corman Common Lisp (3.01, Windows) | ? | * | * | T | T | T | T | ? |
The interpretation is this: S, D, CS and CD correspond to (complex) single and double float. I32 and I64 are signed-byte types of 32 and 64 bits. * means that the element type is upgraded to itself, otherwise the table shows the upgraded type.
The following table gives some details on the pinning mechanism when the implementation has it:
Implementation | Pinning construct | Notes |
---|---|---|
SBCL | sb-sys:with-pinned-objects with sb-sys:vector-sap | pins object only where GC granularity allows |
Clozure CL | ccl:with-pointer-to-ivector | disables GC |
ECL | ? | conservative GC, data can be used as long as the object is alive |
So LLA is likely to have the following optimization model: it will be blazingly fast (basically the speed of LAPACK, with a tiny bit of overhead) on implementations which support all upgraded types and pinning, and a bit slower on other ones. If your implementation supports upgrading some of the above element types to themselves, those arrays will require no element type detection so they will be faster. I64 is only needed on 64-bit machines.
The motivation behind this decision is that if you are doing serious numerical work, your implementation should support all relevant array element types and also pinning. SBCL does, and since I am using that, the new version of LLA will take advantage of all its nice facilities — it already does, but currently it only runs on SBCL so that was natural. LLA will be fully functional on other implementations, but may be slower.
Please keep sending in information for the implementations you don't see in the table above. I am especially interested in which implementations support pinning. You can use this code snipped to generate output on array element type upgrading:
(flet ((check-upgraded (type) (let ((upgraded (upgraded-array-element-type type))) (format t "~A is upgraded to ~A~%" type (if (equal type upgraded) "itself" upgraded))))) (format t "~2&~A (~A) on ~A (~A)~2%" (lisp-implementation-type) (lisp-implementation-version) (machine-type) (machine-version)) (map nil #'check-upgraded '(single-float double-float (complex single-float) (complex double-float) (signed-byte 32) (signed-byte 64))))
;; cygwin clisp:
ReplyDeleteCLISP (2.48 (2009-07-28) (built on ATGRZWN502840.avl01.avlcorp.lan [157.247.26.41])) on I686 (I686)
SINGLE-FLOAT is upgraded to T
DOUBLE-FLOAT is upgraded to T
(COMPLEX SINGLE-FLOAT) is upgraded to T
(COMPLEX DOUBLE-FLOAT) is upgraded to T
(SIGNED-BYTE 32) is upgraded to T
(SIGNED-BYTE 64) is upgraded to T
CCL has ccl:with-pointer-to-ivector, the body of which disables GC, so that it won't be moved.
ReplyDeleteTwo things. The first one is that all arrays in ECL are pinned: our garbage collector is conservative and thus the pointer to its data can be passed to foreign routines. You just have to make sure that a reference to the array is kept live somewhere.
ReplyDeleteThe other one is that the I32 and I64 integers are upgraded not to "internal" types, but just to integers. The fact we use some names for them should not prevent you from supporting them efficiently -- if you don't, I would like to read why and help you solve whatever problems you find.
Hey, I'd love something like this to be available on Windows with Corman CL, so here's:
ReplyDeleteCorman Common Lisp (3.01) on NIL (NIL)
SINGLE-FLOAT is upgraded to itself
DOUBLE-FLOAT is upgraded to itself
(COMPLEX SINGLE-FLOAT) is upgraded to T
(COMPLEX DOUBLE-FLOAT) is upgraded to T
(SIGNED-BYTE 32) is upgraded to T
(SIGNED-BYTE 64) is upgraded to T
Your data regarding Clozure CL (64-bit) seems to be wrong w.r.t. I32 and I64. See output below:
ReplyDeleteClozure Common Lisp (Version 1.6-dev-r13691M-trunk (DarwinX8664)) on i386 (iMac6,1)
SINGLE-FLOAT is upgraded to itself
DOUBLE-FLOAT is upgraded to itself
(COMPLEX SINGLE-FLOAT) is upgraded to T
(COMPLEX DOUBLE-FLOAT) is upgraded to T
(SIGNED-BYTE 32) is upgraded to itself
(SIGNED-BYTE 64) is upgraded to itself
Thanks for all the data and corrections, I have updated the table.
ReplyDeleteThere's an alternative to pinning arrays: to use static arrays, which are never moved by the GC.
ReplyDeleteI wrote a library that wraps the various implementation's way of allocating static arrays: http://gitorious.org/iolib/static-vectors/
Allegro CL is now getting a complete SMP infrastructure, and provides (at least in 8.1 w/patches and 8.2) with-pinned-objects.
ReplyDeleteSee http://www.franz.com/support/documentation/8.1/doc/operators/excl/with-pinned-objects.htm
Cheers,
Don't know if this is made obsolete by cyclopes' post:
ReplyDeleteInternational Allegro CL Enterprise Edition (8.2 [64-bit Mac OS X (Intel)] (Jan 25, 2010 14:49)) on 64-bit x86-64 (i386)
SINGLE-FLOAT is upgraded to itself
DOUBLE-FLOAT is upgraded to itself
(COMPLEX SINGLE-FLOAT) is upgraded to itself
(COMPLEX DOUBLE-FLOAT) is upgraded to itself
(SIGNED-BYTE 32) is upgraded to itself
(SIGNED-BYTE 64) is upgraded to itself
NIL
Did you know that a limit on the rank of an array should not not be smaller than 7?
ReplyDeleteLispWorks (6.1.0) on i686 (NIL)
ReplyDeleteSINGLE-FLOAT is upgraded to itself
DOUBLE-FLOAT is upgraded to itself
(COMPLEX SINGLE-FLOAT) is upgraded to T
(COMPLEX DOUBLE-FLOAT) is upgraded to T
(SIGNED-BYTE 32) is upgraded to itself
(SIGNED-BYTE 64) is upgraded to T
NIL