## Monday, May 17, 2010

### Upgraded array element types and pinned arrays (updated)

I am planning to do a major rewrite of my LLA library in the near future. LLA, which stands for Lisp Linear Algebra, uses BLAS and LAPACK to perform operations on matrices (and vectors), ranging from matrix multiplication to singular value decompositions. LLA is experimental, and its aim is to provide semantics which recognize special matrix types both as inputs and outputs. Maybe I will blog about it sometime, but that's not the focus of this post.

Currently, LLA uses Lisp arrays, but wrapped in a class which has a slot signifying the element type of the arrays. This is because it is meant to be portable, and not all implementations have upgraded array element types for every type LAPACK recognizes (see below). Also, LLA matrix representation is currently column major, and Lisp arrays are row-major.

However, I realized that I can use Lisp arrays in more places - LAPACK handles row-major arrays just fine (as transposed column-major arrays). Besides, having a native Lisp array is nice, eg one can just use functions like REDUCE or REPLACE, without dealing with a special type. However, there is a price to pay: when LAPACK routines are called, the array element type has to be detectable. For implementations which upgrade element types use by LLA to themselves or something specific, this is a cheap operation. For implementations that don't, it requires an extra pass.

Another issue is whether your implementation has what SBCL calls "pinned" arrays. Pinned arrays can make their contents available to foreign routines directly with negligible overhead for a limited duration (or maybe indefinitely, but I prefer the first).

I asked for implementation-specific details on c.l.l, and got the following information:

ImplementationMachineSDCSCDI32I64pinned?
SBCL (1.0.38)64-bit******yes
Lispworks (6.0.1)32-bit**TT**?
Lispworks Personal Edition (5.1.1)32-bit**TT*T?
Clozure CL (1.4, 1.6)32-bit**TT*Tyes
Clozure CL (1.6)64-bit**TT**yes
ECL (10.3.1)32-bit**TTTTyes
ABCL (0.13.0)64-bitTTTTTT?
Allegco CL Enterprise Edition (8.1 & 8.2, Linux and Windows)32-bit*****T?
Allegco CL Enterprise Edition (8.1 & 8.2, Linux and Windows)64-bit******?
CLISP (2.48)bothTTTTTT?
CMUCL (19e)32-bit*****Tyes
Corman Common Lisp (3.01, Windows)?**TTTT?

The interpretation is this: S, D, CS and CD correspond to (complex) single and double float. I32 and I64 are signed-byte types of 32 and 64 bits. * means that the element type is upgraded to itself, otherwise the table shows the upgraded type.

The following table gives some details on the pinning mechanism when the implementation has it:

ImplementationPinning constructNotes
SBCLsb-sys:with-pinned-objects with sb-sys:vector-sappins object only where GC granularity allows
Clozure CLccl:with-pointer-to-ivectordisables GC
ECL?conservative GC, data can be used as long as the object is alive

So LLA is likely to have the following optimization model: it will be blazingly fast (basically the speed of LAPACK, with a tiny bit of overhead) on implementations which support all upgraded types and pinning, and a bit slower on other ones. If your implementation supports upgrading some of the above element types to themselves, those arrays will require no element type detection so they will be faster. I64 is only needed on 64-bit machines.

The motivation behind this decision is that if you are doing serious numerical work, your implementation should support all relevant array element types and also pinning. SBCL does, and since I am using that, the new version of LLA will take advantage of all its nice facilities — it already does, but currently it only runs on SBCL so that was natural. LLA will be fully functional on other implementations, but may be slower.

Please keep sending in information for the implementations you don't see in the table above. I am especially interested in which implementations support pinning. You can use this code snipped to generate output on array element type upgrading:

(flet ((check-upgraded (type)
(format t "~A is upgraded to ~A~%"
type
"itself"
(format t "~2&~A (~A) on ~A (~A)~2%"
(lisp-implementation-type) (lisp-implementation-version)
(machine-type) (machine-version))
(complex single-float) (complex double-float)
(signed-byte 32) (signed-byte 64))))


1. ;; cygwin clisp:

CLISP (2.48 (2009-07-28) (built on ATGRZWN502840.avl01.avlcorp.lan [157.247.26.41])) on I686 (I686)

(COMPLEX SINGLE-FLOAT) is upgraded to T
(COMPLEX DOUBLE-FLOAT) is upgraded to T
(SIGNED-BYTE 32) is upgraded to T
(SIGNED-BYTE 64) is upgraded to T

2. CCL has ccl:with-pointer-to-ivector, the body of which disables GC, so that it won't be moved.

3. Two things. The first one is that all arrays in ECL are pinned: our garbage collector is conservative and thus the pointer to its data can be passed to foreign routines. You just have to make sure that a reference to the array is kept live somewhere.

The other one is that the I32 and I64 integers are upgraded not to "internal" types, but just to integers. The fact we use some names for them should not prevent you from supporting them efficiently -- if you don't, I would like to read why and help you solve whatever problems you find.

4. Hey, I'd love something like this to be available on Windows with Corman CL, so here's:

Corman Common Lisp (3.01) on NIL (NIL)

(COMPLEX SINGLE-FLOAT) is upgraded to T
(COMPLEX DOUBLE-FLOAT) is upgraded to T
(SIGNED-BYTE 32) is upgraded to T
(SIGNED-BYTE 64) is upgraded to T

5. Your data regarding Clozure CL (64-bit) seems to be wrong w.r.t. I32 and I64. See output below:

Clozure Common Lisp (Version 1.6-dev-r13691M-trunk (DarwinX8664)) on i386 (iMac6,1)

(COMPLEX SINGLE-FLOAT) is upgraded to T
(COMPLEX DOUBLE-FLOAT) is upgraded to T
(SIGNED-BYTE 32) is upgraded to itself
(SIGNED-BYTE 64) is upgraded to itself

6. Thanks for all the data and corrections, I have updated the table.

7. There's an alternative to pinning arrays: to use static arrays, which are never moved by the GC.

I wrote a library that wraps the various implementation's way of allocating static arrays: http://gitorious.org/iolib/static-vectors/

8. Allegro CL is now getting a complete SMP infrastructure, and provides (at least in 8.1 w/patches and 8.2) with-pinned-objects.
See http://www.franz.com/support/documentation/8.1/doc/operators/excl/with-pinned-objects.htm

Cheers,

9. Don't know if this is made obsolete by cyclopes' post:

International Allegro CL Enterprise Edition (8.2 [64-bit Mac OS X (Intel)] (Jan 25, 2010 14:49)) on 64-bit x86-64 (i386)

(COMPLEX SINGLE-FLOAT) is upgraded to itself
(COMPLEX DOUBLE-FLOAT) is upgraded to itself
(SIGNED-BYTE 32) is upgraded to itself
(SIGNED-BYTE 64) is upgraded to itself
NIL

10. Did you know that a limit on the rank of an array should not not be smaller than 7?

11. LispWorks (6.1.0) on i686 (NIL)