Thursday, July 21, 2011

Library configuration in Common Lisp: an example

I have been dodging the issue of configuration for LLA for a while, but eventually I had to come up with a solution for the revised version that I pushed to Github yesterday.

LLA users may need to configure two things: the location and name of libraries that they want to use, and whether LLA should use 64-bit integers to interface with BLAS/LAPACK (this is rarely needed, even on 64-bit platforms, as many implementations still use 32-bit integers). In the previous versions of LLA, I tried to do this with read-time conditionals (based on trivial-features) but this was not satisfactory as it is impossible to figure out the list of libraries from nothing but the platform information — for example, a Linux user could use ATLAS or MKL.

I read Maintaining Portable Lisp Programs by Christophe Rhodes and also asked on cl-pro, where I got a lot of good suggestions and decided to follow the advice of Pascal Costanza. This is how configuration works at the moment for LLA:

  1. The user may define a variable *lla-configuration* in the CL-USER package. The variable should contain a plist of configuration options, eg :libraries. The variable does not need to be bound, and it can be NIL or incomplete: the user only needs to deal with configuration when he wants to override the defaults of the library. LLA makes an effort to come up with sensible platform-specific defaults.
  2. When loaded/compiled, LLA checks whether cl-user::*lla-configuration* is bound, and if it is, it uses the corresponding values. If the variable is not bound or doesn't contain the desired property, a default value is used instead.

Internally, some features are implemented by pushing symbols like lla::int64 to *features*. I have do admit that I didn't even consider the possibility of package qualifiers in read-time conditionals before I read the appropriate sections of the Hyperspec, but in retrospect it makes perfect sense as it helps to avoid name clashes. LLA also removes its own symbols from *features* if they don't belong there: this means that you can reload the library with a different configuration without restating your CL image.

Wednesday, July 20, 2011

LLA reorganization almost finished, new version on Github

I have almost completed the reorganization of LLA and merged it to the main branch on Github (sorry, no installation via Quicklisp until the library is fully stable, for the moment you have to check it out from Github). Make sure that you get the latest dependencies, most importantly cl-num-utils and let-plus.

This version of LLA boasts a reorganized interface: most importantly, now it works with plain vanilla Common Lisp arrays:

LLA> (defparameter *a* #2A((1 2) (3 4)))
*A*
LLA> (mm t *a*) ; same as A^T A, but LLA recognizes that the result is Hermitian
#<HERMITIAN-MATRIX 
  10.00000        *
  14.00000 20.00000>
LLA> (elements *)  ; Lisp arrays inside all special matrices
#2A((10.0d0 0.0d0) (14.0d0 20.0d0))
LLA> (svd *a* :thin)
#S(SVD
   :U #2A((-0.40455358483375703d0 0.9145142956773045d0)
          (-0.9145142956773045d0 -0.40455358483375703d0))
   :D #<DIAGONAL 
  5.46499       .
        . 0.36597>
   :VT #2A((-0.5760484367663207d0 -0.817415560470363d0)
           (-0.817415560470363d0 0.5760484367663208d0)))
LLA> (elements (svd-d *)) ; Lisp arrays again
#(5.464985704219043d0 0.3659661906262576d0)

LLA uses clever tricks to avoid transposing when it can. Row-major and column-major representations are transposes of each other, and instead of calculating an SVD as $$A=UDV^T$$ LLA justs calculates $$A^T={V^T}^T D^T U^T$$ and returns \(U\) as \(V^T\) and vice versa. Of course transposing cannot be avoided all the time, but it is not needed often and should not impose a large performance penalty. So LLA can work with row-major Common Lisp arrays natively — in fact, that's the only array representation it supports.

This version of LLA has a reorganized DSL for BLAS/LAPACK calls which makes it really easy to wrap Fortran functions in general, not just LAPACK/BLAS functions for which it has extensions. For example, this is how Cholesky factorization is implemented:

(defmethod cholesky ((a hermitian-matrix))
  (let+ ((a (elements a))
         ((a0 a1) (array-dimensions a)))
    (assert (= a0 a1))
    (lapack-call ("potrf" (common-float-type a)
                          (make-instance 'cholesky :left-square-root 
                                         (make-lower-triangular-matrix l)))
                 #\U (&integer a0) (&array a :output l) (&integer a0) &info)))

The first two lines are just bookkeeping: the array A containing the elements of the hermitian (symmetric) matrix (in the lower triangle) is extracted, and its dimensions are examined. LAPACK-CALL figures out whether to use SPOTRF, DPOTRF, CPOTRF, ZPOTRF from the element type of the matrix, allocates memory for atoms (recall that Fortran takes pointers) using the macros, and automatically examines the INFO parameter at &info to catch exceptions. The #\U in the code suggests that storage is upper-triangular, even though in CL we store elements of the hermitian matrix in the lower triangle: this is because we avoid transposing. I find that it is really easy to write wrappers to Fortran code using this macro family (there is also a BLAS-CALL and a LAPACK-CALL-W/QUERY that queries workspace sizes), and it should be easy to write a FORTRAN-CALL for general Fortran code, or even a wrapper for C functions that handle arrays. Let me know if you are interested in any of these, I would be happy to add them or even factor out this part of LLA into another library.

By the way, contrary to what my previous post on LLA said, you no longer need the C wrappers in MKL or CLAPACK, so ATLAS or something similar is just fine. The README has instructions on how to set up these libraries (unless you want to compile your own, a single apt-get install or similar is enough to satisfy LLA's dependencies) and configure custom library locations.

Even though LLA is now perfectly usable and all the unit tests work just fine, there are still some things left to do:

Write the array implementation-specific sharing macros.
LLA uses a few macros to make arrays available for foreign functions, and all the wrappers are built around these. The idea is to avoid copying if the implementation supports it, and otherwise fall back to copying. I decided to focus my efforts on getting LLA to work again first, and this means that currently all operations fall back to copying, which should entail a slight performance penalty. I will of course remedy this at some point, first for SBCL and ECL and then for the other distributions, but I am waiting for the rest of LLA to stabilize before I start fiddling with low-level optimizations.
Eigenvalues/eigenvectors are not yet implemented.
LAPACK's handling of eigenvalues and especially eigenvectors is a mess, and I need to dust off the code that picks out paired eigenvectors. It should be fairly easy, but I prefer not to add functionality without thorough testing, so currently I am postponing this until someone needs it (just drop me a line).

Also, even though I wrote a lot of unit tests, it is of course possible (and probable) that LLA has bugs. Please report them on Github.

I would like to thank all users of LLA for helpful suggestions and for the gentle nudging I received during LLA's reorganization. I believe that LLA fills an important niche, and I will continue to work on it.

Thursday, July 14, 2011

Why I only buy e-books from indie publishers

Nikodemus Siivola blogged about the $2 surcharge Amazon extracts from Kindle customers outside the US. I use my Kindle every day, and I buy e-books to read on a regular basis, spending around $10-30 a month (depending on how much time I have for reading). I have owned a Kindle for almost a year now, but I didn't know about this surcharge: I guess the reason for this is that none of the money I spent on e-books went to Amazon.

There are two reasons for this: their non-competitive pricing and DRM. First, I am reluctant to pay more for an e-book than I would for a paperback, yet I frequently saw e-books on Amazon that cost more than their paperback versions. I understand that Amazon is trying to extract economic rents using its strong position in this market, but I would feel like an idiot if I participated in this scheme.

Regarding DRM: I like my Kindle very much, but I think that in the long run e-book readers will become even more of a commodity (like laptops or cell phones), and I don't wish my e-book collection to be tied to Amazon. Call me old-fashioned, but I do want to own the content I pay for, especially if I paid almost as much as I would pay for a hardcopy. So I only buy DRM-free books.

I mostly read literature on my Kindle. I buy quite a bit of sci-fi from Baen books — they also have a lot of books available for free, knowing full well that good sci-fi is addictive for some people anyway, but they don't exploit this: most of their books sell for $5-6. I haven't head a chance to check out Calibre's DRM-free book store, but it looks very promising.

I think that unless Amazon changes its pricing policy and considers offering DRM-free content, there is very little chance that I will buy anything from them. If you know some indie authors or publishers who sell reasonably priced DRM-free e-books, I would be interested in hearing about them.