Tamas K Papp's blog: July 2011

Thursday, July 21, 2011

Library configuration in Common Lisp: an example

I have been dodging the issue of configuration for LLA for a while, but eventually I had to come up with a solution for the revised version that I pushed to Github yesterday.

LLA users may need to configure two things: the location and name of libraries that they want to use, and whether LLA should use 64-bit integers to interface with BLAS/LAPACK (this is rarely needed, even on 64-bit platforms, as many implementations still use 32-bit integers). In the previous versions of LLA, I tried to do this with read-time conditionals (based on trivial-features) but this was not satisfactory as it is impossible to figure out the list of libraries from nothing but the platform information — for example, a Linux user could use ATLAS or MKL.

I read Maintaining Portable Lisp Programs by Christophe Rhodes and also asked on cl-pro, where I got a lot of good suggestions and decided to follow the advice of Pascal Costanza. This is how configuration works at the moment for LLA:

The user may define a variable *lla-configuration* in the CL-USER package. The variable should contain a plist of configuration options, eg :libraries. The variable does not need to be bound, and it can be NIL or incomplete: the user only needs to deal with configuration when he wants to override the defaults of the library. LLA makes an effort to come up with sensible platform-specific defaults.
When loaded/compiled, LLA checks whether cl-user::*lla-configuration* is bound, and if it is, it uses the corresponding values. If the variable is not bound or doesn't contain the desired property, a default value is used instead.

Internally, some features are implemented by pushing symbols like lla::int64 to *features*. I have do admit that I didn't even consider the possibility of package qualifiers in read-time conditionals before I read the appropriate sections of the Hyperspec, but in retrospect it makes perfect sense as it helps to avoid name clashes. LLA also removes its own symbols from *features* if they don't belong there: this means that you can reload the library with a different configuration without restating your CL image.

Wednesday, July 20, 2011

LLA reorganization almost finished, new version on Github

I have almost completed the reorganization of LLA and merged it to the main branch on Github (sorry, no installation via Quicklisp until the library is fully stable, for the moment you have to check it out from Github). Make sure that you get the latest dependencies, most importantly cl-num-utils and let-plus.

This version of LLA boasts a reorganized interface: most importantly, now it works with plain vanilla Common Lisp arrays:

LLA> (defparameter *a* #2A((1 2) (3 4)))
*A*
LLA> (mm t *a*) ; same as A^T A, but LLA recognizes that the result is Hermitian
#<HERMITIAN-MATRIX 
  10.00000        *
  14.00000 20.00000>
LLA> (elements *)  ; Lisp arrays inside all special matrices
#2A((10.0d0 0.0d0) (14.0d0 20.0d0))
LLA> (svd *a* :thin)
#S(SVD
   :U #2A((-0.40455358483375703d0 0.9145142956773045d0)
          (-0.9145142956773045d0 -0.40455358483375703d0))
   :D #<DIAGONAL 
  5.46499       .
        . 0.36597>
   :VT #2A((-0.5760484367663207d0 -0.817415560470363d0)
           (-0.817415560470363d0 0.5760484367663208d0)))
LLA> (elements (svd-d *)) ; Lisp arrays again
#(5.464985704219043d0 0.3659661906262576d0)

LLA uses clever tricks to avoid transposing when it can. Row-major and column-major representations are transposes of each other, and instead of calculating an SVD as $$A=UDV^T$$ LLA justs calculates $$A^T={V^T}^T D^T U^T$$ and returns $U$ as $V^T$ and vice versa. Of course transposing cannot be avoided all the time, but it is not needed often and should not impose a large performance penalty. So LLA can work with row-major Common Lisp arrays natively — in fact, that's the only array representation it supports.

This version of LLA has a reorganized DSL for BLAS/LAPACK calls which makes it really easy to wrap Fortran functions in general, not just LAPACK/BLAS functions for which it has extensions. For example, this is how Cholesky factorization is implemented:

(defmethod cholesky ((a hermitian-matrix))
  (let+ ((a (elements a))
         ((a0 a1) (array-dimensions a)))
    (assert (= a0 a1))
    (lapack-call ("potrf" (common-float-type a)
                          (make-instance 'cholesky :left-square-root 
                                         (make-lower-triangular-matrix l)))
                 #\U (&integer a0) (&array a :output l) (&integer a0) &info)))

The first two lines are just bookkeeping: the array A containing the elements of the hermitian (symmetric) matrix (in the lower triangle) is extracted, and its dimensions are examined. LAPACK-CALL figures out whether to use SPOTRF, DPOTRF, CPOTRF, ZPOTRF from the element type of the matrix, allocates memory for atoms (recall that Fortran takes pointers) using the macros, and automatically examines the INFO parameter at &info to catch exceptions. The #\U in the code suggests that storage is upper-triangular, even though in CL we store elements of the hermitian matrix in the lower triangle: this is because we avoid transposing. I find that it is really easy to write wrappers to Fortran code using this macro family (there is also a BLAS-CALL and a LAPACK-CALL-W/QUERY that queries workspace sizes), and it should be easy to write a FORTRAN-CALL for general Fortran code, or even a wrapper for C functions that handle arrays. Let me know if you are interested in any of these, I would be happy to add them or even factor out this part of LLA into another library.

By the way, contrary to what my previous post on LLA said, you no longer need the C wrappers in MKL or CLAPACK, so ATLAS or something similar is just fine. The README has instructions on how to set up these libraries (unless you want to compile your own, a single apt-get install or similar is enough to satisfy LLA's dependencies) and configure custom library locations.

Even though LLA is now perfectly usable and all the unit tests work just fine, there are still some things left to do:

Write the array implementation-specific sharing macros.: LLA uses a few macros to make arrays available for foreign functions, and all the wrappers are built around these. The idea is to avoid copying if the implementation supports it, and otherwise fall back to copying. I decided to focus my efforts on getting LLA to work again first, and this means that currently all operations fall back to copying, which should entail a slight performance penalty. I will of course remedy this at some point, first for SBCL and ECL and then for the other distributions, but I am waiting for the rest of LLA to stabilize before I start fiddling with low-level optimizations.
Eigenvalues/eigenvectors are not yet implemented.: LAPACK's handling of eigenvalues and especially eigenvectors is a mess, and I need to dust off the code that picks out paired eigenvectors. It should be fairly easy, but I prefer not to add functionality without thorough testing, so currently I am postponing this until someone needs it (just drop me a line).

Also, even though I wrote a lot of unit tests, it is of course possible (and probable) that LLA has bugs. Please report them on Github.

I would like to thank all users of LLA for helpful suggestions and for the gentle nudging I received during LLA's reorganization. I believe that LLA fills an important niche, and I will continue to work on it.

Thursday, July 14, 2011

Why I only buy e-books from indie publishers

Nikodemus Siivola blogged about the $2 surcharge Amazon extracts from Kindle customers outside the US. I use my Kindle every day, and I buy e-books to read on a regular basis, spending around $10-30 a month (depending on how much time I have for reading). I have owned a Kindle for almost a year now, but I didn't know about this surcharge: I guess the reason for this is that none of the money I spent on e-books went to Amazon.

There are two reasons for this: their non-competitive pricing and DRM. First, I am reluctant to pay more for an e-book than I would for a paperback, yet I frequently saw e-books on Amazon that cost more than their paperback versions. I understand that Amazon is trying to extract economic rents using its strong position in this market, but I would feel like an idiot if I participated in this scheme.

Regarding DRM: I like my Kindle very much, but I think that in the long run e-book readers will become even more of a commodity (like laptops or cell phones), and I don't wish my e-book collection to be tied to Amazon. Call me old-fashioned, but I do want to own the content I pay for, especially if I paid almost as much as I would pay for a hardcopy. So I only buy DRM-free books.

I mostly read literature on my Kindle. I buy quite a bit of sci-fi from Baen books — they also have a lot of books available for free, knowing full well that good sci-fi is addictive for some people anyway, but they don't exploit this: most of their books sell for $5-6. I haven't head a chance to check out Calibre's DRM-free book store, but it looks very promising.

I think that unless Amazon changes its pricing policy and considers offering DRM-free content, there is very little chance that I will buy anything from them. If you know some indie authors or publishers who sell reasonably priced DRM-free e-books, I would be interested in hearing about them.

Tuesday, July 12, 2011

LinkedIn: an obnoxious spam engine

I have to admit that I don't see the point of so-called social networking sites, but I recognize the possibility that some people may find them useful, and I am just not one of them. Nevertheless, I still fail to see why random people I have never met want to add me as a contact. I get invitations from these people on LinkedIn almost daily, and to make it worse, LinkedIn then sends me reminders about them. Note that I am not even registered on their damn site. I don't have a profile, and I have no intention of creating one.

Which presents another problem: I couldn't find a way of opting out from these invitations without creating a profile. Clever, isn't it? They send you spam, and then they are trying to sucker people into registering to stop spam. I could just start marking these messages spam, but I think that my spam filter would take a while to learn this, and in any case, I don't want to add noise to the Bayesian model just to create false positives for relevant e-mail. The cleanest solution I found is putting the whole linkedin.com domain on my blacklist.

I want to emphasize that this is a rare distinction: with spam filters getting so clever (thank you, Gmail!), I haven't had to manually blacklist a sender in years. Congratulations are in order: dear LinkedIn, you are a really obnoxious engine for spam! I sincerely hope that you vanish to oblivion soon.

Monday, July 11, 2011

Goodbye Ubuntu, hello Debian

My road to Ubuntu

I have started using Linux around 1998. I remember that I wanted to try out several distributions before settling on the one the "best" one, but I had a dial-up connection (28kbps, if I remember correctly) and downloading all of them seemed infeasible, so I mail-ordered a set of CDs that contained Red Hat, Slackware and Debian. I think I decided on Debian because I liked the clean and transparent package management system, especially APT which seemed way ahead of other distributions at that time. Linux became my primary OS within a few months: even though it didn't have a fraction of the tools that are available today, it had everything I needed.

A lot happened in the Linux world since then. I don't know when (if?) Linux "conquered" the desktop, but in 2003 it became mature enough that I installed Linux (Debian, of course) on my parent's desktop. Neither of my parents are very computer-savvy, but they also found Linux much nicer than Windows. I think what mattered the most to them is its stability: I would update the software once in a while, test that everything was working (and fix it when it didn't), but in the meantime, the system was rock-solid. This was important as I no longer lived in the same city, so I couldn't just drop by to fix computer problems on a short notice, which I had to do with Windows.

When they got another computer (I think it was around 2005), I replaced Debian with Ubuntu, figuring that it would provide a more polished user experience, and I was pleased with the result. I installed Ubuntu on all computers that family members asked me to help with, and in 2009, I installed Ubuntu on a new laptop I got. It was a bit more polished than Debian, especially when it came to setup things like printers and network drives. I knew how to do those things via the relevant config files, but I appreciated the GUI.

Why I moved back to Debian

I think that I like Linux because it allows me to concentrate on my work: I came to accept that sometimes setting things up required a bit of tweaking, but once I configured something, it would work exactly the way I liked it. I welcome changes if they make sense (eg a feature extension), but I don't see the point of gratuitous ones. In retrospect, I should have considered the moving of window buttons a warning sign, especially when Mark Shuttleworth refused to move them back despite popular demand. The fix was easy, but the whole move was totally unnecessary and resulted in a lot of confusion: until I fixed it (I really did give it a try for a few days), my desktop just felt like a car with the gas and brake pedals interchanged.

The controversy surrounding Unity was the real eye-opener for me. I dist-upgraded Ubuntu on my parents current computer (a perfectly good desktop for office work, around 2 years old) and was informed that the system doesn't have the resources to run the damn thing. I tried it out on my laptop briefly, and realized that all that glitz and glitter is unnecessary and a waste of resources (most importantly battery power), and Ubuntu was trying to force yet another gratuitous UI change on its users. I backed up my data, wiped my HD, and within an hour, I was back to good old Debian. Which still rocks.

Why Debian is perfect for me

In order to answer that question, here is a sample of the desktop software that I use (of course I use a lot of other programs, eg Git or SBCL, but there is no purpose in enumerating all of them):

desktop environment: XFCE (with some Gnome applets in the panel)
e-mail: mutt, offlineimap, msmtp
shell: zsh, running under screen in yakuake (with the Tango color scheme — I got used to that from Gnome)
editing: emacs, TeX Live, AUCTeX, org-mode, …
browser: chrome
password manager: Revelation
newsfeeds: Liferea

This is a pretty eclectic mix with no overall GUI concept — for example, it has programs from Gnome and KDE — but I have come to like each of these programs and they work very well for me. Debian testing has pretty recent releases for all of them, and more importantly, the idea of Debian developers forcing any particular choice on how I set up my desktop never even comes up at any point. They provide excellent software, take care of packaging new releases and fixing bugs, but I doubt that any of them would even entertain the idea of pushing some wacky desktop du jour concept on their users. I think that they have better things to do with their time, including maintaining the best Linux distro out there, and that's the way I like it.

The icing on the cake

Compared to my previous desktop (which was plain vanilla Gnome, Ubuntu/Maverick), the above setup appears to be much less resource intensive. I have plenty of RAM and CPU power, but I noticed that my laptop gives me about 20% more battery time than it used to. I cannot help but love Debian. It is good to be back!

Tamas K Papp's blog