Archive for the ‘Library 2.0’ category

Vufind 0.6 on Ubuntu 7.10

October 30, 2007

Update: The instructions on installing Vufind have been moved to the Vufind Wiki. Please check there for the most up-to-date instructions.

This is an update to my previous post on configuring Ubuntu to run Vufind…

First, upgrade your server distribution to the latest-and-greatest

sudo apt-get dist-upgrade

If you’re on Edgy (7.04), this may take a while. Next install the Java 6 JDK and build-essential (for building Yaz).

sudo apt-get -y install sun-java6-jdk build-essential

When you’re prompted, answer the questions and let Ubuntu finish setting up Java. As a side note, the reason you want the JDK and not the JRE is that we want to run the Solr instance with a server switch to improve the performance. To do this, you need to the JDK.

Next, we install Apache2 and configure the mod_rewrite extension (and reload Apache2):

sudo apt-get -y install apache2
sudo a2enmod rewrite
sudo /etc/init.d/apache2 force-reload

Now, to download Vufind:

tar zxvf VuFind-0.6.1.tar.gz

Now, we need to move the Vufind files to the proper location. By default this should be /usr/local/vufind. If you choose a different location, you’ll need to set an environmental variable for VUFIND_HOME that points to your installation location, but I’ll get more into that a bit later. You also need to change the permissions on the compile and cache folders in the web/interface folder.

sudo mv vufind-0.6.1 /usr/local/vufind
sudo chown www-data:www-data /usr/local/vufind/web/interface/compile
sudo chown www-data:www-data /usr/local/vufind/web/interface/cache

Now to work install MySQL

sudo apt-get -y install mysql-server

PHP5 is required for Vufind with several dependencies.

sudo apt-get -y install php5 php5-dev php-pear php5-ldap php5-mysql php5-xsl php5-pspell aspell aspell-en

I don’t have an Oracle backend, so I haven’t tested the installation of the pdo-oci driver listed in the “official” documentation, but this page will hopefully walk you through installing the driver.

Lastly, we need the Yaz library.

cd /tmp
tar -zxvf yaz-3.0.14.tar.gz
cd yaz-3.0.14
sudo make install

Ok, we’re now finished with adding the packages to get Vufind running. It’s time to run the installation script.

sudo /usr/local/vufind/install

You’ll be walked through the configuration of your Vufind instance. There’s a slight issue in the the database setup script as it assumes you haven’t set a root password (you actually set a password when you set up MySQL in Gutsy now). No biggy, just let the script run through the installation of the PEAR libraries and we’ll fix it with the following:

mysql -u root -p
GRANT ALL ON vufind.* TO vufind@localhost IDENTIFIED BY “secretPassword”;

Now we need to edit a few files. First, we’ll edit /usr/local/vufind/web/conf/config.ini. The big sections that need editing are Site, Amazon, and Catalog (though you probably want to take a look at LDAP too). The Amazon id is your web services access id (not your affiliate ID) and you much change your drive to the appropriate driver that you’re using (e.g. Voyager, SirsiDynix, Koha, Evergreen, Aleph).

Next, the /usr/local/vufind/web/.htaccess file. You’ll need to change the rewrite base. And, you’ll most likely need to tweak the RewriteRule lines for your specific institution. The default is to use numeric call numbers, but if you’re like us, we have OCLC numbers, and many others. In case you’re not a RegEx expert, these are the settings I use:

RewriteRule ^([^/]+)/([a-zA-Z]*[0-9\s]+)/(.+)$
RewriteRule ^([^/]+)/([a-zA-Z]+[0-9\s]+)$
RewriteRule ^([^/]+)/([^0-9/]+)$

We’re almost there!

By default, the Ubuntu Apache2 distribution ignores .htaccess files, so we need to configure Apache to actually use the file. Edit the /etc/apache2/apache2.conf file with the following:

Alias /vufind /usr/local/vufind/web

<Directory /usr/local/vufind/web/>
AllowOverride ALL
Order allow,deny
allow from all

And reload Apache

sudo /etc/init.d/apache2 reload

Ok, let’s check to make sure that the interface is working before we do the final installation of the Solr backend. If you point your browser to http:<your_server>/vufind, you should see the default template. You should see a message on the page stating “Hey! You should customize this space.” If you see a message, you’ll need to do a little debugging (just read the message).

Ok, now for Solr. Vufind is packaged with Solr and Jetty. And, before we get going, we need to set an environmental variable JAVA_HOME. The way I do it is by adding the following line to /etc/profile

export JAVA_HOME

I always reboot, just to make sure that this really takes.

I forgot to change the permissions on startup script when I sent it to Andrew, so you need to make it executable

sudo chmod +x /usr/local/vufind/

And now to fire everything up

sudo /usr/local/vufind/ start

Now, we want to make sure that Jetty and Solr start up all the time, so we create a symbolic link into /etc/init.d to the /usr/local/vufind/ script and then run the update-rc.d script:

sudo ln -s /usr/local/vufind/ /etc/init.d/vufind
sudo update-rc.d vufind defaults

Now, if everything went well, you should be able to check out the Solr interface at http://<your_server&gt;:8080/solr/admin.

With everything running, it’s time to create the index of marc records.

First, export your catalog holdings in marc format and put them in your /usr/local/vufind/import folder. The way I do this is I get the exported files and use scp to copy them to the user account and then sudo mv them to the location:

[On the ILS server]

tar czvf catalog.tar.gz catalog.mrc
scp catalog.tar.gz user@your.vufind.server:~

[On your Ubuntu server]

sudo mv ~/catalog.tar.gz /usr/local/vufind/import
tar zxvf /usr/local/vufind/import/catalog.tar.gz

Now, we need to create the MarcXML file:

sudo touch catalog.xml
sudo yaz-marcdump -f MARC-8 -t UTF-8 -o marcxml catalog.mrc > catalog.xml
sudo php import-solr.php

This is a good time to take a coffee break…or a lunch break…or come back tomorrow 😉 Seriously, the import takes a while. There are some big (ok, they’re HUGE) improvements in the speed in which the files are indexed in the Subversion branch, but those haven’t been officially tagged yet, so just be aware that while this is slow, it’s been significantly improved for future releases.

The only thing to do is to tune the JVM.

As always, if you have questions, leave a comment, or join the Vufind lists.


Library Find

August 29, 2007

Stumbled across LibraryFind the other day and have been playing around trying to get it installed. I’ve not had many good experiences with Ruby based apps, but this looked really promising so I took the plunge. Unfortunately the searching doesn’t work because and just states that there was an error. Looking in the log files, it states that its “missing default helper dispatch_helper” and the record_set_helper. I also ran into a problem in the admin module when I attempted to add a target…just got a recordschema error. I ended up just writing a script to install a couple of EBSCO targets we had, but hopefully once I figure out what’s going on with the helpers, that problem will be resolved too.

Java Tuning for VuFind

August 1, 2007

Had a few more notes on running VuFind.

Java Tuning

Something that is generally looked over when setting up a Java application is tuning Java. This can be a very daunting endeavor as you generally see tutorials that reference things like interpreting p-values and power analysis. However, if you’re just wanting to set an application up, this is a much larger investment of time and effort that is really needed. So, here are some things you probably want to do.

To set the Java ergonomics for server applications, you simple set a new environmental variable. For Tomcat, this is the CATALINA_OPTS. For development boxes, I tend to make these global variables, but as long as the user account that’s running VuFind’s Tomcat instance has CATALINA_OPTS defined, you’ll see the performance boost.

For those who can’t wait, this is what I set for my instance in a visualized instance of Ubuntu server (Feisty) that runs with 2 GB RAM and a dedicated dual-core x86_64 processor.

CATALINA_OPTS="-server -Xmx1024 -Xms1024 -XX:+UseParallelGC -XX:+AggressiveOpts"

I don’t have any heuristics on the improvement, but it is a noticeable difference in both speed and processor utilization.

Without attempting to rehash the nitty-gritty of the ergonomics of the JVM, you’re bascially telling Java to act as a server, use a statically sized heap (the memory allocated for object storage), uses young-generation garbage collection (it divides garbage collection across processors), and turning on point release performance optimizations.

For more info on setting up the JVM to be “server-class”, check out the Java Tuning White Paper. While this paper specifically refers to the Java 5 platform, these same options will work if you’ve deployed under Java 6.

The Library As Text Part III: Or The Finest Possible Communication Apparatus in Public Life

May 31, 2007

Part 1/2

“But quite apart from the dubiousness of its functions, radio is one-sided when it should be two. It is purely an apparatus for distribution, for mere sharing out. So here is a positive suggestion: change this apparatus over from distribution to communication. The radio would be the finest possible communication apparatus in public life, a vast network of pipes. That is to say, it would be if it knew how to receive as well as to transmit, how to let the listener speak as well as hear, how to bring him into a relationship instead of isolating him. On this principle the radio should step out of the supply business and organize its listeners as suppliers.” (Brecht, p616, The Radio as an Apparatus of Communication () in The Weimar Republic Sourcebook first published in 1932).

The heart wants what the heart wants. Woody Allen

Back to the title of this series: The Library As text. This is not a completely original characterization of the library, in fact it was suggested before in an interesting article by John Budd (“An Epistemological Foundation for Library and Information Science,” Library Quarterly, 65:3, 295-318). The article jives quite well with the “wrought manifesto” vibe I’m going for here, in that it calls for the Library and Information Science (LIS) community to consider engaging in a more intellectually textured way of looking at what we do, moving away from our positivistic roots and adopting a more playful, perhaps meaningful, approach in the direction hermeneutics and phenomenology (pick up a reader on Heidegger, Gadamer or Ricoeur and you’ll catch his drift). (more…)


February 14, 2007

As I’ve delved deeper into interface design, one of the big things I’ve come up against is organizing a lot of data into something meaningful. I’ve done some experimenting with different visualization algorithms and implementations (check out my “real” blog on RSS Information Visualization). A few days ago, I ran across Many Eyes IBM’s Alpha Works.

Seeing the treemap visualization (and for the really geeky folks, their treemap algorithm is based on the paper Squarified Treemaps) made me think it would be really cool to actualy display a library catalog this way. Without actually doing the work to actually create a real treemap, I suspect it would look something along the lines of this…Treemap Visualization

Imagine each subject heading to be a big box, with sub-categories being smaller sub-sections, and books being the smallest boxes of all. Done correctly, this could be a really cool way to browse and discover items in the catalog.

Social Software

June 22, 2006

Social Software: A Survey of Web 2.0, Michael Stephens' third session in the Library 2.0 Extravaganza, is available at the OPAL Library and Information Science Archives. Make sure you use Internet Explorer. It is 59 minutes of viewing/listening pleasure.

I won't try to summarize Michael's presentation except to say that it further reinforces the community building and productivity enhancing aspects of the Web 2.0 applications that are springing up with increasing frequency.

Let's look at a few examples:

  • Bloglines — I use Bloglines to manage the RSS blog feeds that I read. The advantage of Bloglines is that it is a web application so I don't have to try to keep my aggregator on my home computer synchronized with my office computer. Productivity. I can also share my RSS feeds with others. If someone asks, "Mack, what blogs do you find helpful to read?" I can point them to my Bloglines account. I could also export my Blogline entries as an OPML file which another user could import into their Bloglines account. Community and sharing.
  • — is a web service that allows you to bookmark a web site. Have you ever arrived at a website and thought "I need to remember this." On your office computer you can bookmark it in which case you have to remember where you put it and you're out of luck if you are at a different computer when you want to revisit the site. If you are away from your computer you scramble for a piece of paper, write the URL, then leave it your pocket when you wash your clothing. gives you a central place, available from any Internet accessible computer, to store bookmarks. Plus, you can tag the links to provide organization and make them findable later and add notes for additional information about the site and why you bookmarked it. Productivity. also introduces the possibility of serendipity. I can see how many other people used the same tag to describe the website AND I can also see what they bookmarked in their accounts thus opening the possibility of locating related sites. If you want to see the sites that I have tagged web2.0, I can point you to my account filtered by the tag. You will see the links to which I have given this tag as well as other tags that I related to web2.0. I see considerable professional and educational possibilities. Community and sharing.
  • LibraryThing — is a web service that, at its most basic, lets you catalog your personal library. You categorize your books with tags. It is a nifty tool for keeping track of your books, when you got them, when you read them, what you thought about them. You could use it as an on-line journal of your reading habits. LibraryThing as a company is very interested in finding ways to hook their service to the library catalog. Productivity. Similar to, you can see other books that were given the same tag. You can see the libraries of other LibraryThing users. You could see if anyone has written a review of a book. If you want to see an example, take a look at my LibraryThing account. It isn't up-to-date and I haven't been good with reviews but I think you'll see the possibilities. Community and sharing.

There are many other examples of social software: Flickr (for pictures); MySpace and FaceBook (lots of press about MySpace lately); calendars; personal organizers. All of these tools can be used by you as an individual as well as in your professional life. Our students are using these tools and we should explore ways that they might help us connect with them.

Someone commented on an interesting side-effect of the widespread adoption of social software: it might be lowering the expectation of privacy among its users. People, particular millennials, are putting their lives on the web for all to see. There can be consequences. That picture of you doing a keg stand might not go over that well if a potential employer finds it while searching applicants in FaceBook.

At this point, you might want to take a moment and think about account names. I subscribed to the three services above at different times. I have three different account names; malundy, Mack42, and Mack46. In retrospect I wish I had standardized on one account name with variations used only if the standard name was used by someone else. You can decide if you want to use a variation of your own name or come up with an alias that describes you — acelibrarian for example.

The phenomenon of social software is one I find very interesting and I will return to it in later posts.

Library 2.0 Innovation Extravaganza

June 15, 2006

Today I participated in the Alliance Online Innovation Institute Library 2.0 Extravaganza. The speaker was Michael Stephens, well known blogger, writer, trainer, and soon to be professor. His blog, Tame the Web, has been one of my must-reads since I discovered blogs. Michael presented four, one hour sessions on these topics:

  • Weblogs & Libraries
  • Instant Messaging: Do You IM?
  • Social software: A Survey of Web 2.0
  • Creating Staff Buy In for New Technologies

I will write a blog on each topic as soon as I review and assimilate what I learned.

I haven't participated in an online presentation before (I don't count vendor webinars) and was pleased with how well it worked. In addition to slides and two-way audio we used chat for comments and questions. This way audio flow was not interrupted.

The presentation was sponsored by the Alliance Library System. Alliance is also the sponsor of Second Life Library 2.0 which is located in the 3-D virtual reality world of Second Life. I plan to blog on it later but if you are curious and can't wait here is the Second Life Library 2.0 blog. The Extravaganza was held in  the  OPAL (Online Programming for All Libraries) online auditorium.  Take a look at the OPAL archives. They do interesting stuff.