Archive for September, 2006

Zotero’s coming

I’m one of the many beta testers for Zotero and have been working with one version or another of this software for several months. I see on Dan Cohen’s blog that the public beta release is just a week or so away. Get ready.

Beyond developing what I think is a really great piece of software, the Zotero group is also interested in building a community of users and developers around the product. Like Firefox 2 (which uses SQLite to store bookmarks and browser history), Zotero is built around an SQLite database that lives on your local computer. A smart decision for Zotero’s purposes, but it also opens a low-barrier door to those interested in playing around with creation of add-on services for the product.

For example, I’m not all that interested in leveraging the “social” aspects of Zotero (tagging and swapping entries and the like) but I do have an interest in syncing my Zotero database across multiple computers—like I now do with my various Yojimbo databases. My initial work on extending Zotero will focus on developing a similar capability (of course, I’ll have to learn how to code this sort of thing first so I’m sure progress will be slow).
sqlite_zotero.jpgTo begin getting a handle on the Zotero database structure, there are a number of open-source and commercial products that support SQLite databases. I’ll recommend starting with SQLite Browser on SourceForge. The latest version works well with the Zotero database and it’s available for Windows, Linux and OSX.

I tried using the sqlite3 command line utility that’s installed in /usr/bin on Mac OSX (version 3.1.3 on 10.4) but that didn’t work—complained about an invalid file type. Tried Fink but there’s no binary distribution of sqlite3 available. Went to DarwinPorts and after installing the latest version and issuing the “port selfupdate” command, I grabbed the latest version of sqlite3 with a simple “port install sqlite3.” This version (3.3.7) opened the Zotero file with no problems (as the little graphic illustrates).

Let me also suggest that you not have Firefox running when you poke around your Zotero datastore—SQLite does perform an exclusive lock on the whole file for any writes but it’s better to be safe than sorry.
Looking forward to what I hope will be the public beta release next week…

http://zotero.org

Add to Del.icio.us Add to Technorati Stumble Upon Digg This

eAccelerator update

A very quick note on the sort of improvement we’re seeing with eAccelerator in place on our CWIS application. I ran these tests with ab (ApacheBench, version 1.3d) which may or may not be the right tool for this task.

Without eAccelerator caching, 30 requests:  Requests per second: 0.61
With eAccelerator caching, 30 requests: Requests per second: 0.94

So, with eAccelerator we can boost performance of this complex application by about 34%. On the face of it, that doesn’t seem like a tremendous increase but then this application does a lot of MySQL work behind the scenes so opcode caching can only take us so far.

On a relatively simple page, (a single, lightweight MySQL query) we see much more dramatic improvement:

Without eAccelerator caching, 200 requests: 47 requests per second
With eAccelerator caching, 200 requests: 384 requests per second

Happily, getting this sort of boost will free up server cycles to help process our more weighty applications. :)

Add to Del.icio.us Add to Technorati Stumble Upon Digg This

eAccelerator and Tiger Server

eaccelerator.jpgSpent an hour or so today working on a PHP accelerator for a modified Scout Portal Toolkit application we run on one of the library’s Mac OS X servers. The CWIS version of the SPT software is a really useful and well-designed application but it’s not clear that much time has been spent on optimizing it for speed. Or, perhaps it’s been carefully hand tweaked but still lumbers along. It’s not painfully slow, but it does seem to take a couple of seconds to load a page and I decided that was at least a second too long (can it ever be too fast?).

I spent a few minutes looking over the source but quickly noticed that the initial “require” statements brought in many hundreds of lines of code and optimizing all that would take much more time and effort than I felt was justified (and besides, the application’s running and who knows what trouble I’d introduce).

Time for a different tack.

I recently spent some time with the MAMP package (Macintosh – Apache – MySQL – PHP) and noticed that they included the eAccelerator for PHP caching (if you’re not familiar MAMP you might want to take a look. It is an all-in-one package for quick and easy installation of all the software needed to do Apache/PHP/MySQL development on an Apple desktop or notebook. Yes, it’s a universal compilation). You can’t use MAMP on OS X Server so I went off in search of the eAccelerator software to add to my existing server’s Apache/PHP/Mysql setup. I wasn’t surprised to find that Mac OS X isn’t much discussed in this context (is it ever?) but thanks to MAMP, I knew it should work.

The software (source only) is at http://eaccelerator.net/

An hour later, I can report that eAccelerator works with the default Apache/PHP/MySQL setup on Tiger server but it does require a bit of tweaking to get running. Here are the highlights:

1. If not there already, install XCode (Developer Tools) on your server. The latest version (2.3) is available for download but you’ll have to join the Apple Developer Connection (it’s free). Another option is the 10.4 install disks but that copy of XCode is probably not the latest release.

2. Correct the information in this file if (like my 10.4.7 installation) it is incorrect:

/usr/include/php/main/php_version.h

The php_version.h on my server was misreporting the actual version of PHP installed. It should have loooked like this:

#define PHP_MAJOR_VERSION 4
#define PHP_MINOR_VERSION 3
#define PHP_RELEASE_VERSION 3
#define PHP_EXTRA_VERSION “”
#define PHP_VERSION “4.3.3″

3. Download the eAccelerator source, unzip/untar it and take a look at the README. It will suggest this sequence:

usr/bin/phpize
./configure \
- -enable-eaccelerator=shared \
- -with-php-config=/usr/bin/php-config \
make
make install

With OSX server, I found it would not compile correctly until I also added one more argument to the configure command:

- -with-eaccelerator-userid=70

or whatever the userID eaccelerator runs as (in most cases this should be the same as the “apache” user (e.g., www or 70)).

Once “make install” finishes, you have to do a bit of work to enable PHP to find and load the module.

3. Modify /etc/php.ini adding the line (assuming you have no other extensions already running):

extension_dir=”/usr/lib/php/extensions/no-debug-non-zts-20020429/”

the “make install” process will tell you where it put eaccelerator.so and you want to use that path in the “extension_dir” statement above.

and you will also want to add these lines to your php.ini file:

extension=”eaccelerator.so”
eaccelerator.shm_size=”16″
eaccelerator.cache_dir=”/tmp/eaccelerator”
eaccelerator.enable=”1″
eaccelerator.optimizer=”1″
eaccelerator.check_mtime=”1″
eaccelerator.debug=”0″
eaccelerator.filter=””
eaccelerator.shm_max=”0″
eaccelerator.shm_ttl=”0″
eaccelerator.shm_prune_period=”0″
eaccelerator.shm_only=”0″
eaccelerator.compress=”1″
eaccelerator.compress_level=”9″

In my case, I turned compression off (eaccelerator.compress=”0″). When on, it compresses the data before putting it in the cache. I was looking to save time, not memory or disk space. Of course, I could be wrong about this. Perhaps compressing takes little CPU power and the smaller cache blocks can be read and decompressed in less time than it takes to read the larger, uncompressed versions. Will do a bit more research on this setting. You can get some documentation for these settings on the eAccelerator project site.

You have to create the directory for the cache (e.g., /tmp/eaccelerator) and make it world writeable:

mkdir /tmp/eaccelerator
chmod 0777 /tmp/eaccelerator

Then restart Apache. You should begin to see data written to your /tmp/eaccelerator directory.

What exactly does this eAccelerator module do? Without a cache in place, every time a PHP script runs the sourcecode must be scanned, compiled and then executed. Once installed, the eAccelerator cache grabs a copy of this compiled code the first time a page is scanned and compiled, then stores that object code in the cache for reuse. The next time the same script is called, it executes this cached/compiled copy instead, skipping the scan/compile step. Eliminating that redundant overhead speeds everything up.

In the coming days, now that I have eAccelerator working, I’m going to find a moment to do some benchmarking on performance with and without the cache. Not so much for this particular application but I have a few future projects for the CWIS platform and need to figure out what sort of performance I can squeeze from it. I’ll report those results here but my empirical observations this afternoon suggest that load times for pages from this application have been cut in half.

Add to Del.icio.us Add to Technorati Stumble Upon Digg This

No Quotes, No Service

It really must be Voyager month…I’ve spent more time on this system in the past 30 days than in the preceeding year. But that’s OK, much of the time’s been spent removing a batch of little annoyances that have crept in across various releases and hung around seemingly forever.

opacscreen.jpgFor example, since the Voyager epoch (for us, around 1996) it’s been the case that our web OPAC defaults to a keyword search—and of course, most users approach the search box as though they’re at Google. Like Google, our system accepts most anything you might care to enter but more often than not your first attempt yields a Your search resulted in no hits! message.

As an aside, I’ve always wondered about that little exclamation point. Is the system happy about this or just trying to spin an unpleasant message?

So you either pose a different query or knowing that the library must have the title you’re trying to locate, you read the help screen and see you have to enclose a multi-word search in quotes or connect the terms with upper-case Boolean connectors.

For years, that procrustean behavior was just chalked up as a training opportunity.

Today I got an email from a colleague at our Law School’s library (which shares our OPAC), mentioning first that he really appreciated a few of my recent “fixes” and asking if I’d take a look at improving the behavior of our keyword search box. Helpfully, he included a couple of links to Voyager sites where it worked better.

Now that’s how you send a complaint to your Systems Office.

Of course, I immediately dropped everything and got to work on the problem. How dare some other site not only fix the problem but then put it out there where everyone can see it!

Following his links, I poked around Yale’s OPAC, then UCLA’s. Sure enough, both had figured out a way to inject a bit of javascript into the process, compensating (where needed) for the absence of either quotes or connectors.

Peeling back the URLs I soon had the javascript (opac.js) displayed and could see that UCLA started the process and Yale built upon their work. I next visited Kansas State University and found a later evolution of the same codebase. The KSU script mentioned Harish Maringanti, a member of their digital initiatives team and credited him with optimizations. I’d say so, Harish’s work reduced the original code from 32K to just over 5K. I later got it down to quick-loading 2K by removing comments, spaces, joining lines and generally rendering the code indecipherable to all but a browser.

From the KSU library staff directory I found Harish’s email address and sent a message. Within a minute I had a reply and his blessing on my reuse of his code. Total turnaround time from “suggestion email” to fixed OPAC was probably something under five minutes.
Here’s a simple recipe for any other Voyager admin who’s interested in making the fix:

1. Log onto your WebVoyage server and create a “js” directory under your /m1/voyager/xxxdb/webvoyage/html directory. You’ll end up with /m1/voyager/xxxdb/webvoyage/html/js

2. Download this file and save it as “opac_stuff.js” or “opac.js” or “whatever.js” in the new directory you created in step 1.

http://magik.gmu.edu/js/opac_stuff.js

2. Add this line to your header.htm file:

<script language="javascript" xsrc="/js/opac_stuff.js"></script>

3. That’s it. Now send an email to Harish Maringanti (harish [at] ksu.edu) and tell him how much you appreciate the work he did on this. And remember to thank anyone you meet from Yale or UCLA for getting the ball rolling.

You can try this out here:

http://magik.gmu.edu/cgi-bin/Pwebrecon.cgi?DB=local&PAGE=First

Add to Del.icio.us Add to Technorati Stumble Upon Digg This

Nisus Writer Express

Voyager persistent links

Before getting to the point of this post, I thought I’d mention the persistent link we’ve added to each item in our OPAC (useful when someone wants to directly reference an individual item in the catalog).

Over the weekend I received an email from an iNode reader in New Zealand asking how that was done. To prove there’s very little wizardry involved, I offer this link for any interested Voyager admin (and putting it here will help me remember how I did it the next time a Voyager upgrade trashes local modifications):

http://deskbox.gmu.edu/voyagerpersistentlink.html

Now, on with the post…

Finding a new word processor

intel.jpgI’ve been spending a lot more time with intel Macs lately (specifically the MacPro). These machines are so fast I’m really beginning to resent the sluggish performance of applications that require Rosetta (the PowerPC emulation technology).

I can’t do much about Dreamweaver (use it maybe once a week) or Photoshop (a bit more often) but I think I can do something about the third head of this bloatware hydra—Word. All I really want is word processor. I rarely do labels, mailmerge, change tracking or 90% of the other “features” a program like Word puts at my fingertips (assuming my fingertips aren’t already occupied thumbing through the manual or pounding keys in the help window). On some level it seems a real waste of electrons to launch a huge emulator-hobbled application just to type up a memo…

I gave some thought to NeoOffice (which offers both PPC and Intel versions) but it’s still a beta product (Aqua Beta 3 as I write this) and has performance issues as well. Besides, I’m not looking for a full Office replacement–just a Intel-native OSX word processor that reads and writes Word files, understands RTF, uses few resources when idle, seems snappy when working, and doesn’t complicate things by offering too much more than the features I’ll actually use.

I looked around and found several candidates.

Mellel is an interesting and well-regarded program but its real power seems to be directed at a problem I don’t often have–work with footnotes, endnotes and multiple languages. I liked a number of Mellel’s features and guess I could grow to tolerate the default brushed metal interface but for now I’ll pass. Continuing the quest, I also took quick looks at AbiWord and Mariner Write but for different reasons neither made a very good first impression.

nisus.jpgFinally, I hit on Nisus Writer Express (2.7).

As one who switched to a Mac a little over two years ago, I guess I can be forgiven for not knowing that during the Classic era, Nisus was one of the major players in the Mac word processor market. Their “Nisus Writer” never made the jump to OS X and market share/interest inevitably dwindled. Rather than port Nisus Writer to OS X, Nisus purchased Okito Composer and began development of Nisus Writer Express (NWE) based on that Cocoa foundation (which explains the speed with which a Universal binary version appeared when needed).

There’s a lot to like in NWE.

It is fast and a universal application but it also plays well with the other 95% of the world that’s using Microsoft Word. Nwe Sceen NWE reads and writes most Word files without much difficulty which isn’t to say there aren’t a few problem areas. I had uneven results with some complex Word documents (but not every one I tried) and lost images in any document where text flowed around embedded graphics.

Beyond those problems (neither of which represent a significant percentage of the documents I encounter each day) compatiblity was very good.

Pros: a real-time thesarus that’s unobtrusive but helpful (displays word alternatives in a pane while you’re typing—meaning you can ignore it until you reach a spot in the document where your vocabulary begins to fail you); the ability to use regular expressions in the find/replace function; support for Perl macros; support for what I’d call OLE (if it wasn’t already called LinkBack) with applications like OmniGraffle or OmniOutliner; a very simple to use implementation of tables and more.

So (for now at least) Nisus Writer Express is my word processor of choice. If I receive a document that opens with problems, I’ll revert to Word…or perhaps Pages (in my tests I found that Pages had no trouble opening and displaying documents when text flowed around embedded images).

In any case, the price is right ($39 for an academic license) and development seems active (several new releases in the past year). Both Mellel ($35 for an academic license) and NWE offer 30 day demos.

Add to Del.icio.us Add to Technorati Stumble Upon Digg This

Voyager CSS and stuff

Petty Larceny?

Did a bit of a CSS makeover on our OPAC this week. Doubt she realizes I “borrowed” her work but thanks to Laura Guy at the Colorado School of Mines for the CSS code. Her post on a Voyager listserv prompted my interest in fixing our pretty-ugly standard fonts and poking around their catalog I found the CSS I needed to fix things.

That “view source” button is a really wonderful invention, isn’t it? Ever speculate on how much slower development/evolution of web design would have been if web authors had always been able to hide their markup? Perhaps that explains why technologies like Flash haven’t been more widely adopted and stretched–you can’t learn very much just looking at someone’s SWF file.

Intranet moving…

Moved our intranet from a old E3000 (Solaris) machine over to OS X server today. I’m trying to free up some of our older servers so we can surplus them before moving in a week or so. Had quite a time getting .htaccess working properly with the Apple implementation of Apache. Seems Apple’s version of mod_auth is just a bit different and the “generic” mod_auth is commented out in the httpd.conf file that ships with Tiger Server 10.4.x

Amazon -> Opac linking

I serve on the Board of Trustees for the Loudoun County Public Library, working through my second and final four year term. I do it both as a bit of public service to my community and as a way to help insure that the system protects basic rights like intellectual freedom and the freedom to read. I’m also interested in the technologies that public libraries are using and this new development at the Loudoun Public Library system is really nice–a greasemonkey script that enables links between Amazon and items in the library’s OPAC. I worked up a similar system for Mason’s catalog a year or so ago, but LCPL has really gone far beyond what that extension attempted. Not only do they show whether the library owns the book, they also give information on its circulation status…even a link to place a hold request for the item.

I think I’m most impressed by the tech staff’s assumption that they can put something as geeky as a greasemonkey extension to Firefox “above the fold” on their home page.

Add to Del.icio.us Add to Technorati Stumble Upon Digg This

Encrypted Sparse Images

files1.jpg

This tip is just for Mac OS X users. I had to figure this out the other day when I wanted to put a copy of my documents directory up on a network server (as a backup) but wanted to encrypt it. If you’re a Windows or Linux user, go read about TrueCrypt.

You likely have a directory somewhere on your Mac that contains files you’d prefer to keep secret–or at least make sure that if those files fell into the wrong hands, you wouldn’t be responsible for losing sensitive information (e.g., the VA data scandal?). And it’s not limited to data on laptops. That really convenient USB thumb drive you use to carry important files is also really easy to lose.

Yes, you could put all your files in a password-protected zip, stuffit or rar archive and carry that around, but it’s hardly convenient—you must extract the file(s) from the archive before using.

Here’s a better solution: create an encrypted sparse image file.

Huh?

A sparse image file is really nothing more than a container file that can hold within it many other files (in that way it’s like other archiving solutions). What makes this different is that this image file can be mounted and used like just another disk drive on your system. The reason it’s called a “sparse” image (as opposed to a “full image”) is that the actual “bytes taken up on the hard disk” size of the image file grows only as you add files to the it—up to the maximum capacity you declared when the image file was created.

We get our security payoff from the fact that you can assign a password to this image and then all data going in is encrypted—and only by knowing the password do you get data out. Stumble upon one of these files without knowing the password and all you or your byte-level editor sees is gibberish.

A few points worth mentioning:

If you plan to put this image on a USB drive, make sure you format the key as HFS+ (not MS-DOS). This is a requirement but it also provides another layer of security if your lost drive turns up in the hands of a Windows user—when inserted Windows will automatically suggest formatting the drive (helpfully obliterating what would have been your “at risk” data).

Don’t lose/forget your encryption password. There’s no recovering data without the password.

The maximum size of the image file can be no larger than the amount of free space available on the drive you use for your creation location (e.g., where you tell Disk Utility to save the new image file you’re creating).

If you use this on a USB drive, be sure to eject the image file before pulling out the thumb drive. If improperly unmounted, you’ll likely find your image file is corrupted and completely unusable.

GUI method: Launch Disk Utility (in /Applications/Utilities)

1. File -> New -> Blank Disk Image
2. Select a maximum size for the image file (large enough to hold all the files you’ll want to include)
3. Select Encryption -> AES-128
4. Now Format ->Sparse Disk Image.
5. Once you press create, you’ll be prompted for a password.

You can then copy the secure-disk-name.sparseimage file to a network volume or USB key. When you click on the image you’ll be prompted for a password. Get that right and you’ll see what just looks like a new disk drive on your desktop. When you’re done, just eject the image drive.

Command line method: (via Terminal)

hdutil create -size SizeYouWant -encryption -type SPARSE -fs HFS+ ImageName

the SizeYouWant is the maximum size for the image (e.g., 500m would build an image capable of holding 500 megabytes, 2g would build one for 2 gigabytes). There’s about a 10% overhead in image file size even when empty.

You can include this command-line version in scripts. For more information, type man hdiutil in a terminal window. Hint, you might find typing “hdiutil create -help” a bit more useful if a good bit less verbose than the man page.

web-based man page for hdiutil

Here’s an Apple document on creating these images using Disk Utility:

http://docs.info.apple.com/article.html?artnum=107333

Add to Del.icio.us Add to Technorati Stumble Upon Digg This

Next Page »