All posts by Martin MC Brown

a.k.a.: Martin MC Brown a.k.a.: Martin Brown a.k.a.: mcslp a.k.a.: Martin C Brown a.k.a.: MC

Peter Wainwright, Pro Apache

Apache has been a stalwart of the Internet for some time. Not only is it well known as a web serving platform, but it also forms a key part of the LAMP (Linux-Apache-MySQL-Perl/Python/PHP) and is one of the best known open source projects. Getting an Apache installation right though can be tricky. In Pro Apache, Peter Wainwright hopes to help readers by using a task, rather than feature based, approach. I spoke to Peter about Apache, its supported platforms, the competition from IIS and his approach to writing such a mammoth tome.

High Performance Linux ClustersInflammatory questions first – Unix or Windows for Apache?

Unix. To be more precise, BSD, then Linux, then almost anything else (e.g., commercial Unixes), then Windows — if you must.

The usual technical arguments and security statistics against using Windows are readily available from a number of sources, so let me give a rather different perspective: it seems Microsoft was in discussion to buy Claria, creators of Gator (one of the more annoying strains of adware that infest Windows desktops). Coincidentally, Microsoft’s beta ‘AntiSpyware’ tool recently downgraded Claria’s products from quarantine to ignore. It seems that the deal fell through, but for reasons of bad PR rather than any concern for the customer. Call me cynical if you like, but I see little reason to place my faith in a closed-source operating system when the vendor is apparently willing to compromise the security of its customers for its own business purposes. Yes, plenty of us already knew that, but this is an example even non-technical business managers can grasp.

Having said that, yes, there are reasons why you might be required or find it otherwise preferable to run Apache on a Windows server. For example, you might need make use of a Windows-specific module or extension. Apache on Windows is perfectly viable – but given a free choice, go the open source route.

Do you prefer the text-based configuration, or the GUI based configuration tools?

Text-based every time. I don’t object to the use of a GUI outright, but if I can’t easily understand the generated configuration files by direct inspection afterwards, or can’t modify the configuration without upsetting the tool, I’ve just built a needless dependency on a tool when I would have been better off maintaining the text-based configuration directly. Using a tool not a substitute for understanding the underlying configuration.

Too many administrators, I think, use the default configuration file without considering whether it might be better to create a much simpler and more maintainable configuration from scratch. I find an effective strategy for maintaining an Apache configuration is to divide it into several simple configuration files according to function – virtual hosting, access control, SSL, proxies, and so on – and then include them into one master configuration file. If you know what your website (or websites) will be doing, you can configure only those features. A simpler configuration, in turn, generally means fewer security issues to deal with.

The default configuration file, if I make use of it at all, becomes just one of the files included into the master configuration file that takes its place. Customisations go into go into their own or files and override the defaults as necessary. This makes it very easy to see what configuration came pre-supplied and what was applied locally. It also it easy to update the default configuration as new releases of Apache come out, because there are no modifications in the file to carry across.

Can you suggest any quick ways to improve performance for a static site?

There are two main strategies for performance-tuning a server for the delivery of static content: finding ways to deliver the content as efficiently as possible, and not delivering the content at all, where possible. But before embarking on a long session of tweaking, first determine whether the load on the server or the available bandwidth is the bottleneck. There’s no point tuning the server if it’s the volume of data traffic that’s limiting performance.

Simple static content performance can be improved in Apache by employing tricks like memory-mapping static files or by caching file handles and employing the operating system’s sendfile mechanism (the same trick employed by kernel HTTP servers) to efficiently transfer static data to the client. Modules like Apache 1.3’s mod_mmap_static and Apache 2’s mod_file_cache make this easy to configure.

At the platform level, many operating systems provide features and defaults out of the box that are not useful for a dedicated webserver. Removing these can benefit performance at no cost and often improve security at the same time. For instance, always shut down the mail service if the server handles no mail. Other server performance improvements can be gained by reducing the amount of information written to log files, or disabling them entirely, or disabling last access-time updates (the noatime mount option for most Unix filesystems).

If the limiting factor is bandwidth, look to trade machine resources to reduce throughput with strategies like compressing server responses with mod_gzip. Also consider the simple but often-overlooked trick of reducing the bytesize of images (which compression generally won’t help with) that Apache is serving.

Arranging not to deliver the content can actually be easier, and this reduces both server loading and bandwidth usage. Decide how often the static content will change over time, then set configure caching and expiration headers with mod_cache (mod_proxy for Apache 1.3) and mod_expires, so that downstream proxies will deliver content instead of the server as often as possible.

To really understand how to do this well, there is no substitute for an understanding HTTP and the features that it provides. RFC2616, which defines HTTP 1.1, is concise and actually quite readable as RFCs go, so I recommend that all web server administrators have a copy on hand (get it from www.w3.org/Protocols/HTTP/1.1/rfc2616.pdf). That said, it is easy to set expiry criteria for different classes of data and different parts of a site even without a firm understanding of the machinery that makes it work. Doing so will enable the site to offload content delivery to proxies wherever possible. For example, tell proxies that all (or most) of the site’s images are static and can be cached, but the text can change and should never be cached. It may happen that most of the text is also static, but since images are generally far larger, marking them as static provides immediate benefits with a very small amount of configuration.

Security is a key issue. What are the main issues to consider with Apache?

Effective security starts with describing the desired services and behaviour of the server (which means both Apache and the hardware it is running on). Once you know that, it is much easier to control what you don’t want the server to do. It’s hard to protect a server from unwanted attention when you don’t have a clear idea of what kinds of attention is wanted.

I find it useful to consider security from two standpoints, which are also reflected in the book by having separate chapters. First is securing Apache itself. This includes not only the security-specific modules that implement the desired security policies of the server, but also the various Apache features and directives that have (sometimes non-intuitive) security implications. By knowing what features are required, you can remove the modules you don’t need.

Second, but no less important, is securing the server that Apache is running on. The security checklist in Pro Apache attempts to address the main issues with server security in a reasonably concise way, to give administrators something to start from and get them thinking in the right direction. One that’s worth highlighting is ‘Have an Effective Backup and Restore Process’ — it’s vital to know how to get your server back to a known state after a break-in, and being able to do so quickly will also stand you in good stead if a calamity entirely unrelated to security occurs, like a hard disc failure or the server catching fire (this actually happened to me). The ssh and rsync tools are very effective for making secure network backups and restores. They are readily available and already installed on most Unixes, so there’s no reason not to have this angle covered.

With the increased use of dynamic sites using PHP and Perl, how important and useful are functions like SSIs and rewriting which is built into Apache?

When designing a web application, use the right tool for each part of the job. Apache is good at handling connectivity and HTTP-level operations, so abstract these details from the application as far as possible. Rewriting URLs, which are simply one kind of many kinds of request mapping, are just an aspect of this. Similarly, don’t make a web application handle all its own security. Use Apache to handle security up front as much as possible, because it is expert at that, and if used properly will prevent insecure or malicious requests from reaching the application. Unfortunately, rather too many web application developers don’t really understand web protocols like HTTP and so build logic into the application that properly belongs in the server. That makes it more likely that a malicious request can find a weakness in the application and exploit it. It also means the application designers are not making use of Apache to its fullest potential.

Bear in mind that it is possible, with scripting modules like mod_perl, to plug handlers into different parts of the request-response cycle. Clever use of this ability allows a flexible modular design that is easier to adapt and less likely to create hidden security issues. Apache 2 also provides new and interesting ways to construct web applications in a modular fashion using filters. These features are very powerful, so don’t be afraid to exploit them.

I’ll admit to a fondness for Server Side Includes (SSIs). Even though they have been largely superseded by more advanced technologies, they are easy to use and allow for simple templating of static and dynamic content. Apache’s mod_include also knows how to intelligently cache static includes, so SSI-based pages are a lot faster than their basic mechanic would suggest, and without requiring any complex configuration. They’re a good choice for sites that have a lot of static content and need to incorporate a few dynamic elements.

Apache is facing an increasing amount of competition from Microsoft’s IIS, especially with the improvements in IIS 6.0. Ignoring the cost implications, what are the main benefits of Apache over IIS?

Trust. One of the reasons that Apache is a reliable, secure, and high-performance web server is because the Apache developers have them as end objectives. They’re not trying to sell you something. Having total flexibility to add or remove features, or inspect and modify the code if necessary, are almost bonuses by comparison.

On a more technical note, an Apache-based solution is of course readily portable to other platforms, which ties into the choice of platform we started out with. Although there are always exceptions, if you think there’s a feature that IIS provides that Apache cannot — bearing in mind you can always run Apache on Windows — chances are you haven’t looked hard enough.

Pro Apache is a mammoth title — where do you start with something as complex with Apache?

Too many books on computing subjects tend to orient themselves around the features of a language or application, rather than the problems that people actually face, which is not much help if you don’t already have some idea what the answer is in order to look it up. I try hard in Pro Apache to start with the problems, and then illustrate the various directives and configuration possibilities in terms of different solutions to those problems.

Even though there are a bewildering number of directives available, many of them are complimentary, or alternatives to each other, or are different implementations of the same basic idea. For example, take the various aliasing and redirection directives, all of which are essentially variations on the same basic theme even if they come from different modules (chiefly, but not exclusively, mod_alias and mod_rewrite). Understanding how different configuration choices relate to each other makes it easier to understand how to actually use them to solve problems in general terms. A list of recipes doesn’t provide the reader with the ability to adapt solutions to fit their own particular circumstances.

I also try to present several different solutions to the same problem in the same place, or where that wasn’t practical, provide pointers to alternative or complimentary approaches in other chapters. There’s usually more than one way to achieve a given result, and it is pretty unlikely, for example, that an administrator trying to control access through directives like BrowserMatch and RewriteRule will discover that the SSLRequire is actually a general-purpose access control directive that could be the perfect solution to their problem. (SSLRequire is my favourite ’secret’ directive, because no one thinks to find a directive for arbitrary access control in an SSL module.)

Since many administrators are still happily using Apache 1.3, or have yet to migrate, the updates made to the first edition of Pro Apache (then called Professional Apache and published by Wrox) to cover Apache 2.0 do not separate coverage of the 1.3 and 2.X releases except where they genuinely diverge. The two versions are vastly more similar than they are different — at least from the point of view of an administrator — and in order to be able to migrate a configuration or understand the impact of attempting to do so, it was important to keep descriptions of the differences between the two servers tightly focused. To do this, coverage of the same feature under 1.3 way and 2.X are presented on the same page wherever possible.

It seems unlikely considering the quality of the content, but was there anything you would have liked to include in the book but couldn’t squeeze in?

With a tool as flexible as Apache, there are always more problems to solve and ways to solve them than there is space to cover, but for the most part I am very happy with the coverage the book provides. Judging by the emails I have received, many people seem to agree. If there’s anything that would have been nice to cover, it would probably be some of the more useful and inventive of the many third-party modules. A few of the more important, like mod_perl, are covered by the last chapter, but there are many so many creative uses to which Apache has been put that there will always be something there wasn’t the space or time to include.

What do you do to relax?

Strangely enough, even though I spend most of my working time at a computer, I’ve found that playing the odd computer game helps me wind down after a long day. I think it helps shut down the parts of my brain that are still trying to work by making them do something creative, but deliberately non-constructive. I recommend this strategy to others too, by the way; board games, or anything similar, work too.

To truly relax, I’ve found that the only truly effective technique is to go somewhere where I don’t have access to email, and determinedly avoid networks of any kind. I suspect this will cease to work as soon as mesh networks truly take hold, but for now it’s still the best option. It also helps that I have a wonderful, supportive wife.

What are you working on next?

Right now I’m gainfully employed and wielding a great deal of Perl at some interesting problems to do with software construction in the C and C++ arena. There’s been some suggestion that a book might be popular in this area, so I’m toying with that idea. I also maintain an involvement in commercial space activities, specifically space tourism, which has recently got a lot more popular in the public imagination (and about time too, some of us would say). That keeps me busy in several ways, the most obvious of which is the ongoing maintenance the Space Future website at www.spacefuture.com.

Author Bio

Peter Wainwright is a developer and software engineer specializing in Perl, Apache, and other open-source projects. He got his first taste of programming on a BBC Micro and gained most of his early programming experience writing applications in C on Solaris. He then discovered Linux, shortly followed by Perl and Apache, and has been happily programming there ever since.

When he is not engaged in development or writing books, Wainwright spends much of his free time maintaining the Space Future website at www.spacefuture.com. He is an active proponent of commercial passenger space travel and cofounded Space Future Consulting, an international space tourism consultancy firm.

Improved application development, Part 4: Building a Web client

Part 4 of the Improved Application Development series, which covers a development from end-to-end using Rational tools is now available.

Written by Nate Schutta, it concentrates on extending the application to work on the web, using the powerful features of the Rational environment to make the developed as quick and easy as possible.

Here’s the intro blurb:

In this tutorial, you’ll return to the Auction application that you developed in Part 2. You’ll add functionality to what you developed previously and connect to your entity beans via a Web-based front end. You’ll take advantage of leading-edge technologies like JavaServer Faces (JSF) and Extensible Hypertext Markup Language (XHTML) to create a dynamic Web project — and, thanks to IBM Rational Application Developer’s powerful Web design features, you’ll hardly have to touch the keyboard.

Click on for the full tutorial.

The story so far:

  1. Improved application development: Part 1, Collating requirements for an application
  2. Improved application development: Part 2, Developing solutions with Rational Application Developer
  3. Improved application development, Part 3: Incorporating changes in requirement

Improved application development, Part 3: Incorporating changes in requirement

The next article in the Improve application development series is now up at the IBM site.

This follows on from Part 2, written by Nate Schutta, and moves on to managing the project now that the application is being developed and the you start getting faults and change requests into the system that need to be tracked and monitored. The main focus here then is the Rational ClearQuest system and it integrates with the other tools you’ll use in the process, including the original RequisitePro and the new Rational Application Developer and Rational Software Modeler tools.

Remember, these later tools are based on the Eclipse platform and that means that the interfacing code is written as a plug-in to the Eclipse environment.

Here’s the intro description:

The focus of this third tutorial in the “Improved application development” series is on change management. This tutorial shows how individual change requests are linked and traced back to the original requirements specification, how you manage that information from within your development environment, and how you generate a new specification.

You can read the full tutorial.

As a recap, this tutorial follows on from:

Improved application development: Part 1, Collating requirements for an application
and
Improved application development: Part 2, Developing solutions with Rational Application Developer

From Bash to Z Shell by Oliver Kiddle, Jerry Peek and Peter Stephenson

Note: This review was originally published in Free Software Magazine

Linux in a Windows WorldIf you use a free software operating system or environment, chances are one of your key interfaces will be through some kind of shell. Most people assume the bulk of the power of shells comes from the commands available within them, but some shells are actually powerful in their own right. Many of the more recent releases being more like a command line programming environment than a command line interface. “From Bash to Z Shell” published by Apress, provides a guide to using various aspects of the shell. From the basic command line interaction through to the more complex processes of programming, it touches on file pattern matching and command line completion along the way.

The contents

Shells are complicated – how do you start describing working with a shell without first describing how the shell works, and don’t you show them how to use it by doing so? The book neatly covers this problem in the first chapter with what must be the best description of a shell and how the interaction works that I’ve ever read.

This first chapter leads nicely into the first of three main sections. The initial section looks at using a shell, how to interact with the programs which are executed by the shell and how to use shell features such as redirection, pipes and command line editing. Other chapters look at job and process control, the shell interface to directories and files, as well as prompts and shell history.

The real meat of the book for me lies in the two main chapters in the middle that make up the second section. The first of these chapters is on pattern matching. Everybody knows about the basics of the asterisk and question mark, but both bash and zsh provide more complex pattern matching techniques that enable you to very find a specific set of files which can simplify your life immensely. The second chapter is on file completion; press TAB and get a list of files that matches what you’ve started to type. With a little customization you can extend this functionality to also include variables, other machines on your network and a myriad of other potentials. With a little more work in zsh and you can adjust the format and layout of the completion lists and customize the lists according to the environment and circumstances.

The third and final section covers the final progression of shell use from basic interaction to programming and extending the shell through scripts. Individual chapters cover the topics of variables, scripts and functions. The penultimate chapter puts this to good use by showing you how to write editor commands – extensions to zsh that enhance the functionality of the command line editor. Full examples and descriptions are given here on a range of topics, including my favourite: spelling correction.

The final chapter covers another extension for the command-line – completion functions. Both bash and zsh provide an extension system for completion. Although the process is understandably complex, the results can be impressive.

Who’s this book for?

If you use a shell – and let’s face it, who doesn’t – then the information provided in the book is invaluable. Everybody from system administrators through developers and even plain old end users are going to find something in this book that will be useful to them.

Of all the target groups, I think the administrators will get the most benefit. Most administration involves heavy use of the shell for running, configuring and organizing your machine, and the tricks and techniques in this book will go a long way to simplify many of the tasks and processes that take up the time. Any book that can show you how to shorten a long command line from requiring 30-40 key presses down to less than 10 is bound to be popular.

Pros

The best aspect of the book is that it provides full examples, descriptions and reasoning for the different techniques and tricks portrayed. This translates the content from more than a simple guide and into an essential part of the users desktop guides. The book is definitely not just an alternative way of using the online man pages.

The only problem – although it’s a good one – is that reading the book and following the tips and advice given becomes addictive. After you’ve customized your environment, extended your completion routines and enhanced your command-line once, you’ll forever find yourself tweaking and optimizing the environment even further.

Finally, it’s nice to see a handy reference guide in one of the appendices to further reading – much of it online, but all of it useful.

Cons

One of the odd things about the book is that the title doesn’t really reflect the contents. If you are expecting the book to be guide to using a range of shells ‘From Bash to Z Shell’, as the name suggests, you’ll be disappointed. Sure, a lot of the material is generic and will apply to many of the shells in use today, but the bulk of the book focuses on just the two shells described in the title, which makes the title a little misleading.

Although I’m no fan of CDs in books, I would have liked to see a CD or web link to some downloadable samples from the book.

In short
Title From Bash to Z Shell
Author Oliver Kiddle, Jerry Peek and Peter Stephenson
Publisher Apress
ISBN 1590593766
Year 2005
Pages 472
CD included No
Mark 9

Eric S Raymond, Deb Cameron, Bill Rosenblatt, Marc Loy, Jim Elliott, Learning GNU Emacs 3ed

GNU Emacs has been the editor of choice for many users for many years. Despite new operating systems, environments and applications, emacs still has a place in the toolbox for both new and old users. I talked to the authors of Learning GNU Emacs, Third Edition: Eric S Raymond, Deb Cameron, Bill Rosenblatt, Marc Loy, and Jim Elliott about the emacs religion, nervous keyboard twitches and whether emacs has a future in an increasingly IDE driven world.

High Performance Linux ClustersWell, I guess the answer to the age-old geek question of ‘emacs’ or ‘vi’ is pretty much covered with this book?

Jim Elliott (JJE): We pretty much start with the assumption that people picking up the book want to know about Emacs. I had fun following the flame wars for a while a decade ago, but we’ve moved on. Some of my best friends and brightest colleagues swear by vi.

Bill Rosenblatt (BR): I try not to get involved in theological arguments.

Deb Cameron (DC): Like all religious questions, you can only answer that for yourself.

Eric S. Raymond (ESR): Oh, I dunno. I think we sidestepped that argument rather neatly.

Marc Loy (ML): I think the other authors have chimed in here, but this book “preaches to the choir.” We don’t aim to answer that religious debate. We just want to help existing converts! Of course I think emacs! but I’m a bit biased.

Could you tell me how you (all) got into using emacs?

ESR: I go back to Gosling Emacs circa 1982 — it was distributed with the variant of 4.1BSD (yes, that was 4.*1*) we were using on our VAX. I was ready for it, having been a LISP-head from way back.

ML: During my first programming course at college, I went to the computer lab and sat down in front of a Sun X terminal. There were two cheat-sheets for editors: emacs and vi. They were out of the vi batch at the time. So I jumped head first into emacs. By the time they had the vi batch replenished, I was hooked and never looked back.

DC: At a startup in Cambridge where I worked, vi was the sanctioned editor. But Emacs evangelists were on the prowl, offering to teach the one true editor in private sessions. Support people threw up their hands in disgust as yet another one one turned to Emacs, though this was too early for GNU Emacs. It was CCA Emacs. The only problem in my opinion was the lack of a book, like O’Reilly’s Learning vi. That gap was the impetus for writing this book.

JJE: I was introduced to the mysteries when I was a co-op intern at GE’s Corporate R&D Center in upstate New York, near my undergraduate alma mater, Rensselaer Polytechnic Institute. My mentor and colleagues handed me a cheat sheet and introductory materials, and I took to it like a fish to water, after getting over the initial learning curve. We were developing graphical circuit design software on SUN workstations, creating our own object-oriented extensions to C, since there was not yet a viable C++ implementation, never mind Java.

BR: I was working as a sysadmin in the mid-1980s at a software company that did a lot of government contract work. I was on a project that required relatively little of my time, so I had a lot of time on my hands. I had some exposure to emacs from a previous job, and I decided, rather than just doing crossword puzzles all day, to spend my time learning GNU Emacs as a means of learning LISP. I ended up contributing some code to GNU emacs, such as the first floating point arithmetic library.

Emacs uses a fairly unique keyboard control mechanism (C-x C-s for save, for example). Do you think this is one of the reasons why many find emacs confusing?

ML: Certainly! But for those that can get past this (large) initial hurdle, I think the keyboard controls increase general productivity. The amount of text manipulation I can do all while “touch typing” in emacs has always impressed me.

DC: I think new users might find Emacs confusing either by reputation or because they don’t have this book or haven’t tried the tutorial. C-x C-s is like any finger habit, easy to acquire and with Emacs, easy to change if you so desire, even if you’re not a LISP hacker. And cua mode lets you use more common bindings easily if your fingers aren’t already speaking Emacs natively.

JJE: Undoubtedly. That’s a big part of the learning curve. But it’s much less of a problem than it used to be, now that keyboards have so many extra keys (like separate, reliable arrow keys, page movement keys, and the like). And, even more importantly, there is is now by default a visible menu bar and icons to fall back on until you learn the more-efficient keyboard commands. Old hands will remember how much of a nightmare the heavy use of control characters (especially C-s and C-q) used to be, when using modems to dial in through text terminal servers. These almost always interacted poorly with the terminal server’s flow control, and there were usually a couple of other problem keystrokes too. Now that we’re all using TCP/IP and graphical environments, people have it easy!

BR: It tends to divide the programmers from the nonprogrammers. Programmers tend to think that control keys are more, not less, intuitive than using regular letters and numbers like vi. But then maybe I’m just showing signs of religious bigotry.

ESR: Probably the biggest single one, at least judging by the way my wife Cathy reacts to it.

Emacs is something of a legend - how much longer can we expect to see emacs as a leading editor and environment; especially when compared to IDEs like Eclipse?

ML: That’s an excellent question. I doubt it will ever disappear, but I do see it losing ground to focused IDEs. For example, I use Eclipse for my Java programming, but I have it set to use emacs keyboard shortcuts.

DC: Emacs offers infinite flexibility and extensibility. Nothing else offers that. As long as there are hackers, there will be Emacs.

ESR: There will always be an Emacs, because there will always be an ecological niche for an editor that can be specialized via a powerful embedded programming language.

JJE: To elaborate on ESR’s response, editors like Eclipse and JEdit give you powerful and flexible customization through Java, and tend to ship with better basic support for advanced language features and refactoring operations, and it’s easy to look a lot better than Emacs at first glance. But there isn’t anything that compares to its breadth, and how amazingly quickly and flexibly you can extend it if you want to. That’s something that comes from using LISP. You really can’t beat it for deep, dynamic control and power. (And I hope readers unfamiliar with LISP will take the opportunity Emacs gives to explore and learn it; the exercise of becoming a competent LISP hacker is extremely valuable in developing deep programming skills.) I use Eclipse for editing Java, but I use Emacs for most everything else.

BR: I think there will always be a role for emacs, because of its extensibility and the fact that visual programming environments are largely cosmetic rather than substantive improvements over character-oriented ones. The day when visual programming languages (as opposed to those written with ascii characters) become popular is the day when emacs will possibly become obsolete. There’s little better evidence of emacs’s longevity than the fact that you are interviewing us for a book that was originally written about 15 years ago (in fact, I am somewhat amazed that you are doing this). There are very few tech books that have been around that long. It’s because of the longevity of the software.

I find it pretty hard - and I’ve been using emacs for 15 years - to find something that emacs can’t do; is there anything that you think should be supported by emacs but currently isn’t?

JJE: The Unicode support is still very rough-edged, given the wrong approaches that were originally taken. It’s hard to work with Asian alphabets, and XML documents with mixed alphabets, without getting a little nuts. But that’s something that rarely affects me.

ML: Jim Elliott mentioned the Unicode support. Being a Java programmer, I sorely miss that feature. In every other regard, I continue to be surprised by what emacs can do or be taught to do. I suppose the quantity of .emacs files and chunks of LISP code out there are a testament to the stability of this editor.

DC: There are things I’d like to see, but what you find is they’re in the works. An easier approach to character encoding is one, and that’s coming in Emacs 22.

ESR: Not since tramp.el got integrated and made remote editing easy. That was the last item on my wishlist.

Do you think it odd that there are certain parts of functionality that are only available through shell commands - spelling, for example - do you think these should be embedded to help emacs become even more of a one stop shop?

ML: Well, I’ve never used emacs for a text editor, so those shell-escaped features never got in my way. Features like spelling certainly would be welcome, but I don’t think that has a big influence on the folks who are picking up emacs–certainly not on the folks who continue to use it.

ESR: No opinion. I laugh at spellcheckers, and I’m not sure what other things fall in this category.

DC: Spellchecking is embedded now with ispell and flyspell.

JJE: I think we show ispell does a really good job of deeply integrating the spell checking process into the Emacs experience. There’s no reason not to take advantage of external processes for things like this. That’s always been the Emacs (and Unix) philosophy; don’t reinvent things that you can leverage instead.

BR: I think that’s really just a question of demand. If people want spell checking as a built-in command, it’s pretty easy to extend emacs to make that happen through the ability to pipe text through a process.

Emacs has itself become either the source or inspiration of a few other GNU projects (GNU info, for example). Do you see this as a dilution or an endorsement of the technology built into emacs?

ML: I see it as an endorsement, definitely.

ESR: An endorsement, fairly obviously.

DC: An endorsement, of course. Emacs is the grandaddy of ‘em all.

JJE: Endorsement, definitely! You can’t be sure something is useful until it’s been reused at least three times.

BR: Certainly it’s an endorsement. GNU emacs contains a lot of code that is quite useful elsewhere. One example of this is the process checkpointing routine (unexec). I wrote an article, about a zillion years ago in a short-lived journal called SuperUser, about interesting uses for unexec.

Emacs is something of a behemoth compared to solutions like vi and nano, do you think this makes new users - of Linux particularly - loathe to use it, when it’s often not included as part of the basic installation tool set (for example Gentoo and others)?

ML: I’m sure it has an effect on new users. But vi isn’t a piece of cake, either! The new folks that I have seen picking up emacs are doing it to follow others they see using it and enjoying. They go looking for it. If it’s not installed, that simply adds one step to the process–a step we cover in the book for several platforms, by the way.

DC: Once upon a time Emacs was the only behemoth, but now that’s pretty common and the build process is easy for Linux if it’s not included or if the version included isn’t the latest. There are easy installs for other platforms too, so you can use Emacs no matter what platform you might be (forced into) using at the moment. I run it on three platforms.

JJE: There used to be some truth to this criticism, remember the old jokes about what Emacs stood for, like “Eight Megs And Constantly Swapping”? But the rest of the computing world has long ago swept by. Emacs is now tiny and tight compared to much software people encounter. Have you looked at Word lately?

ESR: Don’t ask me to think like a new user; I’m afraid I’m far too aged in evil for that.

BR: Perhaps, yes.

Does everybody here have the same nervous C-x C-s twitch while working in other non-emacs editors that I do?

ML: Daily! That’s why I had to switch the shortcuts in Eclipse.

ESR: Heh. No. Actually, I have both emacs and vi reflexes in my fingers, and I almost never cross up in either direction.

DC: Well, Emacs is so good at saving your work in an pinch that I get nervous only if I’m using something else.

JJE: I only tend to get tripped up when I encounter environments people have set up where one editor is trying to pretend to be another. Usually the context is enough for me to reach for the right keys. One thing I very much enjoy about Mac OS X is the way that the standard Apple frameworks used to build modern applications (the ones that came from NeXTStep) all support basic Emacs key bindings.

You say at the start that you weren’t able to include everything you wanted; emacs includes its own programming language which you only touch on for example - but is there anything that didn’t make it into the book that you really, really wanted to include?

ML: I think we managed to cover all of my big ticket items. I’m really happy with the coverage provided for folks learning Emacs. I still use it myself for reminders on the .emacs configuration and font control.

DC: Probably what I would have most liked to include and couldn’t in this rev was Emacspeak, the voice interface to Emacs.

JJE: Deb was the primary driving force behind what got into the third edition.

BR: More on LISP and extensibility, certainly. We had to stick to the fundamentals and only take it so far.

The logistics of five authors for one book must have been interesting?

ML: Actually, with Deb Cameron managing things, it was quite simple. She did a fantastic job–and did a majority of the new work in this edition herself. Jim Elliott and I both worked with her on the second edition of the Java Swing book and had no trouble jumping in to help her finish this book.

ESR: No, it was like one of those album sessions you read about where through the magic of multi-track recording the band members never actually have to be in the same studio at the same time. I only wrote two chapters, fairly cleanly separated from the rest of the book, and never had to interact with the other four authors much.

JJE: It worked very well; Deb’s great at coordinating this sort of thing, and she, Marc and I had worked together in the past on the Java Swing effort.

BR: Well, we were brought on at different times to do different pieces of the book, so it wasn’t a big deal. I wrote roughly the last half of the first edition; the only other author at the time was Deb Cameron. The other authors came along later.

Are any of you working on any new titles we should keep our eyes peeled for?

ML: I’m happily on a writing hiatus, but that never seems to last long.

DC: I’m editing the latest edition of O’Reilly’s Java Enterprise in a Nutshell, a revolutionary revision of that book that includes the best of open source tools in addition to the standard stuff. Check out this article.

JJE: I know that Hibernate: A Developer’s Notebook needs to be revised to cover Hibernate 3. I am hoping to find time to do that this summer or fall, but it’s been a hard year so far because of some health issues in my family. I miss writing! But other things are sometimes more important.

BR: I currently write a newsletter on Digital Rights Management called DRM Watch (www.drmwatch.com). It’s published by Jupitermedia, and it’s a free weekly email subscription. It provides balanced coverage of the subject; I’m somewhere to the left of Big Media and to the right of the EFF.

ESR: I’m going to do a fourth edition of “The New Hacker’s Dictionary” sometime soon.

Author Bios

Marc Loy

Marc Loy is a trainer and media specialist in Madison, WI. When he’s not working with digital video and DVDs, he’s programming in Java. He can still be found teaching the odd Perl and Java course out in Corporate America, but even on the road he’ll have his PowerBook and a video project with him.

James Elliott

James Elliott is a senior software engineer at Berbee, with fifteen years professional experience as a systems developer. He started designing with objects well before work environments made it convenient, and has a passion for building high-quality Java tools and frameworks to simplify the tasks of other developers.

Bill Rosenblatt

Bill Rosenblatt is president of GiantSteps Media Technology Strategies, a New York-based management consulting firm whose clients include content providers and media technology companies ranging from startups to Fortune 100 firms.

Bill’s other titles for O’Reilly are Learning the Korn Shell and (with Cameron Newham) Learning Bash. He is also the author of Digital Rights
Management: Business and Technology (John Wiley & Sons) and editor of the Jupitermedia newsletter DRM Watch (www.drmwatch.com).

Debra Cameron

Debra Cameron is president of Cameron Consulting. In addition to her love for Emacs, Deb researches and writes about emerging technologies and their applications. She is the author of Optical Networking: A Wiley Tech Brief, published by John Wiley & Sons, which covers the practical applications and politics of optical networking.

Deb also edits O’Reilly titles, including Java Enterprise in a Nutshell, Java in a Nutshell, JavaScript in a Nutshell, Essential SNMP, Cisco IOS in a Nutshell, TCP/IP Network Administration, Java Security, Java Swing, Learning Java, and Java Performance Tuning.

Eric S Raymond

Eric is an Open Source evangelist and author of the highly influential paper “The Cathedral and the Bazaar.” He can be contacted through his website, Eric S. Raymond

Linux in a Windows World by Roderick Smith

Note: This review was originally published in Free Software Magazine
Linux in a Windows World
Linux in Windows World aims to solve the problems experienced by many system administrators when it comes to using Linux servers (and to a lesser extent clients) within an existing Windows environment. Overall the book is meaty and a quick flick through shows an amazing amount of information has been crammed between the covers. There are though some immediately obvious omissions, given the books title and description, but I’m hoping this won’t detract from the rest of the content.

The contents

The book starts off with a look at where Linux fits into a Windows network, covering its use both as a server and desktop platform. Roderick makes some salient points and arguments here, primarily for, rather than against, Linux but he’s not afraid to point out the limitations either. This first section leads on to a more in depth discussion of deploying a Linux system into your network, promoting Linux in a series of target areas – email serving, databases and so on – as well as some strategies for migrating existing Windows desktops to Linux.

The third chapter and the start of the second section starts to look in detail at the various systems and hurdles faced through using Linux within an existing heavily Windows focused environment. This entire section is primarily devoted to Samba and sharing and using shared files and printers.

Section 3 concentrates on centralized authentication, including using LDAP and Kerberos in place of the started Windows and Linux solutions.

Remote login, including information on SSH, Telnet and VNC make up content of the fourth section. Most useful among the chapters is the one on Remote X Access which provides vital information on X server options for Windows, and information on configuring XDMCP for session management.

The final section covers the installation and configuration of Linux based servers for well-known technologies such as email, backups and network manage (DNS, DHCP etc).

Who’s this book for?

Overall, the tone of the book is geared almost entirely towards administrators deploying Linux as a server solution and migrating your Windows clients to using the Linux server. The “integration” focus of the book concentrates on replacing Windows servers with Linux equivalents, rather than integrating Linux servers and clients into an existing Windows installation.

All these gaps make the book a “Converting your Windows World to Run on Linux Servers” title, rather than what the book’s title (and cover description) suggests. If you are looking for a book that shows you how to integrate your Linux machines into your Windows network, this book won’t help as much as you might have hoped.

On the other hand, if you are a system administrator and you are looking for a Windows to Linux server migration title then this book will prove invaluable. There are gaps, and the book requires you to have a reasonable amount of Linux knowledge before you start, but the information provided is excellent and will certainly solve the problems faced by many people moving from the Windows to a Linux platform.

Pros

There’s good coverage here of a wide range of topics. The information on installing and configuring Linux equivalents of popular Windows technologies is very nice to see, although I would have preferred some more comparative information between the way Windows and the Linux counterparts work and operate these solutions.

Some surprising chapters and topics also shine through. It’s great to see the often forgotten issue of backups getting a chapter of its own and the extensive information on authentication solutions are invaluable.

Cons

I found the organization slightly confusing. For example, Chapter 3 is about using Samba, but only to configure Linux as a server for sharing files. Chapter 4 then covers sharing your Linux printers to Windows clients. Chapter 6 then covers the use of Linux as a client to Windows for both printer and file shares. Similarly, there is a chapter devoted to Linux Thin Client configurations, but the use of rdesktop, which interfaces to the Windows Terminal Services system, has been tacked on to the end of a chapter on using VNC.

There are also numerous examples of missed opportunities and also occasionally misleading information. Windows Server 2003 for example has a built in Telnet server and incorporates an extensive command line environment and suite of administration tools, but the book fails to acknowledge this. There’s also very little information on integrating application level software, or the client-specific integration between a Linux desktop and Windows server environment. A good example here is the configuration of Linux mail clients to work with an existing Exchange Server, which is quite happy to work with standard IMAP clients. Instead, the book suggests you replace Exchange with a Linux-based alternative, and even includes solutions for configuring this solution.

Finally, there are quite a few obvious errors and typos – many of which are in the diagrams that accompany the text.

In short
Title Linux in a Windows World
Author Roderick W Smith
Publisher O’Reilly
ISBN 0596007582
Year 2005
Pages 478
CD included No
Mark 8

Joseph D Sloan, High Performance Linux Clusters

Getting the best performance today relies on deploying high performance clusters, rather than single unit supercomputers. But building clusters can be expensive, but using Linux can be both a cheaper alternative and make it easy to develop and deploy software across the cluster. I interview Joseph D Sloan, author of High Performance Linux Clusters about what makes a cluster, how Linux cluster competes with Grid and proprietary solutions and how he got into clustering technology in the first place.

High Performance Linux ClustersClustering with Linux is a current hot topic - can you tell me a bit about how you got into the technology?

In graduate school in the 1980s I did a lot of computer intensive modeling. I can recall one simulation that required 8 days of CPU time on what was then a state-of-the art ($50K) workstation. So I’ve had a longtime interest in computer performance. In the early 1990s I shifted over to networking as my primary interest. Along the way I set up a networking laboratory. One day a student came in and asked about putting together a cluster. At that point I already had everything I needed. So I began building clusters.

The book covers a lot of material - I felt like the book was a complete guide, from design through to implementation of a cluster - is there anything you weren’t able to cover?

Lots! It’s my experience that you can write a book for beginners, for intermediate users, or advanced users. At times you may be able to span the needs of two of these groups. But it is a mistake to try to write for all three. This book was written to help folks build their first cluster. So I focused on the approach that I thought would be most useful for that audience.

First, there is a lot of clustering software that is available but that isn’t discussed in my book. I tried to pick the most basic and useful tools for someone starting out.

Second, when building your first cluster, there are things you don’t need to worry about right away. For example, while I provide a brief description of some benchmarking software along with URLs, the book does not provide a comprehensive description of how to run and interpret benchmarks. While benchmarks are great when comparing clusters, if you are building your first cluster, to what are you going to compare it? In general, most beginners are better off testing their cluster using the software they are actually going to use on the cluster. If the cluster is adequate, then there is little reason to run a benchmark. If not, benchmarks can help. But before you can interpret benchmarks, you’ll first need to know the characteristics of the software you are using-is it I/O intensive, CPU intensive, etc. So I recommend looking at your software first.

What do you think the major contributing factor to the increase of clusters has been; better software or more accessible hardware?

Both. The ubiquitous PC made it possible. I really think a lot of first-time cluster builders start off looking at a pile of old PCs wondering what they can do with them. But, I think the availability of good software allowed clusters to really take off. Packages like OSCAR make the task much easier. An awful lot of folks have put in Herculean efforts creating the software we use with very little thought to personal gain. Anyone involved with clusters owes them a huge debt.

Grids are a hot topic at the moment, how do grids - particularly the larger toolkits like Globus and the Sun Grid Engine - fit into the world of clusters?

I’d describe them as the next evolutionary stage. They are certainly more complex and require a greater commitment, but they are evolving
rapidly. And for really big, extended problems, they can be a godsend.

How do you feel Linux clusters compare to some of the commercially-sourced, but otherwise free cluster technology like Xgrid from Apple?

First, the general answer: While I often order the same dishes when I go to a restaurant, I still like a lot of choices on the menu. So I’m happy to see lots of alternatives. Ultimately, you’ll need to make a choice and stick to it. You can’t eat everything on the menu. But the more you learn about cooking, the better all your meals will be. And the more we learn about cluster technology, the better our clusters will be.

Second, the even more evasive answer: Designing and building a cluster requires a lot of time and effort. It can have a very steep learning curve. If you are already familiar with Linux and have lots of Linux boxes, I wouldn’t recommend Xgrid. If you are a die-hard Mac fan, have lots of Mac users and systems, Xgrid may be the best choice. It all depends on where you are coming from.

The programming side of a grid has always seemed to be the most complex, although I like the straightforward approach you demonstrated in the book. Do you think this is an area that could be made easier still?

Thanks for the kind words. Cluster programming is now much easier than it was a decade ago. I’m a big fan of MPI. And while software often lags behind hardware, I expect we’ll continue to see steady improvement. Of course, I’m also a big fan of the transparent approach taken by openMosix and think there is a lot of unrealized potential here. For example, if the transparent exchange of processes could be matched by transparent process creation through compiler redesign, then a lot more explicit parallel programming might be avoided.

What do you think of the recent innovations that puts a 96-node cluster into a deskside case?

The six-digit price tag is keeping me out of that market. But if you can afford it and need it …

Go on, you can tell me, do you have your own mini cluster at home?

Nope-just an old laptop. I used to be a 24/7 kind ‘a computer scientist, but now I try to leaving computing behind when I go home.
Like the cobbler’s kid that go without shoes, my family has to put up with old technology and a husband/father that is very slow to respond to their computing crises.

When not building clusters, what do you like to do to relax?

Relax? Well my wife says …

I spend time with my family. I enjoy reading, walking, cooking, playing classical guitar, foreign films, and particularly Asian films. I tried learning Chinese last year but have pretty much given up on that. Oh! And I do have a day job.

This is your second book - any plans for any more?

It seems to take me a couple of years to pull a book together, and I need a year or so to recover between books. You put so many things on hold when writing. And after a couple of years of not going for a walk, my dog has gotten pretty antsy. So right now I’m between projects.

Author Bio

Joseph D. Sloan has been working with computers since the mid-1970s. He began using Unix as a graduate student in 1981, first as an applications programmer and later as a system programmer and system administrator. Since 1988 he has taught computer science, first at Lander University and more recently at Wofford College where he can be found using the software described in this book.

You can find out more on the author’s website. More information on the book, including sample chapters, is available at O’Reilly.

Improved application development: Part 1, Collating requirements for an application

My latest Rational piece is up on the IBM site. This is an update of the series I co-wrote last year on using a suite of Rational tools for your development projects. The latest series focuses on the new Rational Application Developer and Rational Software Modelere, which are based on the new Eclipse 3.0 platform.

Developing applications using the IBM Rational Unified Process is a lot easier if you have the tools to help you throughout the process. The Rational family of software offers a range of tools that on their own provide excellent support for each phase of the development process. But you can also use the different tools together to build an entire application. By sharing the information, you can track components in the application from their original requirement specification through testing and release. This first part of a five-part series shows how to use Rational RequisitePro to manage and organize the requirements specification for a new project. Then, after you’ve developed your unified list of requirements, the tutorial shows how to use Rational Software Modeler to model your application based on those requirements.

You can read the full article.

If you’ve finished it and want more, check out Improved application development: Part 2, Developing solutions with Rational Application Developer.

Using HTTP Compression

I have a new article up at ServerWatch which looks at the benefits and configuration of HTTP compression within Apache and IIS. Here’s an excerpt from the intro:

There’s a finite amount of bandwidth on most Internet connections, and anything administrators can do to speed up the process is worthwhile. One way to do this is via HTTP compression, a capability built into both browsers and servers that can dramatically improve site performance by reducing the amount of time required to transfer data between the server and the client. The principles are nothing new — the data is simply compressed. What is unique is that compression is done on the fly, straight from the server to the client, and often without users knowing.

HTTP compression is easy to enable and requires no client-side configuration to obtain benefits, making it a very easy way to get extra performance. This article discusses how it works, its advantages, and how to configure Apache and IIS to compress data on the fly.

Read on for the full article.

Interview with Tom Jackiewicz, author of Deploying OpenLDAP

My first article for LinuxPlanet is an interview with the author of Deploying OpenLDAP, Tom Jackiewicz. The book is an excellent guide to using and abusing the OpenLDAP platform.

As well as the contents of the book, I talked with Tom about the uses and best environments for LDAP solutions, as well as technical requirements for OpenLDAP. We also have a little discussion about the complexities of the LDAP system.

You can read the full interview.