Tuesday, December 14, 2004

k3b iso burning

After reading gill's k3b story http://www.valdyas.org/fading/index.cgi/software/bright.html I thought I would chip in. I did the exact same thing the first time I wanted to burn an iso using k3b. I opened the iso image (really putting it into the current "image") and then burned it. What really happened was I got a dud that had 1 file.... the iso. Now of course once you have done it once I too know how to burn an iso image using k3b, but this presents a interesting area of UI. As a developer because you know every area of your app so well that you can't ever look at the app as a noobie. It is surprising what you find when you plop someone down in front of your application who has never seen it before and see what they do (and don't say *anything*!). They will probably learn quickily what to do, but then your job is to go back and rework the app so newbies can learn even faster. In k3b's case burning dude cd's is a big learning expense.

Tuesday, November 16, 2004

Babies

I once heard that one of babies evolutionary advantages is that they are "cute". The other day I was thinking and spun it around. Maybe they aren't cute at all, but we evolved to think they were cute so we wouldn't kill them.

Thursday, October 28, 2004

New Job

I have changed jobs and am now working for SIAC (Securitys Industry Automation Corporation) which means I get to work on the NYSE computer systems. I take a two hour train rid into Brooklyn every day so I have been catching up on quite a lot of reading.

Monday, October 18, 2004

echo "awk isn't hard" | awk '{ print $0 }'

When I first was intersted in Linux and given a Linux book I read it cover to cover. When it got to awk it had just a page or two saying something like: "Awk is Soooo cool it can do *anything*. but we wont cover any of that here, go buy another book." So I didn't bother learning any awk. I mean you can technically do just about everything without awk which is what I did for years. But as the years when buy I read plenty of Linux101 books. Each and every one would barly cover awk and tell me to go and learn it on my own or buy another book. Last week while looking for a way to parse out lines 30-40 in a file I knew I could do it in bash, but someone showed me how to do it in 1 line of awk. Tweaking the script I ran across a website with some more docs on awk. Seeing something I could use I re-wrote another function in a different script I had reducing its runtime from several minutes to just seconds! (probably from all the overhead of calling the dozen different bash mini-apps and piping involved) With that discovery I was hooked. I read the awk online docs, enough to know what it can do so when I want to do something in awk I will know that it can and just have to go back and refure to the doc for the details. So now that I have spent less then a week knowing awk and I find myself wondering why I never learned it before now. Why don't Linux books cover awk? A small chapter could cover a whole lot and get someone started with awk. Maybe the authors themselved don't know awk because they never learned because it was mising from the books they read...

Thursday, September 16, 2004

Looking for a Qt job in NYC area?

If you are looking for a Qt job in the NYC area the company I currently work (SIAC subsidiary of NYSE and the American Stock Exchange) for is looking for some Qt devs (just not in my group) for X11/UnixQt work. You can find the job postings on siac.com Make sure to mention my name or e-mail me and I can pass on your resume. :D

Monday, September 13, 2004

Image Caption CSS

This is short tutorial on how to make images with captions using css without having to resort to using <br> and <div style="clear:both">. I have come across a number of different ways to do this, but each website was only a partial techneque. Putting them all together I wrote this quick tutorial for others and named it something you would search for.

The basic idea is to make the background of the <a> item the image you wish to display and extend the link area to suround the image.

First for my example a small list (taken from my link bar).

<div class="topbar">
<ul>
<li><a class="home" href="/">Home</a></li>
<li><a class="about" href="about/">About</a></li>
<li><a class="programs" href="programs/">Programs</a></li>
</ul>
</div>


Next the CSS.

div.topbar ul { float: left; padding: 0; }
div.topbar li { float: left; display: inline; }
div.topbar a {
float: left;
text-align: center;
padding-top: 32px;
min-width: 32px;
background-position: center top;
background-repeat: no-repeat;
}

div.topbar a.home { background-image: url(pics/home.png); }
div.topbar a.about { background-image: url(pics/about.png); }
div.topbar a.programs { background-image: url(pics/programs.png); }


When done you end up something like this:




There are two main advantages to using this method rather then using img's inside of a "a" tags combined with "br" between the img and text.

  1. First off you don't have to add a clear:both as the final item. This allows for cleaner html and a much easier time modifing the layout of the page just by editing the css file and not having to worry about adding or removing empty div's with a clear in them.

  2. As a list (or just items) you present only one item to the user so when scanning the page by text readers they only see one thing (the link).

  3. No float wondering

  4. It renders in all browsers including MSIE for the Apple.

  5. If it isn't generated the images can be changed or removed just by editing the css file.

Sunday, September 12, 2004

Cleaning out the closet

Computers are one of those hobbies that if you do not watch yourself, you can acquire a vast amount of junk. You will find yourself spending an inordinate amount of time organizing, maintaining, and moving it around. When I first got into computers I kept everything. If someone was getting rid of computer equipment, I would take it. In addition, when I would upgrade something I would keep the old one. What follows is my account of how I purged my computer collection.

The first thing to go was a 500 MHz computer that I really did not use. By using spare parts and including some accesories, I upgraded the computer and sold it for $100. Some things I added were a 10/100 network card, TNT2 video card, 32X CD drive, CD-R drive, and some memory. Installing a legal copy of Microsoft Windows98 onto it and combining it with a mouse, keyboard, and scanner made it a decent computer to sell. I was also able to put together two more (less powerfull) complete computers which I gave away for free. A 486 that had too many problems to have been kept in the first place was tossed out.

I had several hundred computer CD's that was mostly windows software. On the first pass of my purge I was able to toss out many of them that I knew I would never use again like Microsoft FrontPage 97 and many games that I never really played. I went through the games to see if they would work on Microsoft Windows XP. Out of the hundred or so original games that I had, I ended up with seventeen that worked and I would still play. I then tackled CD's that I call "Just Cool Stuff". These dozen CD's contained things that I had found amusing over the years. Coping them onto my computer I started deleting junk like the 320x240 trailer for "The Matrix" from 1999. In the end I was left with a single cd of things that I had created. I had seven CD's of Windows utilities so when the time came for re-installs I had them all in one place. Upon investigation I found them to be filled with wonderful things like Microsoft Internet Explorer 4.0, Netscape 4.5, AIM 1.0 and plenty of other utilities that I will never use again. Consolidating these, I got it back down to one CD and tossed the rest away.

From time to time I buy new computers books and have several shelves worth of them. I also had many manuals from software and hardware. I had simple stuff like, how to install a cdrom drive, to the book sized "Understanding Office97". Who would have thought that having computer parts would end up causing you to use shelf space? I quickly recovered a whole shelf from tossing out stuff like "Learn JavaScript", Java books, "Learn SoftImage", windows manuals/intro books, game manuals/cards and countless hardware leaflets.

Looking around my room I found myself with three printers, none of which worked very well. I got rid of both inkjet printers and kept the laser printer saving a lot room.

With the age of cable modems and USB drives, all of my floppies were in a box under the bed. One afternoon I went through them looking for anything I wanted to keep and then tossing the rest out. I did keep a dozen that were still in shrink wrapped packaging for the day I would need a boot floppy.

After getting rid of so much, I found I could toss out tons of other stuff that was piling up in my house. What follows is some of the stuff, if I couldn't find anyone who wanted them I tossed out.

  • IDE and floppy cables.
  • Two power supplies (AT and ATX).
  • A dozen 10MB network cards (both ISA and PCI) and network cable out the wazoo.
  • ISA and PCI Audio cards all of which are inferior to the cheap ones built on motherboards these days.
  • Video cards from defunct manufacturers or cards that had gone bad (but I still kept for some reason).
  • Video capture cards, FM tuners and everything related.
  • 500MB-2GB hard drives that were slow, noisy, or bad.
  • 386, 486, Pentium, AMD CPUs, old ram, and CPU fans.
  • Two old cdrom drives that you could not boot off of.
  • Three 5 1/4 Floppy drives.
  • Several 10MB hubs and switches.
  • A dozen different mice and keyboards and adaptors.
  • Random stuff like 486 cpu remover, PCI slot fan, extra cd to audio card connectors, and bags of screws.
  • Eight spare printer cables.


The amazing part about computer parts is just how quickly a dependency tree grows. For example because I had a P133 computer I kept my ISA cards, serial mice, AT keyboards and the PS/2 keyboard adaptors. The ISA cards themselves had floppies (containing the drivers) and manuals. By giving away that computer I was able to toss out everything else. By getting rid of a few key computer parts you might find you can get rid of a whole lot more.

When you acquire enough of something you have to store it somewhere or in something. I had several boxes in the closet filled with stuff, another container under the bed, a shelf for manuals, and a CD rack for drivers. In life if you have to go out and buy a container think twice about why you need that container. If you get that container it means a commitment both in physical space and in time to maintain all the stuff you wish it to contain.

Giving away stuff is harder then you might think. I know someone could use the books that I tossed out, but keeping stuff around just to give to people who might want it, starts becoming a full time job! From remembering to bring stuff over to your friends house, to just inquiring to other people if they want your junk takes a lot of time. Unless you really know someone who wants what you are going to toss out, do not keep it just get rid of it.

I am very happy with where I am today compared to two years ago and am not going to let myself fall back to where I was. It is much easier to keep stuff then to get rid of it and that is the trap.

Tuesday, July 13, 2004

A small bug in almost every KDE app...

My favorite features of KConfigXT is that when you open an app that uses KConfigXT it won't dump any settings to your home directory unless you change them to something other than the default value. This same feature has been incorporated into the KMainWindow class (among others) with the window size. If you open an app and close it without changing its size it wont save the size as it is the default. At least that is what it should do.

Looking through my .kde/share/config/ directory this evening I noticed that the applications that used kconfig that I simply had opened to look were empty except the default window size of the app, something that shouldn't have been there. A quick look at kmainwindow I found that it did remove the entry if what it thought was default was the final size on exit. So what was going wrong.... Turned out that what it read as the initial size was in fact not the default. I began looking at the applications that had this problem and right off the bat found a reason for this problem.

Most of the time the problem revolved around applications that had hard coded sizes. The simplest problem was when apps called resize() after setAutoSaveSettings() or setupGUI() (setupGUI() calls setAutoSaveSettings() ). setAutoSaveSettings() when called will load the last saved size, and if an applications calls resize() right after that, then users will never see the size saved. If the resize() was before the function calls there was problems too. If I had a theme that was larger then what the developer used or larger default fonts or the size was simply from when the app was smaller or if kde ever moves to a different size default toolbar having resize before those two function will cause a problem as the initial size wouldn't be the default size presented to the user. So simply removing resize should solve the problem right?

Removing hard coded anything is good and what should happen is the size will be determined by the widget re-sizeing itself when the widget is shown. In all the cases I have run across setupGUI() is called before show() and thus nothing has automatically been resized yet. To fix this setupGUI() will now check and to see if the widget hasn't been shown and call adjustSize() to fix this. All done right?

If setupGUI() is called before setCentralWidget() than adjustSize() won't be applied to the main widget and the "default" size will be wrong.Ok so now that apps don't hard code the size and everything is created before setAutoSaveSettings() is called so the default size is correct right?

Apps that have widgets that inherit from QWidgets like in most games often times didn't impliment sizeHint. The default sizeHint of QWidget I think is 0x0 which is not what we want for the default size of the main widget. So for those widgets the sizeHint() function needs to be added.

And finally after all that, KMainWindow now has the proper value for the defaultSize when setAutoSaveSettings() is called. What started out as a simple fix for ktron ended up uncoverting a lot of small issues all over in many different apps. Beyond the original goal, migrating the apps to automatically determine their default size is much more elegant than the mired of different solutions that apps had implemented.

Some guildlines when fixing apps:
  • Remove hard coded resize() calls in the constructor or main, they should be removed in favor of letting the automatic resizing determine the default window size.
  • Put the setupGUI() call after all widgets have been created and placed inside the main window (i.e. for 99% of apps setCentralWidget())
  • Widgets that inherit from QWidget (like game boards) should overload "virtual QSize sizeHint() const;" to specify a default size.

Finally to test to make sure it is working; simply delete the applications config file, launch the app, exit it, and if the config file exists it shouldn't contain the width/height entries if everything is working.

Friday, June 25, 2004

KDE, New users and Knoppix?

Getting KDE into the hands of new users is much easier then converting current Linux users to KDE. For the most part that means having new users install Linux on a spare computer. When doing that they will probably go with whatever the default of the selected distribution. And whatever that desktop environment might be they will most likely stick with until a really compelling reason comes along. So the easiest way to get new KDE users is to get them playing with KDE before anything else which is where Knoppix comes in.

Knoppix comes with KDE and best of all doesn't require a install to play around with it. You can give out disks to just about everyone and they have the ability to play around with KDE and find all the little things that they like. With Knoppix a user is much more likely to play around with Linux sense the barrier to entry is so much lower. They most likely don't have any feelings to which desktop they want and will go with whatever is presented.
I the past six months I have been giving out Cd's to whomever has asked for a copy. At first I gave them out to those who I worked with, but at time went by I discovered other ways. The best was a simple note posted up saying (short version) "If anyone was intersted in playing with Linux, I had free copies of Knoppix, a version of Linux you don't have to install, but can run from the cd drive." Sense putting that up I have continuesly gotten a steady stream of people coming by for copies of the Knoppix. Most of them having never played with Linux before, but were interesting and wanted to learn more and saw this as a really easy way. These are the type of people that will make up the KDE user base in the future. With Linux only at 4% of the desktop there is plenty of users left to convert and it is a whole lot easier to show new Linux users KDE than it is to convince current users of other desktop environments.

So if you are around non-Linux users all day and have some spare CD-R's try posting up a message in a common area and see what kind of response you get. You might be surprised just how many people stop by.

Tuesday, June 01, 2004

File Encoding With KAudioCreator

In KAudioCreator there has been a long standing wish for the ability to encode wav files. Sense this is a task that KAudioCreator already does, it would seem like an easy thing to add right? Well a year later and a dozen ideas later I still am not too sure how to intigrate it into the interface. In CVS HEAD if you go File/Encode File it brings up a dialog that you can select a file to encode. I am not all that happy with this solution, but it seems the best at this time. So putting it out to everyone for some ideas on how this can be best implemented or if it should even be.

http://bugs.kde.org/show_bug.cgi?id=49119

Tuesday, May 11, 2004

This weekend I drove up to Boston. Not the longest trip by far, but at five hours it was long enough. While driving you inevitably drive in front of or behind someone else for a long period of time. There are all sorts of good reasons for doing it ranging from better milage, to pack behavior. The odd thing is how you can get "attached" to the other cars. Not that you know them or that they know you. Often you don't even see the person. But if someone were to get between the two of you, you might hope that they can get back behind you. And when they leave off on their exit you feel almost sad. Isn't life weird.

Monday, May 03, 2004

Anyone find it interesting how in the Linux world everyone is trying to have the one true toolkit? What is most amusing about it is that Linux is the one place where there will always be new toolkits. People trying out new ideas, hacking together something, releasing internal code as GPL, or just toolkits that will never ever die no matter how many people hate it. In contrast OS X and M$ have plenty of toolkits on their OS, but they all intigrate nice together (ok, not perfect but a heck of a lot better then Linux today). I predict that in five years KDE and Gnome will still be around, but not as the user knows it now. They will more closely work together and look the same. Users today that *only* install Qt or GTK, but not the other are stupid. It is really shortsighted to only look at applications written in your favorite toolkit. Just like developers should know more then one programing language they should be familiar with multiple toolkits so they can take advantage of whatever is best for what they need. With all of the work at freedesktop.org hopefully collaboration will be sooner than later.

Monday, April 19, 2004

DistccPPCKnoppix

DistccPPCKnoppix is a Knoppix distribution that contains distcc servers for both Linux x87 and OSX ppc compiling. With it you can utilize your extra x86 computers to build Linux x86 and OSX PPC binaries.

After running across "Building a Darwin cross compiler for use with distcc and fink" and "HOWTO: Use Gentoo Distcc Host to cross-compile for OSX" I quickly set out and build my own cross compiler to speed up my Apple OSX development. Like most people I have more x86 hardware then I do ppc. It didn't take long to get it build and a test distccd server up to show that it was working.

When I compile for Linux x86 I use distccKnoppix to utilize all my spare x86 computers I have lying around. Using the Knoppix Remastering Howto I quickly added the ppc cross compiler to distccKnoppix. Now when it boots two distccd demon's start on separate ports. The default port (3632) has the x86 compiler and the ppc compiler is on port 3633.

Now Apple developers finally have a use for their x86 hardware (that they can admit to). This is also a nice cheap way to speed up OSX compiling and development.

Check out my distcc optimizations article which can help you best utilize distcc.

XCode
I am very unhappy with XCode's distcc options. There is no option in the GUI to not compile on the local machine, the ability to change the order of boxes that compile, or the number of jobs they should run. See my article on optimizing distcc for why these options need to be there. With more then a few boxes compiling I wouldn't bother using XCode at all and follow the below steps for compiling even if it did work within XCode. I would have thought at Apple where they have more then a few boxes they would have relized this themselves! To compile your XCode project use the command line and run:

export DISTCC_HOSTS="remotehost:3633 localhost"
xcodebuild CC=/usr/bin/distcc CPLUSPLUS=/usr/bin/distcc

Unfortunately I don't know how to get xcodebuild to do the equivalent to make -j . Any hints are welcome. Here is another way that converts to make first so you can do the -j:
To compile your XCode project with distccppcknoppix in theory just follow these options:

1) Grab distcc from samba.org and build it and put it in your path ahead of the one Apple give for command line compiling. The one that Apple gives seems to be incompatible. Not 100% this step is required, so it can't hurt to try skipping it, just let me know if it works for you!

2) Use PBTOMAKE to convert the XCode project file into a Makefile. XCode seems to hard code the compiler (/usr/bin/gcc-3.3), bad XCode, bad!

3) If you have distcc installed, hosts specified and masquerading (see distcc docs or example above) setup you should be able to build from the command line.

If anyone can build Apple's version of distcc (download from their website) on x86 Linux (hehe or use distccppcknoppix to build it for x86) let me know and I will make a new cd with rendezvous and it. This will remove the above three steps and make it "seemlessly" integrate.

For detailed information about Distcc, Distccd, installation instructions, and FAQ visit their website at http://distcc.samba.org/

Update: I got on slashdot! Checkout the comments here.

distccPPCKNOPPIX-0.0.9.iso

Tuesday, March 30, 2004

distcc optimizations

and how to compile kdelibs from scratch in six minutes


If you don't already know about distcc I recommend that you check it out. Distcc is a tool that sits between make and gcc sending compile jobs to other computers when free, thus distributing compiles and dramatically decreasing build times. Best of all it is very easy to set up.

This, of course, leads to the fantastic idea that anyone can create their own little cluster or farm (as it is often referred to) out of their extra old computers that they have sitting about.

Before getting started: In conjunction with distcc there is another tool called ccache, which is a caching pre-processor to C/C++ compilers, that I wont be discussing here. For all of the tests it was turned off to properly determine distcc's performance, but developers should also know about this tool and using it in conjunction for the best results and shortest compile times. There is a link to the homepage at the end of this article.

Farm Groundwork and Setup


As is the normal circle of life for computers in a corporate environment, I was recently lucky enough to go through a whole stack of computers before they were recycled. From the initial lot of forty or so computers I ended up with twelve desktop computers that ranged from 500MHz to 866MHz. The main limit for my choosing dealt with the fact that I only had room in my cube for fifteen computers. With that in mind I chose the computers with the best CPU's. Much of the ram was evened out so that almost all of the final twelve have 256MB. Fast computers with bad components had the bad parts swapped out for good components from the slower machines. Each computer was setup to boot from the CD-ROM and not output errors when booting if there wasn't a keyboard/mouse/monitor. They were also set to turn on when connected to power.

Having enough network administration experience to know better, I labeled all of the computers, the power cord and network cord that was attached to them. I even found different colored cable for the different areas of my cube. The first label specified the CPU speed and ram size so later when I was given faster computers, finding the slowest machine would be easy. The second label on each machine was the name of the machine, which was one of the many female characters from Shakespears plays. On the server side a dhcp server was set up to match each computer with their name and IP for easy diagnosis of problems down the line.

For the operating system I used distccKNOPPIX. distccKNOPPIX is a very small Linux distribution that is 40MB in size and resides on a CD. It does little more than boot, gets the machine on line and then starts off the distcc demon. Because it didn't use the hard disk at all, preparation of the computers required little more than testing to make sure that they all booted off the CD and could get an IP.

Initially, all twelve computers (plus the build master) were plugged into a hub and switch that I had borrowed from a friend. The build master is a 2.7Ghz Linux box with two network cards. The first network card pointed to the Internet and the second card pointed to the build network. This was done to reduce the network latency as much as possible by removing other network traffic. More on this later though.

A note on power and noise, the computers all have on-board components. Any unnecessary pci cards that were found in the machines were removed. Because nothing is installed on the hard disks they were set to spin down shortly after the machines are turned on. (I debated just unplugging the hard disk, but wanted to leave the option for installation open for later.) After booting up and after the first compile when gcc is read off the CD the CD-ROM also spins down. With no extra components, no spinning CD-ROM or hard disk drives the noise and heat level in my cube really didn't change any that I could notice (there were of course jokes galore by everyone about saunas and jet planes when I was setting up the system).

Optimizations


Since first obtaining the computers, I have tweaked the system quite a bit. My initial builds of kdelibs with distcc took around 45 minutes, which I was very happy with, but as time went by I discovered a number of ways to improve the compile speed. Today it takes six minutes from start to finish to compile all of kdelibs using 17 450Mhz-866Mhz computers and 2 3Ghz machines.

localhost/1


Depending on how large your farm is, how fast it is, the speed of your network and the capability of your build box, playing around with the localhost variable in the host list can be well worth your while. Try putting your localhost machine first in the list, in the middle and at the end. Normally you want to run twice the number of jobs as processors that you have. But if you have enough machines to feed, running 2 jobs on the localhost can actually increase your build times. Setting localhost to 1 or 0 jobs can decrease the build time significantly even though the build master's CPU might be idle part of the time. The obvious reason being that the machine can be more responsive to all of the build boxes requesting data.

Network


Because distcc has to transfer the data to the different machines before they can compile it, any time spent transferring the file is lost time. If, on average, 1/3 of the compile time for one file (send/compile/receive) is spent sending and receiving, it isn't the same as if the extra computer is only 2/3 as fast magahertz wise, but it will decrease your farm performance. As it is much cheaper to upgrade the network than it is to upgrade all of the computers. Here are some places for where network bottlenecks occur and can be easily fixed:

Other network traffic. Are the boxes on the common network where all other traffic can spill into it? Reducing traffic not related to building can help improve throughput. Putting them on a different subnet and different switches than the normal home/office can help.

Network compression. On networks slower than 100Mbps where there is no option of upgrading, it might be worth while to turn on the LZO compression for data transfers. Although the network speed may increase, watch out for the server usage.

Enabling compression makes the distcc client and server use more CPU time, but less network traffic. The compression ratio is typically 4:1 for source and 2:1 for object code.

The networks interconnect. Using several 10MB/s hub chained together is probably the worse scenario. Using a single switch that is 100MB/s or better can dramatic increase the transfer time and in turn lead to much faster compile times. Bang for your buck, this might be the best improvement that can be made.

The main box. All traffic for the compiling comes from and is sent to the main box. Using 1000MB/s card(s) on the build master can help reduce this bottleneck. Most networking companies sell switches that have one or two 1000MB/s ports with the rest being 100MB/s. Another possibility, if your switch supports it, is to use multiple nics as one interface (called trunking). Because the machines are independent of each other and only communicate with the main host rather than chaining switches, multiple nics can be installed in the main box, each one connected to a dedicated switch.

Depending on the build system, networking could have a bigger effect on compile times than it should. When using automake, the system enters a directory, builds all of the files in that directory and then continues building recursively through the directories. Each time it enters a directory it begins to build as many files as it can. In that moment there is a huge spike of network traffic (and collisions) slowing down the overall delivery speed. Compounding that problem, because all of the files are pulled from the hard disk at the same time, preprocessed, and sent out, the computer is tasked to the fullest, often to the point of slowing down the total compile than if it was doing one file at a time. Which brings us to...

Unsermake


Make doesn't know about any directory dependencies, only the order in which to build them. The simplest and best example of where this can show up is in the test directory for a project. Typically one will have a library/application and then build (lets say) thirty test applications. Each test application is in it's own directory and contains one file. Make would build each directory, one at a time in a linear order. Unsermake (which replaces automake) realizes they have no interdependencies and, assuming there are thirty boxes in the farm, compiling could be speed up by a factor of thirty! There are in fact a number of new tools that you can replace automake with including SCons. Unsermake is simply the one that I have become most familiar with, but they all have the same feature of decoupling the directory dependency and should give similar results.

Even in the best case scenario for automake, every single computer on the farm is guaranteed to be idle while the last file in the directory finishes its build. Using automake, one is quick to discover that the builds can't scale much more than a handful of computers because the returns will dramatically decrease, and the extra boxes sit idle most of the time.

There is yet another hidden benefit of using Unsermake, which was touched upon in the last section. As each machine finishes its build Unsermake will give them the next file off the stack. The boxes will almost never finish at the same time. So rather than having a spike in network traffic that is more than the system can handle there, is a continue stream at the top speed of the entire system. Rather than trying to read and preprocessor thirty jobs at once, it only has to do it for one. With only one job to do the master box will read it faster, preprocessor it faster and it will be transferred faster to the machine to build. On the small scale it doesn't matter much, but add that up over hundreds of files and you will see a very nice pay off on just this one small thing.

In most of the cases I have tried, using Unsermake has cut the compile time in half. Just to say that again... in Half!


For a good example of all of the problems that Unsermake takes care of, check out this image of distccmon-gnome when compiling Qt with automake. On the right hand side, one can see the red and rust colors where too much was trying to happen at once and everything was slower because of it. On the left, one can see many idle machines as they wait for the others to finish before starting a new directory.

More Machines


Of course, if you are only building with two computers, adding a third will speed it up. But what about the nineteenth or twentieth? If a project contained directories with, at most, five files in them, (using automake) fifteen computers would sit idle! But if Unsermake is used they won't, as many computers as files could be added and (to a point) see benefit from the additions. Rather than trying to always have the top of the line processors one could get slower machines, double the number, at half the price. Then one wouldn't have to worry that only at most five would be used when build time came around. Initially when I would use 450MHz boxes with 800MHz boxes the 450MHz boxes actually slowed down the system while the faster boxes waited for jobs to finish. But after using Unsermake the 800MHz boxes were no longer held back and adding the 450MHz boxes improved the system as a whole just as originally expected. Because of Unsermake, I was able to add three low-end boxes to the farm.

BinUtils


During the 2003 KDE Contributors Conference, Hewlett Packard provided laptops for the developer to use. The laptops were integrated into the already running Teambuilder compilation farm running on the computers contributed by the Polytechnic University of Upper Austria. With 52 computers attached, huge compilation speed increases, and in comparison the linker speed was quite slow. A little bit of work later a patched binutils was created which dramatically increases the speed of the linker. In the binutils 2.14.90 changelog, it is number 7. "Fix ELF weak symbol handling". Not only those who use distcc, but everyone can take advantage of this speed improvement. As of this writing, the current version of linux-binutils is 2.15.90. Make sure you are up to date.

Another place where the new binutils can really benefit is when distcc is used to cross compile. If for example a 200MHz mips box uses twenty P4 boxes with a crosscompiler on them, having a faster linker on the really slow box would improve the total compile time more than adding additional machines.

Memory


Although it might seem obvious, making sure that the build boxes have enough memory is important. Watching a few of the boxes build with one thread each, the memory didn't reach 60MB in usage. Having the build boxes contain only 128MB should suffice, but the build master box should contain much more, and even more when doing only local builds. In one test, I left in only 256MB and found it reaching over 200 most of the time. It did go over 256 and started swapping out a few times, so at minimum 512 is recommended. Presuming a developer will also be running X11 a desktop environment and other applications 1GB of memory isn't out of the question. Once the build computer start to swap, any gains that might have been had are now lost.

Linux Kernel


Just a last note about the kernel. I haven't run any tests using Linux 2.6 vs 2.4 yet, but this might be an area of improvement. From everything else out there that shows the dramatic bandwidth increase going from 2.4 to 2.6, I wouldn't be surprised if it also improved compile times. If you have system with 2.4 and 2.6 and are using distcc let me know if it makes much of a difference and I'll make mention of it here.

Could you go back to just one box?


No way! Once you start playing around with distcc you'll find build times shrinking. Most developers probably have one really good computer and several really old computers that have been rebuilt twenty times. Your eye twinkles as you think how fast things would be if you could get a rack of dual Xeon's. Coming back to reality, you apprehend several sensible requirements of your build farm:

  • Small form factor, if there are going to be several of these boxes, they can't take up much space under the desk.

  • Cheap, duh.

  • Low power, Running twenty P4's uses a lot of electric.



Some options appear:

  • 1" Rack mounted computers. Way, way to expensive for what is needed (Unless your are a corporation in which case, get that rack!).

  • A lot of old computers bought off e-bay or just collected from friends. You will probably get full desktops which aren't exactly small and by the time you buy enough machines you will have spent the same as buying a low end box which is just as fast as all of the old computers combined and use five times the electricity. Oh and don't forget the aggravation of working with a bunch of old computers that are all different.

  • A bunch of those cool cubes, but there is the problem, they are cool and so they go for top dollar for what is generally a VIA 1GB machine.



Those options don't look too good. Maybe there is something better. Micro-ATX is a form factor that is smaller and thinner than the typical desktop, the sacrifice coming from only getting three pci slots. Typically not sold as something "cool" or for servers, but for people who want a computer that doesn't take up much space and they are surprisingly cheap. Putting an AMD XP in the box to reduce electricity costs and if the sweet spot in the market is picked it wont cost you an arm and a leg. Adding it up: Case, Motherboard, CPU & Fan, 256 MB ram, my total casts including shipping were only $240. Considering that a full blown computer with the same CPU can be found for ten times as much this is very cheap and perfect for what we want to use it for. This box gives me the same performance as five old computers, doesn't take up much space, and doesn't use much electricity. Picturing yourself getting two or three of these (for still less than $1K) is actually reasonable now! If someone sold these together as a bundle the price could go even lower (and I would pick up one, hint hint). In my particular situation choosing the middle of the road CPU permits me, in six months, to sell off the CPU and ram and get the new middle of the road for very little cost in difference. The only item I left off the price was the CD-ROM drive. I have half a dozen old CD-ROM drives lying around and, because other then at boot times it wont be used, a new CD-ROM drive isn't needed. But if you don't have a spare already, finding one for only $20 shouldn't be a problem. Also many machines these days can net boot, so if you have enough of them, it might be worth while to set that up.

Conclusion


distcc is a fantastic program which enables developers to cut down on waiting. Although you can just stick more machines on the network and utilize them, if you spend some time testing you might be able to get a whole lot better performance.

If there are more than three boxes in the farm take a look at Unsermake or similar programs and see if you can use it in your compiling projects. The effort to get it to work with your project might pay you back big time. If you are using automake, you want to have a faster, smaller farm.

The idea of building small cheap dedicated build boxes is a very real alternative. Investigate your alternatives. Perhaps the bare bones computer is so cheap that it is the same as the electricity cost difference for running five PII's for a few months. Of course you could have both the new barebones box and those PII's....

Links


scons - http://www.scons.org/
distcc - http://distcc.samba.org/
ccache - http://ccache.samba.org/
distccKNOPPIX - http://opendoorsoftware.com/cgi/http.pl?cookies=1&p=distccKNOPPIX
Unsermake - http://www.kde.me.uk/index.php?page=unsermake

Popular Posts