Web server disk image #6 created and put on to the production system. Looks like this is what we will ship with for the Web Server 1.0 release.
ELF kernel 2.0.31 is stable and pretty much ready for release. There are still lots of ugly compiler warnings throughout but at least the kernel builds and works. Doing an in-house field trial. Source and binaries available on gothics, and will be made publically available later this week (barring any major errors being found).
Took a stab at porting the 2.0.35 kernel, but diff3 lost many of the NetWinder specific changes so the kernel is useless. Need to track down why the local modifications were not preserved before continuing any further.
I've added information on kernel building in anticipation of a limited release of the ELF kernel.
Afternoon spent building RPMS for Apache-1.2.6, sendmail-8.8.7, and others. There are common problems with all of them relating to the changes to the database routines in glibc-2. Therefore I am off on a side project to figure out what has changed in glibc...
Fixing a bug on WS disk image #6: the time zone is set for Eastern Time and short of manually changing the /etc/localtime symlink and /etc/sysconfig/clock config file, there is no easy way to adjust it. I've built the RedHat 'timeconfig' utility and all the supporting utilities. Since the WS doesn't handle the new RPM format, I made a big tarfile that should simply be unpacked in the root (/). Then the timezone can then be adjusted by typing 'timeconfig'.
Stumbled across an obscure bug with recent dynamic libraries. It seems that readelf dumps core on these, and it's related to the strcmp() function referencing into the memory "hole" between the code and data segments. PatB is looking into this.
Decided on a strategy for public distribution of the ELF kernel sources. A script to update the kernel on netwinder.org automatically from the internal CVS server will be set up. The same script will tarball up the tree so that it can be built on a netwinder (a-la Tinderbox). The tarball will go onto the public FTP site and will include the CVS directories intact, making updates simpler.
Set up a trial Faq-O-Matic on my machine for testing purposes. It is pretty easy to set up and seems to work well. Consider setting this up on netwinder.org once the hardware upgrade is complete.
Built webfront into RPM format so that it can be installed along with Apache-1.2.6 directly onto a DM image.
ELF kernel (rel_981006) placed into ccc/kernel on the ftp site. See the Kernel building notes before you try it out.
Repackaged binary kernel (vmlinux-981007.gz) since it was missing /dev/rtc (real-time-clock) support. Source in CVS updated; not a big enough change to warrant another tarfile though.
Rebuilt perl-5.004, this time it is dynamic! You can find it here. It passed all self checks except the groups test (like before) and the io_sock test (which just hangs). It is built straight from the SRPM, but to compile it I had to hack usr/include/sys/sem.h, pull in the definition of union semun from usr/include/linux/sem.h. Just how are these two directories supposed to be kept up-to-date of each other?
Rebuilt tcl/tk 8.0, not sure if it works properly. Built without any troubles, has been RPM'ed, can be found in my RPM folder (see item above).
Discovered that the clock shaker software mod hasn't been included into the kernel. That explains _a lot_ of weirdness we've been seeing. Working with PatB to get the fix into the kernel, will release a new one as soon as it is done.
Tracked down the clock shaker problem and released new kernel with appropriate modifications. Also corrected the missing system calls in include/asm-arm/unistd.h so that glibc can be compiled correctly. Mail server is down so I cannot post an annoucement... will do that on Tuesday (long weekend...).
Two kernel patches, one from Pat to up the bus clock rate, the other a minor cleanup of the log messages during bootup. Built new RPMS for ftpd, imap, inn and having trouble with pam...
Updated gcc source code in CVS repository to the most recent version; helping Scott so that he can keep it up-to-date himself. The other items under /gnu will be updated shortly.
Releasing a new kernel today, which fixes the missing system calls (nanosleep and friends). Also includes some PCI bus tweaking by PatB, a revised CPU-usage indicator (back to the old behaviour).
I'm not sure which day I should put this in... it is 3 am and I'm still here... Andrew and I did major surgery on the NFS disk image, way too many changes to list them here. We've got a three page list of things that are still broken and need to be fixed. Will be working on this over the next few days.
Another kernel release today, main change is to (finally) turn off the bright mode by default. Sound seems not to work properly on my current machine, maybe others using 981014 are also having trouble?
Final touches on the new disk image (2.0 build #8). Added Scott's latest glibc and compiler (after quite some hassles with make suddenly wanting to rebuilt everything). San's 2.0.2 firmware is out and works, so we're adding that as well.
Problems with Tcl/Tk seem to be related to the version of Tcl. We have 8.0 and apparently there are significant changes in the C-language interface relative to version 7.6 and earlier. So maybe that is why a number of things that need Tcl/Tk don't work quite right.
Worked on porting the 2.0.35 kernel. There appear to be major changes in the /arch/arm/kernel directory but everything up to this point has successfully compiled.
Read about CVS 1.10 and have decided that we should upgrade to the new version as soon as possible.
Testing the 2.0 DM build #8 disk image. There are lots of little bugs, but the big one is that the scan rates in the /etc/XF86Config file are too high for most monitors. Editing the HorizSync and VertRefresh to be 30-65 and 50-75, respectively, works for the Phillips monitors that most people have at CCC. Another bug is that X refuses to start by non-root users, this is probably just permission in the Xwrappers.
Not quite sure what I did today... Decided what is going on the final disk image for 2.0, started working on implementing it. Reviewed the needs for WS/GS disk image and will work on getting the pieces together for that. Mostly a day of meetings and decisions.
Good news on the 2.0.35 kernel porting effort: I've managed to get all the changes merged and am able to build the whole tree. Now I just have to get everything to link (there are a pile of undefined symbols currently).
Finalizing the 2.0 disk image... lots of niggly issues. Also doing minor kernel patches (cleanup errors, fix quota support)
Procmail-3.10 is crashing (SIGSEGV) apparently due to a strcat() call with source=destination. The problem doesn't appear when optimization is turned off, so perhaps it's a compiler bug.
Internal release of DM disk image 2.0 build #9. We forgot to insmod the tulip driver, so eth1 doesn't work "out of the box". Also the /var/tmp directory doesn't have the right permissions.
Building RPM of glimpse-4.1 for WS/GS
Mostly a day of meetings... :(
Pat has fixed the shaker bug by moving to timer#4, suitably calibrated against the systems hardware clock during boot-up. Tested on my machines and the clock drift problems seem to be greatly improved. Made a kernel release (981029).
Tested the kernel on a variety of machines with satisfactory results. Hopefully this is the last we'll see of the mysterious drifting clock problems.
Built RPM's for webfront, glimpse (needed to have the `wgconvert' program added to the existing RPM), and looked at webglimpse. The latter looks to be difficult since it depends on knowing the hostname as a part of the configuration process, but that cannot be known when the disk image is assembled.
Began collecting pieces for the WS/GS disk. The aim is to have everything in RPM format so that the process of upgrading from a DM is simply a matter of adding/removing packages.
Built an RPM of Woody's utility programs: therm, debug, flashwrite. Wrote man pages for half of them, still two to go. Decided that the flash driver should be moved into the kernel, Woody looking into registering maj/min numbers. The initial RPM's are available on Woody's home page.
Built sendmail RPM, it had to be modified slightly since our glibc includes the DB2.0 database but sendmail is written for v1.85. Just changing all the #include <db.h> into #include <db_185.h> solves the problem. According to http://www.sleepycat.com/convert.html this is the right thing to do.
Modified webfront install to make non-critical files owned by "httpd:httpd", and added checks to the install and RPM spec file to ensure that this user exists. Couldn't find a decent way to automatically add the user and set his password, so instead we have to do it manually.
Updated master disk image for build #10. Installed samba, imap, quota, procmail, and sendmail and cleaned up afterwards. Installed new SVGA lib with the 85MHz MCLK fix, made several other small adjustments (see the ChangeLog). Fixed/verified the changes from Ming's email on 2 Nov, including loading the tulip driver and removing duplicates in the X font path.
Assembled disk image #10, resolved some issues with how the quota support files will be created. Made backups of the kde-cvs from Oct 29 up to today, and also of the web server disk images up until today.
Disk image had some problems, quota left error messages since the first time it got enabled on root only, then on all filesystems. Cleaned this up, and also added the rescue kernel as /boot/vmlinux.rescue.
Working on NetConfig version 2, adding in requested features (time config, multiple DNS servers, ability to have no domain name, and access control).
Restored 'gothics' after it got reset... Did a bit of work on netconfig but mostly solved problems (or created new ones) for other people.
Build #9 (and #10 for that matter) are missing rhs-printfilters so printing doesn't work. The SRPM from RH-5.1 compiles cleanly and seems to solve the problem.
The /var/tmp directory permissions are still wrong on build #10. However the master image has it right, so we now suspect that tar is the cause of the problem. In the past Andrew did his tar'ing on a x86 but for build #10 we did it on a NetWinder. So tar is wrong too.
Working on NetConfig (to be renamed "Config") again... adding time & date panels, and DHCP if I can figure it out.
Modified my backup scripts to automatically move the images between two separate servers. Still have to pull them to cygnus manually, but even without there is enough redundancy now that I'm happy.
More work on netconfig.. backend work for the date & time (gui is done)..
Crash course on Zaphod so I can demo it at the show
Attending the Comdex show in Las Vegas. Will try to check email periodically...
Updated the WS disk image so it would be compatible with the current production system. This meant adding a new kernel, new firmware and modifiying the /etc/fstab file on the disk. Also moved /dev/therm since it has been given new major/minor numbers in the new kernel. Tested and placed on production1 server.
Build demo disk image include WP and Netscape, imaged for Ron and Roger's desktop machines.
Finishing off NetConfig with DHCP support (hopefully...)...
Re-image WS disk image due to ELF kernel problem... put back the a.out version instead (so long as you don't put in fancy nettrom options, the command line won't be too long and everything works alright.)
More work on NetConfig... to support the features, I decided to re-work the way that input is handled between the different entry fields.
Some more time on NetConfig... also investigated the Apache crash under high load problem with Mark. It looks like it is a lacking physical->virtual mapping operation in the alignment fault handler. A modified kernel has been build and it boots, but I can't seem to run the web-stress tool so I can't say for sure if the problem is solved.
Tracking down the Apache crash issue. It is definately related to the alignment fault handler, because if you remove the handler, the httpd process dies but the system keeps on running. It seems that when physical RAM is nearly exhausted, malloc() calls may fail, and occasionally one gets an unaligned access to a page of memory that is not yet swapped in.
Netconfig again... new widget set in place, now bringing in the file support section.
Helped MarkB build demo #3 for the next show... now on production server.
Nwconfig now reading defaults properly, help is synced back up...
Did some coaching for using CVS, doing RPM builds, setting up a proper disk image.
More work on nwconfig, reading and writing config files, still problems with DHCP settings (cannot configure more than one interface).
Nwconfig work continues - touched up the help.
Began working on 2.0.35 kernel - crashes if built with frame-pointer option in 'menuconfig' is turned on - so don't do that.
Tidied up the kernel over the weekend, set up "resonable" defaults. Still need to work on sound config menu. Did CVS checkin only to find that the kernel wouldn't build upon checkout - investigating. Turns out to have been caused by CVS deleting empty directories. Fixed.
Kernel testing - some problems with sound driver, not sure why. There are some assembly files with 26 bit code that need to be investigated.
The sound problems go away when -fsigned-char is removed from the compiler flags- back the way it used to be. Changed the default to suit.
Added raid-4 and raid-5 support options, they were missing from the configure scripts.
Discovered a serious problem: rpm's cannot be build, a really ugly kernel panic occurs during dependency checking. A similar panic occurs if the appletalk support module is loaded... need to investigate! suit.
Made an interim release of the 2.0.35 kernel so that the GS/RM disk images could get going.
There appear to be two major parts to the kernel problem. First off, ldd fails miserably and this is why RPM fails during dependency checking. PatB has traced this to a missing offset in the ELF dynamic loader.
The second part of the problem is caused by a kernel module doing an unaligned access on behalf of a user process. The current model looks at the value of current and if it is zero, grants kernel-level access. Some device drivers, like msdos and appletalk, apparently do legitimate unaligned accesses. So we need to change the criterion.
Re-released the kernel with patches for the problems listed yesterday, and began assembling disk image #11.
Completed build #11 and released for internal testing. Starting the documentation battle (see ~ralphs/build11.html)
Working on the aftermath of build #11 - updating documentation, explaining things that have changed and chasing down various bugs (or not). There appears to be some trouble with the tulip driver. Also, nwconfig writes out the wrong value in /etc/hosts when you configure only the eth1 interface.
Found a small bug in nwconfig, it would write out an incorrect hosts file for eth1 interface. Also collecting a list of other things that are wrong with build #11, and am starting to think we should do build #12 with just those fixes. That would give an opportunity for a kernel revision, so some other lose ends could be tidied up.
Patched "Corel Computer Corp" to read "Corel Computer" as requested by the "suits". The firmware cannot be changed easily, so it is binary patched. Main kernel has been completely cleaned and changes committed. Also incorporated Grant's updates to the paride driver suite.
Worked on the kernel today, testing version 1.03s of the PARIDE driver suite, and updating the sound configuration to bring it in line with the rest of the kernel config process. Fixed the nwconfig bug where it would write the wrong values to /etc/hosts and for the ONBOOT fields (these problems were related).
Fixing the sound config menus in the kernel configuration, and looking into why xconfig doesn't work. Received another Paride patch from Grant, to be incoporated. The kernel is not to be revved for the next disk image. Collecting pieces for build #12, imaging tomorrow.
Fixing minor bug in nwconfig - enter key behaviour.
Assembled build #12 and released image for internal testing (see build12.html).
Documentation day... working on my web pages at http://www.netwinder.org/~ralphs/.
More documentation work... pages are starting to take shape!
Continuing on the documentation work; restoring the master disk image.
Reviewed the master disk again and fixed some more oversights. The difference listing is still huge, but it is almost all just timestamps on directories now.