CHAPTER 8 TROUBLESHOOT LINUX SYSTEM
8.1 Troubleshoot methodology The maintenance cycle
Monitoring: Observing system areas for problems or irregularities Proactive maintenance: Minimizing chance of future problems e.g., perform regular system backups Reactive maintenance: Correcting problems when they arise Documenting solutions Developing better proactive maintenance methods
Documentation: System information stored in a log book for future references Troubleshooting procedures: Tasks performed when solving system problems
Common troubleshooting procedures
Two troubleshooting golden rules: Prioritize problems according to severity Spend reasonable amount of time on each problem given its priority Try to solve root of problem Avoid missing underlying cause Justify why a certain solution is successful
Two categories of problems: Hardware-related Software-related
8.2 Hardware-Related Problems Hardware-Related Problems Often involve improper hardware or software configuration SCSI termination Video card and monitor configuration POST test alerts Loose hardware connections IRQ or I/O address conflicts View output of dmesg (control or print the kernel ring buffer) command
Absence of device drivers prevent OS from using associated devices Kudzu program: Detect and install support for new hardware If hardware device not detected, device driver must be configured manually HDDs most common device to fail Good idea to use RAID
The kudzu welcome screen
Configuring new hardware using kudzu
If HDD containing partitions mounted on noncritical directories fails: –Power down computer and replace failed HDD –Boot Linux system –Use fdisk to create partitions on replaced HDD –Use mkfs to create filesystems –Restore original data –Ensure /etc/fstab has appropriate entries to mount filesystems
If HDD containing / filesystem fails: –Power down computer and replace failed HDD –Reinstall Linux on new HDD –Restore original configuration and data files You should update your package every time you made changes to your system (hardware/software). You can run PUP, yum, apt-get, or GUI based synaptic package manager to do the update.
8.3 Software-Related Problems: Application-Related Problems Missing program libraries/files, process restrictions, or conflicting applications Dependencies: Prerequisite shared libraries or packages required for program execution Programs usually check at installation Package files may be removed accidentally
rpm –V command: Identify missing files in a package or package dependency ldd command: Display shared libraries used by a program ldconfig command: Updates /etc/ld.so.conf and /etc/ld.so.cache files
/etc/ld.so.conf file: List of directories containing shared libraries /etc/ld.so.cache file: Contains location of shared library files compressor/decompressor (codec) file: Contains rules to compress or decompress multimedia information
ulimit command: Modify process limit parameters in current shell Can also modify max number of filehandles /var/log directory: Contains most system log files If applications stop functioning due to difficulty gaining resources, restart using SIGHUP / SIGHKILL
8.4 Software-Related Problems: Operating System-Related Problems Most software-related problems related to OS Boot loader, filesystem, serial device problems LILO problems: Place “linear” in, remove “compact” from /etc/lilo.conf file GRUB problems: Typically result of missing files in /boot directory mkbootdisk command: Create a boot floppy diskette
If filesystem on partition mounted to noncritical directory becomes corrupted: Unmount filesystem Run fsck command with –f (full) option If fsck command cannot repair filesystem, use mkfs command to re-create the filesystem Restore filesystem’s original data
If / filesystem is corrupted: Boot from first Red Hat Fedora installation CD Type “linux rescue” at welcome screen Enter shell for Linux system on CD Create new / filesystem via mkfs command Restore original data to re-created / filesystem Reboot system
Lost root password First, you have to reboot into recovery mode.
If you have a single-boot (Ubuntu is the only operating system on your computer), you may have to press the Escape key during bootup in order to see the boot menu. If you have a dual-boot (Ubuntu is installed next to Windows, another Linux operating system, or Mac OS X; and you choose at boot time which operating system to boot into), the boot menu should appear without the need to press the Escape key.
From the boot menu, select recovery mode, which is usually the second boot option.
After you select recovery mode and wait for all the boot-up processes to finish, you'll be presented with a few options. In this case, you want the Drop to root shell prompt option so press the Down arrow to get to that option, and then press Enter to select it.
The root account is the ultimate administrator and can do anything to the Ubuntu installation (including erase it), so please be careful with what commands you enter in the root terminal. Once you're at the root shell prompt, if you have forgotten your username as well, type ls /home
You should then see a list of the users on your Ubuntu installation. To reset the password, type username is the username you want to reset. passwd username
You'll then be prompted for a new password. When you type the password you will get no visual response acknowledging your typing. Your password is still being accepted. Just type the password and hit Enter when you're done. You'll be prompted to retype the password. Do so and hit Enter again.
Now the password should be reset. Type to return to the recovery menu. After you get back to the recovery menu, select resume normal boot, and use Ubuntu as you normally would—only this time, you actually know the password! exit