If your Linux server isn't performing to its full potential, it's likely there is an underlying issue that needs resolving.

Follow these five simple yet practical steps to troubleshoot a Linux server and reduce the downtime to an absolute minimal.

1. Check the Hardware

Let's get down to the absolute basics: check the hardware. This means you head over to the physical rack and check if any cables are loose or there's a power outage.

Alternatively, type the following command:

        $ sudo ethtool eth0
    

If it returns a yes, you know your port is talking to the network.

To check a server's BIOS/UEFI hardware report, use the following command:

To see what a server's BIOS/UEFI reports about its hardware
        $ sudo dmidecode --type memory
    

If the response looks good, this isn't the problem either. If you suspect there are memory issues, run the following command:

        $ sudo modprobe edac_core
    

If there are no results after running the aforementioned command, type the following:

        $ sudo grep "[0-9]" /sys/devices/system/etc/mc/mc*/csrow*/ch*_ce_count
    

This presents you with a list of the memory controller's rows along with the error count. When an output is combined with the dmidecode data on the memory channel, part number, and slot, you can successfully find the corrupted memory stick.

Related: Getting Started With Ubuntu Server

2. Decipher the Exact Problem

Your server has gone down, and there are no two ways about it. Before jumping in with your tools, it is essential to define what the exact problem is. For example, if your users face issues with a server application, you need to make sure the problem is not at the client's side.

Secondly, as a part of the problem hunt, you should try to narrow down the source of the problem. This would mean either the server per se or the server application. For instance, a server program can go haywire while the server functions like a well-oiled machine.

To check if an application is running smoothly, type the following:

        $ sudo ps -ef | grep apache2
$ sudo netstat -plunt | grep apache2

If the server is not responding, you can turn on the Apache server using:

        $ sudo service apache2 start
    

In short, figure out the exact problem before jumping the gun. This would help narrow down the list of issues and help you figure out a solution accordingly.

3. Using the Top Function

Top is one of Linux's most exemplary debugging functions, as it loads the average, swap, and a list of processes using the system's resources.

Top function checks load average, swap, and which processes are using resources

But the first time you use it, it can appear confusing. Here's a quick breakdown of top.

Line 1:

  • The time
  • How long the computer has been running?
  • Number of users
  • Load average (the system load time for the last minute, last 5 minutes, and last 15 minutes)

Line 2:

  • Total number of tasks
  • Number of running tasks
  • Number of sleeping tasks
  • Number of stopped tasks
  • Number of zombie tasks

Line 3:

  • CPU usage as a percentage by the user
  • CPU usage as a percentage by system
  • CPU usage as a percentage by low-priority processes
  • CPU usage as a percentage by idle processes
  • CPU usage as a percentage by I/O wait
  • CPU usage as a percentage by hardware interrupts
  • CPU usage as a percentage by software interrupts
  • CPU usage as a percentage by steal time
  • Total system memory
  • Free memory
  • Memory used
  • Buffer cache

Line 4:

  • Total swap available
  • Total swap free
  • Total swap used
  • Available memory

This is followed by a line for each running application. It includes:

  • Process ID
  • User
  • Priority
  • Nice level
  • Virtual memory used by process
  • Resident memory used by process
  • Shareable memory
  • CPU used by process as a percentage
  • Memory used by process as a percentage
  • Time process has been running
  • Command

To find out which process is consuming the highest memory, first sort the process by typing M.

To check processes using the most CPU power, press P.

To filter on specific options, press O, which will display the following commands:

        add filter #1 (ignoring case) as: [!]FLD?VAL
    

Further on, you can filter on a particular process, like

        COMMAND=apache
    

This will filter and show only Apache processes.

4. Tracking the Disk Space

Despite endless available storage, a server can run out of space, leading to a multitude of problems. In such scenarios, use the df command (disk filesystem) to pull out a complete summary of available/used disk space.

Use df command to view a full summary of available and used disk space.

You can use it in the following three ways:

        $ sudo df -h
$ sudo df -i
$ sudo df -hT

Another useful command is %util, which highlights how strained the device is. Any values greater than 60% utilization indicate poor storage performance. Anything close to 100% means the drive is close to saturation.

5. Check the Logs for Problems

The logs give you a ton of helpful information in the /var/log, a subdirectory specific to the service. For newcomers, Linux's server logs might be the scariest place on the planet.

That does not have to be the case, mainly since the logs are divided as per their functionality. One captures what happens on a system/program, while the other records system/application error messages. Logs are usually enormous files, given the amount of information they store.

Log data files are cryptic, and it's always best to learn how to maneuver your way around.

If you are unsure, use dmesg, which displays all the kernel's messages. The tail function shows the first 10 messages by default.

Tail function displays all the kernel messages.
        $ dmesg | tail 
    

Combining the tail command with the -f keyword will continue to keep an eye on the syslog file and print out the next event within syslog.

        $ dmesg | tail -f /var/log/syslog
    

This command will continue to sweep through the logs and show possible problems.

Troubleshooting Your Linux Server Effectively

Troubleshooting your Linux server might seem a daunting feat initially, but there are a few instances necessary to get the ball rolling. If these five steps haven't helped you identify and track the problem, it might be worthwhile to get other people involved.

However, most times, one of the above troubleshooting steps should help resolve the issue at hand.