Quick analysis of a Linux system

Sometimes happens that you need to take a look at an unknown Linux system or that you have to quickly analyze the problems of a Linux server. You’re lucky because, on Linux, there are some commands to start an in-depth analysis and that serve to understand what’s happening behind the scene.

 

Looking, administer, analysis. Going down with command line interface

Not all the commands described are installed by default in every Linux distribution; many are part of the GNU Coreutils package, so you’ll quickly understand that a whole data analysis toolkit is already installed on your system. The first advice is to try to launch them on the systems (servers or normal PCs) that are normally administered and, if they are not working, install them. The suggestion is also the one to include commands and relative packages in your standard installations because those are commands that sooner or later come in handy.

If you want to go deeper in Linux system analysis, I can recommend you to read and study the Linux Performance Analysis, an excellent article by Brendan D. Gregg that lists all the useful commands to identify the possible problems. You can also find this page including the tools maps and links to various Linux performance material that he has created.

 

$ uptime
23:51:26 up 21:31, 1 user, load average: 30.02, 26.43, 19.02

The time that the system is up and running. Also, this is a quick way to view the load averages, which indicate the number of tasks (processes) wanting to run. The three numbers are exponentially damped moving sum averages with a 1 minute, 5 minutes, and 15 minutes constant.

 

$ date
So 11. Mär 11:19:02 CET 2018

A useful command to understand if the system has the correct time and which time zone is being used. This allows you to avoid incorrect interpretations of the logs and allows you to check if problems are related to the correct time depends or due to an incorrect setting of the system clock.

 

$ uname -a
Linux razen 4.15.0-10-generic #11-Ubuntu SMP Tue Feb 13 18:23:35 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

A quick look at the version of Linux that is running, the host name and the processor family. If you want more information about the hardware of the system in use, you use some other specific commands. Take a look at this reference: What is the Linux command to find out hardware info?.

 

$ ps ax
PID TTY STAT TIME COMMAND
1 ? Ss 0:03 /sbin/init splash
2 ? S 0:00 [kthreadd]
4 ? I< 0:00 [kworker/0:0H]
6 ? I< 0:00 [mm_percpu_wq]
7 ? S 0:00 [ksoftirqd/0]
8 ? I 0:02 [rcu_sched]
9 ? I 0:00 [rcu_bh]
10 ? S 0:00 [migration/0]
...

A command to get an idea of what is going on inside. With this command, you can catch simple problems to solve. The ‘a’ option tells the command ‘ps’ to list the processes of all users on the system rather than just those of the current user and the ‘x’ option to include processes that are not running in a terminal, such as daemon processes.

 

$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 3,9G 0 3,9G 0% /dev
tmpfs 788M 1,9M 787M 1% /run
/dev/sda2 234G 165G 57G 75% /
tmpfs 3,9G 91M 3,8G 3% /dev/shm
tmpfs 5,0M 4,0K 5,0M 1% /run/lock
tmpfs 3,9G 0 3,9G 0% /sys/fs/cgroup
/dev/sda1 511M 4,7M 507M 1% /boot/efi
tmpfs 788M 16K 788M 1% /run/user/121
tmpfs 788M 2,1M 786M 1% /run/user/1000
/dev/fuse 250G 0 250G 0% /run/user/1000/keybase/kbfs

The command ‘df’ displays statistics about the amount of free disk space on the specified filesystem or on the filesystem of which file is a part. So, how much free space do we have? Some problems arise from (almost) full file systems.

 

$ free -m
total used free shared buff/cache available
Mem: 7879 2829 2917 462 2132 5103
Swap: 10239 0 10239

The command ‘free’ displays the total amount of free and used physical and swap memory in the system, as well as the buffers and caches used by the kernel. The free value indicates the memory available to start new programs without the swap intervening. If you prefer values expressed in Gigabyte, you can also use ‘free -h’.

 

$ top
Tasks: 244 total, 1 running, 191 sleeping, 0 stopped, 0 zombie
%Cpu(s): 6,0 us, 2,4 sy, 0,0 ni, 91,6 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
KiB Mem : 8068856 total, 2887832 free, 3017140 used, 2163884 buff/cache
KiB Swap: 10485756 total, 10485756 free, 0 used. 5158896 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3344 cialu 20 0 2474952 611800 192800 S 15,2 7,6 11:45.61 Web Content
2344 cialu 20 0 3514540 225424 100184 S 4,6 2,8 5:16.10 gnome-shell
3618 cialu 20 0 2535564 432428 128776 S 4,6 5,4 3:40.96 Web Content
3102 cialu 20 0 3069176 670060 222340 S 3,3 8,3 9:40.47 firefox
...

The ‘top’ command includes many metrics and it continuosly check the loads providing a dynamic real-time view of a running system. It can display system summary information as well as a list of processes or threads currently being managed by the Linux kernel.

 

$ dmesg
[ 37.224664] Bluetooth: RFCOMM ver 1.11
[ 38.179168] rfkill: input handler disabled
[ 159.022215] show_signal_msg: 20 callbacks suppressed
[ 159.022216] deja-dup-monito[3555]: segfault at bbadbeef ip 00007f82cdbfe0b8 sp 00007ffe71ce1930 error 6 in libjavascriptcoregtk-4.0.so.18.7.7[7f82cce41000+fc9000

Invoking ‘dmesg’ without any of its options causes it to write all the kernel related messages as output. As that output doesn’t fit a single terminal page, you can use text-manipulation tools like ‘grep‘, ‘less‘ or ‘grep‘ with ‘dmesg’ command.

 

$ w
11:59:20 up 1:19, 1 user, load average: 1,70, 1,42, 1,28
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
cialu tty2 tty2 10:40 1:19m 47:09 3:41 /usr/lib/firefox

It’s a bit redundant because the ‘w’ command is a combination of several other Unix programs: ‘who’, ‘uptime’ and ‘ps -a’. This command provides a quick summary of every user logged into the system, what each user is currently doing and what load all the activities are inflicting on the computer itself.