Tag: linux

Out of Memory (OOM) in Linux machines

Since some months I have not been posting anything on my blog. I should admit that I was really busy. Recently, a friend asked me about the Out of Memory messages in Linux. How is it generated? What are the consequences? How can it be avoided in Linux machines? There is no specific answer to this as an investigation had to be carried out to have the Root Cause Analysis. Before getting into details about OOM, let’s be clear that whenever the Kernel is starved of memory, it will start killing processes. Many times, Linux administrators will experience this and one of the fastest way to get rid of it is by adding extra swap. However, this is not the definite way of solving the issue. A preliminary assessment needs to be carried out followed by an action plan, alongside, a rollback methodology.

If after killing some processes, the kernel cannot free up some memory, it might lead to a kernel panic, deadlocks in applications, kernel hungs, or several defunct processes in the machine. I know cases where the machine change run level mode. There are cases of kernel panic in virtual machines where the cluster is not healthy. In brief, OOM is a subsystem to kill one or more processes with the aim to free memory. In the article Linux kernel crash simulation using kdump, I gave an explanation how to activate Kdump to generate a vmcore for analysis. However, to send the debug messages during an out of memory error to the vmcore, the SYSCTL file need to be configured. I will be using a CentOS 7 machine to illustrate the OOM parameters and configurations.

1.To activate OOM debug in the vmcore file, set the parameter vm.panic_on_oom to 1 using the following command:

systctl -w vm.panic_on_oom=1

To verify if the configuration has been taken into consideration, you can do a sysctl -a | grep -i oom. It is not recommended to test such parameters in the production environment.

2. To find out which process the kernel is going to kill, the kernel will read a function in the kernel code called badness() . The badness() calculate a numeric value about how bad this task has been. To be precise, it works by accumulating some “points” for each process running on the machine and will return those processes to a function called select_bad_process() in the linux kernel. This will eventually start the OOM mechanism which will kill the processes. The “points” are stored in the /proc/<pid>/oom_score. For example, here, i have a server running JAVA.

As you can see, the process number is 2153. The oom_score is 21

3. There are lots of considerations that are taken into account when calculating the badness score. Some of the factors are the Virtual Memory size (VM size), the Priority of the Process (NICE value), the Total Runtime, the Running user and the /proc/<pid>/oom_adj. You can also set up the oom_score_adj value for any PID between -1000 to 1000. The lowest possible value, -1000, is equivalent to disabling OOM killing entirely for that task since it will always report a badness score of 0.

4. Let’s assume that you want to prevent a specific process from being killed.

echo -17 > /proc/$PID/oom_adj

5. If you know the process name of SSH Daemon and do not it from being killed, use the following command:

pgrep -f "/usr/sbin/sshd" | while read PID; do echo -17 > /proc/$PID/oom_adj; done

6. To automate the sshd from being killed through a cron which will run each minute use the following:

* * * * * root pgrep -f "/usr/sbin/sshd" | while read PID; do echo -17 > /proc/$PID/oom_adj; done
7. Let's now simulate the OOM killer messages. Use the following command to start an out of memory event 
on the machine.
echo f > /proc/sysrq-trigger 

You will notice an OOM error message in the /var/log/messages.
As you can notice here, the PID 843 was calculated by the OOM killer before killing it. 
There is also the score number which is 4 in our case.

Before the 'Out of memory' error, there will be a call trace which will be sent by the kernel.

8. To monitor how the OOM killer is generating scores, you can use the dstat command. To install the dstat 
package on RPM based machine use: 
yum install dstat 

or for debian based distribution use:
apt-get install dstat

Dstat is used to generate resource statistics. To use dstat to monitor the score from OOM killer use:
dstat -top-oom


  • oom_score_adj is used in new linux kernel. The deprecated function is oom_adj in old Linux machine.
  • When disabling OOM killer under heavy memory pressure, it may cause the system to kernel panic.
  • Making a process immune is not a definite way of solving problem, for example, when using JAVA Application. Use a thread/heap dump to analyse the situation before making a process immune.
  • Dstat is now becoming an alternative for vmstat, netstat, iostat, ifstat and mpstat. For example, to monitor CPU in a program, use dstat -c –top-cpu -dn –top-mem
  • Testing in production environment should be avoided!

Hackers.mu attracted a massive crowd at the DevConMru 2017

This is yet another dazzling inspiration that hackers.mu brought into the mind of the audience today on the 1st of April 2017 at the DevConMru – Day 2. After the mesmerising speech at the DevConMru by Logan, this time Codarren Velvindron, core member of hackers.mu hit the conference room with so many attendees. Fast Coding Skills – A well chosen topic especially for the curious ones, beginners or professionals who want to remove the barrier between the code and them. Codarren started the presentation by giving some examples about the applications he ventured into, for example MariaDB.

The room was full with over fifty attendees. While some were sitting on the floor, others leaned up against the wall focussed on Codarren. I heard someone from the crowd murmuring “I want to be a hacker”.. 🙂


Several analogies were brought to the attention of the audience such as the difficulties which one has to encounter whilst coding. Tips and tricks to get relief from these difficulties were offered; such as playing, breaking the huge task into parts and analysing the mini parts of each. Another way to understand how the code works is by “deleting” part it after a backup to know how it would behave in a different environment. Codarren also shared his experience about the IETF hackathon in which he participated.

Here is the Slide of Codarren at the DevConMru 2017

Fast Coding Skills by Codarren Velvindron on Scribd

At the end, we thanked Codarren for the job done. Members of hackers.mu kept on responding to people from the audience who were showing interest in coding. Some questions from the audience were about the challenges faced in the IETF hackathon as well as Codarren’s favourite programming language. “Talk is cheap, show me the code” – Linus Torvalds.

Hackers.mu mesmerising speech at the DevConMru 2017

The message was clear and direct at the DevConMru 2017. Painted with a humorist approach, Loganaden Velvindron #2 of hackers.mu bridged the gap between students who were mainly in the audience to reach their goal in the IT industry and Linux in Mauritius.

The DevConMru is a yearly event to bring together developers, beginners, students and professionals. The goal is to bring more craftsmen under the same roof. “Mauritius has been branded “Cyber Island” in the Indian Ocean… Opinions in those matters vary but with this conference we strive to improve the general attribution of our island. Mauritius has great political stability and economical advantages for foreign investors, and the most precious resource Mauritius has to offer is people’s knowledge. The ICT sector in Mauritius is growing since years and maturing as the fourth pillar of our economy. With its geographical position Mauritius is also welcome as a business and knowledge hub between Africa and Asia.”MSCC.

In today’s Mauritius IT industry, everyone wants to have a better standard of living. But how? How to build a successful IT industry? Are we moving in the right direction? Are foreign investors attracted by the quality of the Mauritius IT industry ? Logan did not miss those points to bring the audience on the track.

Photo Credits: Hackers.mu
Photo Credits: Hackers.mu

After giving a brief intro of the hackers.mu team, Logan explained the requirements and life cycles of IT companies and their profits as to whether they are in the same line of fresh IT graduates and professionals. A vivid example is by analyzing the statistics of Github accounts in Mauritius, the quality and quantity of code contribution compared with Singapore. Students were encouraged to publish their coding exercise on GitHubs, create a blog and take part in Google Code-in.

Indeed, hackers.mu work towards such goals, for example, participation and mentoring for the Google Code-in. Several hackathons were organised. Contributions in the real world applications such as Pfsense, OpenSSH, OpenSSL, OpenBSD, LibArchive, Firejails, Linux and others.. This list is long. An award was also received during the IETF 98 Hackathon.

Logan at the DevConMru 2017
Logan at the DevConMru 2017

The slide can be viewed here or on the Scriba website.

I was impressed how Avinash Meetoo, honorary member of the hackers.mu hacked the audience and shed some light to boost the students. Avinash mentioned himself about his passion for blogging and the importance of projecting his personality with the right vision.

At hackers.mu, we invite many to join us, but one have to work hard to attain a certain level of professionalism.  After the presentation, many came to congratulate us for the job done. We were around chatting with many sharing our work and job experience as well as the passion for coding. I once read a phrase in an old book as follows “You are what you eat”.  But things have change now because You are what You CODE!!

SAR command daily tips and tricks

As promised on Twitter days back, I would post some interesting tips and tricks using the SAR (System Activity Report) linux command. The sar command writes to standard output the contents of selected cumulative activity counters in the operating system. The accounting system, based on the values in the count and interval parameters, writes information the specified number of times spaced at the specified intervals in seconds. If the interval parameter is set to zero, the sar command displays the average statistics for the time since the system was started.die.net 

Understanding SAR and its main configuration files

The SAR command is part of the sysstat package which is a multi-purpose analysis tool and it is useful to pin point specific issue related to CPU, Memory, I/O and Network. The command is really useful especially to plot the output on a graph for visual analysis and reporting. One example of such tool is GNUplot. To install GNUplot and SAR use the command yum install sysstat gnuplot. The configuration file of SAR is located at /etc/cron.d/sysstat and /etc/sysconfig/sysstat . If you would perform a rpm -ql sysstat | less , you would noticed that there are other binaries such as iostat,mpstat,pidstat etc.. that comes along with the package sysstat.

Difference between SA and SAR logs

In the directory /etc/cron.d/sysstat you would noticed that there is a cron which have been set up by default to run every ten minutes. The purpose is to write a log in the directory /var/log/sa . In this directory there are two type of files starting with sa and sar. SAR is the text file while SA is a binary. The sa file – Binary is updated every 10 minutes whereas the sar file is written at the end of the day. This parameter have been configured in the cron itself. By using the file command you would know which one is a binary or a text file.

To open the sa file, you need to use the command sar -f . Here is an example:

The /etc/sysconfig/sysstat file allows you to configure how long you want to keep the log, compression etc..

Some ways to use the SAR command

You can also visualize sar logs live using the sar command with the start and ending second. Let’s say you have run a command on the background or simply want to track the resource status for some seconds, you can use the command sar 1 3 Here is an example with the command

sar 1 3

Checking the load average, we apply the same principle but with the following command. The load average will also include load on each processor.

sar -q 1 3

To check for memory being consumed per seconds, use the following command

sar -r 1 3

To check number of memory coming in and out of the swap space, use the following command

sar -W 1 3

For the Disk I/O read write per seconds use the following command. Read/Write on disk also depend on the hardware

sar -b 1 3

For info about the CPU use the following

sar -u 1 3

To monitor the network activity in terms of packets in and out per interface received and compressed, use the following command

sar -n DEV 1 3

If the sar file has not been generated yet from the binary, you can use the following command, let’s say to convert it to a text in the /tmp directory.

sar -A -f  sa25 > /tmp/sar25

KSAR Graph with SAR logs

Now, in production environment, you need to to analyze for example at a specific time where memory or CPU was high. This can be done by means of a graph. I use the Ksar program. Ksar is a BSD-Licensed JAVA based application which create graph based on sar logs. You will need to install JAVA Runtime and launch the run.sh script to install the Ksar program. Once downloaded, just click on ‘Data’ and ‘Load from text file’. This is an example of swap usage


  • The SAR output by default is in 12 hour clock format. To make it become a 24 hour clock format edit the .bashrc file and insert the parameter alias sar=”LANG=C sar
  • GNUPlot is one another application to plot your information on a graph for better analysis.

What goes on behind the Network Time Protocol ?

Several times, I had discussions with friends on how NTP works! What is the logic behind NTP and its configurations? We noticed that there are several terms and calculations to grasp especially when it comes to debugging. Well, I decided to make a research on it and shed some ideas on NTP – Network Time Protocol. These recent days, we have noticed several vulnerabilities and attacks going on the NTP servers. NTP is a protocol designed to synchronize the clocks of computers over a network.

Photo credits: networktimefoundation.org
Photo credits: networktimefoundation.org

NTP is a utility where timestamps are used. Examples are logs, database replications, the time packets  exchanged in a network. NTP uses its own binary format and runs on port 123 UDP. RFC 1305 and RFC 2030 give detailed explanations of NTP.

The logic behind NTP

In brief, packets are exchanged between the NTP server and the client in the following order. It is to be noted that latency is an important issue when it comes to NTP:

  1. The NTP client will send a request with a timestamp.
  2. The NTP server will return the packet with 3 timestamps.
    • echo of the client timestamp
    • The timestamp of the received timestamp by the server
    • The timestamp response sent by the server.
  3. The client will then estimate the offset (the difference in timestamps between the client and the server)

A client may have several NTP servers configured, but will synchronize with only one NTP server. A server may also take some time to respond to a client depending when it is not busy. NTP packets are exchanged between 64 – 1024 seconds for each server. – This configurations are called “minpoll” and “maxpoll”

Some basic configuration on a CentOS 7 machine

The command timedatectl can be used to check if NTP is enabled or not

[[email protected] ~]# timedatectl | egrep -i ntp
     NTP enabled: yes
NTP synchronized: yes

You can also check if the service is running using the following command;

systemctl status chronyd.service

The configuration file is located by default at /etc/chrony.conf . You will notice that the NTP servers are configured by default in that file.

[[email protected] ~]# head -n 7 /etc/chrony.conf 

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org iburst
server 2.centos.pool.ntp.org iburst
server 3.centos.pool.ntp.org iburst

You can check if your machine are synchronised from the sources with the following command. The column Name/IP address are the location from where the time is being synchronized.

[[email protected] ~]# chronyc sources

210 Number of sources = 4
MS Name/IP address         Stratum Poll Reach LastRx Last sample
^+ cpt-ntp.mweb.co.za            2   6   367    40    +29ms[  +29ms] +/-  151ms
^* cpto-afr-01.time.jpbe.de      2   7   377    43    +69ms[  +66ms] +/-  178ms
^+ ntp2.inx.net.za               2   6   377    43    +64ms[  +64ms] +/-  181ms
^+ ns.bitco.co.za                3   7   377   107    +79ms[  +77ms] +/-  199ms

A drift means a deviation. A drift happens when  the hardware clock is either fast or slow compared to the NTP server clock. The drift file contains 2 values. If it’s a positive number, it means the clock is fast from the NTP server whereas if it is a negative number, it means the clock is slow compared to the NTP server. Here i have a slow clock.

[[email protected] ~]# cat /var/lib/chrony/drift 

         -870.203668          1484.980507
The Maths - How the NTP client adjust its response from the NTP server

Now, we will get into the math behind the logic as discussed previously what happens when the NTP client request the time from the NTP server.

  1. NTP Client A send request to server X – Let’s assume A=100 where 100 is the time of the client
  2. NTP Server X received the request after some secs. – Let’s assume X= 150 where 150 is the time of the server.
  3. Being given that, the request from NTP client is not necessarily served immediately, there is lapse of time at this point. let’s assume that X is now 160
  4. We now have 3 values i.e; The time the client sent the request, the time (real time) the server received the request and the time the server want to respond back.
  5. Now the NTP client gets the request back at 120. This is because the NTP client has its own time.
  6. Client will now determine the time using the formulae B-A – (Y-X) which means 120-100-(160-150) = 10 seconds
  7. Client assumed that the time it took to get the response from server to client is 10/2 = 5 seconds. Assuming 5 seconds is the latency.
  8. Now the client adds 5 seconds to  the server time at the time it received the response which makes 160 +5 = 165 seconds.
  9. The client knows it needs to add 45 seconds to its clock. This is done by subtracting 165 – 120 = 45 seconds where 45 seconds is the difference between the client and the server clock to which the client will set forward its clock by 45 seconds. This indication will be given in the drift file in PPM – Parts Per Million.


  • If the iburst parameters are removed, communication between the server will 8 times faster.
  • You can also increase the verbosity of the command chronyc sources by adding the parameter -v to it and detailed explanation of the values will be given i.e; chronyc source -v
  • You can pick up different NTP servers from the NIST website and restart the NTP service (chronyd).
  • NTP was invented by David L. Mills in 1981 and it is based on Marzullo’s algorithm to get accurate time from several sources.
  • Timestamps of NTP are stored in seconds and it is 64 bit in size – 32 bit for number of seconds and 32 bit for fraction of seconds.
  • Number in drift file is measured in PPM – Parts per million
  • Offset – The difference in timestamps between the client and the server
  • Burst – The speed of communication between NTP client and NTP server will be 8 times more if “burst” is used.
  • The drift value is in PPM – Parts Per Million.
  • To convert into PPM is easy. Since we have 86,400 seconds in a day, therefore, 86,400 / 1,000,000 = 0.0864 PPM
  • If my drift file shows a value of Z  where Z = 30.3 simply do (30.3 x 0.0864) to get the drift file into milliseconds.