Tag: centos

Out of Memory (OOM) in Linux machines

Since some months I have not been posting anything on my blog. I should admit that I was really busy. Recently, a friend asked me about the Out of Memory messages in Linux. How is it generated? What are the consequences? How can it be avoided in Linux machines? There is no specific answer to this as an investigation had to be carried out to have the Root Cause Analysis. Before getting into details about OOM, let’s be clear that whenever the Kernel is starved of memory, it will start killing processes. Many times, Linux administrators will experience this and one of the fastest way to get rid of it is by adding extra swap. However, this is not the definite way of solving the issue. A preliminary assessment needs to be carried out followed by an action plan, alongside, a rollback methodology.

If after killing some processes, the kernel cannot free up some memory, it might lead to a kernel panic, deadlocks in applications, kernel hungs, or several defunct processes in the machine. I know cases where the machine change run level mode. There are cases of kernel panic in virtual machines where the cluster is not healthy. In brief, OOM is a subsystem to kill one or more processes with the aim to free memory. In the article Linux kernel crash simulation using kdump, I gave an explanation how to activate Kdump to generate a vmcore for analysis. However, to send the debug messages during an out of memory error to the vmcore, the SYSCTL file need to be configured. I will be using a CentOS 7 machine to illustrate the OOM parameters and configurations.

1.To activate OOM debug in the vmcore file, set the parameter vm.panic_on_oom to 1 using the following command:

systctl -w vm.panic_on_oom=1

To verify if the configuration has been taken into consideration, you can do a sysctl -a | grep -i oom. It is not recommended to test such parameters in the production environment.

2. To find out which process the kernel is going to kill, the kernel will read a function in the kernel code called badness() . The badness() calculate a numeric value about how bad this task has been. To be precise, it works by accumulating some “points” for each process running on the machine and will return those processes to a function called select_bad_process() in the linux kernel. This will eventually start the OOM mechanism which will kill the processes. The “points” are stored in the /proc/<pid>/oom_score. For example, here, i have a server running JAVA.

As you can see, the process number is 2153. The oom_score is 21

3. There are lots of considerations that are taken into account when calculating the badness score. Some of the factors are the Virtual Memory size (VM size), the Priority of the Process (NICE value), the Total Runtime, the Running user and the /proc/<pid>/oom_adj. You can also set up the oom_score_adj value for any PID between -1000 to 1000. The lowest possible value, -1000, is equivalent to disabling OOM killing entirely for that task since it will always report a badness score of 0.

4. Let’s assume that you want to prevent a specific process from being killed.

echo -17 > /proc/$PID/oom_adj

5. If you know the process name of SSH Daemon and do not it from being killed, use the following command:

pgrep -f "/usr/sbin/sshd" | while read PID; do echo -17 > /proc/$PID/oom_adj; done

6. To automate the sshd from being killed through a cron which will run each minute use the following:

* * * * * root pgrep -f "/usr/sbin/sshd" | while read PID; do echo -17 > /proc/$PID/oom_adj; done
7. Let's now simulate the OOM killer messages. Use the following command to start an out of memory event 
on the machine.
echo f > /proc/sysrq-trigger 

You will notice an OOM error message in the /var/log/messages.
As you can notice here, the PID 843 was calculated by the OOM killer before killing it. 
There is also the score number which is 4 in our case.


Before the 'Out of memory' error, there will be a call trace which will be sent by the kernel.



8. To monitor how the OOM killer is generating scores, you can use the dstat command. To install the dstat 
package on RPM based machine use: 
yum install dstat 

or for debian based distribution use:
apt-get install dstat

Dstat is used to generate resource statistics. To use dstat to monitor the score from OOM killer use:
dstat -top-oom


TIPS:

  • oom_score_adj is used in new linux kernel. The deprecated function is oom_adj in old Linux machine.
  • When disabling OOM killer under heavy memory pressure, it may cause the system to kernel panic.
  • Making a process immune is not a definite way of solving problem, for example, when using JAVA Application. Use a thread/heap dump to analyse the situation before making a process immune.
  • Dstat is now becoming an alternative for vmstat, netstat, iostat, ifstat and mpstat. For example, to monitor CPU in a program, use dstat -c –top-cpu -dn –top-mem
  • Testing in production environment should be avoided!

SAR command daily tips and tricks

As promised on Twitter days back, I would post some interesting tips and tricks using the SAR (System Activity Report) linux command. The sar command writes to standard output the contents of selected cumulative activity counters in the operating system. The accounting system, based on the values in the count and interval parameters, writes information the specified number of times spaced at the specified intervals in seconds. If the interval parameter is set to zero, the sar command displays the average statistics for the time since the system was started.die.net 

Understanding SAR and its main configuration files

The SAR command is part of the sysstat package which is a multi-purpose analysis tool and it is useful to pin point specific issue related to CPU, Memory, I/O and Network. The command is really useful especially to plot the output on a graph for visual analysis and reporting. One example of such tool is GNUplot. To install GNUplot and SAR use the command yum install sysstat gnuplot. The configuration file of SAR is located at /etc/cron.d/sysstat and /etc/sysconfig/sysstat . If you would perform a rpm -ql sysstat | less , you would noticed that there are other binaries such as iostat,mpstat,pidstat etc.. that comes along with the package sysstat.

Difference between SA and SAR logs

In the directory /etc/cron.d/sysstat you would noticed that there is a cron which have been set up by default to run every ten minutes. The purpose is to write a log in the directory /var/log/sa . In this directory there are two type of files starting with sa and sar. SAR is the text file while SA is a binary. The sa file – Binary is updated every 10 minutes whereas the sar file is written at the end of the day. This parameter have been configured in the cron itself. By using the file command you would know which one is a binary or a text file.

To open the sa file, you need to use the command sar -f . Here is an example:

The /etc/sysconfig/sysstat file allows you to configure how long you want to keep the log, compression etc..

Some ways to use the SAR command

You can also visualize sar logs live using the sar command with the start and ending second. Let’s say you have run a command on the background or simply want to track the resource status for some seconds, you can use the command sar 1 3 Here is an example with the command

sar 1 3

Checking the load average, we apply the same principle but with the following command. The load average will also include load on each processor.

sar -q 1 3

To check for memory being consumed per seconds, use the following command

sar -r 1 3

To check number of memory coming in and out of the swap space, use the following command

sar -W 1 3

For the Disk I/O read write per seconds use the following command. Read/Write on disk also depend on the hardware

sar -b 1 3

For info about the CPU use the following

sar -u 1 3

To monitor the network activity in terms of packets in and out per interface received and compressed, use the following command

sar -n DEV 1 3

If the sar file has not been generated yet from the binary, you can use the following command, let’s say to convert it to a text in the /tmp directory.

sar -A -f  sa25 > /tmp/sar25

KSAR Graph with SAR logs

Now, in production environment, you need to to analyze for example at a specific time where memory or CPU was high. This can be done by means of a graph. I use the Ksar program. Ksar is a BSD-Licensed JAVA based application which create graph based on sar logs. You will need to install JAVA Runtime and launch the run.sh script to install the Ksar program. Once downloaded, just click on ‘Data’ and ‘Load from text file’. This is an example of swap usage

TIPS:

  • The SAR output by default is in 12 hour clock format. To make it become a 24 hour clock format edit the .bashrc file and insert the parameter alias sar=”LANG=C sar
  • GNUPlot is one another application to plot your information on a graph for better analysis.

A simple and fast Memcache implementation on WordPress

Memcache is an in-memory key-value pairs data store. The data store lives in the memory of the machine. That is why it is much faster compared to other data stores. The only thing that you can put in memcache are key-value pairs. You can also put anything that is serializable into memcache either as a key or a value. In general, there are two major use cases such as Caching and Sharing Data Cross App Instances. In caching, datastore query results are kept or a user authentication token and session data. There are also APIs call that can also be cached. URL fetching can also be kept in memcache. Content of a whole page can also be kept in memcache. 

Results can be ten times much faster as explained by this benchmark test carried out by Google Developers team some times back.

Photo credits: Google Developers Team
Photo credits: Google Developers Team

1.For installation on a wordpress website, thats simple straight forward. On CLI, simply do a

yum install memcached

2. By default, memcache configurations on Centos7 is found at /etc/sysconfig/memcached as follows

[[email protected] ~]# cat /etc/sysconfig/memcached 
PORT="11211"
USER="memcached"
MAXCONN="1024"
CACHESIZE="64"
OPTIONS=" "

3. To check if your memcache is running correctly, use the following command

echo stats | nc localhost 11211

4. Now, login to the plugin area and download the WP-FFPC plugin and activate it.

Screenshot from 2016-09-04 13-35-27

5. In the wp-config file of the wordpress CMS, define the following parameters

define('WP_CACHE', TRUE);

6. Save changes in the WP-FFPC plugins settings. On your server, you can now see the caching being done

[[email protected]]# echo stats | nc localhost 11211 | grep total_items
STAT total_items 32

And, here you are. A simple and fast memcache implementation on your WordPress website.

Note: By activating a wordpress plugin, you should make sure that proper auditing have been carried out as these days there are several vulnerabilities present on WordPress plugins.

Backup, Restore and work on PostgreSQL cmd line

Last time, we have a look how to install and set up PostgreSQL on a CentOS machine as well as to create a database. Let’s now see some more details about backups and database restore as well as some commands to master the basic of psql on command line.

Screenshot from 2016-03-31 05-31-15

1.We have created a database called test. Here is an example  how you create a table on Postgres

test=# create table hackers( id int primary key not null, country char(20), number int not null, project varchar(50) );

Screenshot from 2016-03-31 05-41-13

2. You can view your table with the command \d or \d <table name>

Screenshot from 2016-03-31 05-42-38

3. To perform a dump of a database log in as the user which have access to the database and launch the following command.

pg_dump test > db_backup.sql

To restore the same backup on a postgres machine, you need to have the database created first

psql test < db_backup.sql

4. A little work around of psql around the database for example accessing directly the database test with the command psql test and to list all the database can be done with \l

Screenshot from 2016-03-31 06-04-24

5. Let’s say you want to switch between database test and toto. This can be done with \c test and \c toto

Screenshot from 2016-03-31 06-07-05

6. To show all tables inside the database can be done with the command \dt

Screenshot from 2016-03-31 06-08-30

7. The schema of the table can also be seen with the command \d+ <table name> Here, it will be \d+ hackers

Screenshot from 2016-03-31 06-11-19

8. All users and permissions can be viewed with \du

Screenshot from 2016-03-31 06-15-18

9. Databases are create and drop inside the psql command line ending with a semi column

Screenshot from 2016-03-31 06-19-19

10. You can also alter user role with superuser and non superuser

Screenshot from 2016-03-31 06-21-51

Install, Setup and Create DB on PostgreSQL

PostgreSQL is yet another open source object relational database system. Its compatible with almost all operating system including BSD, Windows and Linux. I will be using a Centos7 machine to install a PostgreSQL and set up some basics of PostgreSQL. In a next article, i will give some idea of something more robust you can do with PostgreSQL.

Photo Credits: postgresql.org
Photo Credits: postgresql.org

1.You can install PostgreSQL from the repository with the following command

yum install postgresql-server postgresql-contrib

2. Now the first action you need to perform after PostgreSQL server installation is to initialize the database by creating a new database cluster

postgresql-setup initdb

3. You can now start the postgresql service. Postgres will be listening on port 5432

systemctl start postgresql

4. Now that we have PostgreSQL installed on our machine, we can create the first super user with following command. Let’s called the user test. Its a practice to use the username having sudo privilege on the machine itself.

sudo -u postgres createuser --superuser  test

5. To connect on the Postgres command line use the following. postgres is the default user and psql is what u what to run

sudo -u postgres psql

6. You should have something similar to this.

Screenshot from 2016-03-30 21-21-11

7. Now you can set the password for the superuser that you have just created. In my case it is user ‘test’ at step 4

\password test

8. Once you have been prompted to enter the password twice means that you have already set up PostgreSQL. You can exit with the command

\q

9. To connect to the default postgres database you simply need to use the command. To quit follow step6

psql postgres

10. Lets create a database with the superuser. You might need to add your user to the group postgres with the command usermod -g test postgres as postgres will need permission to access your home directory to drop the .psql_history file

sudo -u postgres createdb test

11. To get on the command line you just need to type psql which should show you something similar to this

Screenshot from 2016-03-30 21-25-49