Debugging disk issues with blktrace, blkparse, btrace and btt in Linux environment

blktrace is a block layer IO tracing mechanism which provides detailed information about request queue operations up to user space. There are three major components: a kernel component, a utility to record the i/o trace information for the kernel to user space, and utilities to analyse and view the trace information. This man page describes blktrace, which records the i/o event trace information for a specific block device to a file.

 

Photo credits : www.brendangregg.com
Photo credits : www.brendangregg.com

Limitations and Solutions with blktrace

There are several limitations and solutions when using blktrace. We will focus mainly on its goal and how the blktrace tool can help us in our day to day task. Assuming that you want to debug any IO related issue, the first command will be probably an iostat. These utilities can be installed through a yum install sysstat blktrace. For example:

iostat -tkx -p 1 2

The limitation here with iostat is that it does not gave us which process is utilising which IO.  To resolve this issue, we can use another utility such as iotop. Here is an example of the iotop output.

blktrace, iotop, blkparse, btt and btrace

Here iotop shows us exactly which process is consuming ‘which’ and ‘how’ much IO. But another problem with that solution is that it does not give detailed information. So, blktrace comes handy as it gives layer-wise information. How it does that? It sees what is going on exactly inside block I/O layer. When used correctly, it’s possible to generate events for all I/O request and monitor it from where it is evolving. Though it extracts data from the kernel, it is not an analysis tool and the interpretation of the data need to be carried out by you. However, you can feed the data to btt or blkparse to get the analysis done.

Before looking at blktrace, let’s check out the I/O architecture. Basically, at the userspace the user will write and read data. This is what the User Process at the user space. The user do not write directly to the disk. They first write to the VFS Page Cache and from which there are various I/O Scheduler and the Device Driver will interact with the disk to write the data.

Photo credits: msreekan.com
Photo credits: msreekan.com

The blktrace will normally capture events during the process. Here is a cheat sheet to understand blktrace event capture.

photo credits: programering.com
photo credits: programering.com

blkparse will parse and format events acquired from blktrace. If you do not want to run blkparse, btrace is a shortcut to generate data out of blktrace and blkparse. Finally we have btt which will analyze data from blktrace and generate time deltas for each layer.

Another tool to grasp before moving on with blktrace is debugfs which is a simple-to-use RAM-based file system specially designed for debugging purposes. It exists as a simple way for kernel developers to make information available to user space. Unlike /proc, which is only meant for information about a process, or sysfs, which has strict one-value-per-file rules, debugfs has no rules at all. Developers can put any information they want there.lwn.

So the first thing to do is to mount the debugfs file system using the following command:

mount -t debugfs debugfs /sys/kernel/debug

The aim is to allow a kernel developer to make information available in user space. The output of the command below describe how to mount and verify same. You can use the command mount to test if same has been successful. Now that you have the debug file system, you can capture the events.

Diving into the commands

1.So you can use blktrace to trace out the I/O on the machine.

blktrace -d /dev/sda -o -|blkparse -i -

2. At the same time, on another console launch the following command to generate some I/O for testing purpose.

dd if=/dev/zero of=/mnt/test1 bs=1M count=1

From the blktrace console you will get an output which will end up as follows :

CPU0 (8,0):
 Reads Queued:           2,       60KiB Writes Queued:       5,132,   20,524KiB
 Read Dispatches:        2,       60KiB Write Dispatches:       61,   20,524KiB
 Reads Requeued:         0 Writes Requeued:         0
 Reads Completed:        2,       60KiB Writes Completed:       63,   20,524KiB
 Read Merges:            0,        0KiB Write Merges:        5,070,   20,280KiB
 Read depth:             1         Write depth:             7
 IO unplugs:            14         Timer unplugs:           9
Throughput (R/W): 8KiB/s / 2,754KiB/s
Events (8,0): 21,234 entries
Skips: 166 forward (1,940,721 -  98.9%)

3. Same result can also be achieved using the btrace command. Apply the same principle as in part 2 once the command has been launched.

btrace /dev/sda

4. In part 1, 2 and 4 the blktrace commands were launched in such a way that it will run forever – without exiting. In this particular example,  I will output the file name for analysis. Assume that you want to run the blktrace for 30 seconds, the command will be as follows:

blktrace -w 30 -d /dev/sda -o io-debugging

5. On another console, launch the following command:

dd if=/dev/sda of=/dev/null bs=1M count=10 iflag=direct

6. Wait for 30 seconds to allow step 4 to be completed. I got the following results just after:

[[email protected] mnt]# blktrace -w 30 -d /dev/sda -o io-debugging
=== sda ===
  CPU  0:                  510 events,       24 KiB data
  Total:                   510 events (dropped 0),       24 KiB data

7. You will notice at the directory /mnt  the file will be created. To read it use the command blkparse.

blkparse io-debugging.blktrace.0 | less

8. Now let’s see a simple extract from the blkparse command:

8,0    0        1     0.000000000  5686  Q   R 0 + 1024 [dd]
8,0    0        0     0.000028926     0  m   N cfq5686S / alloced
8,0    0        2     0.000029869  5686  G   R 0 + 1024 [dd]
8,0    0        3     0.000034500  5686  P   N [dd]
8,0    0        4     0.000036509  5686  I   R 0 + 1024 [dd]
8,0    0        0     0.000038209     0  m   N cfq5686S / insert_request
8,0    0        0     0.000039472     0  m   N cfq5686S / add_to_rr

The first column shows the device major,minor tuple, the second column gives information about the CPU and it goes on with the sequence, the timestamps, PID of the process issuing the IO process. The 6th column shows the event type, e.g. ‘Q’ means IO handled by request queue code. Please refer to above diagram for more info. The 7th column is R for Read, W for Write, D for block, B for Barrier operation and finally the last one is the block number and a following + number is the number of blocks requested. The final field between the [ ] brackets is the process name of the process issuing the request. In our case, I am running the command dd.

9.The output can be also analyzed using btt command. You will get almost the same information.

btt -i io-debugging.blktrace.0

Some interesting information here is D2C means the amount of time the IO has been spending into the device whilst Q2C means the total time take as there might be different IO concurrent.

A graphical user interface to makes life easier

The Seekwatcher program generates graphs from blktrace runs to help visualize IO patterns and performance. It can plot multiple blktrace runs together, making it easy to compare the differences between different benchmark runs. You should install the seekwatcher package if you need to visualize detailed information about IO patterns.

The command to be used to generate a picture to for analysis is as follows where seek.png is the output of the png name given.

seekwatcher -t io-debugging.blktrace.0 -o seek.png

What is also interesting is that you can create a movie-like with seekwatch to view the graph.

seekwatcher -t io-debugging.blktrace.0 -o seekmoving.mpg --movie

Tips:

  • For the debugfs, you can also edit the /etc/fstab file to make the mount point permanently. 

  • By default, blktrace will capture all events. This can be limited with the argument -a.
  • In case you want to capture persistently for a long time or for a certain amount of time, then use the argument -w.
  • blktrace will also stored the extracted data in local directory with a format device.blktrace.cpu, for example sda1.blktrace.cpu.
  • At step 1 and 2, you will need to fire a CTRL +C to stop the blktrace.
  • As seen in part 2, you have created test test1 file, do delete it same may consume disk space on your machine.

  • On part 5, the size of the file created in the  /mnt will not exceeds more that 1M since same has been specified in the command.
  • At part 9, you will noticed several other information which can be helpful such as D2C and  Q2C.
    • Q2D latency – time from request submission to Device.
    • D2C latency – Device latency for processing request.
    • Q2C latency – total latency , Q2D + D2C = Q2C.
  • To be able to generate the movie, you will have to install mencoder with all its dependencies.

Linux memory analysis with Lime and Volatility

Lime is a Loadable Kernel Module (LKM) which allows for volatile memory acquisition from Linux and Linux-based devices, such as Android. This makes LiME unique as it is the first tool that allows for full memory captures on Android devices. It also minimises its interaction between user and kernel space processes during acquisition, which allows it to produce memory captures that are more forensically sound than those of other tools designed for Linux memory acquisition. – Lime. Volatility framework was released at Black Hat DC for analysis of memory during forensic investigations.

Analysing memory in Linux can be carried out using Lime which is a forensic tool to dump the memory. I am actually using CentOS 6 distribution installed on a Virtual Box to acquire memory. Normally before capturing the memory, the suspicious system’s architecture should be well known. May be you would need to compile Lime on the the suspicious machine itself if you do not know the architecture. Once you compile Lime, you would have a kernel loadable object which can be injected in the Linux Kernel itself.

Linux memory dump with Lime

1. You will first need to download Lime on the suspicious machine.

git clone https://github.com/504ensicsLabs/LiME

2. Do the compilation of Lime. Once it has been compiled, you will noticed the creation of the Lime loadable kernel object.

make

3. Now the kernel object have to be loaded into the kernel. Insert the kernel module. Then, define the location and format to save the memory image.

insmod lime-2.6.32-696.23.1.el6.x86_64.ko "path=/Linux64.mem format=lime"

4. You can view if the module have been successfully loaded.

lsmod | grep -i lime

Analysis with Volatility

5. We will now analyze the memory dump using Volatility. Download it from Github.

git clone https://github.com/volatilityfoundation/volatility

6.  Now, we will create a Linux profile. We will also need to download the DwarfDump package. Once it is downloaded go to Tools -> Linux directory, then create the module.dwarf file.

yum install epel-release libdwarf-tools -y && make

7. To proceed further, the System.map file is important to build the profile. The System.map file contains the locations of all the functions active in the compiled kernel. You will notice it inside the /boot directory. It is also important to corroborate the version appended with the System.map file together the version and architecture of the kernel. In the example below, the version is 2.6.32-696.23.1.el6.x86_64.

8. Now, go to the root of the Volatility directory using cd ../../ since I assumed that you are in the linux directory. Then, create a zip file as follows:

zip volatility/plugins/overlays/linux/Centos6-2632.zip tools/linux/module.dwarf /boot/System.map-2.6.32-696.23.1.el6.x86_64

9. The volatility module has now been successfully created as indicated in part 8 for the particular version of the Linux and kernel version. Time to have fun with some Python script. You can view the profile created with the following command:

python vol.py --info | grep Linux

As you can see the profile LinuxCentos6-2632 profile has been created.

10. Volatile contains plugins to view details about the memory dump performed. To view the plugins or parsers, use the following command:

python vol.py --info | grep -i linux_

11. Now imagine that you want to see the processes running at the time of the memory dump. You will have to execute the vol.py script, specify the location of the memory dump, define the profile created and call the parser concerned.

python vol.py --file=/Linux64.mem --profile=LinuxCentos6-2632x64 linux_psscan

12. Another example to recover the routing cache memory:

python vol.py --file=/Linux64.mem --profile=LinuxCentos6-2632x64 linux_route_cache

Automating Lime using LiMEaid

I find the LiMEaid tools really interesting to remote executing of Lime. “LiMEaide is a python application designed to remotely dump RAM of a Linux client and create a volatility profile for later analysis on your local host. I hope that this will simplify Linux digital forensics in a remote environment. In order to use LiMEaide all you need to do is feed a remote Linux client IP address, sit back, and consume your favorite caffeinated beverage.” – LiMEaid

Tips:

  • Linux architecture is very important when dealing with Lime. This is probably the first question that one would ask.
  • The kernel-headers package is a must to create the kernel loadable object.
  • Once a memory dump have been created, its important to take a hash value. It can be done using the command md5sum Linux64.mem
  • I would also consider to download the devel tools using yum groupinstall “Development Tools” -y
  • As good practice as indicated in part 8 when creating the zip file, use the proper convention when naming the file. In my case I used the OS version and the kernel version for future references.
  • Not all Parsers/Plugins will work with Volatile as same might not be compatible with the Linux system.
  • You can check out the Volatile wiki for more info about the Parsers.

 


IETF 101 Hackathon by the Hackers.mu team

We believe in rough consensus and running code” – Just have a look at the IETF website, this is the motto that you would come across. This is why the IETF hackathons are so special during the year and hackers.mu team is proud to be the first team in Mauritius who does not only participate in such type of event but also lead the TLS working group. The IETF 101 hackathon was yet another challenge for the hackers.mu team. But, once you are in, the fun begins. Compared to the IETF 100 hackathon, hackers.mu team made an improvement in terms of lines of codes and focused on more projects. We participated remotely in projects such as TLS 1.3, DNS, and HTTP 451. A wiki was also created during that event.

Photo credits: IETF.org
Photo credits: IETF.org

We used Jabber to communicate for the IETF 101 hackathon. Other media such as Facebook was found out to be interesting. I should admit that on Friday and Saturday I went to sleep at 02.00 AM with just the testing part completed. At 23:00 hrs, Logan was asking everyone to go to sleep as we needed more energy on the next day. Selven was also working hard remotely to bring all members on track. What is more relieving is the team spirit where everyone was helping each other during that hackathon.

Photo Credits: Codarren.com
Photo Credits: Codarren.com

One of the interesting issues noticed is about TLS malformed traffic and such thing was able to be detected using Wireshark. Once the patches were ready and the testing part was working fine, we made a debrief at Flying Dodo beer brewing company at Bagatelle Mall and was ready submit patches to their respective projects. I was assigned the “Stunnel” project and a library in “Eclipse Paho”.

Debriefing at Flying Dodo accompanied with beer and some fries
Debriefing at Flying Dodo accompanied with beer and some fries

After the debriefing, Logan was getting ready for his remote presentation at the IETF. We all went through the slides that logan created and went back home happily to see the presentation live on YouTube.

Special thanks goes to the IETF Organising team for having us as Technology Champions! Nick Sullivan head of cryptography expert at CloudFlare, Charkes Eckel, Barry Leiba, Meetecho team, Cisco for sponsoring the event and the all members of the hackers.mu team which made this hackathon a success in the world history of Mauritius.

Other’s are also talking about the IETF 101 hackathon ?

“I had initially started a bit slow, as I was working on other projects in parallel. Everyone was already deeply immersed in their projects, we could see PRs and code merges flying right from the first day.”Codarren Velvindron

“It seems that I am not the only one who feels that this hackathon was really addictive. we were hooked the moment we started working out on our tasks.”Pirabarlen Cheenaramen

Developers working with OpenSSL can finally start to work with TLS 1.3, thanks to the alpha version of OpenSSL 1.1.1 that landed yesterday.” – TheRegister

I think that you guys have more better weather and more fun that we did”Charles Eckel

The DNS madness: 185 RFCs totaling 2781 pages. Hello DNS security flaws ” – Loganaden Velvindron

hackers.mu pioneering the internet! We made it to IETF 101 hackathon with our team members getting featured in front of thousands, followed by a round of applause by IETF members in London. Congratulations guys, we did it again!”Yasir Auleear 

IETF Hackathons encourage developers to collaborate and develop utilities, ideas, sample code and solutions that show practical implementations of IETF standards. The IETF Hackathon in London on 17-18 March is poised to be the largest ever.” – IETF

 In case you are asking yourself, “who are the hackers.mu ?” You can consider is as “a group of developers from Mauritius who loves to code and are passionate about information security.” More information at https://www.hackers.mu


Auditing Linux Operating System with Lynis

Auditing a Linux System is one of the most important aspect when it comes to security. After deploying a simple Centos 7 Linux machine on virtual box, I made an audit using Lynis. It is amazing how many tiny flaws can be seen right from the beginning of a fresh installation. Lynis Enterprise performs security scanning for Linux, macOS, and Unix systems. It helps you discover and solve issues quickly, so you can focus on your business and projects again.Cisofy.

Credits: cisofy.com
Credits: cisofy.com

Introduction

The Lynis tool performs both security and compliance auditing. It has a free and paid version which comes very handy especially if you are on a business environment. The installation of the Lynis tool is pretty simple. You can install it through the Linux repository itself, download the tar file or clone it directly from Github.

 

Scanning Performed by Lynis

1. I downloaded the tar file with the following command:

wget https://cisofy.com/files/lynis-2.6.0.tar.gz

2. Then, just untar the file and get into it

tar -xzf lynis-2.6.0.tar.gz && cd lynis

3. Once into the untar directory, launch the following command:

./lynis audit system --quick

 
As you can see from the output above, there are several suggestions at the end of the scan. In case the paid version of the application was used, more information and commands as how to remediate the situation would be given including support from Lynis. As regards to the free version, you can also debug by yourself several security aspects from the suggestions.
 
Suggestions, Compliance and Improvement.

 
1.The first two suggestions were about minimum and maximum password age.

Configure minimum password age in /etc/login.defs [AUTH-9286]

Configure maximum password age in /etc/login.defs [AUTH-9286]

To check the minimum and maximum password age, use the chage command :
chage -l
 

2. Use chage -m root to set the minimum password age and chage -M root to set maximum password age:

Also, you will have to set the parameter in the /etc/login.defs file

3. Delete accounts which are no longer used [AUTH-9288]

It is also suggested to delete accounts which are no longer in use. This suggestion was prompted as I created a user  “nitin” account during installation and did not use it yet. For the purpose of this blog, I deleted it using userdel -r nitin

4. Default umask in /etc/profile or /etc/profile.d/custom.sh could be more strict (e.g. 027) [AUTH-9328]

Default umask values are taken from the information provided in the /etc/login.defs file for RHEL (Red Hat) based distros. Debian and Ubuntu Linux based system use /etc/deluser.conf. To change default umask value to 027 which is actually 022 by default, you will need to modify the /etc/profile script as follows:

5. To decrease the impact of a full /home file system, place /home on a separated partition [FILE-6310]

  To decrease the impact of a full /tmp file system, place /tmp on a separated partition [FILE-6310]

  To decrease the impact of a full /var file system, place /var on a separated partition [FILE-6310]

In the article Move your /home to another partition, you will have detailed explanations to sort out this issue.

6. Disable drivers like USB storage when not used, to prevent unauthorized storage or data theft [STRG-1840]

   Disable drivers like firewire storage when not used, to prevent unauthorized storage or data theft [STRG-1846]

To disable USB and firewire storage drivers, add the following lines in /etc/modprobe.d/blacklist.conf then do a modprobe usb-storage && modprobe firewire-core

blacklist firewire-core
blacklist usb-storage

7. Split resolving between localhost and the hostname of the system [NAME-4406]

This issue is only about hostname and localhost in /etc/hosts which could confuse some applications installed on the machine. According to cisofy, for proper resolving, the entries of localhost and the local defined hostname, could be split. Using some middleware and some applications, resolving of the hostname to localhost, might confuse the software.

8. Install package ‘yum-utils’ for better consistency checking of the package database [PKGS-7384]

      Consider running ARP monitoring software (arpwatch,arpon) [NETW-3032]

The yum-utils and arpwatch are nice tools to perform more debugging and verification. Install it using the following commands:

yum install yum-utils arpwatch -y

9. You are advised to hide the mail_name (option: smtpd_banner) from your postfix configuration. Use postconf -e or change your main.cf file (/etc/postfix/main.cf) [MAIL-8818]

You just have to uncomment the following line and lauch a postconf -e. However, since this is a fresh install, and I’m not using postfix, it is better to stop the service.

 10.  Check iptables rules to see which rules are currently not used [FIRE-4513]

Since, I’m not on a production environment, it is very difficult to identify unused iptables rules right now. Once on the production environment, this situation is different. According to Cisofy, the best way is to “use iptables –list –numeric –verbose to display all rules. Check for rules which didn’t get a hit and repeat this process several times (e.g. in a few weeks). Finally remove any unneeded rules.”

 11. Consider hardening SSH configuration [SSH-7408]

  •     – Details  : AllowTcpForwarding (YES –> NO)
  •     – Details  : ClientAliveCountMax (3 –> 2)
  •     – Details  : Compression (YES –> (DELAYED|NO))
  •     – Details  : LogLevel (INFO –> VERBOSE)
  •     – Details  : MaxAuthTries (6 –> 2)
  •     – Details  : MaxSessions (10 –> 2)
  •     – Details  : PermitRootLogin (YES –> NO)
  •     – Details  : Port (22 –> )
  •     – Details  : TCPKeepAlive (YES –> NO)
  •     – Details  : UseDNS (YES –> NO)
  •     – Details  : X11Forwarding (YES –> NO)
  •     – Details  : AllowAgentForwarding (YES –> NO)

Again, hardening SSH is one of the most important to evade attacks especially from SSH bots. It all depends how your network infrastructure is configured and whether it is accessible from the internet or not. However, these details viewed are very informative.

12. Periodic system scan, malware and ransomware scanners are now a must. According to statistics, servers are being hacked constantly. Pervasive Monitoring is becoming a heavy cash deal for malicious softwares. 

The Lynis Command

Lynis documentation is pretty straight forward with a cheat sheet. The arguments are self explicit. Here are some hints.

1.Performs a system audit which is the most common audit.

lynis audit system

2. Provides command to do a remote scan.

lynis audit system remote <host>

3. Views the settings of default profile.

lynis show settings

4. Checks if you are using most recent version of Lynis

lynis update info

5. More information about a specific test-id

lynis show details <test-id>

6. To scan whole system

lynix --check-all Q

7. To see all available parameters of Lynis

lynis show options

At the end of any Lynis command, it will also prompt you where the logs have been stored for your future references. It is usually in /var/log/lynis.log. The systutorial on lynis is also a good start to grasp the command. All common systems based on Unix/Linux are supported. Examples include Linux, AIX, *BSD, HP-UX, macOS and Solaris. For package management, the following tools are supported:- dpkg/apt, pacman, pkg_info, RPM, YUM, zypper.


Happy New Year 2018 from TheTunnelix

My dear friends, readers and fellow bloggers, I would like to seize this opportunity as this is my last blog for the year 2017 to wish you and your family a Happy New Year 2018. There were lots of events in the month of November – December 2017. For today, I’m having a drink with family and friends. Oh yeah, Tomorrow, will be a super party 🙂

Those who missed hackers.mu events recently, in the month of November was about the Infotech 2017 where hackers.mu was present on our special stand busy evangelising OpenSource products. Our accomplishments were also displayed. Logan, from the hackers.mu team also made an amazing speech at the video conference room.

The hackers.mu team also had an end of year get together and lunch in a restaurant at Rose-Hill.

I’m happy to be able to complete my VMware Certified Administrator and VMware Certified Administrator Professional exams. I’m looking forward for more certifications next year. This year has marked the history of Mauritius where lot’s of Open Source contributions were carried out from Mauritians, mainly by hackers.mu. Right now, we have several stuffs in our pipeline. Surprise soon 🙂