In the past, I wrote an article on how to analyze memory using Lime. We have also seen how to analyze a process using blktrace, blkparse, btrace, etc. Another more interesting way of analyzing processes due to system performance is by using strace which I already blogged. Also, we looked into some basic disk usage analysis. Well, I never wrote an article into these three compute components: Memory, CPU, and Disk. This article will focus on RAM. Are you ready?
RAM- We all know that RAM always loads the memory randomly. Memory has address locations its randoms. In other words, it’s not contiguous (Do not sit next to each other in sequence). So, a good thing to remember is applications always asked for contiguous address locations.
Memory in the virtualization environment.
In virtualization, virtual memory has been introduced which means that applications will get continuous contiguous memory. But, always remember that at the end of the day, it’s all located in physical memory. Basically, a layer is added to the physical memory to address virtual memory. Think about that for a while: Three Virtual Machines are configured with 2 GM RAM each sitting on a 4 GB physical machine. How this will work? It’s called memory overcommitment. In VMware and other virtualization environments, some technologies will kick in at the hypervisor level which will address this issue. It’s called the memory retrieval technique. ESXi hosts will be kicked in these techniques automatically. Let’s look at each of them:
TPS – Transparent Page Sharing
This is basically when the hypervisor found that two memory pages are identical, instead of loading twice on the physical memory, it will load only once. In other words, it’s called data deduplication (Eliminating similar data sets).
Assuming TPC kicks in and still, there is more memory needed. In VMware, the Ballooning driver is called VMmemctl. This is installed when VMware tools are installed. You can exclude this driver as well. ESXi does not know what is happening inside the VM. However, inside the guest operating system, there are two memory pages: Active and idle memory pages – MRU (Most Recently Used) and LRU (Least Recently Used). So Ballooning driver as the name suggests will inflate itself as a fake application. What I mean is that it will reclaim pages that are least valuable to guest operating systems on the host.
In virtualization, there is also compression of the memory pages. There will be a lot of pages available. When compressed, the load compression cache is loaded, and when everything is under control, it is decompressed immediately. However, it is good to know when compression and decompression are happening, buffer load will increase.
The last resort is SWAP – VMware might start using SWAP when memory is very low. This is just a file on the hard disk that is being processed from the memory processor and used as memory. This means at this point, performance will hit very badly. However, there are still ways to make SWAP performance better by using SSD. The way to do this is by placing all the SWAP files in the SSD datastore.
Some Linux commands
At this point, you should have an idea of how memory is managed by hypervisors between physical and virtual machines. However, my main goal in this article is to bring some better understanding, tools, and techniques to use for troubleshooting memory issues. Let’s move on to the technical part now. We also understood that one of the main problems with memory is when SWAP has started kicking in which may impact performance. This can be easily identified with the command vmstat. As you can see under the swap section: “si” and “sa” means swap in and swap out.
Some reasons why your machine would consume SWAP space are:
- The application needs to move pages from RAM to SWAP to make space for the process to run.
- Some systems tried to eliminate process consuming large memory-consuming processes. Usually, OOM will kick in and kill the process. There is an article I wrote in the past about OOM.
dstat can also be a good tool to troubleshoot OOM. To know which process can become a target of OOM killer, use dstat command:
In the next article which will be part 2 of this one, I’ll put some more details on memory and move on to CPU.