Displaying Linux Memory

Memory management is hard, but RAM management may be even harder.

Most people know the vague overall concept of how memory usage is displayed within Linux. You have your total memory which is everything inside the box; then there is used and free which is what the system is or is not using respectively. Some people might know that not all used is used and some of it actually is free.  It can be very confusing to understand, even for a someone who maintains procps (the package that contains top and free, two programs that display memory usage).

So, how does the memory display work?

What free shows

The free program is part of the procps package. It’s central goal is to give a quick overview of how much memory is used where. A typical output (e.g. what I saw when I typed “free -h”) could look like this:

      total   used    free   shared  buff/cache  available
Mem:    15G   3.7G    641M     222M         11G        11G
Swap:   15G   194M     15G

I’ve used the -h option for human-readable output here for the sake of brevity and because I hate typing long lists of long numbers.

People who have good memories (or old computers) may notice there is a missing “-/+ buffers/cache” line. This was intentionally removed in mid-2014 because as the memory management of Linux got more and more complicated, these lines became less relevant. These used to help with the “not used used memory” problem mentioned in the introduction but progress caught up with it.

To explain what free is showing, you need to understand some of the underlying statistics that it works with. This isn’t a lesson on how Linux its memory (the honest short answer is, I don’t fully know) but just enough hopefully to understand what free is doing. Let’s start with the two simple columns first; total and free.

Total Memory

This is what memory you have available to Linux. It is almost, but not quite, the amount of memory you put into a physical host or the amount of memory you allocate for a virtual one. Some memory you just can’t have; either due to early reservations or devices shadowing the memory area. Unless you start mucking around with those settings or the virtual host, this number stays the same.

Free Memory

Memory that nobody at all is using. They haven’t reserved it, haven’t stashed it away for future use or even just, you know, actually using it.  People often obsess about this statistic but its probably the most useless one to use for anything directly. I have even considered removing this column, or replacing it with available (see later what that is) because of the confusion this statistic causes.

The reason for its uselessness is that Linux has memory management where it allocates memory it doesn’t use. This decrements the free counter but it is not truly “used”. If you application needs that memory, it can be given back.

A very important statistic to know for running a system is how much memory have I got left before I either run out or I start to serious swap stuff to swap drives. Despite its name, this statistic will not tell you that and will probably mislead you.

My advice is unless you really understand the Linux memory statistics, ignore this one.

Who’s Using What

Now we come to the components that are using (if that is the right word) the memory within a system.

Shared Memory

Shared memory is often thought of only in the context of processes (and makes working out how much memory a process uses tricky – but that’s another story) but the kernel has this as well. The shared column lists this, which is a direct report from the Shmem field in the meminfo file.

Slabs

For things used a lot within the kernel, it is inefficient to keep going to get small bits of memory here and there all the time. The kernel has this concept of slabs where it creates small caches for objects or in-kernel data strucutures that slabinfo(5) states  “[such as] buffer heads, inodes and dentries”. So basically kernel stuff for the kernel to do kernelly things with.

Slab memory comes in two flavours. There is reclaimable and unreclaimable. This is important because unreclaimable cannot be “handed back” if your system starts to run out of memory. Funny enough, not all reclaimable is, well, reclaimable. A good estimate is you’ll only get 50% back, top and free ignore this inconvenient truth and assume it can be 100%. All of the reclaimable slab memory is considered part of the Cached statistic. Unreclaimable is memory that is part of Used.

Page Cache and Cached

Page caches are used to read and write to storage, such as a disk drive. These are the things that get written out when you use sync and make the second read of the same file much faster. An interesting quirk is that tmpfs  is part of the page cache. So the Cached column may increase if you have a few of these.

The Cached column may seem like it should only have Page Cache, but the Reclaimable part of the Slab is added to this value. For some older versions of some programs, they will have no or all Slab counted in Cached. Both of these versions are incorrect.

Cached makes up part of the buff/cache column with the standard options for free or has a column to itself for the wide option.

Buffers

The second component to the buff/cache column (or separate with the wide option) is kernel buffers. These are the low-level I/O buffers inside the kernel. Generally they are small compared to the other components and can basically ignored or just considered part of the Cached, which is the default for free.

Used

Unlike most of the previous statistics that are either directly pulled out of the meminfo file or have some simple addition, the Used column is calculated and completely dependent on the other values. As such it is not telling the whole story here but it is reasonably OK estimate of used memory.

Used component is what you have left of your Total memory once you have removed:

  • Free memory – because free is not used!
  • The Cached value – recall this is made up of the Page Cache plus the Reclaimable part of Slab
  • The buffers

Notice that the unreclaimable part of slab is not in this calculation, which means it is part of the used memory. Also note this seems a bit of a hack because as the memory management gets more complicated, the estimates used become less and less real.

Available

In early 2014, the kernel developers took pity on us toolset developers and gave us a much cleaner, simpler way to work out some of these values (or at least I’d like to think that’s why they did it). The available statistic is the right way to work out how much memory you have left. The commit message explains the gory details about it, but the great thing is that if they change their mind or add some new memory feature the available value should be changed as well. We don’t have to worry about should all of slab be in Cached and are they part of Used or not, we have just a number directly out of meminfo.

What does this mean for free?

Poor old free is now at least 24 years old and it is based upon BSD and SunOS predecessors that go back way before then. People expect that their system tools don’t change by default and show the same thing over and over. On the other side, Linux memory management has changed dramatically over those years. Maybe we’re all just sheep (see I had to mention sheep or RAMs somewhere in this) and like things to remain the same always.

Probably if free was written now; it would only need the total, available and used columns with used merely being total minus available. Possibly with some other columns for the wide option.

The code itself (found in libprocps) is not very hard to maintain so its not like this change will same some time but for me I’m unsure if free is giving the right and useful result for people that use it.

 


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *