A number of Linux kernel developers recently debated "swapiness" at length on the lkml, exploring when an application should or should not be swapped out, versus reclaiming memory from the cache. Fortunately a run-time tunable is available through the proc interface for anyone needing to adapt kernel behavior to their own requirements. To tune, simply echo a value from 0 to 100 onto /proc/sys/vm/swappiness
. The higher a number set here, the more the system will swap. 2.6 kernel maintainer Andrew Morton [interview] noted that on his own desktop machines he sets swapiness to 100, further explaining:
"My point is that decreasing the tendency of the kernel to swap stuff out is wrong. You really don't want hundreds of megabytes of BloatyApp's untouched memory floating about in the machine. Get it out on the disk, use the memory for something useful."
The other side of the argument is that if "BloatyApp" is swapped out too agressively, when the user returns to use it he has to wait for it to swap back in and thus detects a noticable delay. Rik van Riel explains, "Making the user have very bad interactivity for the first minute or so is a Bad Thing, even if the computer did run more efficiently while the user wasn't around to notice... IMHO, the VM on a desktop system really should be optimised to have the best interactive behaviour, meaning decent latency when switching applications." Andrew Morton humorously replied, "I'm gonna stick my fingers in my ears and sing 'la la la' until people tell me 'I set swappiness to zero and it didn't do what I wanted it to do'."
From: Brett E. [email blocked] To: linux-kernel mailing list [email blocked] Subject: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 14:27:47 -0700 Same thing happens on 2.4.18. I attached sar, slabinfo and /proc/meminfo data on the 2.6.5 machine. I reproduce this behavior by simply untarring a 260meg file on a production server, the machine becomes sluggish as it swaps to disk. Is there a way to limit the cache so this machine, which has 1 gigabyte of memory, doesn't dip into swap? Thanks, Brett
From: Andrew Morton [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 17:01:06 -0700 "Brett E." [email blocked] wrote: > > I attached sar, slabinfo and /proc/meminfo data on the 2.6.5 machine. I > reproduce this behavior by simply untarring a 260meg file on a > production server, the machine becomes sluggish as it swaps to disk. I see no swapout from the info which you sent. A `vmstat 1' trace would be more useful. > Is there a way to limit the cache so this machine, which has 1 gigabyte of > memory, doesn't dip into swap? Decrease /proc/sys/vm/swappiness? Swapout is good. It frees up unused memory. I run my desktop machines at swappiness=100.
From: Jeff Garzik [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 20:10:14 -0400 Andrew Morton wrote: > Swapout is good. It frees up unused memory. I run my desktop machines at > swappiness=100. The definition of "unused" is quite subjective and app-dependent... I've see reports with increasing frequency about the swappiness of the 2.6.x kernels, from people who were already annoyed at the swappiness of 2.4.x kernels :) Favorite pathological (and quite common) examples are the various 4am cron jobs that scan your entire filesystem. Running that process overnight on a quiet machines practically guarantees a huge burst of disk activity, with unwanted results: 1) Inode and page caches are blown away 2) A lot of your desktop apps are swapped out Additionally, a (IMO valid) maxim of sysadmins has been "a properly configured server doesn't swap". There should be no reason why this maxim becomes invalid over time. When Linux starts to swap out apps the sysadmin knows will be useful in an hour, or six hours, or a day just because it needs a bit more file cache, I get worried. There IMO should be some way to balance the amount of anon-vma's such that the sysadmin can say "stop taking 70% of my box's memory for disposable cache, use it instead for apps you would otherwise swap out, you memory-hungry kernel you." Jeff
From: Nick Piggin [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Thu, 29 Apr 2004 10:21:24 +1000 Jeff Garzik wrote: > Additionally, a (IMO valid) maxim of sysadmins has been "a properly > configured server doesn't swap". There should be no reason why this > maxim becomes invalid over time. When Linux starts to swap out apps the > sysadmin knows will be useful in an hour, or six hours, or a day just > because it needs a bit more file cache, I get worried. > I don't know. What if you have some huge application that only runs once per day for 10 minutes? Do you want it to be consuming 100MB of your memory for the other 23 hours and 50 minutes for no good reason? Anyway, I have a small set of VM patches which attempt to improve this sort of behaviour if anyone is brave enough to try them. Against -mm kernels only I'm afraid (the objrmap work causes some porting difficulty).
From: Wakko Warner [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 20:50:59 -0400 > I don't know. What if you have some huge application that only > runs once per day for 10 minutes? Do you want it to be consuming > 100MB of your memory for the other 23 hours and 50 minutes for > no good reason? I keep soffice open all the time. The box in question has 512mb of ram. This is one app, even though I use it infrequently, would prefer that it never be swapped out. Mainly when I want to use it, I *WANT* it now (ie not waiting for it to come back from swap) This is just my oppinion. I personally feel that cache should use available memory, not already used memory (swapping apps out for more cache). -- Lab tests show that use of micro$oft causes cancer in lab animals
From: Jeff Garzik [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 20:53:05 -0400 Wakko Warner wrote: > This is just my oppinion. I personally feel that cache should use available > memory, not already used memory (swapping apps out for more cache). Strongly agreed, though there are pathological cases that prevent this from being something that's easy to implement on a global basis. Jeff
From: Brett E. [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 17:49:43 -0700 Jeff Garzik wrote: > There IMO should be some way to balance the amount of anon-vma's such > that the sysadmin can say "stop taking 70% of my box's memory for > disposable cache, use it instead for apps you would otherwise swap out, > you memory-hungry kernel you." Or how about "Use ALL the cache you want Mr. Kernel. But when I want more physical memory pages, just reap cache pages and only swap out when the cache is down to a certain size(configurable, say 100megs or something)."
From: Andrew Morton [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 18:00:38 -0700 "Brett E." [email blocked] wrote: > > Or how about "Use ALL the cache you want Mr. Kernel. But when I want > more physical memory pages, just reap cache pages and only swap out when > the cache is down to a certain size(configurable, say 100megs or > something)." Have you tried decreasing /proc/sys/vm/swappiness? That's what it is for. My point is that decreasing the tendency of the kernel to swap stuff out is wrong. You really don't want hundreds of megabytes of BloatyApp's untouched memory floating about in the machine. Get it out on the disk, use the memory for something useful.
From: Jeff Garzik [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 21:24:45 -0400 Andrew Morton wrote: > Have you tried decreasing /proc/sys/vm/swappiness? That's what it is for. > > My point is that decreasing the tendency of the kernel to swap stuff out is > wrong. You really don't want hundreds of megabytes of BloatyApp's > untouched memory floating about in the machine. Get it out on the disk, > use the memory for something useful. Well, if it's truly untouched, then it never needs to be allocated a page or swapped out at all... just accounted for (overcommit on/off, etc. here) But I assume you are not talking about that, but instead talking about _rarely_ used pages, that were filled with some amount of data at some point in time. These are at the heart of the thread (or my point, at least) -- BloatyApp may be Oracle with a huge cache of its own, for which swapping out may be a huge mistake. Or Mozilla. After some amount of disk IO on my 512MB machine, Mozilla would be swapped out... when I had only been typing an email minutes before. BloatyApp? yes. Should it have been swapped out? Absolutely not. The 'SIZE' in top was only 160M and there were no other major apps running. Applications are increasingly playing second fiddle to cache ;-( Regardless of /proc/sys/vm/swappiness, I think it's a valid concern of sysadmins who request "hard cache limit", because they are seeing pathological behavior such that apps get swapped out when cache is over 50% of all available memory. Jeff
From: Andrew Morton [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 18:40:08 -0700 Jeff Garzik [email blocked] wrote: > > Well, if it's truly untouched, then it never needs to be allocated a > page or swapped out at all... just accounted for (overcommit on/off, > etc. here) > > But I assume you are not talking about that, but instead talking about > _rarely_ used pages, that were filled with some amount of data at some > point in time. Of course. My fairly modest desktop here stabilises at about 300 megs swapped out, with negligible swapin. That's all just crap which apps aren't using any more. Getting that memory out on disk, relatively freely is an important optimisation. > These are at the heart of the thread (or my point, at > least) -- BloatyApp may be Oracle with a huge cache of its own, for > which swapping out may be a huge mistake. Or Mozilla. After some > amount of disk IO on my 512MB machine, Mozilla would be swapped out... > when I had only been typing an email minutes before. OK, so it takes four seconds to swap mozilla back in, and you noticed it. Did you notice that those three kernel builds you just did ran in twenty seconds less time because they had more cache available? Nope. > Regardless of /proc/sys/vm/swappiness, I think it's a valid concern of > sysadmins who request "hard cache limit", because they are seeing > pathological behavior such that apps get swapped out when cache is over > 50% of all available memory. We should be sceptical of this. If they can provide *numbers* then fine. Otherwise, the subjective "oh gee, that took a long time" seat-of-the-pants stuff does not impress. If they want to feel better about it then sure, set swappiness to zero and live with less cache for the things which need it... Let me point out that the kernel right now, with default swappiness very much tends to reclaim cache rather than swapping stuff out. The top-of-thread report was incorrect, due to a misreading of kernel instrumentation.
From: Rik van Riel [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 21:47:45 -0400 (EDT) On Wed, 28 Apr 2004, Andrew Morton wrote: > OK, so it takes four seconds to swap mozilla back in, and you noticed it. > > Did you notice that those three kernel builds you just did ran in twenty > seconds less time because they had more cache available? Nope. That's exactly why desktops should be optimised to give the best performance where the user notices it most... -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan
From: Rik van Riel [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 21:46:35 -0400 (EDT) On Wed, 28 Apr 2004, Andrew Morton wrote: > You really don't want hundreds of megabytes of BloatyApp's untouched > memory floating about in the machine. But people do. The point here is LATENCY, when a user comes back from lunch and continues typing in OpenOffice, his system should behave just like he left it. Making the user have very bad interactivity for the first minute or so is a Bad Thing, even if the computer did run more efficiently while the user wasn't around to notice... IMHO, the VM on a desktop system really should be optimised to have the best interactive behaviour, meaning decent latency when switching applications. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan
From: Andrew Morton [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 18:57:20 -0700 Rik van Riel [email blocked] wrote: > > IMHO, the VM on a desktop system really should be optimised to > have the best interactive behaviour, meaning decent latency > when switching applications. I'm gonna stick my fingers in my ears and sing "la la la" until people tell me "I set swappiness to zero and it didn't do what I wanted it to do".
From: Marc Singer [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 19:29:44 -0700 On Wed, Apr 28, 2004 at 06:57:20PM -0700, Andrew Morton wrote: > Rik van Riel [email blocked] wrote: > > > > IMHO, the VM on a desktop system really should be optimised to > > have the best interactive behaviour, meaning decent latency > > when switching applications. > > I'm gonna stick my fingers in my ears and sing "la la la" until people tell > me "I set swappiness to zero and it didn't do what I wanted it to do". It does, but it's a bit too coarse of a solution. It just means that the page cache always loses.
From: Andrew Morton [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 19:35:41 -0700 Marc Singer [email blocked] wrote: > > It does, but it's a bit too coarse of a solution. It just means that > the page cache always loses. That's what people have been asking for. What are you suggesting should happen instead?
From: Marc Singer [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 20:10:59 -0700 On Wed, Apr 28, 2004 at 07:35:41PM -0700, Andrew Morton wrote: > > That's what people have been asking for. What are you suggesting should > happen instead? I'm thinking that the problem is that the page cache is greedier that most people expect. For example, if I could hold the page cache to be under a specific size, then I could do some performance measurements. E.g, compile kernel with a 768K page cache, 512K, 256K and 128K. On a machine with loads of RAM, where's the optimal page cache size?
From: Andrew Morton [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 20:19:24 -0700 Marc Singer [email blocked] wrote: > > > That's what people have been asking for. What are you suggesting should > > happen instead? > > I'm thinking that the problem is that the page cache is greedier that > most people expect. For example, if I could hold the page cache to be > under a specific size, then I could do some performance measurements. > E.g, compile kernel with a 768K page cache, 512K, 256K and 128K. On a > machine with loads of RAM, where's the optimal page cache size? Nope, there's no point in leaving free memory floating about when the kernel can and will reclaim clean pagecache on demand. What you discuss above is just an implementation detail. Forget it. What are the requirements? Thus far I've seen a) updatedb causes cache reclaim b) updatedb causes swapout c) prefer that openoffice/mozilla not get paged out when there's heavy pagecache demand. For a) we don't really have a solution. Some have been proposed but they could have serious downsides. For b) and c) we can tune the pageout-vs-cache reclaim tendency with /proc/sys/vm/swappiness, only nobody seems to know that. What else is there?
From: Marc Singer [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 21:13:03 -0700 On Wed, Apr 28, 2004 at 08:19:24PM -0700, Andrew Morton wrote: > Marc Singer [email blocked] wrote: > > > > > That's what people have been asking for. What are you suggesting should > > > happen instead? > > > > I'm thinking that the problem is that the page cache is greedier that > > most people expect. For example, if I could hold the page cache to be > > under a specific size, then I could do some performance measurements. > > E.g, compile kernel with a 768K page cache, 512K, 256K and 128K. On a > > machine with loads of RAM, where's the optimal page cache size? > > Nope, there's no point in leaving free memory floating about when the > kernel can and will reclaim clean pagecache on demand. It could work differently from that. For example, if we had 500M total, we map 200M, then we do 400M of IO. Perhaps we'd like to be able to say that a 400M page cache is too big. The problem isn't about reclaiming pagecache it's about the cost of swapping pages back in. The page cache can tend to favor swapping mapped pages over reclaiming it's own pages that are less likely to be used. Of course, it doesn't know that...which is the rub. If I thought I had an method for doing this, I'd write code to try it out. > What you discuss above is just an implementation detail. Forget it. What > are the requirements? Thus far I've seen The requirement is that we'd like to see pages aged more gracefully. A mapped page that is used continuously for ten minutes and then left to idle for 10 minutes is more valuable than an IO page that was read once and then not used for ten minutes. As the mapped page ages, it's value decays. > a) updatedb causes cache reclaim > > b) updatedb causes swapout > > c) prefer that openoffice/mozilla not get paged out when there's heavy > pagecache demand. > > For a) we don't really have a solution. Some have been proposed but they > could have serious downsides. > > For b) and c) we can tune the pageout-vs-cache reclaim tendency with > /proc/sys/vm/swappiness, only nobody seems to know that. I've read the source for where swappiness comes into play. Yet I cannot make a statement about what it means. Can you?
From: Andrew Morton [email blocked] Subject: Re: ~500 megs cached yet 2.6.5 goes into swap hell Date: Wed, 28 Apr 2004 21:33:59 -0700 Marc Singer [email blocked] wrote: > > It could work differently from that. For example, if we had 500M > total, we map 200M, then we do 400M of IO. Perhaps we'd like to be > able to say that a 400M page cache is too big. Try it - you'll find that the system will leave all of your 200M of mapped memory in place. You'll be left with 300M of pagecache from that I/O activity. There may be a small amount of unmapping activity if the I/O is a write, or if the system has a small highmem zone. Maybe. Beware that both ARM and NFS seem to be doing odd things, so try it on a PC+disk first ;) > The problem isn't > about reclaiming pagecache it's about the cost of swapping pages back > in. The page cache can tend to favor swapping mapped pages over > reclaiming it's own pages that are less likely to be used. Of course, > it doesn't know that...which is the rub. No, the system will only start to unmap pages if reclaim of unmapped pagecache is getting into difficulty. The threshold of "getting into difficulty" is controlled by /proc/sys/vm/swappiness. > The requirement is that we'd like to see pages aged more gracefully. > A mapped page that is used continuously for ten minutes and then left > to idle for 10 minutes is more valuable than an IO page that was read > once and then not used for ten minutes. As the mapped page ages, it's > value decays. yes, remembering aging info over that period of time is hard. We only have six levels of aging: referenced+active, unreferenced+active, referenced+inactive,unreferenced+inactive, plus position-on-lru*2. > I've read the source for where swappiness comes into play. Yet I > cannot make a statement about what it means. Can you? It controls the level of page reclaim distress at which we decide to start reclaiming mapped pages. We prefer to reclaim pagecache, but we have to start swapping at *some* level of reclaim failure. swappiness sets that level, in rather vague units. It might make sense to recast swappiness in terms of pages_reclaimed/pages_scanned, which is the real metric of page reclaim distress. But that would only affect the meaning of the actual number - it wouldn't change the tunable's effect on the system.
Andrew is silly
I'm not sure what exactly he expects from a desktop, but i suspect it's just Yet Another Case Of A Geek Being Imprisoned In His Own Ivory Tower.
Rik is right. People expect desktops to be responsive. Yeah, you loose a few miliseconds due to your disk cache being smaller - who cares? That's not time lost when it's most needed. While currently, Linux swaps out apps all the time, then the user has to wait for them to come back alive - not only annoying, but that IS time lost when it's most needed.
And don't even get me started on multimedia apps - swapping out apps in that field is criminal. But probably Andrew doesn't care about that; well, no wonder, since he stuck his fingers in his ears now - how do you expect him to enjoy multimedia content without his ears properly functioning? :-)))
Have a look at SGI Irix - that thing is heavily optimized for multimedia, and its behaviour is pretty much like Linux's with swappiness=0. And, from a subjective p.o.v., it feels so much more responsive.
Distribution makers, please configure your desktop versions by default with swappiness=0.
Read more closely
Andrew's fingers are in his ears until someone says "swappiness=0 didn't do what I want."
He *did* put in the swappiness control (around 2.5.40, actually). While he's no fan of turning the number down himself (I think he'd turn it to 110 if that were valid), from his comments it appears he wishes more people would try it before complaining.
Andrew not very silly
Andrew was the person who wrote the original low-latency patches, if you forget. I think he knows what he's talking about when it comes to desktop responsiveness.
Why can't the sys admin adjust it?
After reading this story I came to the conclusion that the kernel developers have a hard time to guess how I use my system. So I gave my system a little help, since I also cringed about the interactive abilities on my Desktop system after leaving the system unattended for some time sometimes.
Anyway, I setup some of the interfering cron or other jobs I run in the background to change the /proc/sys/vm/swappiness to 30 (in my case) while they run and change it back after they are done. I also let the system settle in after a restart (high initial number 90) and then change the swappiness to (60).
Seems to work in my case.
On the other hand a sysadmin should not run any cron jobs on a server that are not absolutely necessary, if he doesn't know where the files are on a server he has a bigger problem then swap in/outs.
What might be interesting...
What might be interesting is some sort of page-in daemon. Basically, it would keep track of what pages tend to get faulted back in for long-running processes, and during periods of I/O quiescence, bring those back in from swap if they're swapped out.
I realize the description is fuzzy, but you get the idea.
Since the swapped-back-in pages are clean and not-yet-active, they can be cheaply discarded if something else needs the RAM.
Heh.. random useful realization
If the page-out daemon does a halfway decent job of choosing which pages to discard in the right order, then the page-in daemon need only start bringing them back in the reverse order.
Hmmm... So, if you kept track of the last "N" things paged out, for some reasonable number "N", the page-in algorithm becomes pretty trivial. Just make the list a stack, and when system I/O is idle "long enough", start popping. Since tasks can "go away" or wake up and fault things in themselves, the "pop" loop would have to check to make sure the thing it popped actually can be paged back in....
Send it to lkml!
That sounds like the perfect solution to the updatedb problem. Let the devs know about it!
See please the CK patch for this!
Which can be found here:
http.://members.optusnet.com.au/ckolivas/kernel/index.html
Why nobody mentions it? It is probably a good attempt at solving this problem.
Quoting from the page for a good description:
I use that patch
And it works well for a KDE user with 1GB RAM and 2GB swap. OK, I've got lots of RAM and swap, so I may be an unusual case, but I find that when (as is usual) I have lots of buffers and cache, the thing doesn't swap. When I've got memory pressure, so not much buffers and cache, I get lots of swapping.
Not enough RAM?
I think that normal usage of a desktop doesn't need 1GB of RAM - I have 1 GB of RAM at home, and I don't use swap at all. I think that having 2GB on the hard-drive dedicated to swap is insane (and I know that some distros recommend that you use twice as much RAM you have).
If I need that much memory I'll create a file and swap into it, but I don't intend to actually run an application that requires 1GB of RAM, or something close to that.
..I think that normal usage
..I think that normal usage of a desktop doesn't need 1GB of RAM..
Maybe then, but reading this comment now from 2009, it sounds hilarious :-D
If there is not RAM for the big process ...
1. To steal many pages of cache's data of IO.
2. To flush every the cache's data of IO (by example, when using write-back) to hard disk.
3. To steal pages using LRU.
4. If it's not possible then to cool or to dump or to kill the big process.
I think that would be silly
What we need, at most, are slightly smarter algorithms for swapping out (and only in the case swapping _really_ is the problem here, which I somewhat doubt). Repeat after me: use of swap is not a bug. It's a feature. Most of what kernel jettisons into swap went there to improve performance. You get the best performance of a system only with some amount of swap, so that kernel has the chance to figure out memory which was once needed and used, but isn't required any more. Swapin daemon is definitely going to solve the perceived problem ("Linux's use of swap") in the wrong way ("Swap things back in once IO and memory pressure gets lower") which ultimately reduces performance by nullifying kernel's fairly intelligent decisions of freeing now-unused-but-earlier-allocated-and-touched memory. Chances are kernel will simply decide to throw the same pages into swap again once memory gets as tight. And paging in and out costs.
I do not experience any swapping-related slowdowns on my systems and swap use is really minimal. I don't know what you other guys are seeing, but I thought memories in the 256-512M range are getting common, so maybe I should boot with mem=128M or something to see how linux would perform. My systems simply do not swap to any mentionable degree. If you think swapping is the problem, can you use top to see how much was in swap before you resumed using some bloaty app, and then how much was in swap after. Is there a difference worth anything?
When fact is correlated to theory
The problem is the complete lack of USEFUL metrics to determine the relative cost and worth of holding file cache pages relative to holding process pages.
First, paging out and back in is two I/O's to reclaim memory for some other process ... simply discarding file cache data only requires a single I/O *IF* that file is needed again. Paged out data is *MUCH* more likely to get used again if the process has been active in the last N minutes. Many many pages that get file cached are much less likely to get used again.
Real metrics to drive this decision would require that file meta data be kept that contain a histogram of probability of reuse periods. Keep counters that indicate the life of a file object once cached ... IE number of accesses from the time the file was cached and over what period. The file cache reclaim algorithm can then rightfully discard files that have a history of being dragged in, then not used for another 72 hrs. Files that are used every few minutes, should be very sticky relative to a processes memory that gets hit every few days.
When fact is correlated to theory
I've been running with swappiness 0 for quite some time now with no issues at all. Much better performance, much less IO happening. This is on a GW 3.06 Ghz machine with 1.5 GB Ram. Kernel should NOT Page or Swap at all. EVER.
What about autoswappiness patch ?
I remember Con made such patch exactly to fix problem
with heave applications swapped during IO.
Anybody played with this patch ?
The latest i found is patch-2.6.4-am9
Swapping because of IO is BAD...
If I untar 600MB of data on 512MB RAM machine, virtually everything gets swapped out. That is not only bad, it's evil! Swap should be used ONLY if some app needs more memory, not just because the diskio will finish 5 seconds sooner.
In my opinion, using huge disk cache is futile. You need to do the write anyway (in 99% cases).
There are two things that I'd like to see improved:
1) if larger disk io operation occurs (I have no definition...), the actual write to disk should not be delayed until all buffers are full and half applications is swapped out.
If I download some data from LAN, memory gets filled to its limit, and THEN the write occurs. It should occur immediately so it doesn't need so much memory, my disk is faster than my LAN, so why fill memory at all??
2) applications should be swapped out *only* if another *app* needs the memory. Dot.
-> 3) give me a /proc/sys/vm/buffer_memory_min and /proc/sys/vm/buffer_memory_max switches, I will tune the system myself (if I want to, nobody is forced). I will set it to use 200MB of disk cache at most - everything will be just as smooth but updatedb will not swap out my KDE (AKA BloatApp). I will also tell it to swap out apps to get at least 100MB of memory for cache.
Zviratko
strongly disagree
>> In my opinion, using huge disk cache is futile. You need to do
>> the write anyway (in 99% cases).
I'm not sure I understand your reasoning.
I often work with large log files, over which I am running scripts, less'ing, or evening opening in a text editor. The first pass through a 300meg file is painful, but once idle applications have been swapped out and the file is entirely in memory, my operations on this file are several orders faster. This, like many other operations that require disk IO, does not request any writing at all, only reading.
As for writing to the disk, I would be very irritable if my word processor always to wait for a shot at the disk drive in order to save, just because a background process was performing continual and sustained writes. I would much rather have that file cached, with the write queued for an opportune time.
Testing and reporting is GOOD
If you untar 600MB data on a 512MB machine and virtually everything gets swapped out, you *need* to send a bug report to lkml. That simply does not happen here.
A recent problem has been reported where doing this sort of IO over NFS causes some swapping for example. This has been tracked down and is being fixed.
I set swappiness to 0 and it doesn't do what I want.
I had to turn off swap to get decent responsiveness. Even with swappiness to 0, stuff was being swapped out that doesn't need to be. So, I've been running without swap for months now. Its working out quite well. I havn't hit the OOM condition yet.
-molo
same here, however
make sure you are using the selective OOM killer, because when mozilla allocs all your memory you don't want klogd killed!!
I don't need swappiness set to zero
I run Debian sarge on 6 boxes currently, of varying configurations: the lowest end is a P2-266 MHz and highest XP2400+ on a Nforce2 board. Memory sizes range from 128 to 768. I ponder if the problem here _really_ is about Linux swapping as my Linux doesn't swap but very little. Let me give you some numbers:
- XP2000+, 768 MB: server & desktop at work: 14 MB in swap after 14 days of use. Doesn't seem to ever swap any of that in regardless of what I do.
- XP2400+, 512 MB: a desktop at home. Booted daily. Sometimes runs overnight. Swap use: minimal, maybe 0-500 kB. Yes. I got cron jobs and like. Updatedb, too. Never noticed.
- P4 1.8 GHz, 512 MB: had 400 kb in swap last night when I had it on. I checked it for the purpose of commenting on this thread.
- Celeron 466 MHz, 384 MB: router, ns server for LAN, squid, apt proxy, ut2004 server for LAN (though can hardly run UT2004): 4 kB in swap currently after 6 days of uptime. I usually don't much touch this box, it just runs what it runs. I just did updatedb on it and _nothing_ went in swap as consequence.
- P2 266 MHz, 128 MB: no swap. Can't measure.
All hosts run kernels from 2.6.3 to 2.6.5. In my experience these two work the same swap-wise.
Do I experience brief periods of slowdown when I use apps I've left running on system? Yes, occasionally. But I'm fairly sure it isn't about swapping. It's probably the need to bring rarely-used pages of program binary into memory. I've never been annoyed by these delays (and it should be noted I also run RAID1-setups on most systems), so maybe that's the issue?
Could it be that people in this thread are mostly barking at the wrong tree? Honestly, if your systems are like mine, they don't have even 1 MB of swap in use even after 1+ week of using. In that case you shouldn't be talking of swapping. Maybe about things like wrong pages in pagecache getting jettisoned during heavy IO. But the kernel only has limited resources in tracking the pages and doing the right decisions in any case, afaik...
Tweakin'
Ehh, you'd think a high-end workstation (I built it ground up to be one, chose the hardware quite specifically for what I do) would have good response times. Perhaps I'll take a look at this.
I remmeber reading somewhere that it's a good idea to have twice the swap space you have RAM. Anybody wanna comment on a 4gb SWAP partition? Seems like a waste of space to me. If I have 2gb of RAM, I don't want anything getting swapped out in the first place...
I think I'll go play with swappiness, maybe even try turning off my swap.
-ryan
Swap
AFAIK swap partitions can only be 2GiB big anyway. We need large memory space for our apps (which are being prototyped on smaller systems than their final target), but we need reliability, so overcommit is off... and so we have multiple 2GiB partitions (afaik you can only have 5 or 6 swap partitions too :-( ) for a total of 8GiB swap. Yowza!
Really, I'd like to be able to have 40GiByte swap on a 4 GiByte RAM machine. 10x physical memory is what people doing scientific and database stuff often need. Really.
swap partitions can be any size
Just make sure your mkswap is uptodate enough. The kernel handles >2 GB swaps just fine.
Are you sure?
Are you sure? My experience is that it "handles" them by using the first 2 GiB :-(
amount of swap space
Yeah, swap = 2*ram is completely over the top nowdays. I think, most of the reasons for the 2*ram rule of thumb was Linux used to handle small swap regions quite poorly. That isn't true anymore.
Swap = mem*2
I'm pretty sure this came from a Sun sysadmin practice (and remember that in the early days, Linux users often either had experience on UNIX stations or were getting advice from existing UNIX admins), where physical RAM was mapped into swap to ease the swapping algorithm. So, if swap was less than physical RAM, there was no real swap.
AFAIK, Linux never adhered to that model, so while that was a common sysadmin practice Back In The Day, it was never a good metric for Linux itself. Personally, I put a decent amount RAM in the box, then estimate what my likely largest working set is, double the difference, and make a swap partition of that size.
Till Linux-2.2.x
Until and including Linux-2.2, the swap algorithm actually needed at least mem*2 space to work properly. Not anymore.
Anno.
Swap = mem*2
This is a hold over from when ram was scarce.
Years ago we sold IBM AIX machines with our own business software on them. We used to run machines that had 16 - 32MB of RAM and 20 - 50 "greenscreen" users on them. You expected them to swap. (and they did) IBM had a complicated formula for figuring out how much swap you needed but it more or less boiled down to about 1.5 times your RAM. (I usually rounded up to 2 times on the theory that it is better to waste a few MB of disk then to have the customers machine run out of swap and have to deal with the support calls about error messages on the console.)
Once you had that number, you wanted to create several swap partitions, on different spindles, and as close to the center of each of the hard disks as you could.
If you put your ear next to the machine you would hear the disk drives clicking away nonstop.
Thinking back, these machines did amazing things with very meager resources. Can you even boot Linux in 32MB anymore?
> and as close to the center
> and as close to the center of each of the hard disks as you could.
You must have meant the border, because transfer on the border is the greatest...
Swap? We don't need no stinkin Swap
Desktop users,
Delete your swap partition. Swap is BULL. I'm running desktops with 256 and 768 megs of ram, 2.4 latest kernels. I hammer them with multimedia and games (xmame and quake3 mostly). I NEVER run out of memory, and suffer NO SWAP LATENCY ISSUES!
The server machine is a differently story of course......
--t
Swap still helpful
Remember the basic unit of utility here is reference. If you consider the entire working set of data to include FILES mapped to RAM, then it becomes clear why one might still want a swap routine. How often do you bring up the preferences menu in Mozilla? Some at first, rarely after you've adjusted to how you want it. How often does it references things in its file cache? Repeatedly. Just a contrived example, but those who've studied the situation in more detail may be able to provide at least some real life example of the tradeoff involved.
But basically, you want to use RAM for high speed access to repeatedly accessed data, be it in a file or in some memory region. So even though you won't run out of memory for programs, some if it may well be useless to you. Tradeoff!
The Swap is EMBARASSING.
I'm a Linux advocate. I use Linux for everything I do.
Let me tell you:
The swapping is totally embarassing.
I will see guests, and say, "Oh! Check out GNOME; This is awesome!" They are excited about the prospect of seeing something different.
I take them to my computer that was functioning perfectly normally perfectly fine-
EMACS, Evolution, Mozilla- all open, all working just fine-
And now I come back, and nothing works. The hard drive goes into overtime. Chug chug chug chug chug. Chug chug chug chug chug.
You click to bring up a window, and it comes up blank- X doesn't have whatever it needs to fill in the window.
Chuuuuuuuug chug chug chug chug. Chuuuuug chug chug chug chug chug.
"Linux is awesome; You're going to love this. Hang on a moment..."
Chuuuuuuug chug chug chug chug. Chuuuuug chug chug chug chug.
"So, this is Evolution; See?" (click to view the Contacts.)
Chuuuuuuug chug chug chug chug. Chuuuuuug chug chug chug chug.
"And, um... Here's EMACS, over on another desktop."
(Screen goes blank.)
Chuuuug chug chug chug chug. Chuuuuuuug chug chug chug chug.
"And there's Mozilla."
Chuuuuuuuuuuuuuuuuuug chug chug chug chug.
Chuuuuuuuuuuuug chug chug chuuuuuug chug chug..
Chuuug.
(some icons appear)
Chuuuuuuuuuug chuug chug chug chug.
(some advertising blocks appear.)
Chuuuuuuuuuuuuug chug chug chug chuuug.
Chuuuuuuuug chug chug chug chug.
(page appears.)
"Um, thanks... I uh... I think I've seen enough."
This is, frankly, the most embarassing thing, about Linux, right now.
After a period of use, the system starts to get "unfrozen" and usable again. Not just usable- smooth, even!
But then, leave for an hour, come back, and suddenly it's hog slow again, and you have to let it de-thaw, aaaalllll over again.
Something Evil happens in that hour. Something Evil that should not happen again.
/proc/sys/vm/swappiness is not a myth!
echo 0 > /proc/sys/vm/swappiness
This then doesn't happen. I've been playing for the last half day or so with this since I've got a cron-heavy box running an interactive app I'm having to use only every hour or so, and the behaviour you describe was getting to me.
THIS REALLY DOES WORK, TRY IT
I just went to apply that to my system....
And found that it's a linux 2.6 thing.
Crap.
I wasn't quite ready to jump from 2.4 yet....
10 seems to be a good compromise
I have a good experience with swappiness at 10.
The latency of the applications decreased **very** much since
I use this setting. Prior to that many fat applications were
swapped out after some time of inactivity. With a setting of 10
this happens only in rare cases.
Anyone else have oom-killer issues?
I find swappiness=0 quite responsive. Right up until the oom-killer kills mozilla. Thankfully, I also run session-saver, so it just takes a few seconds to reload, but on the other hand, this is also how I avoided the 30 seconds of grinding when mozilla hasn't been touched in a day or two.
Looking at the swap usage, it doesn't look like it even tried to swap an app out before killing one. What am I missing?
256mb memory, 768mb swap, 2.6.15 kernel
Humor chug chug chug
Thanks for the excellent and humorous description
of what heavy swapping feels like.
TOTALLY AGREED! I like to sho
TOTALLY AGREED! I like to show off my laptop to others, but I'm sick of waiting for the system to swap stuff in/out. Oddly enough, I NEVER had this problem until I started using 2.6 kernels (2.4 was fine). Thanks for the description--I thought my laptop was nuts.
Yeah, echo 0 > /proc/sys/vm/swappiness DOES WORK. Andrew Morton is wrong about this on desktop/laptop systems. I use it myself now, and I enjoy the fast response. It makes Linux avoid swapping unless the system is running out of RAM.
Easy solution to swapping
I solved this problem on my systems a few years ago - I bought more RAM and deleted the swap partition...
adjust swappiness in your cronjob
don't like updatedb's after effects? use this as your cron job:
old=`cat /proc/sys/vm/swapiness`
echo 0 >/proc/sys/vm/swapiness
updatedb
echo $old >/proc/sys/vm/swapiness
With the obvious problem that
With the obvious problem that if you apply the same fix to another
cron (makewhatis or the rpm update), which starts *after* updatedb and end *after* updatedb, then you end up with a system with swappiness set to 0.
One fix would be to gather all these expensive jobs in one cron-job and change swappiness only there.
Solving the obvious problem
Just save the desired system default in a config file, and reference that in the cron scripts.
Better still, add a colum to crontab, and allow cron to set it -- or have cron tell the kernel do handle swappiness=0 ONLY for this process tree. A similar switch could be made available for other programs, ala nice...
Individual Application Immunity.
While digging through the thread, i had this thought: rather than changing the way the kernel works, would it be simpler to just add some option that allows a sysadmin to assign "swapping immunity" to specific individual apps.
This gets down to the way I work on the desktop. I usually have one "main app" that I'm working in: IDE, OfficeSuite, text editor, etc. Then I have a bunch of "periphery apps" that i visit infrequently. I would like my "main app" to never swap, but i don't care as much about the "peripheral apps". This us usually no big deal since normal swap rules keep "main app" in mem and swap the others if needed, but I'd like to switch back to my IDE from an hour of web research and not have the whole desktop stall -- even for a few seconds. 4-5 seconds is not acceptable. Likewise, I couldn't care less if some background "find" job took 4 seconds longer to run because the system couldn't swap the "bloatapp" that i'm actually using. When I'm sitting there in front of a GUI or a command line, background (non-interactive) processes should take a back seat! Especially those ubiquitous cron jobs [and worse, anacron] that every RPM i install seems to dump into /etc/cron.daily & friends -- yes, i'm using RedHat ;-) -- maybe i should write a cron job to delete all that crap ....
Good suggestion, but..
I like you suggestion. I also liked the idea of a selective OOM killer. But lets be practical for a second. This would add more to the kernel and during compile that just isn't needed. Why not simplify this using syscalls and /proc to let the kernel know which apps get Niced into swap.
If you are able to dynamically alloc the processes in the VM then, the VM will more than likely swap out the "lower priority" apps.
I'm not saying that nice is a good solution to this, just an alternate. Maybe this isn't practical either, but maybe the ability to prioritize apps swapiness on-the-fly would be better than a compile time option?
Just a suggestion
App Survivor
That sounds like an immunity challenge. Sir you get to stay on the Island.
sticky bit
Cannot we use the sticky bit for this ?
(chmod +t your-fav-app)
Maybe I'm outdated here, but,
Maybe I'm outdated here, but, didn't/doesn't the sticky bit alrady mean 'stick the app in memory'/'keep it loaded after exit so it loads faster'? (if it still does, well, I wouldn't mind have it also "stick" it's pages to the RAM.).
Also, I think the suggstion with a 'nice' solution would be "Doing It By The Philosophy Way", an tool such as 'swappiness' (and some kernel functions, so developers can make us of it 'in-apps') would offer a managable way to have application specific swappiness, while still keeping a system default/fallback swappiness with the /proc/sys/vm/swappiness.
I found this when googling for the sticky bit: http://linux.about.com/od/commands/l/blcmdl1_chmod.htm says "On older Unix systems, the sticky bit caused executable files to be hoarded in swap space. This feature is not useful on modern VM systems, and the Linux kernel ignores the sticky bit on files. Other kernels may use the sticky bit on files for system-defined purposes. On some systems, only the superuser can set the sticky bit on files."
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradly Kuhn
Your signature
I usually don't comment on signatures, but I wanted to make an exception this time.
1) Although linux is like a religion to many, in truth you can learn computer science from proprietary software just as well as you can from any other type.
2) Even if your point were valid about being unable to learn CS from proprietary software, your analogy makes no sense whatsoever.
3) The fact that someone apparently made this statement does not in any way add any validity to your argument.