I just found a all-new way to bring my Linux x86 laptop to a grinding halt ("ping" still works but it seems the kernel is totally busy with managing itself in an endless loop) via a application which allocates far too many memory segments via |mmap()|'ing /dev/zero - I am not sure whether this is a DOS attack against the MMU subsystem or just an out-of-swap condition, the machine just hangs.
OK... Linux (SuSE 8.2+Debian Sarge) seem to be vulerable... but how would the Solaris kernel deal with userspace application which runs amok like that (e.g. mmap()'ing millions of small memory segments between 1byte and 80000 byte) ? Is it possible to bring the Solaris kernel to a similar "halt" condition (the question is more of theoretical nature for the kernel gurus here whether the Solaris kernel has extra protections for such cases) ?
I just found a all-new way to bring my Linux x86 laptop to a grinding> halt ("ping" still works but it seems the kernel is totally busy with> managing itself in an endless loop) via a application which allocates> far too many memory segments via |mmap()|'ing /dev/zero - I am not sure> whether this is a DOS attack against the MMU subsystem or just an> out-of-swap condition, the machine just hangs.>
OK... Linux (SuSE 8.2+Debian Sarge) seem to be vulerable... but how> would the Solaris kernel deal with userspace application which runs amok> like that (e.g. mmap()'ing millions of small memory segments between> 1byte and 80000 byte) ? Is it possible to bring the Solaris kernel to a> similar "halt" condition (the question is more of theoretical nature for> the kernel gurus here whether the Solaris kernel has extra protections> for such cases) ?>
---->
Bye,> Roland>
Why don't you try it and report back to us?
-- The e-mail address in our reply-to line is reversed in an attempt to minimize spam. Our true address is of the form che...@prodigy.net.
"Roland Mainz" <roland.mainz@nrubsig.org> wrote in message news:4261D870.B1AED942@nrubsig.org...>
I just found a all-new way to bring my Linux x86 laptop to a grinding> halt ("ping" still works but it seems the kernel is totally busy with> managing itself in an endless loop) via a application which allocates> far too many memory segments via |mmap()|'ing /dev/zero - I am not sure> whether this is a DOS attack against the MMU subsystem or just an> out-of-swap condition, the machine just hangs.>
OK... Linux (SuSE 8.2+Debian Sarge) seem to be vulerable... but how> would the Solaris kernel deal with userspace application which runs amok> like that (e.g. mmap()'ing millions of small memory segments between> 1byte and 80000 byte) ? Is it possible to bring the Solaris kernel to a> similar "halt" condition (the question is more of theoretical nature for> the kernel gurus here whether the Solaris kernel has extra protections> for such cases) ?>
On Sun, 17 Apr 2005 05:30:56 +0200 in <4261D870.B1AED942@nrubsig.org>, Roland Mainz said something similar to: : : I just found a all-new way to bring my Linux x86 laptop to a grinding : halt ("ping" still works but it seems the kernel is totally busy with : managing itself in an endless loop) via a application which allocates : far too many memory segments via |mmap()|'ing /dev/zero - I am not sure : whether this is a DOS attack against the MMU subsystem or just an : out-of-swap condition, the machine just hangs.
Linux overcommits memory, so this is probably what you're seeing - more anonymous memory mapped than backing store to support it, and presto, the machine starts thrashing.
On 2.6 kernels, overcommiting can be disabled (echo 2 > /proc/sys/vm/overcommit_memory), but with earlier kernels you're pretty much out of luck. The best you can do on older kernels is allocate large quantities of swap to reduce the likelyhood of running out of VM.
: OK... Linux (SuSE 8.2+Debian Sarge) seem to be vulerable... but how : would the Solaris kernel deal with userspace application which runs amok : like that (e.g. mmap()'ing millions of small memory segments between : 1byte and 80000 byte) ? Is it possible to bring the Solaris kernel to a : similar "halt" condition (the question is more of theoretical nature for : the kernel gurus here whether the Solaris kernel has extra protections : for such cases) ?
AFAIK, Solaris doesn't overcommit memory, so probably not.
Joerg Schilling 17 April 2005 15:04:44 [ permanent link ]
In article <4261D870.B1AED942@nrubsig.org>, Roland Mainz <roland.mainz@nrubsig.org> wrote:>
Hi!>
---->
I just found a all-new way to bring my Linux x86 laptop to a grinding>halt ("ping" still works but it seems the kernel is totally busy with>managing itself in an endless loop) via a application which allocates>far too many memory segments via |mmap()|'ing /dev/zero - I am not sure>whether this is a DOS attack against the MMU subsystem or just an>out-of-swap condition, the machine just hangs.>
OK... Linux (SuSE 8.2+Debian Sarge) seem to be vulerable... but how>would the Solaris kernel deal with userspace application which runs amok
What you describe is exactly the reason why I did convert the BerliOS web server from Linux to Solaris 1.5 years ago.
Linux overcommits memory allocation and later does no longer know that this happened. For this reason, it seems that Linux sometimes starts to infinitely look for the missing memory.
If you have more than one CPU in the system, the hang in Linux may happen within less than a second. With single CPU systems, there is a short time where you 'feel' that something bad is going to happen.
CJT wrote:> > I just found a all-new way to bring my Linux x86 laptop to a grinding> > halt ("ping" still works but it seems the kernel is totally busy with> > managing itself in an endless loop) via a application which allocates> > far too many memory segments via |mmap()|'ing /dev/zero - I am not sure> > whether this is a DOS attack against the MMU subsystem or just an> > out-of-swap condition, the machine just hangs.> >
OK... Linux (SuSE 8.2+Debian Sarge) seem to be vulerable... but how> > would the Solaris kernel deal with userspace application which runs amok> > like that (e.g. mmap()'ing millions of small memory segments between> > 1byte and 80000 byte) ? Is it possible to bring the Solaris kernel to a> > similar "halt" condition (the question is more of theoretical nature for> > the kernel gurus here whether the Solaris kernel has extra protections> > for such cases) ?>
Why don't you try it and report back to us?
Because I'd like to know whether the kernel engineers added a protection for such cases. The question whether it works is a slightly different one which I cannot test this weekend... one of my SPARCs at home is down and the 2nd one is needed for Xorg work. And trying such things with the production systems at the university is a bad idea...
Mike Delaney wrote:> : I just found a all-new way to bring my Linux x86 laptop to a grinding> : halt ("ping" still works but it seems the kernel is totally busy with> : managing itself in an endless loop) via a application which allocates> : far too many memory segments via |mmap()|'ing /dev/zero - I am not sure> : whether this is a DOS attack against the MMU subsystem or just an> : out-of-swap condition, the machine just hangs.>
Linux overcommits memory, so this is probably what you're seeing - more> anonymous memory mapped than backing store to support it, and presto, the> machine starts thrashing.>
On 2.6 kernels, overcommiting can be disabled> (echo 2 > /proc/sys/vm/overcommit_memory), but with earlier kernels you're> pretty much out of luck.
Horror. Why on earth isn't this the default, e.g. overcommiting turned off ? Every Linux userspace application can force the machine to hang that way... ;-((
The best you can do on older kernels is allocate> large quantities of swap to reduce the likelyhood of running out of VM.
Well, the laptop has 1GB with 1GB swap... I hoped this is enougth and I added even extra protection via "ulimit" - however |mmap()|'ing memory from /dev/zero does not honor any "ulimit" settings on Linux... ;-(
: OK... Linux (SuSE 8.2+Debian Sarge) seem to be vulerable... but how> : would the Solaris kernel deal with userspace application which runs amok> : like that (e.g. mmap()'ing millions of small memory segments between> : 1byte and 80000 byte) ? Is it possible to bring the Solaris kernel to a> : similar "halt" condition (the question is more of theoretical nature for> : the kernel gurus here whether the Solaris kernel has extra protections> : for such cases) ?>
AFAIK, Solaris doesn't overcommit memory, so probably not.
Is there any reason why overcommitting is enabled in Linux by default ? Does that bring any advantage ?
On Mon, 18 Apr 2005 00:37:20 +0200 in <4262E520.D5400CAA@nrubsig.org>, Roland Mainz said something similar to: : Mike Delaney wrote: : > : I just found a all-new way to bring my Linux x86 laptop to a grinding : > : halt ("ping" still works but it seems the kernel is totally busy with : > : managing itself in an endless loop) via a application which allocates : > : far too many memory segments via |mmap()|'ing /dev/zero - I am not sure : > : whether this is a DOS attack against the MMU subsystem or just an : > : out-of-swap condition, the machine just hangs. : > : > Linux overcommits memory, so this is probably what you're seeing - more : > anonymous memory mapped than backing store to support it, and presto, the : > machine starts thrashing. : > : > On 2.6 kernels, overcommiting can be disabled : > (echo 2 > /proc/sys/vm/overcommit_memory), but with earlier kernels you're : > pretty much out of luck. : : Horror. : Why on earth isn't this the default, e.g. overcommiting turned off ?
Since 2.4 and earlier kernels didn't even have the option of disabling overcommits, I suspect that they left the default as-is to prevent people complaining about 2.6 being slower.
: Every Linux userspace application can force the machine to hang that : way... ;-((
Indeed.
: > The best you can do on older kernels is allocate : > large quantities of swap to reduce the likelyhood of running out of VM. : : Well, the laptop has 1GB with 1GB swap... I hoped this is enougth and I
In this case, "large quanitities of swap" -> swap >> ram (at least by a factor of 2). It's still no guarantee.
: added even extra protection via "ulimit" - however |mmap()|'ing memory : from /dev/zero does not honor any "ulimit" settings on Linux... ;-(
The problem is that prior to 2.6, the kernel doesn't appear to properly keep track of how much anonymous memory has been allocated, so the ulimt can't do much more than restrict the max size of a single allocation.
: Is there any reason why overcommitting is enabled in Linux by default ? : Does that bring any advantage ?
Speed? Laziness? Idiocy? I really don't know (and probably don't want to). It certainly seems to be a bloody stupid way to design a VM architecture, though.
Mike Delaney wrote:> On Mon, 18 Apr 2005 00:37:20 +0200 in <4262E520.D5400CAA@nrubsig.org>,> Roland Mainz said something similar to:> : Mike Delaney wrote:> : > : I just found a all-new way to bring my Linux x86 laptop to a grinding> : > : halt ("ping" still works but it seems the kernel is totally busy with> : > : managing itself in an endless loop) via a application which allocates> : > : far too many memory segments via |mmap()|'ing /dev/zero - I am not sure> : > : whether this is a DOS attack against the MMU subsystem or just an> : > : out-of-swap condition, the machine just hangs.> : > > : > Linux overcommits memory, so this is probably what you're seeing - more> : > anonymous memory mapped than backing store to support it, and presto, the> : > machine starts thrashing.> : > > : > On 2.6 kernels, overcommiting can be disabled> : > (echo 2 > /proc/sys/vm/overcommit_memory), but with earlier kernels you're> : > pretty much out of luck.> : > : Horror.> : Why on earth isn't this the default, e.g. overcommiting turned off ?>
Since 2.4 and earlier kernels didn't even have the option of disabling> overcommits, I suspect that they left the default as-is to prevent people> complaining about 2.6 being slower.
How so? My 2.4.22 kernel here has a publicly documented knob to tune (on - off) the overcommit of memory. Lets see:
$ uname -a Linux tecra.naleco.com 2.4.22 #1 mar mar 9 18:19:51 CET 2004 i686
I downloaded this kernel from kernel.org more than a year ago, and compiled it myself for my laptop. If you care to read the Documentation file /usr/src/linux/Documentation/sysctl/vm.txt you could discover you are WRONG:
-----------------Documentation quote begins------------ This file contains the documentation for the sysctl files in /proc/sys/vm and is valid for Linux kernel version 2.4.
The files in this directory can be used to tune the operation of the virtual memory (VM) subsystem of the Linux kernel, and one of the files (bdflush) also has a little influence on disk usage.
Default values and initialization routines for most of these files can be found in mm/swap.c.
Currently, these files are in /proc/sys/vm: - bdflush - buffermem - freepages - kswapd - max_map_count - overcommit_memory - page-cluster - pagecache - pagetable_cache
[...]
overcommit_memory:
This value contains a flag that enables memory overcommitment. When this flag is 0, the kernel checks before each malloc() to see if there's enough memory left. If the flag is nonzero, the system pretends there's always enough memory.
This feature can be very useful because there are a lot of programs that malloc() huge amounts of memory "just-in-case" and don't use much of it.
Look at: mm/mmap.c::vm_enough_memory() for more information. ---------------------Documentation quote ends---------------
You see, this is Open Source and we have the source with us.
: Is there any reason why overcommitting is enabled in Linux by default ?> : Does that bring any advantage ?>
Speed? Laziness? Idiocy? I really don't know (and probably don't want to).> It certainly seems to be a bloody stupid way to design a VM architecture,> though.
If you feel you can do better, by all means show us the code!
On Mon, 18 Apr 2005 09:50:51 +0200 in <d3vorc$3ei$1@news.ya.com>, z80 said something similar to: : Mike Delaney wrote: : > On Mon, 18 Apr 2005 00:37:20 +0200 in <4262E520.D5400CAA@nrubsig.org>, : > Roland Mainz said something similar to: : > : Horror. : > : Why on earth isn't this the default, e.g. overcommiting turned off ? : > : > Since 2.4 and earlier kernels didn't even have the option of disabling : > overcommits, I suspect that they left the default as-is to prevent people : > complaining about 2.6 being slower. : : How so? My 2.4.22 kernel here has a publicly documented knob to tune (on : - off) the overcommit of memory. Lets see: [snip] : Look at: mm/mmap.c::vm_enough_memory() for more information. : ---------------------Documentation quote ends--------------- : : You see, this is Open Source and we have the source with us.
Yes, we do. And if we actually look at said source, we find that the knob you refer to switches between always assuming there's enough memory, and a heuristic algorithm which attempts to *guess* at how much memory is avalible. The latter is the default behaviour.
As Roland discovered, there are corner cases where the heuristic fails.
Strict accounting of memory allocations, and an option to make the VM only allocate space that it knows really exists doesn't show up until the 2.5/2.6 series.
This thread has now completely left the realm of topicality for c.u.s. Over and out.
Dimitri Maziuk 20 April 2005 22:18:16 [ permanent link ]
Roland Mainz sez:> Mike Delaney wrote:>> : I just found a all-new way to bring my Linux x86 laptop to a grinding>> : halt ("ping" still works but it seems the kernel is totally busy with>> : managing itself in an endless loop) via a application which allocates>> : far too many memory segments via |mmap()|'ing /dev/zero - I am not sure>> : whether this is a DOS attack against the MMU subsystem or just an>> : out-of-swap condition, the machine just hangs.>>
Linux overcommits memory, so this is probably what you're seeing - more>> anonymous memory mapped than backing store to support it, and presto, the>> machine starts thrashing.>>
On 2.6 kernels, overcommiting can be disabled>> (echo 2 > /proc/sys/vm/overcommit_memory), but with earlier kernels you're>> pretty much out of luck.>
Horror.> Why on earth isn't this the default, e.g. overcommiting turned off ?
That should've been explained to at memory management susbsytem lecture of your OS 201 course. Deferred allocation is no better or worse (*) than not overcommiting, per se -- if your system doesn't have enough resources to support your job mix, you're screwed either way.
What happens if you set RLIMIT_DATA (RLIMIT_AS ?) to something reasonable (as opposed to "unlimited") and run your test?
(*) Although now that memory is cheap, underutilization is arguably a lesser evil.
Dima -- ... with the exception of January and February 1900, all Microsoft application libraries counted dates the same way. -- An Interview with Joel Spolsky of JoelonSoftware
Dimitri Maziuk <dima@127.0.0.1> wrote:> That should've been explained to at memory management susbsytem> lecture of your OS 201 course. Deferred allocation is no better> or worse (*) than not overcommiting, per se -- if your system> doesn't have enough resources to support your job mix, you're> screwed either way.
There is a big difference: If your system doesn't overcommit memory, your application will know directly that there are no resources left and not killed randomly (probably in the middle of a transaction) while trying to access a previously unmapped memory region.
Not overcommiting gets you determinism: If an application requests memory and gets it from the OS it can depend on it.
Dimitri Maziuk 22 April 2005 03:17:54 [ permanent link ]
D. Rock sez:> Dimitri Maziuk <dima@127.0.0.1> wrote:>> That should've been explained to at memory management susbsytem you --------^>> lecture of your OS 201 course. Deferred allocation is no better>> or worse (*) than not overcommiting, per se -- if your system>> doesn't have enough resources to support your job mix, you're>> screwed either way. >
There is a big difference: If your system doesn't overcommit memory,> your application will know directly that there are no resources left and> not killed randomly (probably in the middle of a transaction) while trying> to access a previously unmapped memory region.>
Not overcommiting gets you determinism: If an application requests memory> and gets it from the OS it can depend on it.
Sure, and a mission-critical transaction batch processing system would probably prefer that guarantee.
Berkeley PDP that provides umpteen students with simultaneous vi sessions -- and if a session crashes, the BOFH can gleefully tell the poor student to "save often" (and call her "luser" behind her back) -- would probably prefer better resource utilization over that guarantee. Esp. since with proper limits on NPROC and DATA/AS the crashes won't happen too often.
And then some bright spark writes something along the "malloc 90% of available vm, sleep for 24 hours" lines, fires up a few of them in a loop on an OS that does not overcommit, and posts an "$OS kernel is full of bugs! It's starving my vi sessions!!!" message in an unrelated usenet newsgroup.
Dima (sheesh) -- The speed at which a mistyped command executes is directly proportional to the amount of damage done. -- Joe Zeff
Dimitri Maziuk wrote: [snip]> > Horror.> > Why on earth isn't this the default, e.g. overcommiting turned off ?>
That should've been explained to at memory management susbsytem> lecture of your OS 201 course. Deferred allocation is no better> or worse (*) than not overcommiting, per se -- if your system> doesn't have enough resources to support your job mix, you're> screwed either way.>
What happens if you set RLIMIT_DATA (RLIMIT_AS ?) to something> reasonable (as opposed to "unlimited") and run your test?
On Linux it seems to be impossible to control the memory usage of an application via "ulimit" when the memory allocations are done via |mmap()| ... fun... ;-(
Dimitri Maziuk 25 April 2005 00:23:34 [ permanent link ]
Roland Mainz sez:> Dimitri Maziuk wrote:> [snip]>> > Horror.>> > Why on earth isn't this the default, e.g. overcommiting turned off ?>>
That should've been explained to at memory management susbsytem>> lecture of your OS 201 course. Deferred allocation is no better>> or worse (*) than not overcommiting, per se -- if your system>> doesn't have enough resources to support your job mix, you're>> screwed either way.>>
What happens if you set RLIMIT_DATA (RLIMIT_AS ?) to something>> reasonable (as opposed to "unlimited") and run your test?>
On Linux it seems to be impossible to control the memory usage of an> application via "ulimit" when the memory allocations are done via>|mmap()| ... fun... ;-(
Hmm, interesting. Trying to prevent an overcommit bomb via rlimits is a PITA anyway, esp. if someone is trying to kill the system deliberately.
The bottom line is, if your application calls for non-deferred allocation, a system that overcommits is a wrong tool for the job. There are applications where deferred allocation works "better" (for various values of), processing multi-million dollar currency transactions (e.g.) is not one of them. Hammer, screw...
Dima -- Sufficiently advanced incompetence is indistinguishable from malice.
Dimitri Maziuk wrote: [snip]> >> What happens if you set RLIMIT_DATA (RLIMIT_AS ?) to something> >> reasonable (as opposed to "unlimited") and run your test?> >
On Linux it seems to be impossible to control the memory usage of an> > application via "ulimit" when the memory allocations are done via> >|mmap()| ... fun... ;-(>
Hmm, interesting. Trying to prevent an overcommit bomb via> rlimits is a PITA anyway, esp. if someone is trying to kill> the system deliberately.>
The bottom line is, if your application calls for non-deferred> allocation, a system that overcommits is a wrong tool for the> job. There are applications where deferred allocation works> "better" (for various values of), processing multi-million> dollar currency transactions (e.g.) is not one of them.> Hammer, screw...
Well, the application is not a transaction system... just the plain "Xorg" Xserver. Seems I may need an additional way to prevent it from hanging the system if the underlying kernel is based on Linux... ;-(