Everybody does Resource Management

Oil Refinery as Resource Management

«C pro­gram­mers think memory man­age­ment is too import­ant to be left to the com­puter. Lisp pro­gram­mers think memory man­age­ment is too import­ant to be left to the user.»

Ellis and Stroustrup, The Annotated C++ Reference Manual.

I’m a C++ developer, that’s were I’m get­ting paid from. I star­ted with C++ early. My first step in pro­gram­ming was that fant­ast­ic Harvard’s CS50 online course, ini­tially on C, and then I moved to two more sources: Dan Grossman’s Programming Languages mater­i­als and CS106B Programming Abstraction mater­i­als from Stanford. This last course was made on C++, and it clicked on me for no par­tic­u­lar reas­on; per­haps just the stub­born­ness of doing things the hard way, the arrog­ance of think­ing that Java and C# are unne­ces­sary sim­pli­fic­a­tions, who knows. Anyway it clicked on me, and I stayed with it.

I have worked with old C code­bases 1 and I have worked with inter­op­er­ab­il­ity between runtimes. I’ve seen C++ done the C way, and .NET interop with nat­ive lib­rar­ies the… no, no reas­on­able way. And there is one pat­tern I’m see­ing repeat­ing itself all over, at least from my whatever little exper­i­ence in the field: Resource Management is most often a mystery.

The part that pains me the most is to see C++ done the C way, it makes me thing we haven’t got the zero-cost abstrac­tions that C++ offered over C, we’ve got away think­ing that OOP was the C++ deal. The part that wor­ries me, is to see interop garbage col­lec­ted code not being under­stood, and the Garbage Collector left as a happy enigma.

So, what is resource management, and how it is done?

Resource Management is the prob­lem of, well, man­aging resources. Computers have a lim­ited set of resources, they’re made to believe resources are vir­tu­ally infin­ite but in real­ity, there are walls that could be hit. Memory is the most obvi­ous one, and without loss of gen­er­al­ity we can reduce all our prob­lem to it. There’s the prob­lem of file hand­lers – Operating Systems have a lim­ited amount of files they can open at once – or the sock­ets in a con­nec­tion, you name it. Resources are acquired, and for the health of the sys­tem, need not be mis­used. Give it back as soon as you don’t need it any more. Best case only you would suf­fer, worst case you can take an entire sys­tem down – hope­fully, a prop­er Operating System will simply kill your abus­ive pro­gram in order to pro­tect the rest of the system.

But return­ing resources is not that simple. It’s clear that if you don’t return things you don’t need any longer you have a resource leak, but if you return some­thing before you actu­ally don’t need it any more, you have a runtime error, as your pro­gram expec­ted a resource to be avail­able and does­n’t under­stand why is not there. And there’s the chance of return­ing a resource twice: the second time you request a resource to be returned, the sys­tem will be a bit lost: what are you try­ing to return me? Hmm… you must be out of your mind, I’ll bet­ter kill you now before you get more messy – again, that’s what a good OS will hope­fully do. You have a worse prob­lem if it doesn’t.

Time to get technical.

Without loss of gen­er­al­ity, I’ll reduce the case to memory man­age­ment. Once memory is acquired, there are three dangers that arise:

  • for­get-to-free: This is were everything starts. Don’t return memory, and your sys­tem will run out of it. Performance can also be severely hurt: not just because at some point swap­ping memory to the infin­itely slower disk will start to hap­pen, but even just because your alloc­at­ors will suf­fer the main­ten­ance of an unne­ces­sar­ily large heap.
  • use-after-free: free a resource before time, that is, free it and then try to use it again. What hap­pens here depends on your sys­tem, and your under­ly­ing runtime: you might have pro­grammed your sys­tem to throw an excep­tion and recov­er what was lost, if recov­ery is even pos­sible, with the huge pen­alty in per­form­ance that comes with it; or in an unpro­tec­ted world like that of C and C++, derefer­ence of an inval­id point­er might make the ker­nel seg­fault your pro­gram if your alloc­at­or actu­ally returned that memory to the OS, as in such case, it is not yours any more, touch it and the ker­nel will kill you. Or if it is still yours, it might have been garbaged, or giv­en to a dif­fer­ent piece of your sys­tem, which will lead you to memory cor­rup­tions and wishes for a crash as quick as pos­sible, so you real­ise your sys­tem went nuts before it sets the world on fire.
  • free-after-free: return a resource, and then return it again. This depends again on your sys­tem and your under­ly­ing runtime. A para­noid check­er could make sure this memory belongs to you, but such checks can be extremely expens­ive to com­pute. A naive per­form­ance-ori­ented runtime would just add whatever you’ve told him to the free list of memory. But, if you already freed this memory before, someone else might have obtained it on a dif­fer­ent acquis­i­tion request in the mean­time: when you free it a second time, you basic­ally cor­rup­ted the memory of someone else. Another crash coming.

Nothing here should be new to any inter­me­di­ate pro­gram­mer – you ser­i­ously have a prob­lem if this is new to you in a pro­duc­tion environment!

Note the repeat­ing pat­tern: in use-after-free just before you reuse it, someone else might have acquired it, and in free-after-free you freed some­thing someone else could have acquired in the mean­time. There’s an idea of mul­tiple play­ers and own­er­ship of resources: he who expects to own a resource shall not be sur­prised by the wrong-doings of those who were not sup­posed to know about it!

Don’t be so naive to thing that this would­n’t hap­pen in a single-threaded envir­on­ment: your runtime often runs in sep­ar­ate threads you did­n’t pro­gram, your OS lib­rar­ies often also do stuff in the back­ground, your OS might as well play with your alloc­at­ors in the mean­time. Even on a truly single-threaded pro­gram, there are still often mul­tiple objects shar­ing resources and alloc­at­ors. There’s no such thing as a single-threaded world. Even if you’re doing embed­ded on tiny memor­ies without any ker­nel admin­is­tra­tion, this is the inter­face with your alloc­at­ors, you don’t know what are they doing with your memory unless you imple­ment them your­self. And, have you imple­men­ted your own mal­loc and free

Ok, so we’re facing a com­plic­ated prob­lem that needs a solu­tion. What are the options? Historically, it’s all described in the quote at the begin­ning of the post. The C way, that is, the manu­al way, or the Lisp way, that is, the auto­mat­ic – garbage col­lec­ted – way. There’s a third one, pion­eered by C++ but only really exploited recently by the most mod­ern C++ stand­ards and by Rust. Ownership Semantics.

  1. And I don’t like them at all!

2 Replies to “Everybody does Resource Management”

  1. […] I men­tioned last time, from a tra­di­tion­al per­spect­ive memory can be man­aged manu­ally or […]

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.