Blocks, Episode 2: Life Cycles

10 Jul 2009 in Programming on Objective-c

Blocks are quite special constructs. The chief reason for this is the way that they are able to capture the lexical scope in which they were defined, keeping the values of variables defined on the stack preserved with them. While this is very powerful, it leads to some questions of memory management, and therefore some new rules to learn. To begin with, we’ll look at a block’s life-cycle.

Blocks start out on the stack

All blocks are initially created on the stack. If they aren’t kept around anywhere, for instance if they’re only passed into synchronous API, then they will remain there for their entire lifetime, and will simply go away when their stack frame returns. While stack-based, a block has no effect on the storage or lifetime of anything it accesses.

Blocks can be copied to the heap

If you want to keep a block around, it can be copied to the heap. This is an explicit operation, which will not happen for you. As a result of being copied to the heap, a block will gain proper reference-counting in the manner of all Objective-C or CoreFoundation objects. When they are first copied, they take their captured scope with them, retaining any objects as necessary to keep them around.

Blocks have their own private const copies of stack variables

If a block references a stack variable, then when the block is initialized it is given its own copy of that variable, with one twist: the copied variable is declared const. This means that you cannot change the values of referenced variables directly from inside your blocks (without a little extra sauce at least, see below). However pointer types remain pointers: you cannot assign a new address value to a pointer, but the value it points to can still be modified. This applies to Objective-C objects, so these can be modified directly. For CoreFoundation types, note that the typedefs for most of those with mutable and immutable variants define those types as [type] * and const [type] * respectively, so your CFMutableStringRefs and CFMutableArrayRefs will cease to be mutable from within your blocks unless you type-cast them.

Mutable stack variables must be declared with the `__block` keyword

If you have an integer counter which must be modified by the code in a block, then you will need to declare it as __block int x rather than just int x. Doing so will make the compiler arrange for it to be placed on the heap so that it can be used by all blocks, and will no longer be declared const inside the block itself. A side-effect of this is that the variable will be shared by all blocks which access it, so modifications made by one block will be seen by another– they will not have their own mutable copies of the variable’s initial value. This also applies to the stack frame containing that variable, so modifications made later in that stack frame will also be visible to your blocks.

Globals ‘just work’

Globals are globals. They exist once in memory, whether it be on the heap or in the mutable data segment of a binary file. Blocks don’t need to do anything special with them, and blocks accessing them from different threads concurrently still need to synchronize their access to them, just like everything else.

How does it work?

Imagine we have a simple function which uses a block, passing it to a hypothetical repeat() function, which takes an iteration count and a block to perform. Our simple function is written like this:

void doSomething( void )
{
    int local = 1;
    __block int shared = 0;
    repeat( 20, ^{
        shared += local;
    });
}

When this function is called, its stack frame will look something like this:

---------Stack-------                ---------Heap--------
int local;
=====================
__block int shared;
=====================
const int _local;
^{
    shared += _local;
}
---------------------                ---------------------

You can see that the chunk of memory for the block includes a new const int variable– this is the const copy of the stack’s local variable. It does not have a copy of the shared variable from the stack, since that was declared using the __block storage specifier.

If the block is never copied, then nothing here changes. Once the block is copied, however, a copy of the block is created on the heap, and the __block variable it references is moved to the heap. Your memory now looks something like this:

---------Stack-------                ---------Heap--------
int local;
---------------------                ---------------------
                                     __block int shared;
---------------------                =====================
const int _local;                    const int _local;
^{                                   ^{
    shared += _local;                    shared += _local;
}                                    }
---------------------                ---------------------

Two copies of the block now exist, each with their own const copy of the local variable. The shared variable still exists in one place, but that place has now moved from the stack to the heap. Variables moved to the heap in this way have a reference count, and are retained by each copy of a block which references them. In this case the shared variable has a retain count of 2: one for the stack-based block, and one for the heap-based block.

If the stack unrolls, the block there will release its reference on the shared variable automatically, leaving it with a reference count of 1. Likewise if the heap-based block is deallocated first, it will do the same, again leaving the shared variable with a reference count of 1. When both blocks have been deallocated, the shared variable will also be released.

Objects and `__block`

For the most part, you won’t ever need to worry about storage classes for Objective-C objects themselves. They are pointers to memory allocated on the heap, so your blocks will be able to use them automatically. When a block is copied, the runtime magic will retain any Objective-C types (which also includes CoreFoundation types thanks to a little of the deep magic in the guts of the Foundation framework) automatically, and it will release them when the block is deallocated.

The only time you’ll need to use the __block storage specifier for an ObjC type is when you want to assign it a value directly. For comparison, look at the two statements inside the following block:

id object = [MyApp makeAnObject];
repeat( [object count], ^{
    [object setValue: @"hello" forKey: @"greeting"]; // ok
    object = [object nextObject];                    // error
});

The first statement is fine. Even though it is modifying the object, it’s not modifying the value of the variable: that still contains the address of the object on the heap. The second statement however actually modifies that variable, making it point to another object, and that falls foul of the copied variable’s new const declaration. To make this work, we use __block like so:

__block id object = [MyApp makeAnObject];
repeat( [object count], ^{
    [object setValue: @"hello" forKey: @"greeting"]; // ok
    object = [object nextObject];                    // ok
});

Block memory management

The Blocks runtime includes support for both the C and Objective-C languages. Blocks are implemented in such a way that they are also first-class citizens in the Objective-C runtime: they all have an isa pointer which references an Objective-C class through which the runtime can access methods and storage. The block objects don’t implement everything that you might want to use however; they implement only the -copy, -release and -autorelease methods. They also implement -retain in order to play nicely with collections, but you should always use -copy instead. Why? Because a stack-based object will not be copied to the heap if you send it a -retain message; it will only be copied upon receipt of the -copy message. A heap-based object, on the other hand, will simply respond to a -copy message by incrementing its retain count.

The C variants of these functions provided by the Blocks runtime are Block_copy() and Block_release(). These work in a similar manner to their Objective-C counterparts with the exception that these must occur in matched pairs. As such, it is highly recommended that you use the Objective-C methods instead of these wherever possible.

That’s all for tonight; it’s now 12:30am and if I keep this up much longer my wife will never have sex with me again. I may be a serious language-runtime geek but sorry, sex will always come first (yay, a pun!).

If you have any questions on this, take a look at the plblocks-devel mailing list and specifically my reply here.

Blocks, Episode 2: Life Cycles

Blocks start out on the stack

Blocks can be copied to the heap

Blocks have their own private const copies of stack variables

Mutable stack variables must be declared with the `__block` keyword

Globals ‘just work’

How does it work?

Objects and `__block`

Block memory management

AQBlog

Error

Blocks start out on the stack

Blocks can be copied to the heap

Blocks have their own private const copies of stack variables

Mutable stack variables must be declared with the __block keyword

Globals ‘just work’

How does it work?

Objects and __block

Block memory management

Templates (for web app):

Error

Mutable stack variables must be declared with the `__block` keyword

Objects and `__block`