macOS CFRunLoop Internals: Scheduling High-Precision Timers and Recurring Tasks

We’re building Meld Studio, a live streaming and video recording desktop app which aims to bring high quality effects and overlays into a real-time context for live streaming. In order to achieve a rock solid broadcast on consumer hardware, we often have to dive deep into system internals. CFRunLoop is a macOS API we’ve spent a significant amount of time working with. Most recently, we’ve optimized our preview window rendering in order to avoid main thread CFRunLoop stalls introduced by AppKit during NSMenuItem events and NSWindow re-ordering events. In this first series of technical posts, we’ll explore the CFRunLoop in detail, learn how it can be used, and later learn about it’s performance characteristics.

Background

Core Foundation’s CFRunLoop is an instrumental component of every macOS and iOS application written since it was introduced in the late 90s. It is one of the lowest level APIs macOS provides to schedule timers, run event processing loops, and dispatch cross-thread tasks.

While the “run loop” terminology is somewhat specific to CoreFoundation (WebKit also uses it), it’s generally synonymous with “event loop”. The concepts and duties take on different names in different codebases:

Chromium refers to this as a MessagePump
Flutter calls this a MessageLoop
Qt calls this an EventDispatcher

All of the above use CFRunLoop under the hood, or the target platform’s equivalent.

It is also what backs a number of the Swift concurrency primitives – with a cross platform, open source implementation of CoreFoundation released as the backing implementation. That source code is invaluable in gaining a better understanding of how CFRunLoop works. At just under 5k lines of quite readable C code, one could grok it at a high level in a few hours.

What do the iOS timer app and the setTimeout JavaScript API in Safari and Chromium have in common? They’re all serviced by CFRunLoop.

I’m most familiar with C++, so I’ll use a combination of C++ / Objective-C++ in the following code samples. As we go through each concept, I’ll share a bit of code that builds on top of the CFRunLoop API and provides higher level primitives similar to what is done by Flutter, Chromium and Qt.

How a thread gets a CFRunLoop

The Apple Documentation tells us there’s a sort of Law of Conservation of CFRunLoop at play here: “There is exactly one run loop per thread. You neither create nor destroy a thread’s run loop.” So CoreFoundation is going to take care of ensuring a valid CFRunLoop is available anytime we call CFRunLoopGetCurrent. This takes away the difficulties of managing the lifetime of the CFRunLoop and cleaning up before the thread exits.

Let’s take this bit of Objective-C++ code typically used to create an NSApplication. The CFRunLoop gets created in the initializer of the NSApplication instance, so it’s available early on.

int main(int argc, const char* argv[]) {
  @autoreleasepool {
    auto* app = [NSApplication sharedApplication];
    [app run];
  }
  return 0;
}

We can confirm this by running lldb with a breakpoint as follows:

(lldb) br set -n "_CFRunLoopGet0"

Which we should see break after running the app:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x000000019323af28 CoreFoundation`_CFRunLoopGet0
    frame #1: 0x0000000196460820 AppKit`+[NSApplication initialize] + 116
    frame #2: 0x0000000192dfeff0 libobjc.A.dylib`CALLING_SOME_+initialize_METHOD + 24
    frame #3: 0x0000000192dfec9c libobjc.A.dylib`initializeNonMetaClass + 904
    frame #4: 0x0000000192e190e8 libobjc.A.dylib`initializeAndMaybeRelock(objc_class*, objc_object*, locker_mixin<lockdebug::lock_mixin<objc_lock_base_t> >&, bool) + 156
    frame #5: 0x0000000192dfe5c4 libobjc.A.dylib`lookUpImpOrForward + 884
    frame #6: 0x0000000192dfdf64 libobjc.A.dylib`_objc_msgSend_uncached + 68
    frame #7: 0x0000000100004b7c main`main + 52
    frame #8: 0x0000000192e3ff28 dyld`start + 2236

From the open source version of the CFRunLoop code, we can see that calling _CFRunLoopGet0 with nullptr for the thread ref will cause a new CFRunLoop to be initialized.

Note also that calling CFRunLoopGetMain() at any point in the program will initialize the main CFRunLoop if it hasn’t been already. So we can assume that CFRunLoopGetMain will always return a valid pointer.

In fact, this doesn’t only apply to the main thread. We can call CFRunLoopGetCurrent on any thread and CoreFoundation will create one under the hood if it doesn’t exist:

auto thread1 = std::thread([&]() {
  EXPECT_TRUE(CFRunLoopGetCurrent());
  EXPECT_NE(CFRunLoopGetMain(), CFRunLoopGetCurrent());
});

EXPECT_TRUE(CFRunLoopGetMain());

thread1.join();

Great, so there is no setup needed – we can create a new thread, call CFRunLoopGetCurrent() and start attaching timers and other sources. Take care to manage the ref counts of the CFRunLoopRef this returns. If you’re holding on to it, you’ll want to take a ref via CFRetain and release it when you’re done.

Creating timers

Timers get me all fired up. They are an essential function of CFRunLoop, and one aspect where Apple’s ecosystem really shines. CoreFoundation timers are very precise, and offer sub-millisecond precision according to the docs. Compare that to Windows which has lackluster ~10ms timer granularity. Chromium has in-depth details on a rather creative solution they use on Windows to implement high precision timers here.

Let’s take a look at how we can schedule timers via the CFRunLoop API. To start, let’s create a wrapper to allow callers to easily schedule a repeating task:

struct RepeatingTimer {
  using Callback = std::function<void()>;

  RepeatingTimer(TimeDelta interval, Callback callback)
      : callback_holder_(std::move(callback)), interval_(interval) {
    CFRunLoopTimerContext timerContext = {
        .version = 0,
        .info = &callback_holder_,
        .retain = nullptr,
        .release = nullptr,
        .copyDescription = nullptr,
    };

    timer_ref_ = AdoptCFRef(CFRunLoopTimerCreate(
        /* allocator */ kCFAllocatorDefault,
        /* fireDate  */ CFAbsoluteTimeGetCurrent(),
        /* interval  */ interval_.ToSecondsF(),
        /* flags     */ 0,
        /* order     */ 0,
        /* callout   */ &TimerCallbackTrampoline,
        /* context   */ &timerContext));

    CFRunLoopAddTimer(CFRunLoopGetMain(), timer_ref_, kCFRunLoopCommonModes);
  }

  ~RepeatingTimer() {
    // Detach the timer, the `ScopedCFTypeRef` will release the ref on the timer after.
    CFRunLoopRemoveTimer(CFRunLoopGetMain(), timer_ref_, kCFRunLoopCommonModes);
  }

  // Move only
  RepeatingTimer(RepeatingTimer&&) = default;
  RepeatingTimer& operator=(RepeatingTimer&&) = default;
  RepeatingTimer(const RepeatingTimer&) = delete;
  RepeatingTimer& operator=(const RepeatingTimer&) = delete;

private:
  struct CallbackHolder {
    void Run() { cb_(); }
    Callback cb_;
  };

  // Called every `interval_` seconds by the CFRunLoop
  static void TimerCallbackTrampoline(CFRunLoopTimerRef timer, void* info) {
    // Cast the opaque pointer back to our callback, and run.
    static_cast<CallbackHolder*>(info)->Run();
  }

  CallbackHolder callback_holder_;
  TimeDelta interval_;
  ScopedCFTypeRef<CFRunLoopTimerRef> timer_ref_;
};

CFRunLoopTimerCreate is what creates the timer and allows us to configure whether it is a repeating timer (by setting interval), or a single shot timer that will fire once at a future time. In this case, we configure it to begin firing now and then continue every interval_ seconds.

Once created, we retain the CFRunLoopRef since we will need it to remove the timer from the run loop in the destructor. For those unfamiliar with the ref counting practices used by CoreFoundation, the functions that have Create in the name generally return a newly created object with the retain count as 1. You can verify this (for debugging only!) with CFGetRetainCount. Since we are taking ownership here, we adopt the existing object into a smart pointer, keeping the retain count as 1 (ScopedCFTypeRef has similar semantics to std::unique_ptr). When our smart pointer goes out of scope, it will release the ref and (assuming we didn’t vend the CFRef to any others) free the memory.

With the newly created timer, we must now assign a run loop for it to take effect. We accomplish that with CFRunLoopAddTimer in the final line of the constructor.

The RepeatingTimer class can be tested as follows:

int demo_timer() {
  auto call_me_maybe = [] { println("Hello, World!"); };

  auto t = RepeatingTimer(Seconds(1), call_me_maybe);

  CFRunLoopRun();

  return 0;
}

Run the above and “Hello, World!” gets printed every second.

We could use something like this to drive a render loop at 60fps:

int demo_render() {
  auto draw_a_cat = [] {
    println(R"(
   /\_/\
  ( o.o )
   > ^ <
  )");
  };

  auto t = RepeatingTimer(Milliseconds(16), draw_a_cat);

  CFRunLoopRun();

  return 0;
}

But driving a render loop with this approach has several pitfalls. More on that later.

Run Loop Modes

A CFRunLoop mode is essentially a collection of input sources, timer sources, and run loop observers that can be scheduled to run in a particular run loop. Each mode has a name that’s represented by a string (CFStringRef), and each mode acts as an independent entity. Only one mode runs at a time, and when a run loop is running in a particular mode, it will only process input sources and timers associated with that mode.

Think of it as a very primitive way to group together activities that you may want to run exclusively. Say you normally service events in the default run loop mode but need to perform some occasional cleanup duties. You could break from the current run loop and enter CFRunLoopRunInMode(kCleanupMode, /* seconds */ 0.05, false);. This would allow you to perform cleanup duties while ignoring the incoming events and timers firing in the default mode.

The typical modes used are kCFRunLoopDefaultMode and kCFRunLoopCommonModes. Both are defined by CoreFoundation as simply the stringified name. That is:

static const kCFRunLoopDefaultMode = "kCFRunLoopDefaultMode"
static const kCFRunLoopCommonModes = "kCFRunLoopCommonModes"

Run Loop modes are very important when working with AppKit GUIs as they might be used to enter a custom mode that only handles events for a modal while it has focus.

macOS frameworks have a number of private modes that can be entered in the context of your application run loop and block your application’s run loop temporarily.

Perhaps the most common is when the user is interacting with the application menu bar. In this case, several different private run loop modes such as com.apple.hitoolbox.windows.windowfadingmode or NSEventTrackingRunLoopMode. These can interrupt (and block!) your main thread CFRunLoop in order to process window system events or animations that need to be performed synchronously.

Let’s focus in for a moment on the line from the example above that adds the timer to the run loop:

CFRunLoopAddTimer(CFRunLoopGetMain(), timer_ref_, kCFRunLoopCommonModes);

It has the following signature:

void CFRunLoopAddTimer(CFRunLoopRef runLoopRef, CFRunLoopTimerRef timerRef, CFStringRef modeName);

The first argument takes a ref to the run loop. The timer passed in the second will be attached to the given run loop.

The modeName argument benefits from a closer inspection.

Let’s introduce our own mode, the “SelfishMode” which only runs tasks and timers we’re concerned with. Modifying the timer code above:

static const auto kSelfishMode = CFSTR("SelfishMode");

...

CFRunLoopAddTimer(CFRunLoopGetMain(), timer_ref_, kSelfishMode);

If we run this, we find that nothing happens. Our timer never fires. So we will need to perform some additional setup.

Rather than call CFRunLoopRun() as we do above, we need to specifically ask for our run loop to run in our mode. Let’s update our entry point:

int demo_modes() {
  static int counter = 0;
  auto increment_counter = [] {
    counter++;
    println("count: {}", counter);
  };

  // Timer ticks every 100ms and prints the latest count
  auto t = RepeatingTimer(Milliseconds(100), increment_counter);

  // Enter the selfish run loop and stop being selfish after 1 second
  CFRunLoopRunInMode(
      /* mode                        */ kSelfishMode,
      /* seconds to run in this mode */ Seconds(1).ToSecondsF(),
      /* returnAfterSourceHandled    */ false);

  return 0;
}

And with this, we see that our counter increments at our interval until the run loop expires one second later:

count: 0
count: 1
count: 2
count: 3
count: 4
count: 5
count: 6
count: 7
count: 8
count: 9
count: 10

By the way, calling CFRunLoopRun and CFRunLoopRunInMode call CFRunLoopRunSpecific under the hood, only CFRunLoopRun uses kCFRunLoopDefaultMode for the mode. It looks something like this:

void CFRunLoopRun()
{
  int32_t result;
  do {
    result =
        CFRunLoopRunSpecific(CFRunLoopGetCurrent(), kCFRunLoopDefaultMode, 1.0E10, false);
  } while (kCFRunLoopRunStopped != result && kCFRunLoopRunFinished != result);
}

Take note of the result type returned here. It’s value will indicate whether the run loop was stopped, timed out, or finished all pending work. While CFRunLoopRunSpecific is private to the CoreFoundation API, we can get the run result when using the CFRunLoopRunInMode. This can be a helpful when you want to process all events that are pending and exit the run loop when the work is done.

Conclusion

In this initial post, we’ve introduced the basics of CFRunLoops and shared some background on what they are commonly used for. We then stepped through examples for scheduling timers, and how run loop modes are used. This covers a large aspect of what CFRunLoop is used for on the main thread: providing the application event loop, and servicing timers and async tasks.

In the upcoming post, I will share how we can establish CFRunLoops in multiple threads and use this to dispatch tasks between each one. I’ll share some benchmarks on how latency looks when posting cross-thread tasks, and compare to some alternative implementations.