Tyson Henning and Charles Munger
Application start is where your app makes the first impression on users, and users don’t like to wait until the app is ready. In this article, you’ll learn how prewarming affects app startup time, and how to manage it.
When an app starts, its features share many resources:
- Main thread CPU time
- Aggregate CPU time
- Local storage I/O throughput (“disk”)
- Network throughput
- Various global locks (Java classloader lock, singleton initialization)
The application becomes responsive only after each of these resources finish all work assigned to them during startup.
Device resources are finite, and each feature that gets initialized at startup increases startup time. The more features an app initializes at startup, the longer startup will take.
We have seen apps eagerly initialize features or libraries, “prewarming” them. While this might help to launch a specific feature faster, the shared resource cost of warming up must be paid. Paying that price at app startup comes at a cost to user retention.
Cold-starting a well programmed and optimized app on a flagship device can be fast enough that the user won’t perceive it as slow.
The devices in many users’ hands, particularly in emerging markets, have a fraction of the single-threaded performance of a flagship device. Disk throughput, network, and aggregate CPU time are similarly restricted. The same well developed app can take a lot longer to start when it is run on a device with low performance specs. That perceived slowdown can cause severe impact on user happiness and business metrics.
“Fast is better than slow”, but slow startup is the lived experience of many users.
An app does five things at startup:
- Initializes ContentProviders
- Creates the Application
- Creates the starting Activity
- Resumes the starting Activity
- Draws the first frame
The app becomes responsive to user input only after these things are finished.
Any code executed before the first frame that isn’t essential to drawing that frame is prewarming, and any non-essential use of resources before that frame is drawn and the app is responsive increases app startup time.
When a feature or library prewarms, it can consume as much of any shared resource as it likes. In practice, each feature prewarming itself adds somewhere between a few milliseconds and a few seconds to app startup time, depending on the feature’s size.
From one perspective, implementing prewarming costs almost no engineering resources, and the feature starts faster.
Unfortunately, it doesn’t necessarily work from the user’s perspective. Prewarming is using shared resources in a zero sum game. Each feature that prewarms slows down startup, so if the user isn’t waiting to use exactly the prewarming feature, the effort is wasted, and slows down the journey that the user is actually taking.
Features and libraries usually initialize at startup by doing one or more of these things:
These are some but not all of the ways that an application can run code before it draws its first frame.
The best way to quantify what happens at startup is by tracing and profiling a production build of the application. You can watch this video to learn how to profile an Android app. You can also automate app startup performance measurement using the Macrobenchmark library.
The negative effects of prewarming on performance can be diffuse, incremental, and difficult to analyze.
Here are some of the emergent effects of prewarming.
Prewarming code executed at startup runs before anything else, and on the application’s main thread.
As a result, a bug in prewarming code can prevent uptake from remote configuration systems (such as Firebase Remote Config) or experiment systems (such as Firebase A/B testing), or simply crash the application before monitoring systems are initialized.
Whenever prewarming crashes the application before its first Activity has successfully shown, the result is usually an outage, with users unable to use the application. If the crash prevents uptake of a new remote configuration that would hotfix the problem, resolving a prewarming crash usually requires deploying a new version of the app to the Play Store and waiting for users to update. Both the crash and the required update can impact user retention.
Hidden Dependency Risk
Prewarming is often thought of as a pure optimization with no effect on app behavior. Unfortunately, it usually isn’t.
Over time in a large and continuously changing codebase, Hyrum’s Law applies: all observable behaviors of your system will be depended on by somebody.
Hypothetically, removing existing prewarming should leave the app behaving exactly the same way with slightly worse latency in the previously warmed path.
In reality, by Hyrum’s Law, features become dependent on prewarming, making the prewarming difficult to adjust or remove later.
This can manifest in at least two ways:
- An asynchronous prewarming task happens to always be complete when checked, so as the program evolves and the timing changes, data race bugs manifest.
- A prewarming task runs at startup to try and execute frequently or recently; unrelated changes to the app cause the process to start more often, causing the prewarming task to run more often, causing excessive load on servers, drained batteries, or drained data plans.
Hidden dependency risk makes it more difficult both to optimize and to evolve the application. It’s a form of technical debt.
The complexity leads to prewarming spot-fixes to unblock launches, adding more complexity and compounding the next problem that arises.
Android apps run on resource constrained devices and can’t be scaled like servers.
An Android process doesn’t have a single `main()` method. Instead, one operating system process plays host to many different components — Activities, Services, BroadcastReceivers, WorkManager Workers — and Android will initialize the Application process in order to use any (or several) of those components.
Initialization can be thought of as a tree. The root of the tree is the process launch. The Application is always initialized. Beneath that are various components that the user (or the operating system) might use.
If the application is loading and the user is entering ShareVideoActivity, but never enters the WatchVideoActivity, prewarming FeedFragment will slow down the user’s journey to ShareVideoActivity.
The same applies to BroadcastReceivers and WorkManager Workers. If the process is launched for a background operation, prewarming a user feature wastes resources.
In large applications with multiple features, application startup time is a tragedy of the commons problem.
When many features prewarm at startup in parallel, removing any one prewarming step doesn’t always measurably improve startup time. It can sometimes make startup slower.
Initialization cost of a shared dependency — often a library — gets paid by the first feature that initializes it. All the subsequent features using that dependency don’t have to initialize it, so they get it for “free”.
This emergent complexity means that apportioning prewarming cost among features using profiling can become intractable. A complex and multithreaded startup sequence can make it infeasible to track and accurately apportion use of shared locks, I/O, thread pool congestion, CPU, or classloading among the application’s features.
Without an accurate measurement of the relative performance cost of features, optimizing startup and other critical user journeys becomes a guessing game. Features can become unaccountable to their own performance cost.
Here are the two strategies we have found most effective for keeping app startup fast.
Optimally, don’t prewarm anything: avoid using any of the ways features and libraries initialize at startup.
Instead, ensure that each application feature’s classes are loaded, constructed, and used only once the user begins using the feature.
If a feature has an entry point UI within some other feature, build that entry point UI as a standalone package that won’t fully class load the feature it links to, or start any asynchronous work until the user accesses the feature through its entry point.
Limit “proactive” features and asynchronous work.
If a library needs initialization, set it up to initialize on demand — lazily. If a library doesn’t support lazy initialization, implement lazy initialization for it by loading the library asynchronously and in a threadsafe way when the user enters a feature that uses it.
In large app projects, prewarming is a policy problem, not a technical problem.
Engineers can always figure out how to start something at application startup time, and may choose to do so on a deadline.
Empowering a product or engineering group to measure and control prewarming is the best policy for applications to avoid piling on prewarming tasks and to keep startup time low as the application expands.
Leave a Reply