You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
When initializing the OpenTelemetry Android SDK with disk buffering enabled, we
discovered that synchronous disk space checks were causing ANRs in production.
These checks occur during the creation of disk buffering exporters, specifically
in `DiskManager.getMaxFolderSize()`, which makes blocking IPC calls through
`StorageManager.getAllocatableBytes()` on the main thread. The issue manifests
in the following ANR stacktrace:
```
android.os.BinderProxy.transact (BinderProxy.java:662)
android.os.storage.IStorageManager$Stub$Proxy.getAllocatableBytes (IStorageManager.java:2837)
android.os.storage.StorageManager.getAllocatableBytes (StorageManager.java:2414)
android.os.storage.StorageManager.getAllocatableBytes (StorageManager.java:2404)
io.opentelemetry.android.internal.services.CacheStorage.getAvailableSpace (CacheStorage.java:66)
io.opentelemetry.android.internal.services.CacheStorage.ensureCacheSpaceAvailable (CacheStorage.java:50)
io.opentelemetry.android.internal.features.persistence.DiskManager.getMaxFolderSize (DiskManager.kt:58)
io.opentelemetry.android.OpenTelemetryRumBuilder.createStorageConfiguration (OpenTelemetryRumBuilder.java:338)
io.opentelemetry.android.OpenTelemetryRumBuilder.build (OpenTelemetryRumBuilder.java:286)
```
Our Solution
To fix this we moved initialization to run on a background executor and
buffer the data in memory until it completes.
The process works like this:
1. Initialize the SDK with `BufferDelegatingExporter` instances that can
immediately accept telemetry data.
2. Move exporter initialization off the main thread.
3. Once async initialization completes, flush buffered signals to initialized
exporters and delegate all future signals.
The primary goal of this solution is to be unobtrusive and prevent ANRs caused
by initialization of disk exporters, while preventing signals from being
dropped.
Testing
We have added unit tests to cover the buffering, delevation, and RUM
building. We've also verified this with both disk enabled and disk
disabled.
0 commit comments