-
Notifications
You must be signed in to change notification settings - Fork 260
Project Proposal: Browser Phase 1 #2751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
8fe38af
f15d2e2
d39e02b
7c2057a
7119276
7a9eff1
b1668fa
c7167b2
bd2ba88
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,166 @@ | ||||||
# Browser Instrumentation (Phase 1) Proposal | ||||||
|
||||||
## Background and Description | ||||||
|
||||||
OpenTelemetry is rebooting telemetry for the Browser as a sequence of small, highly focused projects. The first project will focus on creating instrumentation for the browser runtime and a small set of browser libraries, plus the prerequisite API and data modeling work needed to create this instrumentation. | ||||||
|
||||||
## Current Challenges | ||||||
|
||||||
OpenTelemetry currently has a NodeJS-focused javascript implementation that is also capable of running in the browser. However, the requirements for a successful browser observability solution differ enough from NodeJS that a specialized solution is required. | ||||||
|
||||||
* Loading and unloading in the browser are specialized tasks that are very different from booting and shutting down a NodeJS application. Unloading in particular is under strict time and resource constraints. | ||||||
* Memory, cpu, and networking resources are severely constrained, especially on mobile devices. | ||||||
* A lack of compression and gRPC in browsers | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just a note - Compression spec is already available in most of the browsers https://developer.mozilla.org/en-US/docs/Web/API/Compression_Streams_API, also leveraged by existing RUM tools. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, good to know! |
||||||
* Package and dependency sizes | ||||||
* All activity occurring in the program relates to a single user instead of many independent transactions. So far this has led to a number of spec developments, such as the need for mutable resources for session management, a better logging model for recording events, and may lead to more browser specific design requirements. | ||||||
* Sessions in particular are a very important lifecycle for clients which are not present as a concept in server-side programs. | ||||||
* Sessions persist across multiple page loads, another difficult problem to design a solution for that is browser specific. | ||||||
* NodeJS has built in facilities for tracking context across async boundaries, but browsers do not have an equivalent concept. | ||||||
|
||||||
## Goals, Objectives, and Requirements | ||||||
|
||||||
OpenTelemetry needs a javascript implementation optimized for the browser. Sharing code with NodeJS is helpful when possible. But going forward, we want to prioritize running well in the browser over code reuse with the NodeJS project. If there are places where code reuse with NodeJS is a hard requirement, those places should be identified explicitly. | ||||||
|
||||||
We also have a goal of taking an incremental approach, focused on obtaining the highest initial value for the least amount of work. Among participating companies with the OpenTelemetry community, there are already multiple solutions for browser observability. And the current OTel JS SDK does run in the browser. | ||||||
|
||||||
Rather than start from scratch with a new SDK, we would like to focus our browser work on instrumentation. OTel instrumentation has only one dependency, the OTel instrumentation API. There is no need to do SDK work in order to stabilize the instrumentation packages we want to provide for our community. | ||||||
|
||||||
We propose that the various existing browser clients bind to the new OTel Browser API, and we delay work on providing an optimized SDK solution until after we have stabilized a key set of instrumentation packages. This will allow the community to begin receiving value from our OTel browser work much faster than if we were to start by optimizing the SDK. | ||||||
tedsuo marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
## Deliverables | ||||||
|
||||||
### Step 1: Browser Fundamentals | ||||||
|
||||||
Review the current Tracing, Metrics, and Logs API packages. Identify what, if anything, would need to be changed to make these packages optimal for the browser. Discuss the pros and cons of implementing these changes by sharing the existing packages with NodeJS, or forking them to have a separate set of API packages just for the Browser. Please note that the goal of this review is not to deviate from the OpenTelemetry API specification, but to evaluate the practical limitations that the current API packages may place on browser instrumentation. | ||||||
|
||||||
OpenTelemetry API review: | ||||||
* Package size | ||||||
* Dependencies | ||||||
* Other browser requirements that differ from NodeJS | ||||||
|
||||||
Data Modelling: | ||||||
danielgblanco marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
* Sessions | ||||||
* Resources / Entities | ||||||
* Navigation | ||||||
* Anonymous user ids (Manager) | ||||||
|
||||||
The SIG will also evaluate how best to maintain browser-specific packages, in terms of teams, repos, and other GitHub code management policies. | ||||||
|
||||||
### Support | ||||||
|
||||||
We will also decide our compatibility story. | ||||||
* Which versions of which browsers do we plan to officially support? | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For context: JS SIG discussed this resulting in open-telemetry/opentelemetry-js#5393 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks David. If a feature based approach is better we can explore that. I think it would still be good to publish the resulting version matrix. |
||||||
* Will we aim support electron at this time? | ||||||
|
||||||
### Step 2: Core Instrumentation | ||||||
|
||||||
Once we are clear on what API packages we plan on using, we plan to implement an initial set of instrumentation packages for high priority browser libraries. The goal of this step is to provide enough of an initial ecosystem that end users and vendors can validate the decisions made in Step 1, along with the Semantic Conventions, so that we can stabilize these decisions based on real world feedback. | ||||||
|
||||||
During this phase, the SIG will not be focused on implementing a browser-optimized SDK. Instead, we will continue to use the existing OTel JS SDK as our reference implementation. Third party SDKs may also choose to bind to the OTel Browser API at this time. | ||||||
|
||||||
In some cases, browser instrumentation already exists but may be different from the new instrumentation we want to provide. This instrumentation should be treated as "de facto stable" and should not experience breaking changes until we are ready to issue a stable v1.0. New unstable versions of this instrumentation should be managed in a way that allow the SIG to move quickly without destabilizing any instrumentation currently in production. | ||||||
|
||||||
Instrumentation for the browser runtime: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some of this instrumentations already exists. Is the plan to update them in their current location or make a new copy in a separate one? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is a good question. I think it will need to be on a case by case basis, depending on whether the new instrumentation will be unstable or need to be packaged differently. |
||||||
* Page load/unload | ||||||
* User events (clicks, etc) | ||||||
* Resource timing | ||||||
* Errors | ||||||
* Web vitals | ||||||
* Long tasks | ||||||
|
||||||
Example list of library instrumentation: | ||||||
* Fetch / XHR Requests | ||||||
* Websockets | ||||||
* React | ||||||
* Next | ||||||
* Svelte | ||||||
* Angular | ||||||
* Vue.js | ||||||
|
||||||
### Next Steps | ||||||
|
||||||
Once we are satisfied that we have achieved the above goals, this SIG will review the remaining work and available resources and choose a new project for the SIG to work on. | ||||||
|
||||||
Options include: | ||||||
* Continue implementing instrumentation packages. | ||||||
* Implement a browser-specific SDK. | ||||||
* Implement a more efficient client protocol. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if we need it to explicit, just for clarity - what does this mean in practise? There are existing techniques that are used in the browsers that works by shortening the field names, but now with the compression stream spec - the need for that is reduced. These techniques were done without shipping a GRPC specific package to the client as that would influence the package size shipped to users. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We're talking about OTLP here. For example, currently, OTLP requires different exporters and connections per signals (i.e. one for traces, one for logs, etc). These connections would be coalesced if they go to the same origin, but regardless, certain aspects like Resource attributes need to be duplicated in batches exported for each of those signals, whereas if we had multi-signal OTLP we could do it in one batch. This is just one example, but I expect there will be more things that can be further optimised. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the extra context, makes sense 👍🏽 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, there's a general feeling that the current OTLP protocol and exporter model is not as efficient as it could be for resource constrained environments like the browser. At the same time, there are browser specific facilities, like the Beacon API, that we would want to leverage. And there are techniques such as short codes for reducing payload sizes. But we shouldn't just copy-paste every single technique that has been used in the past, because the landscape is always changing and new browser facilities are becoming available that make some of the older techniques unnecessary. So we should have a review process where we decide what tradeoffs we want to make. The only decision we have made so far is that it would be acceptable for OpenTelemetry to add an additional protocol for sending client telemetry over the public internet, as OTLP was never designed for this use case. |
||||||
* Implement a public gateway for client protocols. | ||||||
* TBD | ||||||
|
||||||
## Staffing / Help Wanted | ||||||
|
||||||
### Industry Outreach | ||||||
|
||||||
There are a number of existing RUM/Client observability implementations. Representatives from a wide selection of companies that have experience with browser observability are present in the SIG. This includes Microsoft, New Relic, DynaTrace, HoneyComb, Grafana Labs, and Cisco, among others. We plan for further outreach among end users once we have working code for them to review. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some clear leaders in the area are missing. My concern is that we'll, once again, design something without hearing opinions from people we know to be relevant in the market. I realize that some of those voices, like Sentry, have not been exactly friendly towards OTel in the past, which is yet another reason to reach out to them right now: they clearly have opinions on how improve what we have right now in the browser use-case. I know nothing about this world of RUM, but a quick search also shows me that some of the apparent leaders in the area aren't in the list above. Companies like Datadog, Sematext, PostHog, Akamai, ... I'm confident you know who's relevant in the market, given how close you are to the SIG and this area, so I'd be interested in hearing what they (and others) said when asked to join this effort. |
||||||
|
||||||
### Required Staffing | ||||||
|
||||||
**Project Lead:** Ted Young | ||||||
|
||||||
**GC Liaison:** Daniel Gomez Blanco | ||||||
|
||||||
**Sponsors:** | ||||||
* Ted Young - Spec Maintainer Sponsor | ||||||
* Daniel Dyla - JS Maintainer Sponsor | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think implementation SIG maintainers are necessarily allowed to sponsor other SIGs. I'm offering in my capacity as spec sponsor. I just think being the JS maintainer makes me particularly relevant for this role.
Suggested change
|
||||||
* Carlos Alberto Cortez – TC Escalating Sponsor | ||||||
|
||||||
**Implementation Engineers:** | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand that Martin is from ServiceNow and Purvi from Honeycomb. Didn't the other companies in your list commit to help in the implementation? I see Marco (Grafana) and David (Elastic) in the list of NodeJS maintainers and approvers here (apparently, it doesn't match with the list I see on opentelemetry-js?): are they also committing to work on the implementation? |
||||||
* Martin Kuba | ||||||
* Purvi Kanal | ||||||
* Nev Wylie (Microsoft) | ||||||
* Karlie L (Microsoft) | ||||||
|
||||||
|
||||||
**NodeJS Maintainers and Approvers:** | ||||||
tedsuo marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
When bootstrapping our browser work, we want to make sure that the existing OpenTelemetry Javascript community is involved. Existing OTel JS maintainers and approvers should participate in this project to help ensure that it is successful. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Whats the path to add more folks involved? Is it similar to the existing process? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Anyone can join! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there specific work you are interested in taking on? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks. No strong preference, been working in the RUM field for a very long time and maintainer for the Elastic RUM agent. Would love to collaborate and contribute. |
||||||
* Martin Kuba | ||||||
* Purvi Kanal | ||||||
* Hector Hernandez (Microsoft) | ||||||
* David Luna | ||||||
* Marco Schäfer | ||||||
|
||||||
### Additional SIG Members | ||||||
|
||||||
**Design review:** | ||||||
* Ram Thiru (Microsoft) | ||||||
* Santosh Cheler (Cisco / Splunk) | ||||||
|
||||||
## Timeline | ||||||
|
||||||
### Step 1 Deliverables | ||||||
|
||||||
Expected due dates for step 1 proposals: | ||||||
* OpenTelemetry API review and changes proposal: TBD | ||||||
* Browser Compatibility and Support proposal: TBD | ||||||
* Data modeling and additional features proposal: TBD | ||||||
|
||||||
Expected due dates for code changes based on the above proposals: | ||||||
* API changes: TBD | ||||||
* Data Modelling changes: TBD | ||||||
|
||||||
### Step 2 Deliverables | ||||||
|
||||||
Expected due dates for step 2 proposals: | ||||||
* Browser runtime instrumentation proposal: TBD | ||||||
* Initial library instrumentation proposal: TBD | ||||||
|
||||||
Expected due dates for code changes based on the above proposals: | ||||||
* Browser runtime instrumentation: TBD | ||||||
* Library instrumentation: TBD | ||||||
|
||||||
## Labels | ||||||
|
||||||
`spec-browser` for all PRs and Issues related to this project. | ||||||
|
||||||
## Project Board | ||||||
|
||||||
TBD | ||||||
|
||||||
## SIG Meetings and Other Info | ||||||
|
||||||
**SIG Meeting time:** | ||||||
Proposing a weekly 30 min meeting, Thursdays 8:30am Pacific (please confirm if this works for you) | ||||||
|
||||||
Once the Browser Instrumentation SIG begins, the current Client SIG will be retired. |
Uh oh!
There was an error while loading. Please reload this page.