Skip to content

[clickhouse] Convert OTel traces model to native format #6935

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

zhengkezhou1
Copy link
Contributor

@zhengkezhou1 zhengkezhou1 commented Mar 26, 2025

Which problem is this PR solving?

Description of the changes

  • Based on the ch-go wire protocol, convert the OTel traces model to the ClickHouse native format for batch insertion.

How was this change tested?

  • unit tests

Checklist

@zhengkezhou1 zhengkezhou1 requested a review from a team as a code owner March 26, 2025 19:54
@dosubot dosubot bot added area/storage go Pull requests that update go code labels Mar 26, 2025
Copy link

codecov bot commented Mar 26, 2025

Codecov Report

Attention: Patch coverage is 99.44134% with 3 lines in your changes missing coverage. Please review.

Project coverage is 96.11%. Comparing base (fe2ad2e) to head (1ecacd5).
Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
...age/v2/clickhouse/tracestore/dbmodel/to_dbmodel.go 99.17% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6935      +/-   ##
==========================================
+ Coverage   96.03%   96.11%   +0.07%     
==========================================
  Files         355      357       +2     
  Lines       20993    21544     +551     
==========================================
+ Hits        20161    20707     +546     
- Misses        626      631       +5     
  Partials      206      206              
Flag Coverage Δ
badger_v1 9.95% <ø> (-0.02%) ⬇️
badger_v2 2.06% <ø> (-0.01%) ⬇️
cassandra-4.x-v1-manual 14.97% <ø> (-0.02%) ⬇️
cassandra-4.x-v2-auto 2.05% <ø> (-0.01%) ⬇️
cassandra-4.x-v2-manual 2.05% <ø> (-0.01%) ⬇️
cassandra-5.x-v1-manual 14.97% <ø> (-0.02%) ⬇️
cassandra-5.x-v2-auto 2.05% <ø> (-0.01%) ⬇️
cassandra-5.x-v2-manual 2.05% <ø> (-0.01%) ⬇️
elasticsearch-6.x-v1 19.99% <ø> (+0.05%) ⬆️
elasticsearch-7.x-v1 20.07% <ø> (+0.05%) ⬆️
elasticsearch-8.x-v1 20.25% <ø> (+0.06%) ⬆️
elasticsearch-8.x-v2 2.06% <ø> (-0.01%) ⬇️
grpc_v1 11.50% <ø> (-0.02%) ⬇️
grpc_v2 2.06% <ø> (-0.01%) ⬇️
kafka-3.x-v1 10.23% <ø> (-0.02%) ⬇️
kafka-3.x-v2 2.06% <ø> (-0.01%) ⬇️
memory_v2 2.06% <ø> (-0.01%) ⬇️
opensearch-1.x-v1 20.12% <ø> (+0.05%) ⬆️
opensearch-2.x-v1 20.12% <ø> (+0.05%) ⬆️
opensearch-2.x-v2 2.06% <ø> (-0.01%) ⬇️
query 2.06% <ø> (-0.01%) ⬇️
tailsampling-processor 0.56% <ø> (-0.01%) ⬇️
unittests 94.91% <99.44%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

scopeVersion proto.ColStr
spanAttributesKeys = new(proto.ColStr).LowCardinality().Array()
spanAttributesValues = new(proto.ColStr).LowCardinality().Array()
duration proto.ColUInt64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no equivalent to timestamp for duration?

timestamp                = new(proto.ColDateTime64).WithPrecision(proto.PrecisionNano)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proto.ColDateTime64 is used as an accurate instant timestamp. Why change the duration?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because duration and timestamp as strong types usually go together, if CH has that ability to represent duration natively I would use it. Otherwise as int64 you don't know what units those are, etc.

@yurishkuro yurishkuro added the changelog:exprimental Change to an experimental part of the code label Mar 29, 2025
Comment on lines 18 to 20
ValueTypeEmpty = pcommon.ValueTypeEmpty
ValueTypeMap = pcommon.ValueTypeMap
ValueTypeSlice = pcommon.ValueTypeSlice
Copy link
Contributor Author

@zhengkezhou1 zhengkezhou1 Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/common/README.md#attribute
It is specified that the Value cannot be nil and must only be a basic data type or its homogeneous array.

For Empty and slice, their values are uncertain; they could be{ }or other basic types. Ensuring that the type is not lost seems to leave only JSON format storage as a viable solution. Map is neither a basic data type nor a homogeneous array (it is pcommon.Map). What should we do??

Copy link
Contributor Author

@zhengkezhou1 zhengkezhou1 Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	ss.Scope().Attributes().PutEmpty("empty").FromRaw("1")
	ss.Scope().Attributes().PutEmptyMap("a").FromRaw(make(map[string]any))
	ss.Scope().Attributes().PutEmptyBytes("bytes").FromRaw([]byte("as you can see."))
	ss.Scope().Attributes().PutEmptySlice("slice").FromRaw([]any{1, 2, 3})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we always converted map and slice to JSON representation. I am not sure we need to support ValueTypeEmpty, it's ok to drop such attributes.

Copy link
Contributor Author

@zhengkezhou1 zhengkezhou1 Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClickHouse also cannot directly store []byte. I see that ch-go provides support for []byte (but it actually converts it to a string for storage):

https://github.com/ClickHouse/ch-go/blob/bd582c5836a06845fddd71d5c7024eeb5c7632c5/proto/col_str.go#L185-L188

However, the []byte here has already been Base64 encoded into a string. When using ColBytes, we first explicitly convert the string to []byte, but it is ultimately stored as ColStr:

https://github.com/open-telemetry/opentelemetry-collector/blob/564818fd7f6291e18c1e1881d1596b72782199bd/pdata/pcommon/value.go#L386-L395

So, should we actually use ColUint8 for storage? Why not directly use Base64 string storage? After all, strings are more readable.

@zhengkezhou1 zhengkezhou1 force-pushed the define-clickhouse-db-model-and-deserialization branch from 938515c to 7b58c84 Compare April 1, 2025 17:12
Copy link
Contributor Author

@zhengkezhou1 zhengkezhou1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have implemented support for writing complex structures in Attributes (it works correctly in local tests). I will add unit tests later.


func TestFromPtrace(t *testing.T) {
trace := jsonToTrace(t, "ptrace.json")
input := FromPtrace(trace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we usually try to do a roundtrip from-to-from, to ensure that the translation to dbmodel and back is lossless (as that's the primary requirement for storage format).

@zhengkezhou1 zhengkezhou1 force-pushed the define-clickhouse-db-model-and-deserialization branch 3 times, most recently from dc79552 to 4c2585e Compare April 7, 2025 12:58
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a high level question - what is different about the implementation you're trying to write compared to otel-contrib's implementation? I assume some of the value encoding is different, can you summarize it somewhere, perhaps in a README file in the packagee.

// For basic data types (such as bool, uint64, and float64), type safety is guaranteed when converting back from a string.
// For non-basic types (such as Map, Slice, and []byte), they are serialized and stored as JSON-formatted strings,
// ensuring that the original type is preserved when reading
result[typ][k] = v.AsString()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I follow. If you convert everything to strings, it means we cannot do analytics queries on the numeric attributes, which are the main reason we want to use CH. Also, isn't AsString() a debugging feature of ptrace model? In other words it does not guarantee how the value is being returned. Nothing requires it to return a valid JSON, for instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the processing only involves complex types like map, slice, and bytes. For basic types, I will convert them back before writing. For example, for an int64 that was converted to a string, I would use strconv.ParseInt(v, 10, 64).

If you don't think AsString() is a good choice, another implementation would be to use map[pcommon.ValueType]map[string]pcommon.Value instead of map[pcommon.ValueType]map[string]string for grouping and collecting. Then, complex types would be converted to JSON, while basic types would remain unchanged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what is the direction you are going in? I think OTLP/JSON is a good representation for complex types, but you mentioned you can't implemented over pdata.Value without extension in OTEL? Did you open a ticket there? Or did you find an alternative solution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will only convert complex types to OTLP JSON strings. For basic types, it is not necessary to first convert them to strings and then back. I create a featute request: open-telemetry/opentelemetry-collector#12826 .

I am currently trying other solutions which are quite tricky but don't depend on the OTel collector pipe data.

  1. Based on common.pb.go, establish the same processing logic as pcommon.Value. In other words, we need to implement another pcommon.Value; only in this way can we generate data in OTLP/JSON format. This requires us to copy a significant amount of duplicate code.
  2. Convert the complex types in pcommon.Value to our implemented pcommon.Value type. In this way, we can directly add the MarshalValue method, thereby obtaining the corresponding OTLP/JSON data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I can make a suggestion - put this issue on a back burner. There's lots of other work to be done to support CH, and support for these complex types is not the highest priority, it can be improved later (especially if the OTEL ticket is solved as you asked for). Aside from complex types support is this PR ready for review?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw I would suggest creating a child ticket under the main CH Support issue to keep track of these deferred tasks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, drop complex types, introduce basic types here. And ready for review now.

@zhengkezhou1
Copy link
Contributor Author

I have a high level question - what is different about the implementation you're trying to write compared to otel-contrib's implementation? I assume some of the value encoding is different, can you summarize it somewhere, perhaps in a README file in the packagee.

Yes, i will do that.

Copy link
Contributor Author

@zhengkezhou1 zhengkezhou1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if Otel will accept my proposal.

@zhengkezhou1 zhengkezhou1 force-pushed the define-clickhouse-db-model-and-deserialization branch from ac49a71 to 8560ef6 Compare April 9, 2025 23:51
resourceAttributesIntKey = new(proto.ColStr).LowCardinality().Array()
resourceAttributesIntValue = new(proto.ColInt64).Array()
resourceAttributesStrKey = new(proto.ColStr).LowCardinality().Array()
resourceAttributesStrValue = new(proto.ColStr).LowCardinality().Array()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we know values are low cardinality? they are user-provided values, can be anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know, but is it common to store over 10,000 values? https://clickhouse.com/docs/sql-reference/data-types/lowcardinality#description
Also, they are very helpful when testing; otherwise, I would need to handle the byte stream myself (which I am trying to do)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

捕获

return trace
}

func AttributesGroupToMap(span ptrace.Span, group AttributesGroup) pcommon.Map {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func AttributesGroupToMap(span ptrace.Span, group AttributesGroup) pcommon.Map {
func attributesGroupToMap(group AttributesGroup, warn func(string)) pcommon.Map {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we should write warn directly into the attributes of the relevant entity (like Span Event, Resource, Scope, etc.) instead of writing it into the Span attributes?

@zhengkezhou1 zhengkezhou1 force-pushed the define-clickhouse-db-model-and-deserialization branch from 41987b5 to f852e0b Compare April 20, 2025 13:39
Comment on lines 330 to 519
boolKeyCol := attrColumnsMap[ValueTypeBool].keyCol
boolKeyCol.(*proto.ColArr[string]).Append(group.BoolKeys)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I follow the data ownership here. AllAttributeColumns is a static value. So are attrColumnsMap and boolKeyCol. What happens when we we call boolKeyCol.Append(group.BoolKeys) where group.BoolKeys is a dynamic value for a given trace only?

@zhengkezhou1 zhengkezhou1 marked this pull request as draft April 22, 2025 23:43

// ToDBModel Converts the OTel pipeline Traces into a ClickHouse-compatible format for batch insertion.
// It maps the trace attributes, spans, links and events from the OTel model to the appropriate ClickHouse column types
func ToDBModel(td ptrace.Traces) proto.Input {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move to the top of the file

If ToDBModel() returns proto.Input, what value does dbmodel.go provide?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proto.Input is the compatible format for ClickHouse batch insertion. The conversion path can be ptrace -> dbmodel.Trace -> proto.Input. Considering the final target is proto.Input, converting to the intermediate structure dbmodel.Trace introduce unnecessary complexity. Why not convert directly from ptrace to proto.Input? Furthermore, dbmodel.go is also used for handling read operations (with clickhouse-go), responsible for mapping ClickHouse table fields to corresponding Go structs.

Signed-off-by: zhengkezhou1 <[email protected]>
Signed-off-by: zhengkezhou1 <[email protected]>
Signed-off-by: zhengkezhou1 <[email protected]>
Signed-off-by: zhengkezhou1 <[email protected]>
Signed-off-by: zhengkezhou1 <[email protected]>
@zhengkezhou1 zhengkezhou1 force-pushed the define-clickhouse-db-model-and-deserialization branch from 1976d7a to b885c23 Compare April 25, 2025 13:53
@zhengkezhou1 zhengkezhou1 marked this pull request as ready for review April 25, 2025 13:54
@dosubot dosubot bot added the v2 label Apr 25, 2025
Comment on lines +31 to +54
timestampCol := traceColumnSet.span.timestamp.Col
timestampCol.(*proto.ColDateTime64).Append(span.StartTimestamp().AsTime())
traceIDCol := traceColumnSet.span.traceID.Col
traceIDCol.(*proto.ColLowCardinality[string]).Append(traceIDToHexString(span.TraceID()))
spanIDCol := traceColumnSet.span.spanID.Col
spanIDCol.(*proto.ColLowCardinality[string]).Append(spanIDToHexString(span.SpanID()))
parentSpanIDCol := traceColumnSet.span.parentSpanID.Col
parentSpanIDCol.(*proto.ColLowCardinality[string]).Append(spanIDToHexString(span.ParentSpanID()))
traceStateCol := traceColumnSet.span.traceState.Col
traceStateCol.(*proto.ColLowCardinality[string]).Append(span.TraceState().AsRaw())
spanNameCol := traceColumnSet.span.name.Col
spanNameCol.(*proto.ColLowCardinality[string]).Append(span.Name())
spanKindCol := traceColumnSet.span.kind.Col
spanKindCol.(*proto.ColLowCardinality[string]).Append(span.Kind().String())
scopeNameCol := traceColumnSet.scope.name.Col
scopeNameCol.(*proto.ColLowCardinality[string]).Append(scope.Name())
scopeVersion := traceColumnSet.scope.version.Col
scopeVersion.(*proto.ColLowCardinality[string]).Append(scope.Version())
durationCol := traceColumnSet.span.duration.Col
durationCol.(*proto.ColDateTime64).Append(span.EndTimestamp().AsTime())
statusCodeCol := traceColumnSet.span.statusCode.Col
statusCodeCol.(*proto.ColLowCardinality[string]).Append(span.Status().Code().String())
statusMessageCol := traceColumnSet.span.statusMessage.Col
statusMessageCol.(*proto.ColLowCardinality[string]).Append(span.Status().Message())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that not all fields should be LowCardinality. This is because the difference between the String type and LowCardinality(String) during writing makes testing and comparison very difficult, as I mentioned before.


resourceAttributes, err := AttributesGroupToMap(dbTrace.Resource.Attributes)
if err != nil {
jptrace.AddWarnings(span, fmt.Sprintf("failed to decode bytes value: %v", err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there's no indication at this point in the code that the error is about "failed to decode bytes". Such error can be returned near the code that actually attempted decoding byte values. The most we can say here is "failed to decode attributes: %w"

scope := scopeSpans.Scope()
sc, err := FromDBScope(dbTrace.Scope)
if err != nil {
jptrace.AddWarnings(span, fmt.Sprintf("failed to decode bytes value: %v", err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue with error message

span.SetStartTimestamp(pcommon.NewTimestampFromTime(s.Timestamp))
traceId, err := hex.DecodeString(s.TraceId)
if err != nil {
return span, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"failed to decode trace ID: %w"

"github.com/jaegertracing/jaeger/internal/jptrace"
)

func FromDBModel(dbTrace Trace) ptrace.Traces {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how would dbTrace object be created before this function is called? Do you have a pointer in the old (full) PR as example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: zhengkezhou1 <[email protected]>
@yurishkuro yurishkuro added this pull request to the merge queue Apr 27, 2025
Merged via the queue into jaegertracing:main with commit 1e691d5 Apr 27, 2025
57 checks passed
amilbcahat pushed a commit to amilbcahat/jaeger that referenced this pull request May 4, 2025
…g#6935)

## Which problem is this PR solving?
- Part of jaegertracing#5058

## Description of the changes
- Based on the `ch-go` wire protocol, convert the OTel traces model to
the ClickHouse native format for batch insertion.

## How was this change tested?
-  unit tests

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `npm run lint` and `npm run test`

---------

Signed-off-by: zhengkezhou1 <[email protected]>
amilbcahat pushed a commit to amilbcahat/jaeger that referenced this pull request May 4, 2025
…g#6935)

## Which problem is this PR solving?
- Part of jaegertracing#5058

## Description of the changes
- Based on the `ch-go` wire protocol, convert the OTel traces model to
the ClickHouse native format for batch insertion.

## How was this change tested?
-  unit tests

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `npm run lint` and `npm run test`

---------

Signed-off-by: zhengkezhou1 <[email protected]>
Signed-off-by: amol-verma-allen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage changelog:exprimental Change to an experimental part of the code go Pull requests that update go code v2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants