Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Nov 4, 2025

What changes were proposed in this pull request?

Bump gRPC from 1.67 to 1.76, with additional Python package upgrades for consistency:

  • googleapis-common-protos==1.71.0
  • protobuf==6.33.0

And buf v33.0

Fix the shading leaks of the spark-connect jar

before

$ jar tf spark-connect_2.13-4.1.0-preview3.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
javax/annotation/Generated.class
...
javax/ejb/EJB.class
...
javax/persistence/PersistenceContext.class
...
javax/xml/ws/WebServiceRef.class
...
com/google/shopping/type/Price$Builder.class
...
com/google/apps/card/v1/Widget$DataCase.class
...

after

$ jar tf spark-connect_2.13-4.2.0-SNAPSHOT.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
<no-output>

Why are the changes needed?

For Python:

For Java:

Check full release notes at: https://github.com/grpc/grpc/releases

Does this PR introduce any user-facing change?

Maybe, reduce the potential conflict risks between Spark and user classes.

How was this patch tested?

Pass GHA, plus manual checks (see above sections).

Was this patch authored or co-authored using generative AI tooling?

No.

@pan3793 pan3793 marked this pull request as ready for review November 4, 2025 11:39
@pan3793
Copy link
Member Author

pan3793 commented Nov 4, 2025

</execution>
</executions>
</plugin>
<plugin>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After removing this, the configuration from the parent's pom.xml will be inherited. A double check is needed to see if there are any negative impacts.

Copy link
Member Author

@pan3793 pan3793 Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after removing org.apache.tomcat:annotations-api, the only included jar left over is org.spark-project.spark:unused, so it's safe to remove and use global rules (inherit from the parent's pom.xml). in addition, in spark binary distribution, the spark-connect-common jar neither exists in jars/ nor jars/connect-repl/


the above conclusion is incorrect, details are explained at #52918

<pattern>org.checkerframework</pattern>
<shadedPattern>${spark.shade.packageName}.org.checkerframework</shadedPattern>
</relocation>
<relocation>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide relevant links to confirm that it's really not needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was mentioned in PR description, the real matter here is javax.annotation.Generated, see details at grpc/grpc-java#9179

Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM
Thanks @pan3793

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-54177][BUILD] Upgrade gRPC 1.76.0 [SPARK-54177][BUILD] Upgrade gRPC to 1.76 and protobuf to 6.33 Nov 4, 2025
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @pan3793 and @LuciferYang .

Merged to master/4.1.

dongjoon-hyun pushed a commit that referenced this pull request Nov 4, 2025
### What changes were proposed in this pull request?

Bump gRPC from 1.67 to 1.76, with additional Python package upgrades for consistency:

- `googleapis-common-protos==1.71.0`
- `protobuf==6.33.0`

And `buf v33.0`

Fix the shading leaks of the `spark-connect` jar

before
```
$ jar tf spark-connect_2.13-4.1.0-preview3.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
javax/annotation/Generated.class
...
javax/ejb/EJB.class
...
javax/persistence/PersistenceContext.class
...
javax/xml/ws/WebServiceRef.class
...
com/google/shopping/type/Price$Builder.class
...
com/google/apps/card/v1/Widget$DataCase.class
...
```

after

```
$ jar tf spark-connect_2.13-4.2.0-SNAPSHOT.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
<no-output>
```

### Why are the changes needed?

For Python:

- [grpcio v1.75.1](https://github.com/grpc/grpc/releases/tag/v1.75.1) addes official Python 3.14 support
- googleapis-common-proto v1.71.0 addes official Python 3.14 support, see googleapis/google-cloud-python#14699

For Java:

- v1.74 removes dependency on Tomcat's annotation API, see grpc/grpc-java#9179

Check full release notes at: https://github.com/grpc/grpc/releases

### Does this PR introduce _any_ user-facing change?

Maybe, reduce the potential conflict risks between Spark and user classes.

### How was this patch tested?

Pass GHA, plus manual checks (see above sections).

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #52874 from pan3793/SPARK-54177.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit e55333c)
Signed-off-by: Dongjoon Hyun <[email protected]>
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this seems to break the commit builders of both master and branch-4.1 branch. Let me revert this.

#12 2.368 ERROR: Cannot install grpcio-status==1.71.2 and protobuf==6.33.0 because these package versions have conflicting dependencies.
#12 2.368 
#12 2.368 The conflict is caused by:
#12 2.368     The user requested protobuf==6.33.0
#12 2.368     grpcio-status 1.71.2 depends on protobuf<6.0dev and >=5.26.1
#12 2.368 
#12 2.368 Additionally, some packages in these conflicts have no matching distributions available for your environment:
#12 2.368     protobuf

@dongjoon-hyun
Copy link
Member

It seems that there exists some leftovers.

@pan3793
Copy link
Member Author

pan3793 commented Nov 4, 2025

@dongjoon-hyun I see the issue, Python 3.14 uses a different grpc version, so it is not covered by search-replace, will send a fixed version soon.

@dongjoon-hyun
Copy link
Member

Thank you!

pan3793 added a commit to pan3793/spark that referenced this pull request Nov 4, 2025
### What changes were proposed in this pull request?

Bump gRPC from 1.67 to 1.76, with additional Python package upgrades for consistency:

- `googleapis-common-protos==1.71.0`
- `protobuf==6.33.0`

And `buf v33.0`

Fix the shading leaks of the `spark-connect` jar

before
```
$ jar tf spark-connect_2.13-4.1.0-preview3.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
javax/annotation/Generated.class
...
javax/ejb/EJB.class
...
javax/persistence/PersistenceContext.class
...
javax/xml/ws/WebServiceRef.class
...
com/google/shopping/type/Price$Builder.class
...
com/google/apps/card/v1/Widget$DataCase.class
...
```

after

```
$ jar tf spark-connect_2.13-4.2.0-SNAPSHOT.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
<no-output>
```

### Why are the changes needed?

For Python:

- [grpcio v1.75.1](https://github.com/grpc/grpc/releases/tag/v1.75.1) addes official Python 3.14 support
- googleapis-common-proto v1.71.0 addes official Python 3.14 support, see googleapis/google-cloud-python#14699

For Java:

- v1.74 removes dependency on Tomcat's annotation API, see grpc/grpc-java#9179

Check full release notes at: https://github.com/grpc/grpc/releases

### Does this PR introduce _any_ user-facing change?

Maybe, reduce the potential conflict risks between Spark and user classes.

### How was this patch tested?

Pass GHA, plus manual checks (see above sections).

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#52874 from pan3793/SPARK-54177.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Nov 5, 2025
Second attempt of #52874, fixed the grpcio deps version for Python 3.14, which was missed in the previous attempt.

### What changes were proposed in this pull request?

Bump gRPC from 1.67 to 1.76, with additional Python package upgrades for consistency:

- `googleapis-common-protos==1.71.0`
- `protobuf==6.33.0`

And `buf v33.0`

Fix the shading leaks of the `spark-connect` jar

before
```
$ jar tf spark-connect_2.13-4.1.0-preview3.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
javax/annotation/Generated.class
...
javax/ejb/EJB.class
...
javax/persistence/PersistenceContext.class
...
javax/xml/ws/WebServiceRef.class
...
com/google/shopping/type/Price$Builder.class
...
com/google/apps/card/v1/Widget$DataCase.class
...
```

after

```
$ jar tf spark-connect_2.13-4.2.0-SNAPSHOT.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
<no-output>
```

### Why are the changes needed?

For Python:

- [grpcio v1.75.1](https://github.com/grpc/grpc/releases/tag/v1.75.1) addes official Python 3.14 support
- googleapis-common-proto v1.71.0 addes official Python 3.14 support, see googleapis/google-cloud-python#14699

For Java:

- v1.74 removes dependency on Tomcat's annotation API, see grpc/grpc-java#9179

Check full release notes at: https://github.com/grpc/grpc/releases

### Does this PR introduce _any_ user-facing change?

Maybe, reduce the potential conflict risks between Spark and user classes.

### How was this patch tested?

Pass GHA, plus manual checks (see above sections).

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #52879 from pan3793/SPARK-54177-2.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Nov 5, 2025
Second attempt of #52874, fixed the grpcio deps version for Python 3.14, which was missed in the previous attempt.

### What changes were proposed in this pull request?

Bump gRPC from 1.67 to 1.76, with additional Python package upgrades for consistency:

- `googleapis-common-protos==1.71.0`
- `protobuf==6.33.0`

And `buf v33.0`

Fix the shading leaks of the `spark-connect` jar

before
```
$ jar tf spark-connect_2.13-4.1.0-preview3.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
javax/annotation/Generated.class
...
javax/ejb/EJB.class
...
javax/persistence/PersistenceContext.class
...
javax/xml/ws/WebServiceRef.class
...
com/google/shopping/type/Price$Builder.class
...
com/google/apps/card/v1/Widget$DataCase.class
...
```

after

```
$ jar tf spark-connect_2.13-4.2.0-SNAPSHOT.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
<no-output>
```

### Why are the changes needed?

For Python:

- [grpcio v1.75.1](https://github.com/grpc/grpc/releases/tag/v1.75.1) addes official Python 3.14 support
- googleapis-common-proto v1.71.0 addes official Python 3.14 support, see googleapis/google-cloud-python#14699

For Java:

- v1.74 removes dependency on Tomcat's annotation API, see grpc/grpc-java#9179

Check full release notes at: https://github.com/grpc/grpc/releases

### Does this PR introduce _any_ user-facing change?

Maybe, reduce the potential conflict risks between Spark and user classes.

### How was this patch tested?

Pass GHA, plus manual checks (see above sections).

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #52879 from pan3793/SPARK-54177-2.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 83e49b7)
Signed-off-by: Dongjoon Hyun <[email protected]>
huangxiaopingRD pushed a commit to huangxiaopingRD/spark that referenced this pull request Nov 25, 2025
### What changes were proposed in this pull request?

Bump gRPC from 1.67 to 1.76, with additional Python package upgrades for consistency:

- `googleapis-common-protos==1.71.0`
- `protobuf==6.33.0`

And `buf v33.0`

Fix the shading leaks of the `spark-connect` jar

before
```
$ jar tf spark-connect_2.13-4.1.0-preview3.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
javax/annotation/Generated.class
...
javax/ejb/EJB.class
...
javax/persistence/PersistenceContext.class
...
javax/xml/ws/WebServiceRef.class
...
com/google/shopping/type/Price$Builder.class
...
com/google/apps/card/v1/Widget$DataCase.class
...
```

after

```
$ jar tf spark-connect_2.13-4.2.0-SNAPSHOT.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
<no-output>
```

### Why are the changes needed?

For Python:

- [grpcio v1.75.1](https://github.com/grpc/grpc/releases/tag/v1.75.1) addes official Python 3.14 support
- googleapis-common-proto v1.71.0 addes official Python 3.14 support, see googleapis/google-cloud-python#14699

For Java:

- v1.74 removes dependency on Tomcat's annotation API, see grpc/grpc-java#9179

Check full release notes at: https://github.com/grpc/grpc/releases

### Does this PR introduce _any_ user-facing change?

Maybe, reduce the potential conflict risks between Spark and user classes.

### How was this patch tested?

Pass GHA, plus manual checks (see above sections).

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#52874 from pan3793/SPARK-54177.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
huangxiaopingRD pushed a commit to huangxiaopingRD/spark that referenced this pull request Nov 25, 2025
Second attempt of apache#52874, fixed the grpcio deps version for Python 3.14, which was missed in the previous attempt.

### What changes were proposed in this pull request?

Bump gRPC from 1.67 to 1.76, with additional Python package upgrades for consistency:

- `googleapis-common-protos==1.71.0`
- `protobuf==6.33.0`

And `buf v33.0`

Fix the shading leaks of the `spark-connect` jar

before
```
$ jar tf spark-connect_2.13-4.1.0-preview3.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
javax/annotation/Generated.class
...
javax/ejb/EJB.class
...
javax/persistence/PersistenceContext.class
...
javax/xml/ws/WebServiceRef.class
...
com/google/shopping/type/Price$Builder.class
...
com/google/apps/card/v1/Widget$DataCase.class
...
```

after

```
$ jar tf spark-connect_2.13-4.2.0-SNAPSHOT.jar | grep '.class$' | grep -v 'org/apache/spark' | grep -v 'org/sparkproject' | grep -v 'META-INF'
<no-output>
```

### Why are the changes needed?

For Python:

- [grpcio v1.75.1](https://github.com/grpc/grpc/releases/tag/v1.75.1) addes official Python 3.14 support
- googleapis-common-proto v1.71.0 addes official Python 3.14 support, see googleapis/google-cloud-python#14699

For Java:

- v1.74 removes dependency on Tomcat's annotation API, see grpc/grpc-java#9179

Check full release notes at: https://github.com/grpc/grpc/releases

### Does this PR introduce _any_ user-facing change?

Maybe, reduce the potential conflict risks between Spark and user classes.

### How was this patch tested?

Pass GHA, plus manual checks (see above sections).

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#52879 from pan3793/SPARK-54177-2.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants