Skip to content

Commit 52bbff3

Browse files
authored
Misc fixes and docs update (#195)
* Don't force test runs in ndarray and framework Signed-off-by: Ryan Nett <[email protected]> * gitignore bazel config files Signed-off-by: Ryan Nett <[email protected]> * Add CONTRIBUTING.md Signed-off-by: Ryan Nett <[email protected]> * Add note about code generation Signed-off-by: Ryan Nett <[email protected]> * Updates Signed-off-by: Ryan Nett <[email protected]> * Use set TF_CUDA_COMPUTE_CAPABILITIES by default Signed-off-by: Ryan Nett <[email protected]> * Add dedicated native builds section Signed-off-by: Ryan Nett <[email protected]> * Fix quoting Signed-off-by: Ryan Nett <[email protected]>
1 parent d8e212e commit 52bbff3

File tree

6 files changed

+121
-22
lines changed

6 files changed

+121
-22
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,3 +53,5 @@ gradleBuild
5353
.classpath
5454

5555
**/target
56+
.tf_configure.bazelrc
57+
.clwb/

CONTRIBUTING.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Building and Contributing to TensorFlow Java
2+
3+
## Building
4+
5+
To build all the artifacts, simply invoke the command `mvn install` at the root of this repository (or the Maven command of your choice). It is also
6+
possible to build artifacts with support for MKL enabled with
7+
`mvn install -Djavacpp.platform.extension=-mkl` or CUDA with `mvn install -Djavacpp.platform.extension=-gpu`
8+
or both with `mvn install -Djavacpp.platform.extension=-mkl-gpu`.
9+
10+
When building this project for the first time in a given workspace, the script will attempt to download
11+
the [TensorFlow runtime library sources](https://github.com/tensorflow/tensorflow) and build of all the native code for your platform. This requires a
12+
valid environment for building TensorFlow, including the [bazel](https://bazel.build/)
13+
build tool and a few Python dependencies (please read [TensorFlow documentation](https://www.tensorflow.org/install/source)
14+
for more details).
15+
16+
This step can take multiple hours on a regular laptop. It is possible though to skip completely the native build if you are working on a version that
17+
already has pre-compiled native artifacts for your platform [available on Sonatype OSS Nexus repository](#Snapshots). You just need to activate
18+
the `dev` profile in your Maven command to use those artifacts instead of building them from scratch
19+
(e.g. `mvn install -Pdev`).
20+
21+
Modifying the native op generation code (not the annotation processor) or the JavaCPP configuration (not the abstract Pointers) will require a
22+
complete build could be required to reflect the changes, otherwise `-Pdev` should be fine.
23+
24+
### Native Builds
25+
26+
In some cases, like when adding GPU support or re-generating op classes, you will need to re-build the native library. 99% of this is building
27+
TensorFlow, which by default is configured for the [CI](.github/workflows/ci.yml). The build configuration can be customized using the same methods as
28+
TensorFlow, so if you're building locally, you may need to clone the [tensorflow](https://github.com/tensorflow/tensorflow) project, run its
29+
configuration script (`./configure`), and copy the resulting
30+
`.tf_configure.bazelrc` to `tensorflow-core-api`. This overrides the default options, and you can add to it manually (i.e. adding `build --copt="-g"`
31+
to build with debugging info).
32+
33+
### GPU Support
34+
35+
Currently, due to build time constraints, the GPU binaries only support compute capacities 3.5 and 7.0.
36+
To use with un-supported GPUs, you have to build it yourself, after changing the value [here](tensorflow-core/tensorflow-core-api/build.sh#L27),
37+
setting the environment variable `TF_CUDA_COMPUTE_CAPABILITIES`, or configuring it in a bazel rc file (
38+
i.e. `build --action_env TF_CUDA_COMPUTE_CAPABILITIES="6.1"`). While this is far from ideal, we are working on getting more build resources, and for
39+
now this is the best option.
40+
41+
To build for GPU, pass `-Djavacpp.platform.extension=-gpu` to maven. By default, the CI options are used for the bazel build, see the above section
42+
for more info. If you add `bazelrc` files, make sure the `TF_CUDA_COMPUTE_CAPABILITIES` value in them matches the value set elsewhere, as it will take
43+
precedence if present.
44+
45+
## Running Tests
46+
47+
`ndarray` can be tested using the maven `test` target. `tensorflow-core` and `tensorflow-framework`, however, should be tested using
48+
the `integration-test` target, due to the need to include native binaries. It will **not** be ran when using the `test` target of parent projects, but
49+
will be ran by `install` or `integration-test`. If you see a `no jnitensorflow in java.library.path` error from tests it is likely because you're
50+
running the wrong test target.
51+
52+
### Native Crashes
53+
54+
Occasionally tests will fail with a message like:
55+
56+
```
57+
Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.22.0:test(default-test)on project tensorflow-core-api:There are test failures.
58+
59+
Please refer to C:\mpicbg\workspace\tensorflow\java\tensorflow-core\tensorflow-core-api\target\surefire-reports for the individual test results.
60+
Please refer to dump files(if any exist)[date]-jvmRun[N].dump,[date].dumpstream and[date]-jvmRun[N].dumpstream.
61+
The forked VM terminated without properly saying goodbye.VM crash or System.exit called?
62+
Command was cmd.exe/X/C"C:\Users\me\.jdks\adopt-openj9-1.8.0_275\jre\bin\java -jar C:\Users\me\AppData\Local\Temp\surefire236563113746082396\surefirebooter5751859365434514212.jar C:\Users\me\AppData\Local\Temp\surefire236563113746082396 2020-12-18T13-57-26_766-jvmRun1 surefire2445852067572510918tmp surefire_05950149004635894208tmp"
63+
Error occurred in starting fork,check output in log
64+
Process Exit Code:-1
65+
Crashed tests:
66+
org.tensorflow.TensorFlowTest
67+
org.apache.maven.surefire.booter.SurefireBooterForkException:The forked VM terminated without properly saying goodbye.VM crash or System.exit called?
68+
Command was cmd.exe/X/C"C:\Users\me\.jdks\adopt-openj9-1.8.0_275\jre\bin\java -jar C:\Users\me\AppData\Local\Temp\surefire236563113746082396\surefirebooter5751859365434514212.jar C:\Users\me\AppData\Local\Temp\surefire236563113746082396 2020-12-18T13-57-26_766-jvmRun1 surefire2445852067572510918tmp surefire_05950149004635894208tmp"
69+
Error occurred in starting fork,check output in log
70+
Process Exit Code:-1
71+
Crashed tests:
72+
org.tensorflow.TensorFlowTest
73+
at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:671)
74+
at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:533)
75+
at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:278)
76+
at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:244)
77+
```
78+
79+
This is because the native code crashed (i.e. because of a segfault), and it should have created a dump file somewhere in the project that you can use
80+
to tell what caused the issue.
81+
82+
## Contributing
83+
84+
### Formatting
85+
86+
Java sources should be formatted according to the [Google style guide](https://google.github.io/styleguide/javaguide.html). It can be included
87+
in [IntelliJ](https://github.com/google/styleguide/blob/gh-pages/intellij-java-google-style.xml) and
88+
[Eclipse](https://github.com/google/styleguide/blob/gh-pages/eclipse-java-google-style.xml).
89+
[Google's C++ style guide](https://google.github.io/styleguide/cppguide.html) should also be used for C++ code.
90+
91+
### Code generation
92+
93+
Code generation for `Ops` and related classes is done during `tensorflow-core-api`'s `compile` phase, using the annotation processor in
94+
`tensorflow-core-generator`. If you change or add any operator classes (annotated with `org.tensorflow.op.annotation.Operator`), endpoint methods (
95+
annotated with `org.tensorflow.op.annotation.Endpoint`), or change the annotation processor, be sure to re-run a
96+
`mvn install` in `tensorflow-core-api` (`-Pdev` is fine for this, it just needs to run the annotation processor).
97+
98+
### Working with Bazel generation
99+
100+
`tensorflow-core-api` uses Bazel-built C++ code generation to generate most of the `@Operator` classes. See [Native Builds](#native-builds) for
101+
instructions on configuring the bazel build. To run the code generation, use the `//:java_op_generator` target. The resulting binary has good help
102+
text (viewable in
103+
[op_gen_main.cc](tensorflow-core/tensorflow-core-api/src/bazel/op_generator/op_gen_main.cc#L31-L48)). Generally, it should be called with arguments
104+
that are something like:
105+
106+
```
107+
bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/libtensorflow_cc.so --output_dir=src/gen/java --api_dirs=bazel-tensorflow-core-api/external/org_tensorflow/tensorflow/core/api_def/base_api,src/bazel/api_def
108+
```
109+
110+
(called in `tensorflow-core-api`).

README.md

Lines changed: 8 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -34,26 +34,17 @@ The following describes the layout of the repository and its different artifacts
3434
* Intended audience: any developer who needs a Java n-dimensional array implementation, whether or not they
3535
use it with TensorFlow
3636

37-
## Building Sources
3837

39-
To build all the artifacts, simply invoke the command `mvn install` at the root of this repository (or
40-
the Maven command of your choice). It is also possible to build artifacts with support for MKL enabled with
41-
`mvn install -Djavacpp.platform.extension=-mkl` or CUDA with `mvn install -Djavacpp.platform.extension=-gpu`
42-
or both with `mvn install -Djavacpp.platform.extension=-mkl-gpu`.
38+
## Communication
4339

44-
When building this project for the first time in a given workspace, the script will attempt to download
45-
the [TensorFlow runtime library sources](https://github.com/tensorflow/tensorflow) and build of all the native code
46-
for your platform. This requires a valid environment for building TensorFlow, including the [bazel](https://bazel.build/)
47-
build tool and a few Python dependencies (please read [TensorFlow documentation](https://www.tensorflow.org/install/source)
48-
for more details).
40+
This repository is maintained by TensorFlow JVM Special Interest Group (SIG). You can easily join the group
41+
by subscribing to the [[email protected]](https://groups.google.com/a/tensorflow.org/forum/#!forum/jvm)
42+
mailing list, or you can simply send pull requests and raise issues to this repository.
43+
There is also a [sig-jvm Gitter channel](https://gitter.im/tensorflow/sig-jvm).
4944

50-
This step can take multiple hours on a regular laptop. It is possible though to skip completely the native build if you are
51-
working on a version that already has pre-compiled native artifacts for your platform [available on Sonatype OSS Nexus repository](#Snapshots).
52-
You just need to activate the `dev` profile in your Maven command to use those artifacts instead of building them from scratch
53-
(e.g. `mvn install -Pdev`).
45+
## Building Sources
5446

55-
Note that modifying any source files under `tensorflow-core` may impact the low-level TensorFlow bindings, in which case a
56-
complete build could be required to reflect the changes.
47+
See [CONTRIBUTING.md](CONTRIBUTING.md#building).
5748

5849
## Using Maven Artifacts
5950

@@ -162,6 +153,4 @@ This table shows the mapping between different version of TensorFlow for Java an
162153

163154
## How to Contribute?
164155

165-
This repository is maintained by TensorFlow JVM Special Interest Group (SIG). You can easily join the group
166-
by subscribing to the [[email protected]](https://groups.google.com/a/tensorflow.org/forum/#!forum/jvm)
167-
mailing list, or you can simply send pull requests and raise issues to this repository.
156+
Contributions are welcome, guidelines are located in [CONTRIBUTING.md](CONTRIBUTING.md).

ndarray/pom.xml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,6 @@
8080
<forkCount>1</forkCount>
8181
<reuseForks>false</reuseForks>
8282
<argLine>-Xmx2G -XX:MaxPermSize=256m</argLine>
83-
<skipTests>false</skipTests>
8483
<includes>
8584
<include>**/*Test.java</include>
8685
</includes>

tensorflow-core/tensorflow-core-api/build.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ fi
2424

2525
if [[ "${EXTENSION:-}" == *gpu* ]]; then
2626
export BUILD_FLAGS="$BUILD_FLAGS --config=cuda"
27-
export TF_CUDA_COMPUTE_CAPABILITIES="3.5,7.0"
27+
export TF_CUDA_COMPUTE_CAPABILITIES="${TF_CUDA_COMPUTE_CAPABILITIES:-"3.5,7.0"}"
2828
if [[ -z ${TF_CUDA_PATHS:-} ]] && [[ -d ${CUDA_PATH:-} ]]; then
2929
# Work around some issue with Bazel preventing it from detecting CUDA on Windows
3030
export TF_CUDA_PATHS="$CUDA_PATH"

tensorflow-framework/pom.xml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,6 @@
9494
<forkCount>1</forkCount>
9595
<reuseForks>false</reuseForks>
9696
<argLine>-Xmx2G -XX:MaxPermSize=256m</argLine>
97-
<skipTests>false</skipTests>
9897
<includes>
9998
<include>**/*Test.java</include>
10099
</includes>

0 commit comments

Comments
 (0)