Apache Iceberg C++ library
You do not need Java to use Apache Iceberg™.
It's an alternative for iceberg-cpp. The library is a part of our extention for Greenplum that allows it to read Iceberg data from S3 compatible storage using HMS and Nessie catalogs. The extention is not opensourced yet. But we're thinking about it.
Source https://iceberg.apache.org/status/
Data type | Iceberg version | Cxx | Java |
---|---|---|---|
boolean | 2 | + | + |
int | 2 | + | + |
float | 2 | + | + |
double | 2 | + | + |
decimal | 2 | + | + |
date | 2 | + | + |
time | 2 | + | + |
timestamp | 2 | + | + |
timestamptz | 2 | + | + |
timestamp_ns | 3 | + | + |
timestamptz_ns | 3 | + | + |
string | 2 | + | + |
uuid | 2 | + | + |
fixed | 2 | + | + |
binary | 2 | + | + |
variant | 3 | - | + |
list | 2 | + | + |
map | 2 | - | + |
struct | 2 | - | + |
unknown | 3 | + | ? |
Datetime restrictions are defined in Iceberg spec.
For date
underlying type is int32
. For time*
it's int64
.
timestamp
and timestamptz
store microseconds from 1970-01-01 00:00:00.000000
.
timestamp_ns
and timestamptz_ns
store nanoseconds from 1970-01-01 00:00:00.000000000
.
tz
suffix means the time is adjusted to UTC.
File format | Cxx | Java |
---|---|---|
Parquet | + | + |
ORC | - | + |
Puffin | + | + |
Avro | - | + |
File IO | Cxx | Java |
---|---|---|
Local Filesystem | + | + |
Hadoop Filesystem | - | + |
S3 Compatible | + | + |
GCS Compatible | - | + |
ADLS Compatible | - | + |
Not implemented
Not implemented
Operation | Iceberg version | Cxx | Java |
---|---|---|---|
Plan with data file | 1,2 | + | + |
Plan with position deletes | 2 | + | + |
Plan with equality deletes | 2 | + | + |
Plan with puffin statistics | 1,2 | - | + |
Read data file | 1,2 | + | + |
Read with position deletes | 2 | + | + |
Read with equality deletes | 2 | + | + |
Operation | Iceberg version | Cxx | Java |
---|---|---|---|
Append data | 1,2 | + | + |
Write position deletes | 2 | - | + |
Write equality deletes | 2 | - | + |
Write deletion vectors | 3 | + | + |
Table Operation | Rest | Glue | HMS |
---|---|---|---|
listTable | - | - | - |
createTable | - | - | - |
dropTable | - | - | - |
loadTable | +- | - | +- |
updateTable | - | - | - |
renameTable | - | - | - |
tableExists | +- | - | +- |
createView | - | - | - |
dropView | - | - | - |
listView | - | - | - |
viewExists | - | - | - |
replaceView | - | - | - |
renameView | - | - | - |
listNamespaces | - | - | - |
createNamespace | - | - | - |
dropNamespace | - | - | - |
namespaceExists | - | - | - |
updateNamespaceProperties | - | - | - |
loadNamespaceMetadata | - | - | - |
- C++20 compliant compiler
- CMake 3.20 or higher
- OpenSSL
You have to download Apache Arrow dependencies first.
mkdir _deps && cd _deps
git clone --single-branch -b maint-15.0.2 https://github.com/apache/arrow.git
cd arrow && git apply ../../vendor/arrow/fix_c-ares_url.patch && cd ..
./arrow/cpp/thirdparty/download_dependencies.sh ./arrow-thirdparty
mkdir _build
cd _build
ln -s ../_deps/arrow-thirdparty arrow-thirdparty
cmake -GNinja ../
ninja
cd tests/
../iceberg/iceberg-cpp-test
../iceberg/common/fs/iceberg_common_fs_test
./iceberg_local_test