COPY command relies on column order in DataFrame

The write operation currently generates a `COPY` command like the following:

```sql
COPY "PUBLIC"."some_table" FROM 's3://some-bucket/tmp/manifest.json' CREDENTIALS 'aws_access_key_id=__;aws_secret_access_key=__' FORMAT AS CSV NULL AS '@NULL@' manifest 
```

This relies on the DataFrame to have the columns in the same order as the table if it already exists. However, the `COPY` command supports specifying column lists or JSONPath expressions to map columns ([documentation](http://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-column-mapping.html)). It would be nice to at least support the column list, potentially as an option on the write operation like:

```scala
df.write
  .format("com.databricks.spark.redshift")
  .option("url", "jdbc:redshift://redshifthost:5439/database?user=username&password=pass")
  .option("dbtable", "my_table_copy")
  .option("tempdir", "s3n://path/for/temp/data")
  .option("include_column_list", "true")
  .mode("error")
  .save()
```

Looks like this should be fairly straightforward to add [here](https://github.com/databricks/spark-redshift/blob/master/src/main/scala/com/databricks/spark/redshift/RedshiftWriter.scala#L89-L103).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

COPY command relies on column order in DataFrame #340

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

COPY command relies on column order in DataFrame #340

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions