You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had searched in the issues and found no similar issues.
Description
If the amount of data in a partition is greater than INSERT_BARCH_SIZE, each task commits multiple StreamLoad tasks. If the task fails to retry, all data in the partition is recommitted to the StreamLoad task, as well as the data that was previously successfully written. Data duplication occurs.
当一个分区中的数据量大于参数INSERT_BARCH_SIZE时,每个task便会提交多个StreamLoad任务,如果任务发生失败重试,那么该分区的所有数据便会重新提交StreamLoad任务,对于之前成功写入的数据也会重新提交,造成数据重复。