Pipline get all catch errors

baruch.machluf · July 22, 2019, 9:49am

Hi,

I’m loading GZIP files from S3 into memsql cluster using the pipline.
At first I got many ERROR 1262 ER_WARN_TOO_MANY_RECORDS errors so I added the IGNORE ALL ERRORS to the pipline and rerun it.

Now, after it loads the data, I want to know how many rows it didn’t load.
Where can I find this info?

Thanks!

hanson · July 22, 2019, 8:00pm

This table has pipelines error info:

information_schema.PIPELINES_ERRORS Table

But if it actually does not process some rows because of ERROR 1262 then it’s possible that not all rows will be flagged as errors.

Consider using a SELECT query to count how many rows got loaded, based on a timestime or other criteria, and pass your S3 files (after copying them to a linux-visible directory) through wc using a linux command line to count the lines, to see the difference in total rows processed.