I’m loading GZIP files from S3 into memsql cluster using the pipline.
At first I got many ERROR 1262 ER_WARN_TOO_MANY_RECORDS errors so I added the IGNORE ALL ERRORS to the pipline and rerun it.
Now, after it loads the data, I want to know how many rows it didn’t load.
Where can I find this info?
This table has pipelines error info:
But if it actually does not process some rows because of ERROR 1262 then it’s possible that not all rows will be flagged as errors.
Consider using a SELECT query to count how many rows got loaded, based on a timestime or other criteria, and pass your S3 files (after copying them to a linux-visible directory) through
wc using a linux command line to count the lines, to see the difference in total rows processed.