HDFS Pipeline is not working

Hello All.
I’m trying to load a csv file in HDFS to a table. The “CREATE PIPELINE” command had no error.
But after starting pipeline and select * from the table no data is in the table.

If there is no error with the “CREATE PIPELINE” command, does it mean SingleStore correctly connect to HDFS to get the file? I double checked the DDL of table and input data cause when I tested on S3 to load the file it worked well.
Is there anything I have to check or test for loading from HDFS?

hello Iris, can you please share the syntax you used for HDFS pipeline creation? Here is the docs for the same : HDFS Pipeline Scenario · SingleStore Documentation

1 Like

Thank you for replying.

Here’s the syntax you’ve asked.

CREATE PIPELINE diabetes_hdfs
AS LOAD DATA HDFS ‘hdfs://hadoop-namenode:8020/user/iris-kang/diabetes_noheader.csv’
CONFIG ‘{
“hadoop.security.authentication”: “kerberos”,
“kerberos.user”: “iris-kang@EXAMPLE.COM”,
“kerberos.keytab”: “/home/singlestore/iris-kang.keytab”,
“dfs.client.use.datanode.hostname”: true,
“dfs.datanode.kerberos.principal”: “hdfs/_HOST@EXAMPLE.COM”,
“dfs.namenode.kerberos.principal”: “hdfs/_HOST@EXMPLE.COM”
}’
INTO TABLE diabetes_hdfs
FIELDS TERMINATED BY ‘,’;
( I replaced inhouse host names with “hadoop-namenode” and “EXAMPLE.COM”)

There was no error on creating pipeline but it doesn’t work.
Thank you again, Mr. Kumar