ERROR 1970 ER_SUBPROCESS_TIMEOUT_ERROR: Subprocess timed out when creating a pipeline

Running into a bit of an issue when creating pipelines:

ERROR 1970 ER_SUBPROCESS_TIMEOUT_ERROR: Subprocess timed out. Truncated stderr:

Upon failure, the command.log on the aggregator says:

1958097856 2019-02-05 09:58:25.114  ERROR: write() system call (fd=11) failed with errno: 32 (Broken pipe)
1958097903 2019-02-05 09:58:25.114  ERROR: NotifyAndClose(): Failed writing back to the engine

Any ideas of what might be wrong?

Hello,

Can you share an example of how are you going to create a pipeline?

Thanks

Basically any create pipeline statement fails, including the one in the docs:

CREATE PIPELINE library
AS LOAD DATA S3 '<my bucket>'
CONFIG '{"region": "eu-west-1"}'
CREDENTIALS '{"aws_access_key_id": "<key>", "aws_secret_access_key": "<secret>"}'
INTO TABLE `test`
FIELDS TERMINATED BY ',';

We recently had to detach and re-attach some leaves, but before that, everything was working well. The error seems to be more on the OS side?

Hi,

Are you using an on-Prem MemSQL cluster or AWS MemSQL Cluster ?

Thanks

On-prem, centos-based.

Hi,

You are trying to create pipeline using the following command which will not work because the following command refers S3 bucket and AWS credentials.

CREATE PIPELINE library AS LOAD DATA S3 ‘<my bucket>’ CONFIG ‘{“region”: “eu-west-1”}’ CREDENTIALS ‘{“aws_access_key_id”: “<key>”, “aws_secret_access_key”: “<secret>”}’ INTO TABLE test FIELDS TERMINATED BY ‘,’;

Are you trying to create filesystem pipeline?

Thanks

No, I’m trying to create an S3 pipeline. This was working fine a couple of days ago, after which we detached/attached some leaves. The status for all nodes seems to be ok, but somehow, adding new pipelines no longer works.

Hi,

You cannot create S3 Pipeline on Prem. You need MemSQL cluster on AWS to create pipeline with S3.

Thanks

Clarification: I’m using S3 pipelines, but pointing to our on-prem ceph storage. This has worked without a hitch in the past.

E.g. this used to work just fine on-prem:

CREATE PIPELINE pipeline_test
AS LOAD DATA S3 '<bucket>'
CONFIG '{"endpoint_url": "<our ceph endpoint"}'
CREDENTIALS '{"aws_access_key_id": "<key>", "aws_secret_access_key": "<secret>"}'
MAX_PARTITIONS_PER_BATCH 1
REPLACE
INTO TABLE test
LINES TERMINATED BY '\n'

I also just tested running a cluster-in-a-box on my laptop and S3 pipelines (also AWS S3) work perfectly fine.

I’ll do a full reinstall tomorrow and report back.

Hi,

Let me double check on my side.

Thanks

Hey Max,

First of all, can we quickly confirm you are on 6.5 or later? If you are on an earlier version, it might be worth upgrading, since S3 pipelines have been significantly improved since then. If you can’t upgrade, you might have to bump pipelines_extractor_get_offsets_timeout_ms, but in 6.5 or later, this should not be necessary.

Otherwise, it could be a bonified a connectivity issue. Make sure the Master Aggregator can connect to your ceph node.

Did this work for anyone? I am seeing this suddenly come up in our 7.3 version out of the blue.