Showing Data Lineage Using Data Pipelines

Hello,

I am using SingleStore 7.1.5 free edition, I have few Kafka pipelines running into my 2 node cluster.
Basically I am injecting API data into my Kafka and from there transferring to SingleStore using data pipelines and procedures. The Kafka message are in JSON formats and my data pipeline statement is like below -

CREATE OR REPLACE PIPELINE deposittransactions_pipeline AS 
LOAD DATA KAFKA '*******/savingTransactions'
CONFIG '{"kafka_version":"0.11.0.0"}'
INTO PROCEDURE savingtransactions_procedure
FORMAT JSON
(	
	@transactiondate < - entityData::transactionDate,
	entityname < - entityName,
	globalid < - globalId,
	processtype < - processType,
	transactionid < - entityData::savingTransactionId,
	accountId <- entityData::accountId,
	paymentTypeId <- entityData::paymentTypeId,
	paymentType <- entityData::paymentType,
	transactionType <- entityData::transactionType,
	description <- entityData::note,
	dateFormat <- entityData::dateFormat,
	locale <- entityData::locale,
	type <- entityData::type,
	amount <- entityData::transactionAmount,
	customerId <- entityData::customerId,
	currencycode <- entityData::currency,
	balanceamount <- entityData::runningBalance,
	submittedby <- entityData::submittedByUsername,
	completejson < - %
)
SET transactiondate = TO_DATE(@transactiondate,'DD MM YYYY'), 
transactionpostingdate = TO_DATE(@transactiondate,'DD MM YYYY');

I want to show column level data lineage on above pipelines, but I am not able to get proper information about the data pipelines created into SingleStore as well as I tried looking into SingleStore documentation on data lineage but unfortunately I did not found any documents.

Is there any solution on this? Does SingleStore support showing data lineage using pipelines? Or is there any tool which I can use to get lineage information by looking into data pipeline statement?

What do you mean by “column level data lineage on above pipelines?”

Hello @hanson1,

I mean, if we consider above data pipeline, the source is Kafka and target is the MemSQL procedure. Here we want create a lineage which will show the column to column mapping from source to target.
like source.entityname - target.entityname, source.globalid - source.globalid … and source.entitydata::runningBalance - target.balanceamount etc.

Hope you understand the concern.