MemSQL Pipelines CSV with Transform process with R

Hi, can anyone help with a example of pipelines using R as transform step? Thanks a lot

Hi @edwin.florescastello

Could you give us more information on using R as a transform step? Do you have any code written that we could help with?

Sure Jacky. This is basic example that I trying to implement.

R code:
#!/usr/bin/env Rscript

##------Input files from CSV, to changed to MemSQL input----------
##setwd("/home/memsql/TestFiles/Sample.csv")

Read sample data

sampledata <- read.csv(“Sample.csv”)
sampledata$Test_Calc<- sampledata$Sample * 3
##sampledata
##-----------------------------------------------------------------------
Test_CalcData <- sampledata$Test_Calc

##—Output to file, to be changed to MemSQL table-------
##write.csv(Test_CalcData, ‘TestCalc.csv.log’)
write.csv(sampledata, “Sample.csv”)
##-------------------------------------------------------

MemSQL Pipeline code
CREATE AGGREGATOR PIPELINE tp2 as
LOAD DATA FS ‘/home/memsql/TestFiles/Sample.csv’
WITH TRANSFORM (’** **file://localhost/home/memsql/TestCode_Pipe.R’,’’,’’)
INTO TABLE Test_Raw FIELDS TERMINATED BY ‘,’;

CSV
Sample,Test_Calc,
26.48416667,
102.265,
321.6091667,
1,
158.5141667,
1699.884167,
-32.9225,
145.8925,
20.50833333,
4454.87,
533.4758333,
2.32,
26.2875,
1538.288333,
Thanks for the help

I’m not familiar enough with R to know exactly how to do this, but a memsql transform reads the input file on stdin and writes, say, a CSV file to stdout. You should research how to do this in R, I assume its possible. I don’t know how good R is as a scripting language though.

To test your program, you can do

cat your_input_file.csv | ./TestCode_Pipe.R

and, if it prints CSV to the terminal, you have won.