Whenever we need to add a new column or modify the column in the pipeline, we have to
- Open the PIPELINE DDL, do the required changes
- Drop the pipeline or replace the pipeline with create or replace command
- We are currently hard coding for PIPELINE DATASURCE FILE with either pattern or specific name, instead of it is there any alternate way to pass the pipeline data source file name as parameter to invoke the pipeline.
The issue is whenever we do drop and recreate pipeline, it picks up all the files with the File Data source pattern that we are specifying, it is corrupting the data to the max extent and becoming difficult to trace out.
We are exercising utmost caution by archiving the previously processed files and some instances any residue files are available in the data source path, when we recreate the pipeline and then trigger the job, it’s picking up all the files in the path.
I would need help on :
- Do we have the mechanism to pass the data source file name as dynamic one as parameter)?
- Or do we have the way to just alter the pipeline without drop and recreate when we need to modify some of the elements of the pipeline DDL?
- Is there any metadata where we can see by running a query or some other mechanism to see : pipeline and filename info without opening the pipeline DDL?
I know that once we run the pipeline we can get the processed file name by querying pipelines_files meta data table.
we want to ensure that there should be one-to-one mapping between the pipeline and data source filename .
I really appreciate if I get the solution for the above points.
Prathap Reddy G