Some time you want to update a single table from different Flat File Sources.
This post shows how to carry out such logic with the assumption all the file are stored in a single directory and they all have the same structure and file formatting.
I have implemented this using two option
1. Using a Temp Data store
2. Using Procedure.
|LOOP||USED TO ASSIGN AND INCREMENT NO OF TIME FLOW HAS TO BE PROCESSED|
|GET_INTERFACE||SOURCE FILE TO BE PROCESSED BY INTERFACE|
|CHECK||USED TO CHECK WHETHER ALL FILES ARE PROCESSED OR NOT AT EVERY LOOP|
All the variable are Alphanumeric and Historize property.
Using Temp Data store
Temporary Interface – FILE_LIST
Get the list of files in the Temporary Datastore in Work schema, so that we can process Interface sequentially based on the list of files.
Next Step is to Assign the variable ‘ LOOP ‘ = 0 (Zero )
Next Step is to get the File Name to be processed the Interface . Refresh the Variable.
Apply the required query so it fetches the next record in the loop.
Here iam getting the ‘ File name ‘ to be processed based on the Loop Number so the First loop will fetch the First file , second loop next file and keep doing so for others.
SELECT FILE_NAME FROM <%=odiRef.getInfo(“DEST_WORK_SCHEMA”)%>.FILE_LIST
The Next step is to use that File and process the Interface. Here in the Model i have used the ‘ Resource Name – #GET_INTEFACE ‘ as i want to get the right file for processing.
Next Step is to Refresh the Variable ‘ Check ‘ ,which checks if all the interface is been processed or not.
In this Step iam checking whether all the interface is been processed or not.
The logic is pretty simple to subtract the Loop value from the Total count , so if i have 10 files in the first loop i would have
Count(*) = 10
Loop = 1 so remaining interface 10-1 =9 to be process and when its Zero i stop the Flow .
SELECT COUNT(*)-#LOOP FROM <%=odiRef.getInfo(“DEST_WORK_SCHEMA”)%>.FILE_LIST
Next Step is to increment the Value of ‘ LOOP’ as there are still interface to be processed.
This is summary of the overall flow.
1. Get the File list into the Temporpary Datastore ( ‘FILE_LIST’)
2. Assign the Loop to Zero
3. Get the Next File to processed.
4. Process the Interface by passing the value from #GET_INTERFACE variable
5. Refresh the Variable Check
6. Check if there are still Files to be processed or not. If so Next step , else Exit .
7. Increment the Loop by 1.
Using ODI Procedure
You can perform the above logic , using an ODI Procedure , instead of a temporary datastore.
I personally would prefer this method as its easy to maintain and modify when required.
The steps are very simple and as follows.
1. Drop the Temporary Table.
2. Create the Temporary table in the Work Schema.
3. Insert the Required File and Values.
DROP TABLE <%=odiRef.getInfo(“DEST_WORK_SCHEMA”)%>.FILE_LIST
CREATE TABLE <%=odiRef.getInfo(“DEST_WORK_SCHEMA”)%>.FILE_LIST
INTO <%=odiRef.getInfo(“DEST_WORK_SCHEMA”)%>.FILE_LIST VALUES (1,’FILE1.TXT’)
INTO <%=odiRef.getInfo(“DEST_WORK_SCHEMA”)%>.FILE_LIST VALUES (2,’FILE2.TXT’)
INTO <%=odiRef.getInfo(“DEST_WORK_SCHEMA”)%>.FILE_LIST VALUES (3,’FILE3.TXT’)
SELECT * FROM dual
[ Note : Please note its a best practise to use a unique work_schema and use in all the Physical schema defination . I have used a common work_schema called ‘ WORK_SCHEMA’ , so when i create all the temporary data store or procedure they are all in the Temporary work schema ‘ WORK_SCHEMA’ . The other good point will be that if any error occurs during the Interface and due to some reason the temporary objects like C$_,I$_ etc dont get dropped and they will be seen in the main table.]