ODIExperts.com

The blog for Oracle Data Integrator ( ODI )

Why Creating a separate schema in ODI?

Before jumping into this detail of the topic I would like to make the definition of schema and Work schema with respect to ODI Topology clear and i.e

SCHEMA – Schema where ODI inserts and updates the target tables.

WORK_SCHEMA – Schema where all the ODI Temporary table ( C$_,I$_ etc ) are created and dropped by ODI.

Work Schema is an important part of the ODI Architecture. As you all know ODI create Temporary tables, for its loading, integration and transformation from the various source to the target, such as C$_, I$_ etc.  These temporary tables can create a lot of junks and ugliness in your target table in the long end.

The best solution is creation of separate Database schema or user called “WORK_SCHEMA”  or ” TEMP ” or “ODI_TEMP”  whatever you wish name as per your environment , but the most important part is how you use it .

I very strongly recommend the usage of using this TEMP  as ‘ Schema (work schema ) ‘ option in Physical schema for all the technologies. The result and the impact of this is whenever ODI create temporary tables ( C$_, I$_ etc ), you can drop or clean the space of this schema as any time, thus avoiding clutter or temporary ODI tables in the target schema.

Let’s see this option diagrammatically.

Option 1 – Having the Schema and the Work Schema in Topology same.


In such a scenario, ODI will create Temp tables (C$_, I$_ ) in the same schema , the result as you can see lots of junk . Would you love to have such scenario in your Production box?

SYSTEM (Schema)

EMPLOYEE ( Table)
SALES ( Table)
C$_employee
I$_employee
C$_sales

I$_sales

Option 2 – Having the Schema and the Work Schema in Topology different.


In such a scenario, ODI will create Temp tables (C$_, I$_) in the different schema, the result as you can see no junks and DBA can drop this WORK_SCHEMA ODI tables at regular interval.

SYSTEM (Schema)

EMPLOYEE ( Table)
SALES ( Table)

WORK_SCHEMA (Schema)

C$_employee
I$_employee
C$_sales

I$_sales

I know what you all are thinking, isn’t ODI drops all the temporary tables, well the answer is yes but how many times you have created interfaces that you would create just for

Testing purpose

or failed interface still not going till Drop table option

or some time due to some error it fails to drop the table ,

The result presence of these tables permanently in Database unless dropped manually.

I have seen companies using such schema with the option “Staging are different from the target” which is not a good practice of using such schema, as you generally developer would forget to use this option. Moreover, its best and easy to implement in topology rather than to use at 100s of interface created and still the Knowledge Modules creates the Temporary schema in the work schema before doing the Insert and Update.


Second advantage of defining in the topology is that, mostly we define the temporary table in the work schema so we would create all the temporary tables in work Schema.

This option can be considered an important aspect in ODI Best Practices.

11 Comments

  1. Hi,
    Nice Post but i think we have an option in 12c which says Remove the temporary object on Error.
    can we use that in here ?

  2. Hi Dev,

    Appreciated your article. very nice.

    Actually, I was trying to implement simple CDC in ODI 11g. As you said, I selected work schema and target schema separately due to that, J$ table is creating in the work schema but while creating a JV$ & JV$D views, ODI failed to find the source table at work schema.

    Kindly advise on this.

    Thanks in advance.

  3. Nice post, but after set the work schema in a temporary schema, ODI requests an specific KM. Which KM must be used on it ?

    Thanks

  4. Hi Kshitiz,
    Thanks for the quick reply.Its very helpful for me.
    Actually i was working in Data Junction.Now migrating to ODI. In Data Junction we can select source as TABLE directly or SELECT STATEMENT.

    As per my knowledge ODI can read either table or view or file. But in my job, i have SELECT STATEMENT containing a lot of sub queries nearly 80 lines of coding atleast.

    This is the problem where i have stocked. Don’t know its a silly question or what.I was thinking that i will create one VIEW according to select statement in source schema and then read it but i dont have permission to create a view.

    Is there a better way?

    Source schema—>> Database 1

    Temp_work,Target Schema—>> Database2

    I am helpless.
    Kindly suggest

  5. Hi, I am new to ODI..I saw this post and made me very clear regarding Temporary tables.But i have one question that is
    1. Insted of flat files or tables if the source is one SQL Query (query contain tables from different schema), is it necessary i have to create one view for reading the source. if yes will i be creating it in temp_work schema or else where.

    2. Again if yes, Now i got 3 schema. One for source,another for target and the temp_work schema. As i have mentioned in my select query i have lot of tables from different schema. Yes one more thing Some schema in that select query may belongs to SQLSERVER and some from ORACLE.

    Its seems very complicated to me.

    In this situation how can i do one efficient mapping.

    Please suggest.

    • Hi Bhabani ,

      Thanks for visiting us.

      To answer your queries. Its very simple logic we createa Temporary Work schema in the target Table , as you know ODI creates C$ , I$ in the target tables unless you specify other schema using Stagging area different from schema.
      Irrespective any number of source you have . You can pull join from different source system .
      Lets take a simple example I have three source tables in my Interface. One is a File , another from SQL Server and third one from DB2. Now you can join this three table accordingly respective column in stagging area.
      Now ODI will used three different LKM and so it will be created three different c$ to pull the data and in I$ it will combine them as per the joins specified in the interface and load them into target table.
      Here C$ and i$ are all created in the Target Database so we need to create the Work schema only at the target.

      Hope this is clear . Please let us know if you still have question .

      Thanks
      Kshitiz Devendra

  6. Hi Cezar,

    We are seeing high sort space usage for a particular scenario which is described below:

    We are loading data into a Staging Area by getting the information from 5 tables, 4 of which is located in the ODI Work Repository Schema and 1 table in another schema within the same database.
    ODI creates a C$ table based on the this particular table (located on the other schema) and then joins them internally to generate a query.

    insert /*+ APPEND */ into PROD_SNPW.”I$_7310249TBL_INF_MRT_STG”
    (
    LOAD_ID,
    RECORD_ID,
    LINE_ID,
    MRT_ID,
    IRI_ITEM_ID,
    MATCHED_MRQ_FLAG,
    TASK_TYPE,
    IND_UPDATE
    )
    select
    TBL_INF_MRT_STG.LOAD_ID,
    TBL_INF_MRT_STG.RECORD_ID,
    TBL_EPYX_ITEM_LINE_DTLS_STG.ROW_ID,
    TBL_INF_MRT_STG.MRT_ID,
    TBL_INF_MRT_STG.IRI_ITEM_ID,
    ‘Y’,
    TBL_EPYX_ITEM_LINE_DTLS_STG.TASK_TYPE,

    ‘I’ IND_UPDATE

    from PROD_SNPW.TBL_EPYX_ITEM_LINE_DTLS_STG TBL_EPYX_ITEM_LINE_DTLS_STG, PROD_SNPW.TBL_INF_MRT_STG TBL_INF_MRT_STG, PROD_SNPW.TBL_EPYX_TRANSACTION_HDR_STG TBL_EPYX_TRANSACTION_HDR_STG, PROD_SNPW.TBL_INF_MRQ_STG INF_MRQ_STG, PROD_SNPW.”C$_73102490TBL_INF_MRT_STG”
    where (1=1)
    And (((TBL_EPYX_ITEM_LINE_DTLS_STG.BATCH_ID=TBL_EPYX_TRANSACTION_HDR_STG.BATCH_ID) AND
    TBL_EPYX_ITEM_LINE_DTLS_STG.TRANSACTION_NO=TBL_EPYX_TRANSACTION_HDR_STG.TRANSACTION_NO))
    AND ((TBL_INF_MRT_STG.LOAD_ID=TBL_EPYX_ITEM_LINE_DTLS_STG.BATCH_ID) AND
    /*TBL_EPYX_ITEM_LINE_DTLS_STG.TASK_TYPE=TBL_INF_MRT_STG.TASK_TYPE) AND */
    TBL_EPYX_ITEM_LINE_DTLS_STG.ONELINK_JOB_CODE=TBL_INF_MRT_STG.IRI_ITEM_ID)
    AND ((TBL_INF_MRT_STG.MRQ_MRQ_ID=INF_MRQ_STG.MRQ_ID) AND INF_MRQ_STG.RECORD_ID=TBL_INF_MRT_STG.RECORD_ID)
    AND (TBL_EPYX_ITEM_LINE_DTLS_STG.ONELINK_JOB_CODE=C1_ITEM_ID)
    AND (((TBL_EPYX_TRANSACTION_HDR_STG.BATCH_ID=INF_MRQ_STG.LOAD_ID) AND
    TBL_EPYX_TRANSACTION_HDR_STG.ROW_ID=INF_MRQ_STG.RECORD_ID))
    And (TBL_EPYX_TRANSACTION_HDR_STG.BATCH_ID=#PROJ_BTFLEET_EPYX_INV_LOAD.var_batch_id)
    And (TBL_EPYX_ITEM_LINE_DTLS_STG.BATCH_ID=#PROJ_BTFLEET_EPYX_INV_LOAD.var_batch_id)
    And (TBL_INF_MRT_STG.LOAD_ID=#PROJ_BTFLEET_EPYX_INV_LOAD.var_batch_id)
    And (/*(TBL_EPYX_TRANSACTION_HDR_STG.PROCESS_STATUS != ‘E’
    or
    TBL_EPYX_TRANSACTION_HDR_STG.PROCESS_STATUS is null )*/
    TBL_EPYX_TRANSACTION_HDR_STG.PROCESS_STATUS is null)
    And (INF_MRQ_STG.LOAD_ID=#PROJ_BTFLEET_EPYX_INV_LOAD.var_batch_id)
    And (TBL_EPYX_ITEM_LINE_DTLS_STG.TASK_TYPE=’#PROJ_BTFLEET_EPYX_INV_LOAD.var_task_other’)
    And (INF_MRQ_STG.SESSION_ID=#PROJ_BTFLEET_EPYX_INV_LOAD.var_session)
    And (TBL_EPYX_TRANSACTION_HDR_STG.SESSION_ID=#PROJ_BTFLEET_EPYX_INV_LOAD.var_session)
    And (TBL_INF_MRT_STG.SESSION_ID=#PROJ_BTFLEET_EPYX_INV_LOAD.var_session)

    This query is generating huge SORT SPACE usage in the Database and this usage grows exponentially with the number of records.
    The number of records that the tables contain at any point of time are in the order of :

    TBL_EPYX_ITEM_LINE_DTLS_STG,TBL_INF_MRT_STG,TBL_EPYX_TRANSACTION_HDR_STG,TBL_INF_MRQ_STG,C$_73102490TBL_INF_MRT_STG with approximately 1500 number of records.

    For the above mentioned records we get a temp space usage in database in the order of 20Gig.

    Could you please suggest us whether there would be improved performance if we get the information stored in the C$ table before hand within the work Repository schema?
    Or is there any other option/best practice to follow in order to minimize the temp space usage.

    Regards,
    Chary

    • Hi Chary,

      If I understood in the right way, all tables has around 1500 records, is that?

      If yes, of course that 20GB doesn’t seem reasonable…. Are you sure that isn’t generating a “Cartesian product” with the joins?

      Please, send me a direct email (cezar.santos@odiexperts.com), it’s a faster way to discuss this….

      Once again, thank you for visiting odiexeperts.com

      Cezar Santos

  7. Hi Uli,

    I suggest you take a look in the http://odiexperts.com/?p=672

    I believe will help you….

    Thanks for come in our site!!!

    Cezar Santos

  8. As always excellent post and great explanation.

    Do you have any best practice recommendations with regards to permissions? I mean if I create the temp C$,I$ etc tables in the work schema I need to grant select and create table permissions to the data schema user to be able to access the data from the data schema. Do you use Synonyms or am I missing sth. here.

    Cheers
    uli

  9. Hi Dev
    this is excellent,,,,
    these kind of stuff will help us a lot.

    Thanks a lot
    yugi

Leave a Reply

Required fields are marked *.