Basic operation of ODI

Time:2019-12-31

Basic operation of ODI

[TOC]

Summary

ODI (Oracle data integrator) is a data integration tool launched by Oracle after the acquisition of sunopis in October 2006 and the integration of sunopis active integration platform. Now it is a component of Oracle Fusion Middleware. Whether ETL or e-lt, its basic goal is to integrate the source data into the target database / data warehouse through the extraction / transformation / load processes. The basic idea of ETL is to extract the source data through the independently developed transformation engine, complete the transformation and integration process in the transformation engine, and finally load the data into the target database or data warehouse. The basic idea of e-lt is to make full use of RDBMS (relational database) to achieve the goal of transformation, that is to use the functions and SQL statements provided by RDBMS manufacturers to complete the transformation, rather than the transformation process in the transformation engine developed by itself, similar to ETL completed by manual programming. The transformation of e-lt occurs not only at the destination, but also at the source data. Its principle is to complete data conversion at the end with the highest conversion efficiency.

  1. Oracle data integrator architecture

Race data integrator is integrated in a modular data warehouse, and can be accessed by a graphical development management tool written based on Java and a planning agent.
Di’s data warehouse includes a master repository, which is used to store user and role information, connect other databases or data sources, and different versions of projects. A master warehouse contains one or more work repositories, contains specific data, and establishes relationships between users and consolidated data. Data warehouse can be built on Oracle database or other databases, and can be managed through development tools and accessed through planning agent.

  1. ODI provides the following management tools:

1) Designer is used to define data transformation logic, which is the most commonly used development tool. Most of the development tasks, including the definition of data store, the creation of interface (data mapping relationship) and package (equivalent to workflow), are completed in designer.
2) The operator is used to manage and monitor the execution of data conversion tasks. In the design phase, it can also be used for debugging
3) Topology manager is used to define the physical and logical infrastructure, such as the creation and management of work repository.
4) Security manager is used to manage user rights and a scheduling agent:
5) The schedule agent schedules agents to perform data conversion tasks. The plan agent also has a data conversion engine, but the ODI adopts the e-lt architecture, so basically the plan agent just passes tasks to the target database, and its data conversion engine is rarely used.

  1. Knowledge modules
    The reason why race data integrator can adapt to different and diverse data sources and complete data extraction flexibly and effectively/
    The process of transformation / loading is based on its knowledge model system.
    Nowledge modules are similar to plug-ins in programs. Oracle data integrator abstracts the task of data integration into six parts
    1) [Reverse Engineering] reverse engineering knowledge modules: used to read tables and other objects from data sources.
    2) [journal] journaling knowledge modules: used to record new and modified data for a single or a group of tables / views. ODI supports the change data capture (CDC) function of some data sources, provided that the module is enabled in the ODI project.
    3) [load] loading knowledge modules: used to extract data from the data source.
    4) [check] check knowledge modules: used to check the validity of the extracted source data.
    5) [integration] integration knowledge modules: used to convert data in staging area to target table, and generate corresponding conversion SQL based on target database.
    6) [service] service knowledge modules: provides the function of presenting data in the form of Web services.

1. Basic preparations

  • 1.1 installation of odi12c
    Ensure native java version 1.8 and above
    Enter the bin directory of native Java and execute the Java jar path fmw_12.2.1.0.0_odi.jar
    Click next all the way to select independent installation
  • 1.2 connecting to working space
    Basic operation of ODI
  • 1.3 new data server

Right click Oracle in the topology and select new data server
Basic operation of ODI

Enter the name and database instance (service name of the database), user name and password
Basic operation of ODI

Enter the jdbc driver (usually oracle. JDBC. Oracledriver) and JDBC address in JDBC. When using private synonyms, synonyms will be synchronized to, but there are no fields. You need to add attributes to the data server to reverse: key = includesynyms value = true, and then resynchronize, which may cause the reverse speed to be particularly slow (the reason is to be confirmed, the test environment has not With this problem)
Basic operation of ODI

Click test connection in the upper left corner, select local (no agent) in the first pop-up box, click test, and a successful connection will be successful.
Basic operation of ODI

  • 1.4 new physical scheme

Right click the new data server and select new physical scheme
Basic operation of ODI

Select the user of the data source in the scheme (scheme) drop-down box, and select the target user to import in the scheme (work scheme)
Basic operation of ODI

Click save and the following pop-up box will pop up

  • 1.5 new logic scheme

Right click Oracle in logical architecture and select new logical scheme
Basic operation of ODI

Enter the name of the logical scheme, select the newly created physical scheme in the physical scheme, and click save

  • 1.6 new model

After the logical scheme is created, in the designer, select new model
Basic operation of ODI

Create the following four model folders

Right click the file and click new model to create a new model
Basic operation of ODI

Select Oracle for technology, and choose fusion? Dev for logic scheme. Click Save
Basic operation of ODI

Open the model just built, and select reverse engineering. In “types of objects to be reverse engineered”, if you select a table, you can only synchronize the table according to the contents in the mask. If you select a table and a view, you can synchronize the table and a view. The common synonyms cannot be synchronized, because (all database objects (tables, views, synonyms, over) managed by the Oracle data integrator (ODI) Procedure, function, trigger, index…) Should be prefixed with the database schema name. Since the public synonym does not have any schema name in front of it, it cannot be reverse engineered by ODI using the standard reverse engineering mode). To solve this problem, users can be directly granted access to the objects referenced by the public synonym, or use simple / private synonyms instead of public synonyms.
When using private synonyms, synonyms will be synchronized to, but there are no fields. You need to add attributes on the data server: key = includesynyms value = true, and then resynchronize, which may cause slow reverse speed
Basic operation of ODI

After the reverse is completed, it is possible to reverse a synonym without a primary key. You need to add a right-click constraint
Basic operation of ODI

1.7 comparison of various synchronization modes

Model folder Synchronization type Scope of application Other requirements
DemoTestSync synchronization Single table / view with small amount of data Allow to delete all data and insert new data
DemoTestSimple Simple CDC synchronization It is applicable to single table synchronization. The changes of each CDC table are captured independently. It is not necessary to consider the data consistency between multiple tables with primary and foreign key reference relationships Database performance loss caused by trigger of update / insert / delete allowed to be created on the table
DemoTestTrigger Consistent CDC synchronization It is suitable for the synchronization of multiple related tables. The concept of change set is introduced in consistency CDC. A change set can include multiple tables with related relationships (such as primary and foreign key reference relationships). CDC can ensure data consistency when capturing and publishing changes in a change set Database performance loss caused by trigger of update / insert / delete allowed to be created on the table
DemoTestCDCUpdateDate Consistent CDC synchronization (based on last update time) It is applicable to the upgraded version of multiple associated table synchronization and consistency CDC synchronization, which reduces the loss of database performance on the basis of data consistency, but requires higher requirements The database performance loss caused by the trigger that allows the delete to be created on the table ensures that there must be lastupdatedate field (date or timestamp type) in the field of the table that changes with any change, and the corresponding schema must also have DBA and create file permission

2. Simple synchronization data

Right click the map in the project to create demotestsync
Basic operation of ODI

Drag the source and target tables into the map and wire them
Basic operation of ODI

If you need operation fields, you can select expression editing in the target table
Basic operation of ODI

If the source table and target table are in the same schema, click physical to display as follows
Basic operation of ODI

To clear the old data before inserting data, click the target table, and select truncate target table or delete all as true,Truncate? Target? Table cannot take effect on synonyms
Basic operation of ODI

If the source table and target table are not in the same schema, click physical to display as follows
Basic operation of ODI

Click the connection component and select LKM SQL to Oracle as the loading knowledge module
Basic operation of ODI

After editing the mapping, click execute to check whether it is successful in the database

End of one synchronization data

All actions performed in the operator
Basic operation of ODI

The steps page contains all the steps of the operation. Double click to open the details to view the SQL of the specific operation
Basic operation of ODI

Basic operation of ODI

3. Data synchronization in simple CDC mode

Modify the knowledge module of the model to JKM Oracle simple mode and reverse the data
Basic operation of ODI

Add CDC to source table
Basic operation of ODI

Model add subscription
Basic operation of ODI

Enter the subscriber, click OK, and the subscriber can define himself / herself
Basic operation of ODI

Source table opening Journal
Basic operation of ODI

Select the subscriber you just created when opening
Basic operation of ODI

After opening the journal, the following table and two views will be generated in the database. If they are not available, the journal cannot be opened successfully, and a check process is required
select * from j$odi_test_simple_source_synon;
select * from jv$dodi_test_simple_source_syn;
select * from jv$odi_test_simple_source_syno;

Insert two pieces of data into ODI ﹣ test ﹣ simple ﹣ source ﹣ synym, and you will find that two pieces of data are also inserted into J $ODI ﹣ test ﹣ simple ﹣ source ﹣ synym
Basic operation of ODI

Drag the source and target tables into the map and wire them
Basic operation of ODI

Select the source table, modify the journal data filter, and change the subscriber to the subscriber zero
Basic operation of ODI

Modify the integration type of the target table to incremental update
Basic operation of ODI

In physics, check only the data that has been journaled
Basic operation of ODI

Click the connection component and select LKM SQL to Oracle as the loading knowledge module
Basic operation of ODI

Modify the IKM integrated knowledge module to IKM Oracle incremental update
Basic operation of ODI

After editing the mapping, click execute to check whether it is successful in the database
Basic operation of ODI
Data inserted successfully

End of data synchronization in a simple CDC mode

4. Consistent CDC mode synchronization data

Modify the knowledge module of the model to JKM Oracle consistent and reverse the data
Basic operation of ODI

The following steps are the same as simple CDC
: source table add CDC
: model add subscription
: enter the subscriber and click OK. The subscriber can be defined by itself
: source table open Journal
: drag the source and target tables into the map and wire them
: select the source table, modify the journal data filter, and change the subscriber to the subscriber zero
: modify the integration type of the target table to incremental update
: in physics, check only the data that has been journaled
: click the connection component and select LKM SQL to Oracle as the loading knowledge module
: modify the IKM integrated knowledge module to IKM Oracle incremental update

After opening the journal, the following table and two views will be generated in the database. If they are not available, the journal cannot be opened successfully, and a check process is required
SELECT * FROM J$ODI_TEST_T_SOURCE_SYNONYM;
SELECT * FROM JV$DODI_TEST_T_SOURCE_SYNONYM;
SELECT * FROM JV$ODI_TEST_T_SOURCE_SYNONYM;

But the structure of the checklist is different from that of the simple CDC

There are only two fields in J $ODI ﹣ test ﹣ t ﹣ source ﹣ synonym, and the field window ﹣ ID is empty after inserting data
Basic operation of ODI

This is because the consistency CDC provides a mechanism to lock and access journal data by window extension and user, which is needed to ensure that the consistency is not violated when capturing and publishing change data

To see the data, you need to perform the following additional operations:

Extension window
Basic operation of ODI

After expanding the window, you can query the data
Basic operation of ODI

Lock subscriber
Basic operation of ODI

Select the subscriber you just created when opening
Basic operation of ODI

After editing the mapping, click execute to check whether it is successful in the database
Basic operation of ODI
Data inserted successfully

After data synchronization

Clear diary
Basic operation of ODI

Unlock user
Basic operation of ODI

End of data synchronization in a consistent CDC mode

##5. Consistency CDC (last update date) synchronization data
Modify the knowledge module of the model to JKM Oracle consistent (update date) and reverse the data
Basic operation of ODI
Modify update date col name to last updated date in the table, and ensure that last updated date will be updated with any data update
Validate select true / true

Other operations are the same as consistency CDC

: source table add CDC
: model add subscription
: enter the subscriber and click OK. The subscriber can be defined by itself
: source table open Journal
: drag the source and target tables into the map and wire them
: select the source table, modify the journal data filter, and change the subscriber to the subscriber zero
: modify the integration type of the target table to incremental update
: in physics, check only the data that has been journaled
: click the connection component and select LKM SQL to Oracle as the loading knowledge module
: modify the IKM integrated knowledge module to IKM Oracle incremental update
: extended window
: Lock subscribers
: after editing the mapping, click execute to check whether it is successful in the database
: data inserted successfully
: Unlock user
: clear Journal

6. Query in ODI

Drag a lookup, filter or join component from the component to query
Basic operation of ODI

Lookup two tables or views associated query use
Basic operation of ODI

Filter single table or view filtering
Basic operation of ODI

Join two tables or out of view associations. The following figure shows all rows including lookup
Basic operation of ODI

Field value of target table
Basic operation of ODI

7. package

After the mapping is built, it needs to be called according to the actual situation. Right click the package
Basic operation of ODI

Drag two TESTTRIGGER models, demotesttrigger mapping, odiwaitforlogdata and odisleep components into the package, and connect them as follows
Basic operation of ODI

The first TESTTRIGGER model is configured as follows

Set as the first step. There is a green triangle in the lower right corner
Basic operation of ODI

general information
Basic operation of ODI

option

The second TESTTRIGGER model is configured as follows

general information
Basic operation of ODI

option

Odiwaitforlogdata configuration information
Basic operation of ODI

CDC set is found in select * from SNP > CDC > set
Polling interval is the interval between odiwaitforlogdata to regularly check whether there is any change in the CDC set

After configuration, click Run to start running. You can view the running status in the step page

8. Scheduling (timing / triggering)

After the package is created, right-click the package just created, generate the scenario, and click OK
Basic operation of ODI

Find the new scenario and right-click new schedule under scheduling
Basic operation of ODI

Logical agent select oraclediagent
Basic operation of ODI

implementPackage execution time under the tag
Choicestart-up, the package is called when the schedule is started
ChoiceHourly / day / week, it is executed every hour / day / week, which can be executed in thestateSelect start time, end time, and daily execution time. You can manage the packages or mappings executed every hour / day / week, but this configuration is not valid for closed-loop packages, because only errors or administrator forced shutdown occur after the start of closed-loop packages, otherwise it will not end

After scheduling, click Update scheduling in oraclediagent

Basic operation of ODI

9. Problems encountered

Caused By: java.sql.BatchUpdateException: ORA-01691: unable to extend lob segment TESTUSER.SYS_LOB0000178918C00006$$ by 1024 in tablespace TESTUSER_DATA
Insufficient table space

Caused By: java.sql.SQLSyntaxErrorException: ORA-00904: “TARG”.”LAST_DATE”: invalid identifier
JKM did not select Update? Date? Col? Name

Odi-1217: session test physical (10951) failures with return code odi-1298
Di-1226: step physics ﹣ step failures after 1 attempt (s)
DI-1227: Task SERIAL-MAP_MAIN- fails on the source connection <Empty Value>.
DI-1298: Serial task “SERIAL-MAP_MAIN- (10)” failed because child task “SERIAL-EU-MD_UNIT (20)” is in error.
DI-1298: Serial task “SERIAL-EU-MD_UNIT (20)” failed because child task “Create Oracle directory on SOURCE-LKM Oracle to Oracle (datapump)- (30)” is in error.
DI-1227: Task Create Oracle directory on SOURCE-LKM Oracle to Oracle (datapump)- fails on the source connection WC_SMEC.
aused By: java.sql.SQLSyntaxErrorException: ORA-01031: insufficient privileges
LKM selects LKM Oracle to Oracle (datapump), which will create files in the target schema

ODI-1228: Task Drop work table-LKM SQL to Oracle- fails on the target connection WCDEV_SMEC.
Caused By: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
Table or does not exist, previous operation error, resulting in table or view not built

ODI-1226: Step Physical_STEP fails after 1 attempt(s).
DI-1240: Flow Physical_STEP fails while performing a Insert new rows-IKM Oracle Insert-Load WO_FASTREPAIR_TEST_DEMO_1 operation. This flow loads target table MNT_WO_FASTREPAIR_TEST_DEMO.
DI-1298: Serial task “SERIAL-MAP_MAIN- (10)” failed because child task “SERIAL-EU-FUSION_DEV_UNIT (90)” is in error.
DI-1298: Serial task “SERIAL-EU-FUSION_DEV_UNIT (90)” failed because child task “Insert new rows-IKM Oracle Insert-Load WO_FASTREPAIR_TEST_DEMO_1 (100)” is in error.
DI-1228: Task Insert new rows-IKM Oracle Insert-Load WO_FASTREPAIR_TEST_DEMO_1 fails on the target connection WCDEV_SMEC.
aused By: java.sql.SQLIntegrityConstraintViolationException: ORA-00001: unique constraint (FUSION.MNT_WO_FASTREPAIR_TEST_DEMO_PK) violated
Duplicate primary key, IKM does not select incremental synchronization or clear the target table data before inserting data

CDC (last update date) synchronization failed to synchronize the data copied from this table
The data source of the incoming data is the JV $view. The CDC (last update date) mode has a condition in the view that the update time must be later than the update time of the last ODI record, so the copied data time does not match, and the result cannot be synchronized

The schedule needs to be regenerated after the package and map are modified, otherwise the previous package and map will be executed

A map can only capture changes of one table at the same time

10. appendix

It is recommended to create a new schema and give DBA permission for synchronization

Create user ODI identified by XXX; / / create ODI schema with password of XXX
Grant connect, resource, DBA to ODI; / / grant permission
Grant create any directory to ODI; / / grant create folder permission

To select a consistent CDC synchronization (lastupdatedate), please note that when updating lastupdatedate, only the rows whose lastupdatedate is later than the last record time of the ODI CDC table will be recorded. If lastupdatedate is later than the current time, it may not be recorded
Select * from function. SNP? CDC? Sub / / after adding a subscription, a piece of data will be added to the table, and the subscription will be cancelled. The record will be cancelled
select * from FUSION.SNP_CDC_SET_TABLE
select * from FUSION.SNP_CDC_OBJECTS

Select * from I $ODI ﹣ test ﹣ UD ﹣ source ﹣ synonym; / / insert the table temporarily, create it at the beginning of data synchronization, and delete it at the end of data synchronization

Select * from J $ODI ﹣ test ﹣ UD ﹣ source ﹣ / / the data synchronization table will insert data into the table when trigger is triggered based on the trigger table, and insert data into the table when lock window is last updated

Select * from JV $ddodi ﹣ test ﹣ UD ﹣ source ﹣ synonym; / / when the data synchronization view is lastupdatedate, you can find the data after locking the user

Select * from JV $ODI ﹣ test ﹣ UD ﹣ source ﹣ synonym; / / data synchronization view. There are things in this view. Then synchronization is basically OK. When you lock the user on lastupdatedate, you can find the data

Recommended Today

Elasticsearch learning one (basic introduction)

1、 Introduction to elasticsearch You can describe elasticsearch as follows: A distributed real-time document storage, each field can be indexed and searched A distributed real time analysis search engine It is capable of extending hundreds of service nodes and supports Pb level structured or unstructured data Elasticsearch is a real-time distributed search and analysis engine […]