Data warehouse
There are three aspects to the warehouse:
1) data warehouse table definitions
2) Data elements and data hierarchy definitions
3) the system that takes the definitions and executes them
Define the tables for the data warehouse. (1)
The first two are your primary needs. How it works is like this:
In the data warehouse components.xml file (2) there are Base warehouse tasks. These link to the related table structures, the required manager beans, and the child task. The base warehouse task requires a class for pulling the beans that you want to warehouse. It is standard, but not required, to have one base task per manager. We set up the managers with a specific method for pulling all the beans managed by the manager.
The child tasks list the fields to be captured by the data warehouse, the insert statement (the ? are in the order of the fields listed), the clear statement for cleaning the warehouse, and any complex fields. The fields are linked to the bean properties via the get(property) methods. The insert and clear statements are to allow the child task to interact with the tables in the data warehouse. Complex fields are for off shoot child tasks... Aka, any properties that are not base types (Strings, ints, dates, etc.) but for object and list of object properties. Each item in the list will enact the sub child object.
(3) There are various ways of accessing data. The usual is the bean property access. But when looking at, for example, bean properties that are Ids. We can't store the Id object but with the Id property access, the access function will pull the value out of id. This allows the Id to be stored into the DW. There are other access classes so get familiar with them.
Test, test, test. That's it.
To recap:
- create the base task class
- create the warehouse tables
- create the base task bean
- create the children tasks
FYI, the order of the task and child task beans is important. The base should be at the bottom. Each layer of object deepness should drive the bean further up towards the top of the file.