The Data Warehouse Staging Area is temporary location where data from
source systems is copied. A staging area is mainly required in a Data
Warehousing Architecture for timing reasons. In short, all
required data must be available before data can be
integrated into the Data Warehouse.
Due to varying business cycles, data processing cycles, hardware and
network resource limitations and
geographical factors, it is not feasible to extract all the data from all
Operational databases at exactly the same time.
For example, it might be reasonable to extract sales data on a daily
basis, however, daily extracts might not be suitable for financial data that
requires a month-end reconciliation process. Similarly, it might be
feasible to extract "customer" data from a database in Singapore at noon
eastern standard time, but this would not be feasible for "customer" data in
a Chicago database.
Data in the Data Warehouse can be either persistent (i.e. remains around for a
long period) or transient (i.e. only remains around temporarily).
Not all
business require a Data Warehouse Staging Area. For many businesses it is
feasible to use ETL to copy data directly from operational databases into
the Data Warehouse.