Posted by sqldbs on March 30th, 2007
http://www.dmreview.com/…
By nature, data warehouse projects are costly endeavors where many resources are consumed - both hardware and software. Although, experienced database developers may use 90 percent of the same skills on their last projects, they may find themselves approaching development issues differently!
While the project team as a whole usually understands the risks and benefits of building a data warehouse, I find that not enough IT managers are initially aware of the unique challenges a data warehouse effort poses for database developers. Operating in a very large database (VLDB) multigigabyte to terabyte decision support system (DSS) data environment will often require unique approaches for database developers, where strategic value can be added to the warehouse development cycle. Understanding and anticipating the kinds of challenges developers will face in a DSS database is essential. This article will address a few of these challenges.
Read the rest of this entry »
Posted in Data Warehouse | No Comments »
Posted by sqldbs on March 30th, 2007
http://www.tdwi.org/research/…
Designing the physical path of data movement from transaction databases into a data mart or warehouse can range from very simple to very complex. In an environment where there are one or two homogenous sources and a single data mart, introducing an intermediate “staging area” (also called an “operational data store”) may add little value. However, as the complexity increases—by involving multiple database types or different platforms for different sources (e.g., ERP data on Unix, historical data on MVS, a data mart on Windows NT)—the need for a staging area to integrate all data types into one format becomes apparent, even critical
Read the rest of this entry »
Posted in Data Warehouse | No Comments »
Posted by sqldbs on March 30th, 2007
http://www.dbmsmag.com/…
The data staging area is the data warehouse workbench. It is the place where raw data is brought in, cleaned, combined, archived, and eventually exported to one or more data marts. The purpose of the data staging area is to get data ready for loading into a presentation server (a relational DBMS or an OLAP engine). We assume that the data staging area is not a query service. In other words, any database that is used for querying is assumed to be physically downstream from the data staging area.
Perhaps you don?t even realize you have a data staging area. Maybe your data just does a “touch and go” landing in between the legacy system and the presentation server. (That?s an airplane metaphor.) You bring the data in briefly, assign a surrogate key, check the records for consistency, and send them on to the DBMS loader that is the presentation database.
Read the rest of this entry »
Posted in Data Warehouse | No Comments »
Posted by sqldbs on March 9th, 2007
http://www.microsoft.com/technet/…
Relational Data Warehouse or Reporting work loads are characterized by low volumes of very large transactions. These applications are often identified as having predominately read workloads (e.g. Decision Support, Analysis, and Reporting) with periodic feeds or batch loads. It is important to keep these characteristics in mind as we examine the significance of database design, resource utilization and system performance. The top performance bottlenecks or gotchas to avoid for Data Warehouse or Reporting applications are outlined below.
Read the rest of this entry »
Posted in SQL Server, Data Warehouse, Tuning | No Comments »
Posted by sqldbs on February 28th, 2007
http://olapmonkey.blogspot.com/…
A question I often hear asked is “How do I get my time dimension to default to the current time period”? The first problem with the question is what is meant by “current time period”. In my experience, 8 of 10 times when folks say “current time period” that means the latest member of time dimension at it’s leaf level that happens to have data in the fact table.
For instance, if you use Foodmart 2000 it has time periods for all the months in 1997 and 1998, but the data is only associated with 1997 for the fact table in the Sales cube. So if by “period” one happens to mean “month”, then the most recent period with data in the fact table is [Time].[1997].[Q4].[12].
Read the rest of this entry »
Posted in SSAS, MDX | No Comments »
Posted by sqldbs on February 6th, 2007
http://www.developersdex.com/gurus/…
The usual example of a tree structure in SQL books is called an adjacency list model and it looks like this:
CREATE TABLE OrgChart
(emp CHAR(10) NOT NULL PRIMARY KEY,
boss CHAR(10) DEFAULT NULL REFERENCES OrgChart(emp),
salary DECIMAL(6,2) NOT NULL DEFAULT 100.00);
OrgChart
emp boss salary
===========================
‘Albert’ ‘NULL’ 1000.00
‘Bert’ ‘Albert’ 900.00
‘Chuck’ ‘Albert’ 900.00
‘Donna’ ‘Chuck’ 800.00
‘Eddie’ ‘Chuck’ 700.00
‘Fred’ ‘Chuck’ 600.00
Another way of representing trees is to show them as nested sets. Since SQL is a set oriented language, this is a better model than the usual adjacency list approach you see in most text books. Let us define a simple OrgChart table like this, ignoring the left (lft) and right (rgt) columns for now. This problem is always given with a column for the employee and one for his boss in the textbooks. This table without the lft and rgt columns is called the adjacency list model, after the graph theory technique of the same name; the pairs of emps are adjacent to each other.
Read the rest of this entry »
Posted in SQL | No Comments »
Posted by sqldbs on February 2nd, 2007
http://blogs.conchango.com/christianwade/…
I’m approaching the end of an enjoyable data warehousing project and we have collected some size/volume statistics around our Analysis Services 2005 cube. We have:
* 48 Dimensions. This includes role-playing dimensions.
* 751 Attributes and user hierarchies (!). So if a dimension has 5 attributes and 2 user hierarchies, this counts as 7.
* 21 Measure groups
* 68 Measures (including calculated ones)
This is all in one mother cube.
Read the rest of this entry »
Posted in SSAS, Tuning | No Comments »