Star schema

The star schema (sometimes referenced as star join schema) is the simplest data warehouse schema where a single "fact table" containing a compound primary key, with one segment for each "dimension" and with additional columns of additive, numeric facts.

The star schema makes multi-dimensional database (MDDB) functionality possible using a traditional relational database. Because relational databases are the most common data management system in organizations today, implementing multi-dimensional views of data using a relational database is very appealing. Even if you are using a specific MDDB solution, its sources likely are relational databases. Another reason for using star schema is its ease of understanding. Fact tables in star schema are mostly in third normal form (3NF), but dimensional tables in de-normalized second normal form (2NF). If you want to normalize dimensional tables, they look like snowflakes (see snowflake schema) and the same problems of relational databases arise - you need complex queries and business users cannot easily understand the meaning of data. Although query performance may be improved by advanced DBMS technology and hardware, highly normalized tables make reporting difficult and applications complex.

Missing image
Star_schema.png
Typical data warehousing star join schema

Contents

Example SQL

SELECT
  sum (f_sales.units_sold)
 FROM
  f_sales, d_customer, d_time, d_store, d_product
 WHERE
  f_sales.customer_id = d_customer.customer_id AND
  f_sales.date_id = d_time.date_id AND
  f_sales.store_id = d_store.store_id AND
  f_sales.product_id = d_product.product_id AND
  d_time.year_id = 1997 AND
  d_product.category_id = "tv"
 GROUP BY
  d_product.brand, d_store.country_iso_id
 

Equivalent ANSI SQL-92 Example

SELECT
  sum (f_sales.units_sold)
 FROM
  f_sales
 INNER JOIN 
  d_customer ON d_customer.customer_id = f_sales.customer_id
 INNER JOIN
  d_time ON d_time.date_id = f_sales.date_id
 INNER JOIN
  d_store ON d_store.store_id = f_sales.store_id
 INNER JOIN
  d_product ON d_product.product_id = f_sales.product_id
 WHERE
  d_time.year_id = 1997 AND
  d_product.category_id = "tv"
 GROUP BY
  d_product.brand, d_store.country_iso_id
 

Alternate ANSI SQL-92 Example

SELECT
  sum (f_sales.units_sold)
 FROM
  f_sales
 NATURAL JOIN 
  d_customer
 NATURAL JOIN
  d_time
 NATURAL JOIN
  d_store
 NATURAL JOIN
  d_product
 WHERE
  d_time.year_id = 1997 AND
  d_product.category_id = "tv"
 GROUP BY
  d_product.brand, d_store.country_iso_id
 

See also

External Links

http://c2.com/cgi/wiki?StarSchema

See also: Star schema, Data warehouse, Database normalization, Fact table, Multidimensional database, Primary key, Relational database, Snowflake schema