The road to analytics
- OLTP: 
- consistent and reliable record-keeping.
 - look at detailed individual records (small part of database)
 - lots of writes (thousands of people write together, like new customers registering)
 - fully interactive (<1s)
 - value Consistency.
 
 - OLAP: web analytics, sales analytics, management support.
- look at historical summarized consolidated data ( analysis over large part of database)
 - lots of reads (even not writing)
 - slow interactive, like MapReduce.
 - Redundancy, we denormalized everything.
 
 
OLAP
A dataware house is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management’s decision-making process.
- subject-oriented: customers, sales, products
 - integrated: databases are in hundreds/thousands/or even more. Hard to join these isolated databases. When to analyze ,copy these databases from all over isolated machines in a single machine.
 - time-variant
 - non-volatile: copy without any update.
 
Architecture
From different sources -> ETL-> get the derived data -> analyze/report/mine.
Data Model
Cubes
- Dimensions: what the data looks like
 - aggregation: the first thing we can do with data
 - slicing: take alongside the dimension
 - Dicer: take all of the dimensions: what you want to look into details. Organize the rows and columns.
 
ETHLing
- extract: triggers, gateways, log extraction
 - transform: derivation, value transformation, cleaning, filter, split, merge, join
 - load: integrity constraints, sorting, build indices, partition
 
Implementation
- ROLAP: physically stored in RDMS. Fact table: one value in each row, with multi measure. Star schema. snow-flake schema.
 - MOLAP: the physical layer would be the memory-order harddrive.
 
Query
MDX(multi-dimensional expressions): the language for cubes. Actually, a lot of people use SQL to query cubes.
GROUPING SETS, ROLLUP, CUBE: ROLLUP: from left to right.
A cube is a list of dimensions indexing a list of measures.
- hierarchies in dimensions: dimension values are organized in hierarchies. MDX does know this hierarchy.
 
Statements
The feeling: XML schema can be used to describe graphs, cubes.
Further Reading
Database Systems: The Complete Book 2nd Edition Chapter 10.6, 10.7