Series:
RDBMS? NoSQL? Datawareshouse?
ER? Dimensional? Star? Snowflake? Constellation?
Answer: Mixed of data repositories with mixed modelling.
·
Understand traits of data integrity.
o
Do you need ATOMicity? (atomicity, consistency, isolation, durability)
o
Do you need BASE properties? (Basic Availability, Soft-state,
Eventual consistency)
o
Have you alluded to CAP theorem?
o
Are you using different isolation levels?
o
In memory repository model? (E.g. memcached)
·
Choose the traits from CAP theorem.
o
Consistency: every read would get you the
most recent write
o
Availability: every node (if not failed)
always executes queries
o
Partition-tolerance: even if the
connections between nodes are down, the other two (A & C) promises, are kept.
·
Understand properties of data repository
o
E.g. for DW: subject-oriented, integrated, time-variant,
and nonvolatile (from one source)
·
OLTP vs OLAP?
·
Type of NoSQL (or shchemaless or rather implicit
schema)
o
Key-Value
o
Document
o
Column-family
o
Graph (this supports ACID)
Suggested data repository selection based on CAP model:
Decision makers
Notes:
·
Modelling follows after the DB selection, conspicuously.
·
RDBMS entails ER modeling. Determine the level
of normalization to adhere to. Usually 3rd normalization form or
Boyce-codd (colloquially called 3.5 NF) is the highest level recommended.
Although you can go up to 6th NF.
·
Dimensional modeling is a better approach for
Data warehouse compared to standard Data Model.
·
You
will end up using one or more RDBMS with one or more NoSQL along with DW.
No comments:
Post a Comment