Keep Growing
BD12 Querying BD12 Querying
XML NavigationOperator slash: / descendant axis : // attribute axis: @attr-name atomization: data(...) filter: [], e.g.@
2019-01-26
BD9 Performance at large scale BD9 Performance at large scale
MeasurementsPrefixes: mili: $m$, 0.001,3, micro: $\mu$, 0.000 001, 6, nano: $n$, 9, pico: $p$, 12, femto: $f$, 15,
2019-01-25
BD8 Spark BD8 Spark
SparkIt is for full-DAG query processing. Its first-class citizen: RDD. 8 nodes, 16 cores per node and 128 GB of memory
2019-01-25
BD6 Map reduce BD6 Map reduce
ProcessInput data(key-value pairs) -> split -> Map -> shuffle -> Intermediate data(key-value pairs) ->
2019-01-25
BD7 YARN BD7 YARN
JobTrackerResponsibilities Resource Management (RM) Scheduling: like ten tasks per machine, who is in charge of reducing
2019-01-25
BD5 Wide Column Store BD5 Wide Column Store
HBase: design to run on a scalable cluster of commodity hardware, built on HDFS. Founding paper: Google’s BigTable. Desi
2019-01-24
BD4 File system BD4 File system
Use Cases Billions of TB files: the files are relatively small, but the amount is large Object Storage (technology) Ke
2019-01-24
BD3 Storage BD3 Storage
Old Local File SystemFile = Content + Metadata fixed metadata -> fixed Schema organized in a hierarchy files are sto
2019-01-23
BD2 Database Basics BD2 Database Basics
Table equivalent Conceptstable,collectionattribute,column,field,propertyrow,business object,item,entity,document,recordp
2019-01-22
BD1 Introduction BD1 Introduction
Database prehistory: speaking/singing(expressing information). writing(recording). accounting (processing infor
2019-01-22
BD11 Data Model BD11 Data Model
Type Systemedge vs node labeling: labels are on the edges(JSON)/nodes (XML) Shared properties distinction between atomic
2019-01-22
4 / 4