Data Management and Data Science

 

This is my cheat sheet for the course CS460 Systems for Data Management and Data Science at EPFL.

Contents

1. Storage Hierarchies and Data Layout
2. Query Execution & Optimization
3. Transactions & Distributed Transactions
4. Batch Processing & MapReduce
5. Gossip Protocols
6. DHT + Consistency Models
7. Key-value Stores, CAP Theorem
8. Scheduling
9. Stream Processing
10. Distributed Learning Systems