Home

Data Management and Data Science

This is my cheat sheet for the course CS460 Systems for Data Management and Data Science at EPFL. Contents 1. Storage Hierarchies and Data Layout 2. Query Execution & Optimization 3. Transactions & Distributed Transactions 4. Batch Processing & MapReduce 5. Gossip Protocols 6. DHT + Consistency Models 7. Key-value Stores, CAP...

Read more

Information Security and Privacy

This is my cheat sheet for the course COM402 Information Security and Privacy at EPFL. Contents 1. Privacy 2. Network Security 3. Mobile Security 4. Automated Testing (Fuzzing) 5. Threats 6. Data Security 7. Web and Software Bugs 8. Access Control 9. Machine Learning Security 10. Trusted Computing 11. Crypto Basics 12. Programming L...

Read more

UML

1. Class Notation 2. Generalization & Realization public class A extends B { ... } public class A implements B { ... } 3. Dependency & Association [reference], [reference2] 3.0 Cardinality In UML, cardinality is used to specify the number of instances of one class that can be associated with the instances of another class ...

Read more

Machine Learning 3

Clustering I. Defining Clusters A cluster is a group of similar examples. Define cluster $k$ by a prototype $\mu_k$. $r_{nk}$ is an indicator variable, $r_{nk} \in$ {0, 1}, 1 means example $n$ is in cluster $k$; 0 means example $n$ is not in cluster $k$. The restriction is that every example must be in one cluster, $r$ is going to be a matrix ...

Read more

Machine Learning 2

Perceptron We have talked about [ref] [ref2]       Linear Regression square Loss MSE = $\frac{1}{n}\sum^n_{i=1} (y-\hat{y})^2$, where $y$ is the actual value, $\hat{y}$ is the predicted value. Logistic Regression Log Loss $L(w) = \sum_{i=1}^{n}-y\text{log}...

Read more

Machine Learning

Regression Regularization Regularization refers to techniques that are used to calibrate machine learning models in order to minimize the adjusted loss function and prevent overfitting or underfitting. L2 Ridge Regression vs. L1 Lasso Regression vs. Ordinary least squares linear regression When would you want to use (i) Ridge regression, (...

Read more

Summary of Agile Object-oriented Software Development

Week 1 JVM & JRE Primitive data types vs. Non-primitive data types There are 8 primitive data types: Logical: boolean Textual : char Integral: byte, short, int, long Floating point: double, float Non-primitive (reference) data types: The non-primitive data types include Strings, Classes, Interfaces,...

Read more