What do you consider big data?: There are datasets with size of hundreds of TBs and larger. It is classified by 5Vs
What are the "5V"s?:
Explain the main differences in data science from theory to practice.: Main point: distributed databases, unclean data, many outliers, data versioning.
What are the responsibilities of "Data Engineers", "Data Analysts", and "Data Scientists", respectively?:
Data Engineers: They have to handle the entire pipelined architecture to handle log errors, agile testing, building fault-tolerant pipelines, administering databases and ensuring a stable pipeline. Database engineers and Database administrators.
Data Analysts: Analyse the data with some queries and tools.
Data Scientists: Build the models, that can predict something.