• What do you consider big data?: There are datasets with size of hundreds of TBs and larger. It is classified by 5Vs

  • What are the "5V"s?:

    2022-03-12 19_32_06-PowerPoint-Präsentation - Adobe Acrobat Pro DC (32-bit).png

  • Explain the main differences in data science from theory to practice.: Main point: distributed databases, unclean data, many outliers, data versioning.

    2022-03-12 20_06_38-PowerPoint-Präsentation - Adobe Acrobat Pro DC (32-bit).png

  • What are the responsibilities of "Data Engineers", "Data Analysts", and "Data Scientists", respectively?:

    1. Data Engineers: They have to handle the entire pipelined architecture to handle log errors, agile testing, building fault-tolerant pipelines, administering databases and ensuring a stable pipeline. Database engineers and Database administrators.

    2. Data Analysts: Analyse the data with some queries and tools.

    3. Data Scientists: Build the models, that can predict something.

      2022-03-12 20_02_11-PowerPoint-Präsentation - Adobe Acrobat Pro DC (32-bit).png