Semester 2: Big Data Technologies-

  1. Hadoop Ecosystem
    • Introduction to Hadoop
    • Hadoop Distributed File System (HDFS)
    • MapReduce programming model
  2. Apache Spark
    • Introduction to Spark
    • Spark architecture and components
    • Spark programming in Scala or Python
  3. NoSQL Databases
    • Types of NoSQL databases (e.g., MongoDB, Cassandra)
    • Data modeling and querying in NoSQL databases

Reference Books:

  1. “Hadoop: The Definitive Guide” by Tom White
  2. “Spark: The Definitive Guide” by Bill Chambers and Matei Zaharia
  3. “Big Data: Principles and best practices of scalable realtime data systems” by Nathan Marz and James Warren
  4. “Kafka: The Definitive Guide” by Neha Narkhede, Gwen Shapira, and Todd Palino
  5. “HBase: The Definitive Guide” by Lars George
  6. “Big Data Analytics with R and Hadoop” by Vignesh Prajapati
  7. “Big Data Technologies and Applications” edited by Borko Furht and Flavio Villanustre