DATS 6450.13
  • Home
  • Instructor
  • Syllabus
  • Content
  • Labs
  • Project
  • References

Applied Big Data Analytics

George Washington University
Spring 2026

Published

Thursday Mar 5, 2026 at 04:51 pm

NoteCourse Information

Course Code: DATS 6450.13
Credits: 3
Schedule: Thursdays 6:10 PM - 8:40 PM
Location: Phillips 416

NoteInstructor

Abhijit Dasgupta
Adjunct Professor

Email: abhijit.dasgupta@gwu.edu Office Hours: Thursdays 5:10 PM - 6:10 PM (Phillips 416) or by appointment (online)

Course Description

Applied Big Data Analytics (DATS 6450.13) teaches practical, hands‑on methods for building scalable analytics pipelines that start on a single machine and scale to distributed clusters. Students learn to develop local analytical workflows with DuckDB, translate and scale them with Apache Spark, and evaluate performance tradeoffs through comparative benchmarking, query tuning, partitioning and memory strategies. The course covers modern tooling (Polars, Ray, RAPIDS), Spark SQL/DataFrame APIs, Spark NLP and MLlib, and efficient visualization of very large datasets (Datashader), with emphasis on reproducible end‑to‑end workflows and a final project demonstrating design and performance decisions.

Content 2026
Abhijit Dasgupta
All content licensed under a Creative Commons Attribution-NonCommercial 4.0 International license (CC BY-NC 4.0)

 

Made with and Quarto