Unless otherwise indicated, a grade of C or higher is required for all prerequisite courses.
Introduction to the field of Big Data, its concepts and technologies, as well as current programming environments such as R and Python. Students will explore the roles of a data scientist in terms of network architecture, data analytics and predictive analysis. Fundamental questions of data science and scenarios appropriate for each will be discussed. Differentiation among raw data, clean data, and tidy data; and tools to convert data to/from these formats will be covered. Effective management of large data in single and distributed computing environments, including managing data redundancy and failures, will be covered. Introduction to Data Mining and Machine Learning techniques: classification, correlation, cluster analysis, frequent patterns and data visualization will be introduced. Intended for students with previous programming experience.