DSDA 1995: Data Science and Society using R

Course Information

Course Location Meeting Days Time
SSH 308 Monday 4:00-6:30pm

Instructor Information

Instructor E-Mail Office Location Hours
Dr. Jason S. Byers jason.byers@uconn.edu SSH 433 11:00 - 1:00pm Monday

Syllabus

Course Home
Everything you need for this class (announcements, resources, assignments and other activities) will be posted on HuskyCT or the course website. Please plan to check the pages regularly.

Course Description
The course aims to introduce students to the foundational concepts of data science and its impact on modern society. The course will develop students’ technical skills in data literacy, wrangling, analysis, and visualization, while fostering an understanding of how data science can be applied to real life issues.

Learning Outcomes
Together, we will strive for your individual and collective success in achieving the learning outcomes of this course. At the conclusion of this course, students will be able to:

  • Students will develop a foundational understanding of data science principles and their societal implications.

  • Gain proficiency in data wrangling, analysis, visualization, and basic programming. Cultivate data/computer/programming literacy to support further study in data science.

  • Analyze real-world issues through data-driven approaches.

  • Prepare for advanced studies and careers by understanding the skills and knowledge required in data science.

Prerequisites
There are no prerequisites for this course. In particular, I will assume no previous experience with R, computer programming, or statistics.

Course Materials
To maximize access to this class, we will use some textbooks, videos, and other resources, with a focus on the following:

  • Primary Text (R4DS): Wickham, Hadley, Mine Cetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import Tidy, Transform, Visualize, and Model Data. 2nd Edition. O’Reilly Media. This book is freely available online. It is also available in paperback, if you prefer a hard copy. Warning: some content and the numbering system differs between print and online versions; I will exclusively refer to the free online version.

  • Primary Text (GCR): Lovelace, Robin, Jakub Nowosad, and Jannes Muenchow. 2019. “Geocomputation with R.” CRC Press. This book is freely available online. It is also available in paperback, if you prefer a hard copy. Warning: some content and the numbering system differs between print and online versions; I will exclusively refer to the free online version.

  • Primary Text (USDR): Engel, Claudia A. 2019. “Using Spatial Data with R.” This book is freely available online.

  • Primary Text (MSR): Wickham, Hadley. 2021. “Mastering Shiny: Build Interactive Apps, Reports and Dashboards Powered by R.” O’Reilly Media. This book is freely available online.

Reference Texts

  • Reference Text (HOPR): Grolemund, Garrett. 2014. Hands-On Programming with R: Write Your Own Functions and Simulations. O’Reilly Media. This book is freely available online. It is also available in paperback, if you prefer a hard copy. Warning: some content and the numbering system differs between print and online versions; I will exclusively refer to the free online version.

  • Reference Text (DSB): Cetinkaya-Rundel, Mine. 2021. Data Science in a Box. This book is freely available online.

Computer Requirements
We will be conducting data analysis in class so that you can practice the skills that you’ve learned from the textbook and lectures. To conduct data analysis, we will be using the R statistical computing environment on your computer. Please bring your laptops with R installed to every class. You are responsible for having a reliable computer and internet connection throughout the course.

Reusing/Sharing Code. Many of the datasets we will discuss and analyze are publicly available, so they may have been extensively discussed and analyzed. Unless explicitly instructed otherwise, you may use available code and resources for course activities (e.g., Github repos, StackOverflow answers) but you must cite the source of the code/resource within your program files and/or document. Recycled code that is discovered that is not properly cited may be considered as plagiarism. When working in groups on class assignments you are welcome to discuss problems together and ask for general advice, but you may not share or use code from another group.

Software
You will use two freely available programs, R and RStudio, in order to complete the assignments for this course.

University Policies
The University of Connecticut is committed to protecting the rights of individuals with disabilities and assuring that the learning environment is accessible. If you anticipate or experience physical or academic barriers based on disability or pregnancy, please let me know immediately so that we can discuss options. Students who require accommodations should contact the Center for Students with Disabilities, Wilbur Cross Building Room 204, (860) 486- 2020 or http://csd.uconn.edu/.

Missed and/or Late Work Policy
Missed and/or late work will not be accepted.

Policy Against Discrimination, Harassment and Related Interpersonal Violence
The University is committed to maintaining an environment free of discrimination or discriminatory harassment directed toward any person or group within its community – students, employees, or visitors. Academic and professional excellence can flourish only when each member of our community is assured an atmosphere of mutual respect. All members of the University community are responsible for the maintenance of an academic and work environment in which people are free to learn and work without fear of discrimination or discriminatory harassment. In addition, inappropriate amorous relationships can undermine the University’s mission when those in positions of authority abuse or appear to abuse their authority. To that end, and in accordance with federal and state law, the University prohibits discrimination and discriminatory harassment, as well as inappropriate amorous relationships, and such behavior will be met with appropriate disciplinary action, up to and including dismissal from the University. Additionally, to protect the campus community, all non-confidential University employees (including faculty) are required to report sexual assaults, intimate partner violence, and/or stalking involving a student that they witness or are told about to the Office of Institutional Equity. The University takes all reports with the utmost seriousness. Please be aware that while the information you provide will remain private, it will not be confidential and will be shared with University officials who can help. More information is available at equity.uconn.edu and titleix.uconn.edu.

Absences from Class Due to Religious Observances and Extra-Curricular Activities
Faculty and instructors are expected to reasonably accommodate individual religious practices unless doing so would result in fundamental alteration of class objectives or undue hardship to the University’s legitimate business purposes. Such accommodations may include rescheduling an exam or giving a make-up exam, allowing a presentation to be made on a different date or assigning the student appropriate make-up work that is intrinsically no more difficult than the original assignment. Faculty and instructors are strongly encouraged to allow students to complete work missed due to participation in extra-curricular activities that enrich their experience, support their scholarly development, and benefit the university community. Examples include participation in scholarly presentations, performing arts, and intercollegiate sports, when the participation is at the request of, or coordinated by, a University official. Students should be encouraged to review the course syllabus at the beginning of the semester for potential conflicts and promptly notify their instructor of any anticipated accommodation needs. Students are responsible for making arrangements in advance to make up missed work. For conflicts with final examinations, students should contact the Dean of Students Office.

Office of Emergency Management on Emergency Preparedness
In case of inclement weather, a natural disaster, or a campus emergency, the University communicates through email and text message. Students are encouraged to sign up for alerts through http://alert.uconn.edu. Students should be aware of emergency procedures, and further information is available through the Office of Emergency Management at http://publicsafety.uconn.edu/emergency/.

Course Organization
Modes of learning in this class (whether assessed directly or indirectly) require a range of skills and abilities. Every student’s success is important to me, and I am happy to work with you to develop strategies for success in this class.

Learning R and using it to illuminate and analyze issues within political science and the social sciences requires regular and repeated practice. This course will be an active learning environment, which consists of a mix of lectures, discussions, short demonstrations, presentations, and in-class activities.

  • In-Class Activities Each class day will involve a significant amount of hands-on work with R and data. In order to learn from these activities, you must do the assigned readings and videos before you come to class, and be prepared to ask (and answer) questions before diving into a small group discussion or activity.

  • Problem Sets Weekly assignments (due approximately every Monday at 4:00 pm Eastern) will provide you with regular practice using quantitative methods in R, and applying these methods to issues within political science. These assignments will build on the material presented in class, and require you to apply the basic concepts in new ways. Collaboration is encouraged on the problem sets. All code must be your own work. You may get help on problem sets from me and by searching for existing advice on the internet. You may not ask any other person, whether at the University of Connecticut or elsewhere (including the internet) to help you solve a problem.

  • Quizzes. In general, weeks without problem sets will have quizzes released on Monday and due on the following Monday at 4 pm Eastern. Quizzes cover all course material that has been presented between each quiz, with an emphasis on techniques used in the previous problem set. Quizzes are to be completed individually, with no help from anyone, in a limited amount of time. Quizzes are open book, and open notes.

  • Final Group Project. The goal of the final project is for you to apply the data science skills learned in this course to real data to answer a question that helps illuminate and analyze real world issues. You will work in teams of four to build a dashboard with graphics, text explanations and interpretations. Intermediate deadlines will entail submitting your research questions, data sources, design sketch, and initial maps and graphs.

Grading

Category Points
Problem Sets 40 Points
Quizzes 40 Points
Final Project 20 Points

Your grade will be determined according to the following system:

Grade Points
A 94 or above
A- 90 - 93
B+ 87 - 89
B 84 - 86
B- 80 - 83
C+ 77 - 79
C 74 - 76
C- 70 - 73
D+ 67 - 69
D 64 - 66
D- 60 - 63
F 0 - 59

Schedule

A tentative class schedule of topics, readings and due dates is available below. Minor adjustments will be made as needed, on the course web page. Please double check the web page before doing each reading assignment.

Week 1

Topics

  • Introduction
  • Downloading R/RStudio
  • Introduction to R and RStudio
  • Introduction to Quarto

Date \(~~~~\) Readings

1/27 \(~~~~\) Introduction
\(~~~~~~~~~~~\) HOPR Appendix A
\(~~~~~~~~~~~\) QSS Chapter 1.3-1.3.8
\(~~~~~~~~~~~\) R4DS Chapter 28
\(~~~~~~~~~~~\) R4DS Chapter 29

Assignments

  • Download and install R and RStudio on your personal machines

Week 2

Topics

  • Introduction to R and RStudio
  • Getting and Loading Data
  • Dealing with Messy Data

Date \(~~~~\) Readings

2/3 \(~~~~~~\) HOPR Appendix D
\(~~~~~~~~~~~\) R4DS Chapter 4
\(~~~~~~~~~~~\) R4DS Chapter 6
\(~~~~~~~~~~~\) R4DS Chapter 7
\(~~~~~~~~~~~\) R4DS Chapter 8
\(~~~~~~~~~~~\) R4DS Import

Assignments

  • Problem Set 1 Assigned

Week 3

Topics

  • Data Visualization

Date \(~~~~\) Readings

2/10 \(~~~~\) R4DS Chapter 1
\(~~~~~~~~~~~\) R4DS Chapter 11
\(~~~~~~~~~~~\) R4DS Chapter 9
\(~~~~~~~~~~~\) Data Visualization in R

Assignments

  • Problem Set 1 DUE
  • Quiz 1 Assigned

Week 4

Topics

  • Transforming Data

Date \(~~~~\) Readings

2/17 \(~~~~\) R4DS Chapter 3

Assignments

  • Quiz 1 DUE
  • Problem Set 2 Assigned

Week 5

Topics

  • Exploring Data

Date \(~~~~\) Readings

2/24 \(~~~~\) R4DS Chapter 10
\(~~~~~~~~~~~\) R4DS Chapter 12
\(~~~~~~~~~~~\) R4DS Chapter 13

Assignments

  • Problem Set 2 DUE
  • Quiz 2 Assigned

Week 6

Topics

  • Tidy Data
  • Joins I

Date \(~~~~\) Readings

3/3 \(~~~~~~\) R4DS Chapter 5
\(~~~~~~~~~~~\) R4DS Chapter 19.1 - 19.2

Assignments

  • Quiz 2 DUE
  • Problem Set 3 Assigned

Week 7

Topics

  • Joins II

Date \(~~~~\) Readings

3/10 \(~~~~\) R4DS Chapter 19.3 - 19.6
\(~~~~~~~~~~~\) R4DS Chapter 14
\(~~~~~~~~~~~\) R4DS Chapter 16
\(~~~~~~~~~~~\) R4DS Chapter 17

Assignments

  • Problem Set 3 DUE
  • Quiz 3 Assigned

Week 8

Topics

  • Spring Break

Date \(~~~~\) Readings

3/17 \(~~~\) Spring Break

Week 9

Topics

  • Quiz 3

Date \(~~~~\) Readings

3/24 \(~~~~\) Quiz 3

Assignments

  • Quiz 3 DUE

Week 10

Topics

  • Geospatial Data I

Date \(~~~~\) Readings

3/31 \(~~\) USDR Chapter 1
\(~~~~~~~~~\) GCR Chapter 1
\(~~~~~~~~~\) USDR Chapter 3.1-3.4
\(~~~~~~~~~\) GCR Chapter 8.1-8.2

Assignments

  • Problem Set 4 Assigned

Week 11

Topics

  • Geospatial Data II
  • R Shiny

Date \(~~~~\) Readings

4/7 \(~~~~\) USDR Chapter 3.5-3.6
\(~~~~~~~~~\) GCR Chapter 8.3-8.6
\(~~~~~~~~~\) MSR Chapter 1-4
\(~~~~~~~~~\) MSR Chapter 12

Assignments

  • Problem Set 4 DUE
  • Quiz 4 Assigned

Week 12

Topics

  • R Shiny

Date \(~~~~\) Readings

4/14 \(~~~\) MSR Chapter 5
\(~~~~~~~~~~\) MSR Chapter 6

Assignments

  • Quiz 4 DUE
  • Problem Set 5 Assigned

Week 13

Topics

  • Topics in Data Science

Date \(~~~~\) Readings

4/21 \(~~~~\) TBD

Assignments

  • Quiz 5 Assigned
  • Problem Set 5 DUE

Week 14

Topics

  • Presentations

Date \(~~~~\) Readings

4/28 \(~~~\) Presentations

Assignments

  • Quiz 5 DUE
  • Presentations DUE