The notes represent a distinctly practical, hands-on approach to learning both pandas and SQL. While these two topics may be touched upon in traditional database courses (or introduction to data science courses), delving deeply into how to efficiently write modern code using these tools is often a lower priority compared to other topics. These notes approach learning the basics of data management from a different angle — you need to be comfortable with directly manipulating data before you can easily internalize additional concepts.
This course has been taught at undergraduate, graduate, and executive certificate levels in a variety of course structures.
This page contains the most recent version of those notes, currently taught to students in the MS-CAPP Program at the University of Chicago.
The data used in this course can be found in the repository here. Below is the table of contents with links to specific chapters. Please note that some sections are works in progress or TBD.
A combined PDF with all the notes is available here. Specific chapters are listed below, but please be aware that links in the chapter-specific PDFs are not currently functional.
.Table of Contents
Introduction and Errata | Introduction | |
Relational Databases | ||
Rows and Columns |
| Chapter 1 |
Basic Manipulations |
| Chapter 2 |
Subqueries, Distinct & Case |
| Chapter 3 |
Database Internals: Transactions |
| Chapter 4 |
Aggregations |
| Chapter 5 |
Dates and Types |
| Chapter 6 |
Averages |
| Chapter 7 |
Joins |
| Chapter 8 |
Advanced Joins |
| Chapter 9 |
Analytic Functions & CTE's |
| Chapter 10 |
Database Internals: Performance Evaluation |
| Chapter 11 |
Extensions [TBD] |
| Chapter 12 |
Interview Hints |
| Chapter 13 |
Pandas | ||
Introduction |
| Chapter 14 |
More Manipulations and Types |
| Chapter 15 |
Aggregations |
| Chapter 16 |
Joins |
| Chapter 17 |
Window Functions |
| Chapter 18 |
Appendix | ||
Data Dictionaries |
| Appendix A |
Connecting SQL to Python or R |
| Appendix B |
Assignments | ||
Example Exams |