Data Science
Courses and Projects
Beginning in 2019, I started taking online courses through Coursera, Udemy, and Udacity in order to obtain the programming and analytical skills of a data scientist. Certificates of completion for these courses and course bundles (along with links to the course sites) can be found on my LinkedIn site.
I began my journey with Coursera, learning to program in Python:
Learn to Program: The Fundamentals (University of Toronto)
Learn to Program: Crafting Quality Code (University of Toronto)
Through Coursera, I also endeavored to establish a good foundation in statistics with the following sequence of courses using R and RStudio:
Introduction to Probability and Data (Duke University)
Inferential Statistics (Duke University)
Linear Regression and Modeling (Duke University)
To expand my understanding of R as a programming language, I took Udemy's course:
R Programming A-Z
My knowledge of the tools and techniques of data science grew considerably with two course bundles (called "nanodegrees") by Udacity:
Programming for Data Science with Python
Data Analyst
In these course bundles, I was introduced to SQL, Git, and GitHub as well as various Python libraries: Numpy and Pandas for data wrangling, Random and Statsmodels for statistics, and Matplotlib and Seaborn for data visualization. At the end of each course in a bundle was a project; Udacity specialists evaluated projects and provided feedback. All criteria in the project rubric had to be met in order to pass the course. To view my course projects, see my GitHub Project Portfolio.
Next on my agenda in my data science education is machine learning.
Collaborative Research on the "Changing Use of Formal Methods in Philosophy"
During the Covid pandemic, starting in August 2020 and ending in November 2021 with the publication of our results, I collaborated remotely via Zoom with three other philosophers on a project investigating how the use of formal methods (i.e., methods involving symbolic notation and special techniques for reasoning and calculating) has been changing in philosophy. We found an increased use of probabilistic methods alongside a stable use of logical methods. To arrive at this conclusion, we gathered roughly 900 articles from two different ranges of years from the journal Philosophical Studies to form our sample. After an initial screening process using a team of undergraduates at the University of Minnesota, the four of us researchers classified and coded the articles that passed the screening process. Statistical results were computed and an article with our conclusions was written collaboratively (using the Overleaf online LaTeX editor) and published.
Besides proficiency in logic, all four of us researchers had expertise in either statistics or data science. I brought my developing data science skills to the mix. So, in addition to providing input on the design of the project and contributing an equal share to article classification and coding, my special contributions to the project were the following:
Taking primary responsibility for data management, data cleaning, and data wrangling (using Python in a Jupyter notebook with numpy and pandas)
Creating three tables and, alongside Joshua Knobe, all but the first of the data visualizations that appear in the paper (using Python's matplotlib and seaborn libraries as well as R's ggplot)
Writing substantial portions of three sections of the paper (the sections: “Classification process," “Classification resolution and final data set," and “Results”)
Python F-Strings Cheat Sheets
In Python, F-Strings are strings prefixed with 'f' or 'F' and containing Python expressions inside curly braces for evaluation at run time. They are more formally known as "formatted string literals" and were introduced with Python 3.6. They provide a concise, readable way to include the value of a Python expression, with formatting control, inside strings.
I had originally learned an older Python string formatting technique, but wanted to update when I learned about F-Strings. As I looked into them, I thought I could organize the information about F-Strings better than other presentations I had seen, so I created my own pair of cheat sheets on Cheatography, an online cheat sheet creation and publishing tool. One displays the basics of F-Strings and the other focuses on number-formatting.
These can be found online at the Cheatography website.
Unfortunately, a change in the engine that turns the web version of a cheat sheet into a handy PDF resulted in the corruption of the associated PDF files so that they are not as readable as they originally had been. So, I make the original PDFs available below. (I have been unsuccessful in getting a response from Cheatography about how to easily address this issue.)