
Data Analytics & Visualization
In December 2022, I began a Data Analytics & Visualizations Bootcamp with George Washington University. During this Bootcamp, I completed professional level projects using a wide variety of technologies including Python, Pandas, Matplotlib, SQL, NoSQL, Tableau, and Machine Learning.
Business Data Analysis
Queries in SQL
This project shows the creation of an SQL database from six related csv datasets, focused on an organization containing over 300,000 employees. I also conducted queries in order to obtain specific pre-defined results.
School Resource Outcomes
Analysis with Python & Pandas in Jupyter Notebook
Using Pandas and Jupyter Notebook, I created a report that analyzes Student and School data together in order to gain insights into how student education outcomes relate to a variety of school-level factors.
Crowdfunding Extract Transform & Load (ETL)
Team project with Python, Pandas, RegEx and PostgreSQL
My team built an ETL pipeline using Python, Pandas, and regular expressions to extract and transform the data. We then created four CSV files and used the CSV file data to create an ERD and a table schema. Finally, we uploaded the CSV file data into a Postgres database.
Tableau Project
Data Visualization in Tableau
I built a story that features a series of dashboards of related visualizations exploring three key questions. These visualizations look at birth year and time of day in relation to a variety of factors. And finally, they visualize a discrepancy with a hypothesized cause.
Credit Risk Classification
Using Supervised Learning and Logistic Regression
The purpose of this analysis is to generate a model that will be able to correctly identify 'healthy loan' and 'high risk loan' applicants based on application factors which include loan size, interest rate, borrower income, debt to income ratio, number of accounts, pre-defined derogetory marks, and total debt. We will then test this model to determine how effective it is at classifying both categories.
Cryptocurrency Cluster Analysis
Unsupervised learning: K Means with StandardScaler & PCA
I analyzed 42 cryptocurrencies in order to determine the effect of price changes over different periods of time. This analysis used Unsupervised Machine learning; specifically utilizing K Means, PCA, and StandardScaler.
Deep Learning: Neural Network
Analyzing Corporate Funding Allocation for Prediction
Analysis of over 34,000 businesses that received funding, to generate 184 Neural Network algorithm to predict effective allocation of funding. Over multiple models, the prediction accuracy improved from 72.7% to 73.6%.
Autism Prediction
with Machine Learning
Supervised Learning Model Comparison with F1 and Accuracy
In this project I attempted to predict autism using machine learning. I generated 12 machine learning models using 6 analysis types both with and without Random Oversampling. The best model had an accuracy of 91% with F1 scores of .81 and .94.