My Assignments

Assignment 0: Test Dummy

This assignment is for testing. Created a file called tasks that uses sys in python to get numbers form the command line. Then had to cast the arguments into ints as they are strings by defualt. Once they were casted I could add them together as integers and print the output.

Github Link

Assignment 1: Elevator Position

This assignment I collected data for when the elevators opened in the CDS building. Then wrote small python scripts to read and reitrive/compute data to order to find the optimal place to stand to minimize the dsitance you need to walk.

Github Link

Assignment 2: Kmeans Algorithm WebApp

This assignment I developed a web app using flask for back-end development and javascript for front-end interaction with the web page. Which allows the user to put in inputs for k clusters and what type of method they want to use for initializing centriods. It also allows them to manually select the inital centriods.

Github Link

Demo Video

Assignment 3: SVD Implementation and Analyze

This assignment I implemented SVD on each image in the mnist data base of numbers 0-9. And ploted the accuracy and Time certain amounts of SVD_components to retain. Then analyze the result.

Github Link

Assignment 4: LSA Search Engine

This assignment I developed a web app that uses the implmentation of LSA on a dataset of documents. A search engine that allows the user to type in key words and find related documents determined by cosine similariy.

Github Link

Demo Video

Assignment 5: KNN Bank Churn

This assignment I implemented KNN from scratch and used the model to predit the churn of clients based on the provided data. I preprocessed the data by finding the corrilation between exit status and each feature. I also inflated the miniority class of exiting to balance the dataset.

Github Link

Assignment MovieReview: MovieReview

In this project, I developed a predictive model to determine Amazon review star ratings based on a variety of features, including text data processed with TF-IDF vectorization. I built a stacked ensemble model using Random Forest, LightGBM, and XGBoost, leveraging each model's strengths for improved accuracy. To optimize the model’s performance, I carefully tuned hyperparameters, experimented with feature engineering, and tested methods to handle class imbalance, including SMOTEENN and custom class weights. The final model achieved a 60% accuracy on unseen data, highlighting effective strategies for handling large, imbalanced datasets.

Github Link

Assignment 7: Linear Regression Simulations++

For this website I included hypothesis testing and confidence intervals for linear regression. These tools allow users to test if an observed slope is statistically significant and to view a range within which the true slope likely falls, adding depth to the regression analysis.

Github Link

Demo Video

Assignment 8: Linear Regression With Cluster Shifting

For this website I implmeted logistic regression with cluster shifting. It helps analyze the relationship between the variables in the decision boundary with cluster shifting.

Github Link

Demo Video

Assignment 10: Image Search

For this website I implmeted 3 types of qeuries. That uses the clip model to convert images to embbeded images and text to search through and return the top 5 highest cosine simliarity

Github Link

Demo Video