📈 Markowitz Model for Optimal Portfolio

Objective: To develop a model that helped on trading challenges to select the optimal portfolio each week to get the best profits in a return-risk trade-off.

Multiple weeks sitting at top 3 places on best portfolios on Reto Actinver 2022.
5th Place National Award for Bloomberg Trading Challenge 2024.
Worked as advisor for stock buying in Trading challenges.
The methodology was:
- Download the stock prices with the yfinance library
- Clean the data and calculate:
  - Mean Returns
  - Log Returns
  - Portfolio Risk
  - Portfolio Returns
  - Sharpe Ratio
- Create a function that created random weights for selecting a 10 stock portfolio randomly.
- Create 100,000 random portfolios.
- Save the next best portfolio given the return.

🏈 NFL Combine measurements importance

Objective: Identify players that by their measurments at the combine would be recommended to draft but weren’t drafted. With this I tried to look for hidden talent.

This project was to try to identify the most important variables for scouts to Draft a Prospect.
Developed hypothesis on comparing positions and their distributions.
There is a statistical difference between positions on their Combine measurements.
Compared Logistic Regression vs Random Forest Classifier and provide a better example.
Logit with Lasso Penalty was the best model.
40 yd dash time, and Weight are the most important features.
AUC=0.72
Model recommended to draft UDFA such as Jaylen Warren and Cameron Dicker.
Brock Purdy was a strong recommendation to draft.

📑 Retrieval Augmented Generation (RAG) and Agent with GeminiAPI for Drafting players in the NFL

As part of the Gen AI Intensive Course Capstone 2025Q1
Developed an LLM with a RAG to be trained on all the publicly available scouting reports regarding the 2025 NFL Draft.
LLM able to give an assesment as well as compare and contrast players of the same position.
Added an Agent that was capable of playing a mock draft with the user and make picks from the big board based on the rankings and team needs.

🎲 Probability and Statistics Course at Universidad Panamericana

Leveraging the R programming language, students were able to have a deeper understanding of Probability and Statistics concepts such as:
- Conditional Probability
- Discrete Probability Distributions
- Continuous Probability Distributions
- Analysis of Variance
- Experimental Design
Upcoming term: On the next term, students will have a reference guide, you can read the WIP here.

Predicting NFL Matches with different ML Models and variables

Using publicly available data like scraping tables from Pro Football Reference, Sports History Odds and NFLFastR
Training Data of all games since 1999 to predict the 2023 season
72% Accuracy Score
Variables referenced in (Delen,2012) were also relevant for our claim.
Beats many state-of-the-art algorithms regarding prediction of games.

MCMC to prove lottery is strictly random or an associate distribution can be found

Worked with Mexican Power Balls such as Chispazo and Melate Retro.
Markov Chain Monte Carlo Simulations (MCMC) to prove such claim.
-146 and -93 log likelihood on simulation distributions compared with real distribution of contests.
Failed to reject the hypothesis that Lottery contests are completely random.

📊 Sankey Report for Laboratory

Developed an end-to-end data pipeline to provide ad hoc analytics on evaluating Lab Workers to correctly identify blood cells through a specific methodology.
Whole architecture is hosted on GCP with the final product is delivered through Looker Studio.
Participants in this quality program are evaluated in two ways: the monthly expert, and an expert consensus to insure an unbiased assessment.

🏀 March Madness Kaggle Competition 2022

This project was used in the March Machine Learning Mania 2022 - Men’s competition to predict the bracket
Logistic Regression with CV was used for predicting the bracket
Avg. Log Loss of the algorithm 0.68492
Calculate probability of win for a team
Beat the auto bracket (all teams have equal probability to win)
Predicted St. Peter’s Peacocks upset over No.2 Seed Kentucky and No. 3 Seed Purdue

NLP Project Assignment

Took Amazon Reviews Dataset (you can find the dataset here)
Looked at the most common reviews
EDA
Created a topic classifier with an Latent Dichrilet Allocator (LDA)
Classified possitive topics into 10 different categories based on their sentiment score ***

🚢 Titanic Kaggle Competition

The most recommended Kaggle competition to get your hands on ML
Voter Classifier that included distinct methodologies
Feature engineering on deck in which passenger was assigned
Filled NULL values in numeric dimensions with average of the column
This project graded 0.77751 on accuracy

🚗 Insurance Project

Analyzed a Dataset and predict if a user would renew its insurance policy or not
After an EDA we identified that users without a license we the most likely to not buy an insurance
Users with older vehicles were more inclined to buy insurance policies
Decision Tree was the best option for classifying users that were prone to buy an insurance policy

📅 Schedule a Call

If you’d like to chat or collaborate, feel free to book a time with me: