About Me

My name is Jevan Singh Chahal, I have recently graduated from UC San Diego as a data science major. I am a current intern for the United Sikh Movement as well as running my own e-commerce buisness. From my education as well as my own personal projects, I have developed vast skills in the field of data science, ranging from machine learning all the way to database management!

Skills

Here is some of the skills I have to offer!

  • Python
  • SQL
  • Machine Learning
  • Data Analysis
  • Natural Language Processing
  • Data Engineering
  • Feature Engineering
  • Dashboard Development

Work Experience

Here I will list my Work Experience!


Data Analyst Internship

● Cleaned and processed raw Eventbrite sales data to ensure accuracy and consistency before analyzing the dataset
● Designed and implemented SQL schemas to structure the database for efficient querying
● Developed an interactive dashboard using Streamlit to visualize key performance indicators derived from the database to enable data-driven insights and draw conclusions for the business

ValueVantage

● Drove over $75,000 in revenue within three months by launching and managing an e-commerce business
● Oversaw the end-to-end selling process, including product sourcing, listing, and customer fulfillment
● Optimized pricing strategies to maximize profitability while maintaining competitive market positioning
● Streamlined accounting processes and implemented organizational systems to ensure efficient business operations

Personal Projects

Here I will list some notable projects that I have worked on!


Credit Worthiness Language Models Project

● Received multiple raw datasets from PrismData regarding bank transactions, cleaned the data, and created features from the data using Python to clean the dataset
● Utilize NLP and LLM machine learning models to categorize bank transactions
● Create user data such as user income from raw bank data using attributes such as regularity, category, and frequency
● Create cash scores and reason codes from users based on previous users' transaction and default history and determine how credit-worthy users are and if they qualify for a loan
● Created 400+ features to measure users' income, and spending habits (category-specific and temporal)
● Created multiple decision tree models (LightGBM, HistGB, Catboost) which comprised our final weighted ensemble

🔗 Here is the link to my poster

PR Sentiment Agent

● Built a LangChain agent with a Flask frontend to analyze company PR sentiment using live news and web data
● Created a NewsAPI-based tool with SQLite caching and zero-shot sentiment classification using BART
● Used TavilySearch to fill missing context and Gemini 2.0 Flash for summary generation

🔗 Here's the GitHub Repository

NFL QB Analysis


NFL Quarterback Analysis (1999–2025)

Part 1 – QB Playstyle Clustering
    • Engineered features like sack rate, air yards, and rushing stats from 25+ years of QB data
    • Applied PCA to reduce dimensionality and retain 94% variance
    • Clustered QBs into 5 distinct playstyle groups using K-Means

Part 2 – QB Tier List Ranking
    • Created composite scores using PCA on 9 advanced performance metrics
    • Ranked over 1,000 QB seasons from 1999 to 2025 by overall efficiency
    • Visualized top-rated (e.g., Rodgers 2020) and lowest-rated QB seasons

🔗 View Part 1: Playstyle Clustering Interactive Image
🔗 View Part 2: Tier Ranking Interactive Image
🔗 View GitHub Repository

Cloned Google Homepage
Static clone of the Google homepage generated via LLM pipeline

Website Cloning with LLM, Next.js & FastAPI

● Used Playwright to programmatically extract website DOM and capture layout screenshots.
● Applied prompt engineering with multiple LLMs to generate static HTML/CSS that accurately replicates site style and structure.
● Built a full-stack app with Next.js frontend and FastAPI backend to automate scraping, LLM calls, and cloning workflows.
● Delivered static site clones preserving responsive design and visual fidelity without JavaScript.
● Designed for scalability and extensibility with modular prompt templates and multi-LLM support.

🔗 Here's the GitHub Repository

Stocks Project

● Running a Python-based program to gather live second-by-second stock data to create multiple stock indicators from the data such as RSI, EMA, and Bollinger Bands
● Utilizing multiple APIs to automate live stock trades based on indicators and parameters
● Creating a stock screener that filters every US ticker based on parameters given

🔗 Here's the GitHub Repository

Recipe Analysis Project & Machine Learning Model

● Received two raw data sets, cleaned the data, and merged the sets giving a clear view of the data
● Utilized data visualization skills to create graphs that showcased a range of information including the overall distribution of ratings and correlations between the time it took to make a recipe against the rating the recipe received
● Leveraged feature engineering techniques to extract meaningful insights by creating new columns and enhancing the predictive power of the dataset
● Utilized regression modeling to accomplish the project goal of predicting the sugar contents of recipes

🔗 View Part 1: Recipe Analysis
🔗 View Part 2: Sugary Content Prediction Model

Contact Me

If you'd like to get in touch, feel free to reach out via any of the methods below: