Kanika Chopra

about me


Kanika Chopra

I am a 3rd year undergraduate student studying Mathematical Finance and Statistics at the University of Waterloo. I started competing in math competitions when I was 8, so you can tell I'm passionate about what I study. I'm interested in data science, machine learning, and understanding how math and data science can be leveraged in different unique fields.

This term, I am the Vice-President of Events & Education of the Waterloo Data Science Club. One of my main goals is to foster a more welcoming environment for minorities to enter the field, so if you are a minority in data science or know someone who would be willing to speak at an event, shoot me a message!

My previous co-op was working as a Data Science Intern with Goldspot Discoveries Inc. in Toronto. Goldspot is a company revolutionizing the mining industry by applying machine learning techniques into this niche field.

Besides my interests in math and data science, I find that my side hobbies contribute to my skills as well. I love pen sketching and watercolor painting and have found that the amount of detail and focus required in creating these pieces of art can help when digging into data. If you would like to see some of my art pieces, you can check out some of my work in my portfolio (COMING SOON). 

I am currently looking for Summer 2020 Data Science positions. If you have any opportunities available, I can be contacted via e-mail or LinkedIn!

skills



Languages

Python

R

SQL

MATLAB

HTML/CSS

Tools

pandas, NumPy, Rasterio, sqlalchemy

matplotlib, Plot.ly, Tableau

Fast.ai, spaCy, nltk, Scikit-Learn, Scikit-Image

BeautifulSoup, selenium, requests

Microsoft Azure

Microsoft Excel

experience


Goldspot Discoveries Inc

Data Science Intern, Sept 2019 - Present

Toronto, ON

  • Trained an NLP classifier using fast.ai to conduct a sentiment analysis on 18,000+ gold-related tweets with 85% accuracy
  • Visualized the sentiment distribution, U.S. geographic segmentation and top Twitter users between a range of dates on Plot.ly
  • Performed PCA, K-Means Clustering, SLIC and multi-band ratios on satellite images for mineral composition analysis using Scikit-Learn, Scikit-Image and rasterio
  • Manipulated insider score data into 20 bins and calculated the population stability index for a Plot.ly multi-plot visualization; deployed on internal website using Flask
  • Visualized interactive time-series plots for a stock price technical analysis with 20-day, 50-day, 200-day moving averages and relative indexes using Plot.ly
  • Automated paper trading by processing stock price data to implement a Reinforcement Learning model with paper trading using Interactive Broker’s API
  • Royal Bank of Canada

    Strategic Initiatives Analyst Co-op, Jan 2019 - Apr 2019

    Toronto, ON

  • Automated team reporting by determining metrics using GitHub verbiage stored in SQL server and then visualizing the data in a Tableau dashboard for management decisions
  • Developed Python scripts using Selenium, Beautiful Soup and Requests for dynamic web scraping to find and download all article files from a website into a directory
  • Visualized event registration data with a storyboard on Tableau for marketing purposes to display demographic and geographic analysis on registrants
  • projects



    News Category Classifier

    Newspapers

    News-category classifier trained on 60k headlines to be used for categorizing news-related tweets.

  • Trained a news category classifier using Multinomial Naïve Bayes, SVM (Linear, Polynomial, Gaussian), Multinomial Logistic Regression, and Random Forest based on 60,000 HuffPost headlines
  • Fine-tuned each model using a variety of parameters with GridSearchCV to achieve a maximum accuracy of 88% with Linear SVM
  • Engineered features using word count vectors and TF-IDF for word frequency vectors
  • Collecting and labelling categories for Mississauga News tweets to apply transfer learning to classify and categorize news-related tweets
  • Posturizer

    Posturizer sample

    Posturizer is a web app that takes a photo of a user, identifies their posture and speaks back providing appropriate advice

  • Preprocessed collected training image dataset to recognize and blur out faces in the photo to preserve privacy and increase model’s accuracy using OpenCV
  • Trained a Microsoft Azure AI Custom Vision model with the pre-processed training data to classify five postures: leaning forward, back, right, left and good with 91.4% accuracy
  • Project # 3

    Satellite Image

    This project includes implementing a PCA, K-Means Clustering and Band Ratios via. Command Line Interfaces.

  • Principal Component Analysis (PCA): Input a .TIF image filepath, number of components, output filepath and list of bands that you want the PCA to be applied to and it'll export the image to the output filepath
  • Band Ratio Application: Input a .TIF image filepath, ratio equations referring to the bands (n), output filepath and it'll export the n-band image to the output filepath.
  • K-Means Clustering: Input a filepath with the .TIF image, the number of clusters, the output filepath and a list of bands that you want to use for K-Means Clustering and it'll export the image to the output filepath.
  • Optimal Dimensionality Reduction Algorithm Report

    paper notebook and pen

    A report written comparing PCA and LDA to see when it is most optimal to use each algorithm

    volunteering


    Waterloo Data Science Club

    VP of Education & Events, Jan 2020 - Present

    Waterloo, ON

  • Starting a “Diversity in Data Science” series to encourage women, LGBTQ+ and other minorities to enter the field
  • Big Brother Big Sisters of Canada

    Teen Mentor/Leader, Sept 2014 - Present

    Mississauga/Waterloo, ON

  • Mentored students from the Peel and Waterloo regions and organized fun activities to develop their interpersonal skills
  • hobbies



    Coming Soon.....

    Watercolor Painting, Pen Sketching, Hiking and Biking