MADHUR KAPOOR
Student - Data *Enthusiast - Ever Learning

Know Me!

Hello There! Welcome to my humble abode on the internet.

I am a Computer Science graduate student at the University of California, San Diego, specializing in Data Science, Distributed Systems, and Machine Learning.

I am enthusiastic about:

  • Java and Python Development
  • Large scale data analytics
  • Healthcare analytics, Adtech
  • Hadoop ecosystem and other distributed
    processing systems
  • Data Science and Visualization
  • Scikit-Learn
  • Fullstack (MEAN) Development
  • Budding Javascript Libraries
  • Trekking & Traveling

My Skills

Java

Python

Hadoop Ecosystem

MySQL

Javascript

Node JS

C++

C

PHP

Powershell

Couch DB

Linux

DATA SCIENCE, MACHINE LEARNING, ACADEMIC RESEARCH

My Work So Far

    Yahoo! Inc.

    (Intern)
  • Built an ad-inventory forecasting model for guaranteed ads based on stochastic streaming algorithms (sketches) which had a 3-fold improvement (accuracy, disk-space, infrastructural simplicity) over the existent sampling based system.
  • Collaborated directly with the tech lead and architect on this project which culminated with a paper submission in Yahoo! Tech Pulse 2016.

    San Diego Supercomputer Center

    (Graduate Student Researcher)
  • Analyzed census-level data across 5 years to aid public health ocers in data-driven policy-design decisions. My work was funded by the Robert Wood Johnson Foundation.
  • Developed a data pipeline for extracting the most meaningful factors aff ecting health outcomes like diabetes and life expectancy and their evaluation by comparing them from a theoretical stand-point. Involved similarity analysis, feature engineering/selection schemes and predictive analytics.

    Qulinary

    (Part-time Freelancing Software Engineer)
  • Working on an Ionic & Cordova app for real-time driver tracking and route optimization using AngularJS, Leaflet.js
  • Implemented an on-demand route optimization service to suggest an optimal order of traversal to drivers and aid them in deliveries.

    Citrix R&D

    (Software Test Engineer I)
  • Developed automation for the Personal Virtual Disk and Personalization product line from scratch up to 42% to reduce testing time of each build by minimizing manual work.
  • Designed and implemented UPM Troubleshooter (released as a hotfix), for customers and admins to debug their UPM setup which significantly reduced the round-trip-time for addressing customer issues.

    Artoo

    (Software Engineering Intern)
  • Engineered an Admin dashboard for the client - Ujjivan Financial Services, to track the activity of their field agents in providing loans to Bottom of Pyramid entrepreneurs.
  • Incorporated a lazy module loader into the application to significantly improve the boot-time.

    Indian Statistical Institute

    (Research Intern)
  • Devised a model to predict spurious citations in research work submitted to a conference proceeding. The estimations were made using text processing methods like TF-IDF.

Projects

Implemented a distributed le system from scratch powered by a self-made RMI Library to support remote calls. Supported operations include CRUD les, naming, auto-replication, locking to ensure consistency. Know more

Predicted the top pick-up zones at any given hour, using 6 months Uber pickup data from NYC. Know more

Predicted the helpfulness of a review (regression) and rating that a user would give to an item (using latent factor models)

Handwriting based collaboration framework-multiple remote parties collaborate using pen and paper in real-time. Click to know more

Handwriting based collaboration framework-multiple remote parties collaborate using pen and paper in real-time. Click to know more

A QR Code based system to provide educational solutions for students who can't afford provate tutoring Click to know more

Mini search engine to query over textual data of an organization and return the related documents, using a single node Hadoop cluster

WORK EXPERIENCE