Shubham Agrawal

I'm |

Introduction

  • I am currently a Master of Science in Computer Science student at Columbia University. Before this, I worked as a Software Engineer at Goldman Sachs. I have completed my undergraduate from NITK in May 2021.
  • My interests are broadly in Software Engineering, Machine Learning, Computer Vision, and Natural Language Processing.
  • I have experience with full-stack web development, designing, testing, deployment, and maintenance of product.
  • I have worked on problems in the domains of medical imaging, information retrieval, visual question answering, and data mining.

Education

Columbia University in the city of New York

Aug 2022 - Dec 2023 (Expected)

MS in Computer Science (Machine Learning track)

CGPA: 4.25/4.0

National Institute of Technology Karnataka, Surathkal (NITK)

Jul 2017 - May 2021

B-Tech in Information Technology

CGPA: 9.79/10.00, Institute Gold medalist

Secondary High School- Ahmedabad Public School, Gujarat (CBSE)

Class XII - 93.2% (studied computer science)

High School- Kendriya Vidyalaya No. 1, Gandhinagar, Gujarat (CBSE)

Class X - CGPA: 10.0/10.0

Technical Experience

A10 Networks

May 2023 - August 2023

Senior Software Engineering Intern

  • Building an auto-scaling platform for A10 vThunder Load Balancers, leveraging network data insights & user-defined thresholds, and utilizing a finite state machine architecture-based implementation.
  • Testing algorithms to predict traffic flow and scale ADC automatically using platform metadata and logs.

Columbia University, New York

Sep 2022 - Present

Graduate Reserach Assistant (GRA)

  • Internet Real Time Lab (IRT), (advised by Dr. Henning Schulzrinne)
  • Researching and formulating evaluation techniques for Secure Digital Identity Options for Humanitarian Purposes in collaboration with Mastercard.

Goldman Sachs, Bangalore

Jun 2021 - Aug 2022

Software Engineer (Analyst)

  • Machine Learning & Automation team
  • Developed a No-Code platform leveraging Spring Boot, Angular, and MongoDB to automate business processes by deploying BPMNs.
  • Handled end-to-end testing, health-check, and deployment of the software product.
  • Integrated product with Prometheus, Grafana, internal Analytics, Alerting, and Ticketing platforms.
  • Worked on demising a legacy platform containing more than 18k schemas and migrated them to alternative platforms.

Edumeister, California

Sep 2020 - Dec 2020

Research Intern

  • Structured content extraction from preformatted documents and study materials.
  • Collaborated with startup founders to devise a Machine Learning ensemble model to identify and classify 60,000 study documents.
  • Analyzed Natural Language Processing, Google Vision, and Amazon Textract OCR to extract questions & answers from unstructured documents.

Goldman Sachs, Bangalore

May 2020 - Jun 2020

Software Engineer, Intern (Summer Analyst)

  • Created new REST APIs to take screenshot on a host server and stream it back to the user.
  • Developed REST endpoints to compress the log files and stream the zip as a byte array.
  • Added functionalities like real time screenshot, download log-files on the UI using Angular.

IIIT Hyderabad

May 2019 - Jul 2019

Research Intern (Summer Research Fellow)

  • Created an annotation tool to automate the peacock feather's eye (ocelli) counting process using OpenCV, Flask, and Python.
  • Tested Deep Learning models like CSRNet, MCNN and interacted with a biomedical scientist to uniquely identify peacocks by ocelli position and count to avoid their extinction.

Central Research Laboratory, Bangalore

Dec 2018

Internship Project

  • Developed a software to recognize faces using OpenCV, Haar-cascades and LBPH algorithm in python and built an Artificial Neural Network in MATLAB R2018b

Bharat Electronics Ltd. Bangalore

May 2018 - Jun 2018

Project Trainee

  • Developed a software to automate the testing of Radar Warning Receiver system of Su-30 fighter aircraft.
  • A GUI was made using Visual Basic for ATE(Automatic test equipment) which automated the testing of Dual Switch Amplifier(component of RWR).

Skills

Skills are listed in decreasing order of their profeciency.

Languages/Markup and Scripts:

  • C/C++
  • Python
  • Java
  • HTML
  • CSS
  • JavaScript
  • Spring Boot

Technologies/Frameworks:

  • AngularJS
  • NodeJS
  • Kubernetes
  • Apache Kafka
  • Elasticsearch
  • AWS
  • Git
  • STL
  • OpenCV/OpenGL
  • MATLAB
  • MySQL
  • VMware vSphere

Machine Learning Tools:

  • Keras
  • Tensorflow
  • Scikit-Learn
  • Pandas
  • Hugging Face

Publications

Blockchain based Framework for Student Identity and Educational Certificate Verification

2nd IEEE International Conference on Electronics and Sustainable Communication Systems (ICESC)

  • Improved the security of the certificate verification system by using unique ID and secret phrase of a student.
  • Documents are linked to the student to add another layer of verification. Implemented the platform using smart-contracts, Metamask, Truffle, Ganache-cli.

Categorizing Relations via Semi-Supervised Learning using a Hybrid Tolerance Rough Sets and Genetic Algorithm Approach

Soft Computing for Data Analytics, Classification Model, and Control. Studies in Fuzziness and Soft Computing, vol 413. Springer, Cham.

  • Project in collaboration with Dr. Sheela Ramanna, University of Winnipeg, Canada.
  • Categorization of the relational words in the text-based data represented as pairs using semi-supervised learning algorithm that uses the Genetic algorithm to optimize the variables used in the TPL algorithm.

Content-based Medical Image Retrieval System for Lung Diseases using Deep CNNs

International Journal of Information Technology (BJIT)

  • Used deep learning, transfer learning to identify features and classify diseases from chest X-ray images.
  • Retrieved the closest image using metrics like Chi-squared, Euclidean and Cosine distance.

Machine Learning based COVID-19 Mortality Prediction using Common Patient Data

IEEE 7th International Conference for Convergence in Technology

  • Mortality prediction using chect X-ray and common easy-to-obtain patient data such as age and gender. The proposed method was able to achieve a classification accuracy of 92.6% and AUPRC of 0.95.

Utilizing Deep Learning Models and Transfer Learning for COVID-19 Detection from X-ray Images

SN Computer Science Journal, Springer Nature

  • Developed a ResNet50 based transfer learning model with a multi-class classification accuracy of 87.2%.
  • Compared model interpretation methods like GradCAM and LIME for reducing false-positives.

Projects

Columbia Connect Application

  • Developed a scalable and distributed web application for the university students, using AWS services like Cognito, Chime SDK, SQS, Eventbridge, Lambdas, and hosted using CloudFront, Route53, and Certificate Manager.
  • Devised an intelligent student matching system using LinkedIn profiles and BERT model to generate keywords.

Smart Photo Album

  • Built a photo album web application that can be searched using natural language through text and voice. Intelligent search layer added using AWS Rekognition and Lex which adds tags to the image in OpenSearch automatically.

Visual Question Answering

  • Developed an ML model which uses an image and a corresponding question to generated relevant answer.
  • For image recognition, CNN (pre-trained VGG-16) was used and for questions, Global vectors along with LSTM were used.

Doctors Prescription Handwriting Recognition

  • The project is to automate recognition of prescribed medicines by doctors and to automate the job of a pharmacist. This is done by CNN models instead of existing OCR methods.
  • An annotation tool was also built to generate the data of a particular hospital.

Crop Cycle Parameters Extraction using Multi Temporal Data

  • Developed a semi-automated approach using Indian Remote Sensing satellite data and NDVI values to extract annual cropping patterns. Pixel wise study of parameters such as date of sowing, date of harvesting and number of harvests based on temporal profile was done.

DDoS detection and mitigation

  • Implemented Van Emde Boas tree based priority queues in Python to solve the problem of DDoS attacks.

Achievements and Awards

  • Awarded Institute Gold Medal for securing highest Cumulative Grade Point Average in B.Tech (IT)
  • Selected for Indian Academy of Sciences - Summer Research Fellowship ‘19, awarded to 365 students nationwide in India
  • Ranked in top 4 at Smart India Hackathon Grand Finale ’19 (Software Edition) held at IIT Roorkee.
  • Ranked 3rd in Internal hackathon (Smart India Hackathon 2020) held at NITK Surathkal
  • Awarded CBSE Certificate of excellence from MHRD in class 10
  • All India Rank JEE (ADVANCE / MAINS) : 6542 / 7983

Extracurricular Activities

  • Technical Team Lead - HackVerse 2020-2021, one of the largest student organized hackathon in India
  • Executive Member at Institution of Engineers, NITK
  • Mentor for IE Summer Mentorship - Program to introduce concepts of Machine Learning to freshers
  • Project co-ordinator - Technites, college technical fest.
  • Teaching - Gave several talks like Git, Internship guide, ML road-map to teach juniors.
  • Good Speaking and Writing Skill
  • Other hobbies include playing badminton, sketching and coding

Contact

Location:

4C, 157 West, 106 Street, New York, NY, 10025

Last Updated on 30th March, 2023 | Designed with the help of BootstrapMade