top of page

​

I am a Master's candidate at the Carnegie Mellon University - School of Computer Science majoring in Computational Data Science. I am currently looking for full time roles in Deep Learning and Machine Learning.

​

My major areas of interest include Natural Language Generation and Recommendation Systems. I have worked on applications of Machine Learning in multiple areas such as Natural language Processing, Computer Vision and Finance.

 

As part of my master's capstone project, I am currently working with Bosch Research on generating natural language from unstructured datasets such as a list of words. 

​

I am also part of Professor Fei Fang's team at CMU where we are working on building a petition recommendation model for social media posts. Our team won the best paper award for the same at the Harvard's AI for Social Impact Workshop 2020.

  • White LinkedIn Icon
  • White Instagram Icon
  • github-mark
EXPERIENCE
EXPERIENCE
2020-2020

Data Science Intern

CLOUDFLARE

I work with the Business Intelligence team, developing models to recommend products to users based of their traffic statistics and demographics.

2017-2019

Quantitative Associate

GOLDMAN SACHS

I worked with Risk Informatics team based out of Bengaluru, India. my job included developing models to identify patterns in trading and market data and identify avenues of market risk, with a special focus on Commodities asset class.

2017-2018

Senior Member Technical

ARCESIUM (D.E. SHAW GROUP)

I worked with the Trade Reconciliation team as a full-stack developer focussing on data engineering for trade reconciliation platform. A side project of mine aimed to deploy a Blockchain for simulating smart contract transactions.

EDUCATION
2019-2020

Master's in Science

CARNEGIE MELLON UNIVERSITY

SCHOOL OF COMPUTER SCIENCE

Computational Data Science

2011-2015

Bachelor's of Technology

INDIAN INSTITUTE OF TECHNOLOGY (BHU) VARANASI

Computer Science and Engineering

EDUCATION
COURSEWORK
SKILLS

Deep Learning

Cloud Computing

Computational Ethics for NLP

Multilingual Natural Language Processing

Machine learning

Neural Networks for NLP

Multimodal Machine Learning

Interactive Data Science

PROJECTS
EXPERTISE
Contextual Natural Language Generation 
  • Implementing UniLM for sentence generation for a set of concepts (from CommonGen dataset) with commonsense injection 

  • Seq2seq modelling for text generation for structured dataset (such as WikiBio) using autoencoders & prototype edit methods 

Topic Classification in Speech Processing 
  • Trained a feedforward neural network to determine phoneme states from mel spectrogram frames of speech recordings 

  • Trained and hypertuned a CNN using PyTorch for topic classification using GloVe word embeddings of the text generated 

Anger to Constructive Criticism on Social Media 
  • Project aims to capture public anger on Twitter regarding social issues and convert it to constructive criticism by recommending relevant petitions to users. Awarded best Poster at the Harvard's AI for Social Impact Workshop 2020 

  • Trained an ensemble model of SVM, Naïve Bayes and CNN to classify tweets and recommended petitions using Bag of Word

Contextual Natural Language Generation 
  • Deploying a Language model (UniLM) for sentence generation from a given set of words, from the CommonGen dataset and use attention-based approaches for commonsense injection 

  • Seq2seq modeling for text generation for structured dataset (such as WikiBio) using autoencoders & prototype edit methods 

Attention-based Speech-to-Text Generation
  • Trained a Pyramidal Bi-LSTM based Encoder-Decoder architecture to generate text for given speech utterances

  • Experimented with concepts like attention injection, gumbel noise, teacher forcing and beam search

Anger to Constructive Criticism on Social Media 
  • Mined tweets regarding social issues and trained a BERT based neural model to classify tweets on hate speech / toxicity

  • Performed topic modelling for theme detection and implemented a Petition recommender system based on Bag of Words

  • Awarded best Poster at Harvard's AI for Social Impact Workshop 2020

Face Classification and Verification
  • Trained & comparatively analyzed CNN based architectures (MobileNet, AlexNet, ResNet variants) for face classification

  • Experimented with Cross Entropy, Triplet & Center loss functions, with a max verification accuracy of 0.93 on CelebA dataset

Big Data Analytics on Twitter data
  • Designed a scalable friend recommender system on ~ 1TB of user data, hosted on SQL DBMS & tested on live queries

  • Performed MapReduce on AWS EMR for ETL & used TF-IDF for tweet analysis & PageRank for social graph analysis

Bias Identification and Mitigation in Text
  • Identified social bias in MultiNLI & SNLI datasets by PMI scoring & obfuscated bias using context-based unigram replacement

  • Evaluated bias in word embeddings (GloVe & polyglot) using WEAT & proposed adversarial training to debias embeddings

Ride-sharing Service ML Pipeline
  • Implemented an end-to-end ML Pipeline to match cab riders with drivers by deploying GCP ML APIs on Google App Engine

  • Predicted cab fares by training XGBoost for feature engineering and tuned hyperparameters on Google AI Platform

Multilingual POS Tagging
  • Implemented a BiLSTM for POS tagging across 8 languages and experimented with GloVE, FastText & polyglot word embeddings to improve performance in a multilingual setup

YouTube Trend Analytics
  • Designed and implemented an analytical model to perform exploratory & statistical analysis on YouTube trending data 

  • Model deployed linear regression to analyze factors causing videos to trend and identify biases in the data 

  • Integrated visualization for data results using Tableau and developed a website showcasing the study results 

CONTACT ME
CONTACT

Thanks for submitting!

  • Black LinkedIn Icon
  • Black Instagram Icon

Pulkit Goel

Language Technologies Institute

School of Computer Science - Carnegie Mellon University

​

Phone:

412-628-2010

​

Email:

pulkitgo@cs.cmu.edu​

pulkit.26mar@gmail.com

bottom of page