I am a computer scientist from BITS Pilani Dubai with my Masters in Applied Data Science from New York University. The focus of my academic and professional career has been on data and I have worked in all stages of the data lifecycle from ingestion to visualization.
-
Experience
Data Engineer(Amplo Global Inc.)
• Building and testing an audio to text classification element of the product that would help simplify the user experience for the customers.
• Building and automating the ETL process using Azure data factory triggers and pipelines.
Graduate Student Researcher(McDevitt Research group, New York University)
• Automated the download and storage of over 20,000 Web of Science pdf’s related to trauma using Python and Selenium.
• Consolidated data from Web of Science, Wiley, Pubmed, Science Direct and BioRxiv into an SQLite database.
• Filtered out foreign articles and classified pdf’s into Animal/Human related pdf’s using Python.
Graduate Student Researcher(Sounds of New York City, New York University)
• Compared the movement of noise complaints related to the Washington Square Park water mains construction with the noise levels in the area
through sensors in the area using Python.
• Created a dashboard allowing users to investigate trends in noise complaints and identify nearby street construction permits that could be a contributing factor using Dash and Python.
Product Development Analyst(DialoggBox)
• Built an interactive Excel dashboard using macros which acted as a template for the front end of the application.
• Developed functions linking the back end of the application on Amazon EC2 to an external API using SQL and Flask.
• Developed a text classifier to differentiate between professional conversation and personal conversation using Python.
DRAWBrooklyn(Urban Planning Intern)
• Proposed a data driven implementation of estimating the impact of a rezoning on the citibike, taxi and subway riderships using publicly available data and Python.
Business Operations Intern(Merck Sharpe & Dohme)
• Created and maintained data warehouse for sales and product data using SQL, Excel and VBA macros.
• Responsible for creating monthly summary dashboards and quarterly forecast setting dashboards using Tableau.
• Worked with 13 managers in Saudi to create a common yearly forecast dashboard using Excel and Tableau.
Business Analyst Intern(Scan Technology L.L.C.)
• Teamed with interns to build, clean and visualize a dataset of colleges around the world that would allow a prospective student to pick his ideal
college.
• Used Tableau to create dashboards highlighting KPI’s from different data sets.
-
Projects
Noise Inspection Scheduling(New York University)
Worked with the Department of Environmental Protection(DEP) to improve the percentage of enforced 311 construction noise complaints from
3% with the help of inter agency data like After Hours Construction permits. A dashboard was created using Python, machine learning models, ArcGIS and Tableau.
Correlation of drug-related arrests and twitter data(New York University)
Made use of PySpark to classify 18 GB(100 million tweets) geocoded tweets based on the presence of drug related keywords and checked for a correlation to drug related arrests in the neighborhood.
Graffiti Art vs Vandalism(New York University)
Used 311 complaints, NYPD Arrests and Flickr images to classify whether graffiti is perceived with a positive or negative influence and correlating
this with crime in the neighborhood. We observed a Spearman’s rank correlation coefficient of -0.68 and achieved this using Hadoop
MapReduce, Hive and Python.