I recently completed my Master's in Information Technology and Management, focused on Business Intelligence and Analytics at The University of Texas at Dallas. Having a Masters degree in Information Technology and Management, I possess solid math and quantitative background with a strong foundation in computing. I completed a 2-year technology entrepreneurship program at The Indian School of Business. Thus, I can say that I am a focused individual with an entrepreneur mindest in solving real-world challenges. Success in science fairs during high school made me inclined towards data. I have around 3+ years of experience in handling and analyzing data as a Data Scientist, Data Engineer, and Business Intelligence Analyst. I know how the work gets done in different organizations as I have experience working in Student organizations, start-ups and big Multi-National Companies.
-
Experience
tekVizion, Plano, TX, USA from May 2019- December 2019 as a Data Science (Python) Intern:
• Simulated network-switch level optimizations to identify categories with maximum impact using Machine Learning models.
• Extracted clean real-time data from Oracle SQL database and SAP database using ETL and created a data collection system.
• Exploratory analysis terabytes of data from network switches, routers using Big Data tools such as Sqoop, Hadoop and Hive.
• Performed end-to-end analysis, from data extraction to dashboard creation. Created an automated report generator.
• Coordinated product launches with Cross-Functional Teams resulting in 50% reduction in issues. Spreading best practices to
analytics and product teams. Monitoring key product metrics, understanding root causes of changes in metrics.
Netcracker Technology Solutions from July 2016- July 2018 as a Data Engineer:
• Delivered the software-based product-MRE (Marketing Rules Engine) for AT&T. Automated the process of regression testing-
Rating and Billing using Python, SQL which can be used to create accounts, rate and bill events in less than 3 minutes.
• Proficient in Equipment Data Warehousing (eCDW) of the transactions. Worked on data modeling using Erwin.
• Utilized Sqoop, Flume to extract the data from Oracle production database and stored the transformed data in HDFS.
• Aggregated and analyzed large scale structured and unstructured data using SQL and Python. Experienced in data mining and
communicating summary metrics/ reports from a dataset. Automated weekly report generation/ visualization using Tableau.
• Coordinate customer calls, gather requirements and analyze them. Participate in customer, financial analysis of the product.
• Lead 31 Rapid Deployments and delivered them with 0 production defects. Honored with ‘You’ve Made the Difference’
award for an outstanding effort. Strong knowledge of STLC and SDLC with practical experience in ‘Agile’ SDLC.
-
Projects
Sentiment Analysis on Demonetization using Big Data:
• I analyzed the views of different people on the effect of demonetization by using the tweets from twitter.
• After importing the data from twitter, initially, I classified the tweets as strong positive, strong negative, weak positive, weak negative, neutral based on the sentiments.
• In this particular project:
1. I utilized Sqoop to import the data from twitter.
2. Utilised HDFS which is Hadoop Distributed File System to store the data.
3. Used Hive-QL which is similar to that of SQL to analyze the data.
4. Utilised HDFS again to store the results and finally.
5. Finally, used Tableau to visualize the data by connecting it to HDFS.
Audit Risk Calculation in a financial firm using Machine Learning:
• I calculated audit risk in a financial firm using the techniques of Machine Learning.
• I took the real data set of Bank of America from Kaggle.
• Initially, I cleaned the data. I spent 70% of my time in cleaning the data according to the requirements.
• Later, I analyzed the pre-processed data using Machine Learning techniques such as - Regression, Classification and Deep Learning models in Python.
• Finally, I optimized the results using the techniques such as gradient boosting, dimensionality reduction, ensemble learning and model evaluation techniques.
Story telling using Tableau: Flight delays Visualization:
• Visualized how percentage of delay changes with time and location, number of delays each reason causes using Tableau.
• The dataset contains information on the United States flight delays and performance from 2013 until August, 2017.
• There are 5 main reasons of delay: air carrier, National Aviation System (NAS), weather, latearriving aircraft and security.
• This project shows how percentage of delay changes with time and location and how many delays each reason cause. The airports are grouped according to size and location to see how location and size of airport influence the delays of flights.
• Readers can click on the map to interact with the charts.
Modeling and Data warehousing of Movie Database using MySQL
• Designed and developed a movie database using MySQL relational database.
• Designed the Conceptual, Logical and Physical Data Models. Executed data modeling using ERwin data modeling tool and created ER diagrams.
• Analyzed Source Data to determine what meta-data was to be included in the Logical Data Model.
• Engaged in data profiling to integrate data from different sources.
• Created a datawarehouse architecture using Star schema that can be used for Analytical Processing. Created the data models for OLTP and analytical systems.
• Designed and developed Data Marts with respect to the genre of the movies.