As an accomplished data professional with a background in the consulting and insuretech industries, I am passionate about developing products with data-driven insights. With over 2 years of experience, I have delivered multiple data integration and analytical projects in the domain of Healthcare and Insurance.
Currently, I am working as a Data Analyst intern at Terrene Labs, LLC in their data services team and my work involves statistical analysis using Python and MS Excel to identify risk profiles of clients and provide better underwriting decisions and results. My role also involves wrangling and cleansing raw data to create client profile reports and prepare data for the enhancement and improvement of the firm's analytic engine in identifying risks.
My previous experience was with Deloitte Consulting US-India Pvt. Ltd. as a Business Technology Analyst in the strategy and analytics service area, where I delivered big data integration projects involving skills in SQL, Apache Spark, Hadoop, Hive and MS Excel in the domain of Healthcare and Insurance. As part of this role, I analyzed healthcare claims data to develop Spark applications using Scala to load data into Hive tables to be used in client reports. Additionally, I developed test plans and test cases using MS Excel and Hive, achieving zero defects in client reports.
I have also delivered multiple academic projects in R, SAS, Tableau, Power BI during my Master’s in Information Systems at the University of Cincinnati. I have gained more in-depth knowledge in the field of Data Analytics and Business Intelligence during my graduate studies. The course curriculum with specialized subjects like Data Warehousing and Business Intelligence, Big Data, Data Wrangling, etc. has exposed me to the latest conceptual developments and the industry’s best practices.
Proficiency and skills include:
• Languages: SQL, Python, R, SAS, HiveQL, Spark SQL, Core Java, PySpark, Scala
• Analytical/Reporting tools: R, Python, Tableau, SAS, Power BI, Apache Spark, Hive, MS Excel
• Databases/Technologies: MS Excel, Oracle DB, MYSQL, Microsoft Access, Hue, Hadoop, HDFS
• Data Analysis: Reporting, Data Wrangling, Data Visualization, ETL, Business Intelligence, Validation
• Business Analysis/ Other tools: Requirements gathering, Agile, JIRA, MS Project, Jupyter Notebook, Eclipse IDE, Putty
-
Experience
Terrene Labs, LLC
Data Analyst Intern
• Performed statistical analysis to identify risk profiles of clients and provide better underwriting decisions and results
• Extracted and wrangled web scrapped data using Python to create client profile reports with an accuracy of 90%
• Prepared data mappings from multiple sources to load data to the firm's analytic engine for identifying risks
Deloitte Consulting US-India Pvt. Ltd.
Business Technology Analyst
• Applied agile practices to report recovered health care fraud, waste & abuse claims, with a 30% increase in customers, generating $XXX M and 30% increase in new customers
• Liaised between business and technical teams by documenting source-to-target mappings and process flows
• Performed cleansing, wrangling, and profiling using MS Excel prior to transformations, optimizing loads by 60%
• Created Spark applications using Scala to load target tables for data consumption by client reports with 75% efficiency
• Built and triggered required Spark scripts utilizing Scala shell commands to perform ETL loads on Hive tables
• Involved in the development of Hadoop Test Automation framework using Spark to accelerate QA tasks by 50%
• Prepared test plans, test cases and created HiveQL/SQL queries to perform SIT, achieving 95% data quality check
• Identified and resolved multiple defects into JIRA Agile application, reducing defect count to 0 in the client reports
• Mentored and trained team members on Spark, Hadoop and other big data tools used in analytical processing
-
Projects
Telecom churn prediction – R Studio
• Performed cleansing and exploratory data analysis to determine the key drivers which influence the churn rate
• Built logistic regression and ANN models to predict customer churns with an accuracy of 88% and 93% respectively
Online retail customer clustering – R Studio
• Cleansed and visualized dataset to prepare customer data based on RFM (recency, frequency, and monetary)
• Segmented RFM data by K-Means and Hierarchical clustering to target customer groups efficiently
Spotify - music track according to mood – R Studio
• Gained key insights on Spotify music population by identifying the most popular tracks and their artists in each genre based on mood factors like danceability, energy level, and positivity, with an accuracy of 95%
Humanitarian AID – Power BI
• Created visualizations using join queries and complex calculations to gain insights on countries in need of relief based on child mortality, life expectancy, GDP, inflation rate, etc
International students in the US – Tableau
• Designed interactive dashboards to gain insights on student migration statistics in the US for 5 years (2013-2018)
• Analyzed factors like students by academic level, %enrollment deviation, top countries, etc to identify patterns by 85%
House price prediction – Python (Jupyter Notebook)
• Analyzed house sales by wrangling and visualizing features like bedrooms, bathrooms, square foot, floors, etc.
• Fitted a linear regression model to predict housing price with a model fit accuracy of 87%