I am a Computer Science - Big Data Systems Graduate student, skilled in Machine Learning, Data Structures and Algorithms, ETL,
Python, SQL. I am actively looking for internship/Co-op opportunities in software development for Spring 2021(Starting January) as well as full time opportunities from August 2021.
-
Experience
Tata Consultancy Services, India Jun 2018 – Dec 2019
Worked as Data Engineer and responsibilities included:
● Automated the jobs, developed the code, mappings, Unix and Python scripts, loading the data after the cube refresh,
Migrating the analytic workloads to the cloud from on-premise servers.
● Converted existing Informatica PowerCenter mappings and ETL to be compatible with Snowflake Database.
● Updated PL/SQL procedures (SCD1, SCD2 etc.) to informatica mappings and at the same time created snowflake
functions in the database level for optimum performance of the mappings.
● Analyzed dependencies in different source systems, within tables, data sets and coordinated with other system owners
with day-to-day ETL progress monitoring. Created, executed, and documented unit test scenarios for ETL.
-
Projects
Gesture Recognition – [Python – Pandas, TensorFlow], ASU Jan 2020 – Apr 2020
● Developed an application to convert videos of ASL signs to JSON data and developed a RESTful API to predict the sign.
● Trained various Machine Learning models like Support Vector Machines, Random Forest, Linear Discriminant Analysis
using the JSON data of the recorded videos to predict its sign. Achieved accuracy of 78%.
Mobile Offloading –[Java], ASU Jan 2020 – Apr 2020
● Developed master and slave applications along with the distributed computing infrastructure to multiply two matrices
in a distributed manner on slave applications Integrated the results to the master application concurrently.
● Implemented tasks like monitoring battery level, location of slave mobiles in master application to disconnect slave
mobile according to the threshold values.
BigTable-like DBMS – [Java, SQL], ASU Jan 2020 – Apr 2020
● Created Heap files and index files to organize the data, taking minibase as a foundation.
● Implemented a command line program of batch insert, map insert, row join, sort, row sort and query with equality
search and range search along with different column sorting.
Visualization System – [Java, Python, D3, HTML, CSS, JavaScript, Tableau] May 2020 –Jun 2020
● Performed data analysis, aggregation, transformation and extracted the patterns.
● Built a visualization system to visually encode the extracted patterns & sequences to analyze clickstream data.
Geospatial Operations – [Scala, Hadoop, Apache Spark, MongoDB] May 2020 –Jun 2020
● Performed Geospatial operations on a distributed data processing system using clusters on 3 different machines.
● Processed more than 3 GB of NYC taxi-trip dataset and identified taxi hotspots using MapReduce algorithm