I am a Graduate student doing my Masters in Computer Science at the University of Texas at Dallas. I enjoy solving problems and building useful applications. I am particularly interested in the application of Data Science tools and techniques.
Programming Languages : C, Java, Python (Scikit-learn, SciPy, Pandas, NumPy, Matplotlib, TensorFlow, Keras), R, Scala
Tools/IDEs : Databricks, Power BI, Eclipse, IntelliJ, Visual Studio, Matlab, LabVIEW
Web Languages/Technologies : HTML5, CSS, Bootstrap, RESTful API, XML, JavaScript, JQuery
Big Data Technologies Used : Microsoft Azure, Amazon S3, HDFS, Apache Spark, Apache Hive, Apache Kafka
-
Experience
Software Developer (R&D) Intern, ANSYS Inc. JAN 2020 – APRIL 2020
• Contributed towards the Research and Development of the Analytics Workbench for the Electronics Business Unit.
• Utilized Pyspark on Azure Databrick’s Apache Spark Engine to process and clean-up raw data. Reduced run-time of most dashboards by 20%-50 and reduced run-time of main dashboard by 85% by optimizing queries.
• Utilized Power BI and Databricks to create visualizations.
• Developed maintainable, readable code for running experiments and proofs-of-concept.
• Created new workflows and documented all key changes.
• Worked with very large, high-dimensional datasets using Apache Spark.
Graduate Teaching Assistant, UT Dallas : SEP 2019 – DEC 2019
• TA for the Course CS 3340 at UT Dallas (Undergraduate Computer Architecture).
• Graded Assignments, projects and performed other student evaluations.
• Helped students with queries and providing additional materials and guidance during office hours.
Graduate Student Instructor, UT Dallas : SEP 2018 – AUG 2019
• Instructed school students about programming fundamentals and mathematics through vibrant teaching methodologies.
• Helped organizing and conducting coding camps at college for students of Grades 1-10 (levels 1, 2 and 3).
• Developed new materials for the Robotics camps. Taught camps on basic Python and Arduino as well.
-
Projects
Text Summarization (Apache Spark, Scala, Amazon EMR, Amazon S3) SEP 2019 – DEC 2019
• Represented sentences in vectorized format and found the similarities between them using cosine similarity.
• Implemented PageRank algorithm to assign ranks to all sentences.
• Classified the sentences as either Summary text or non- Summary text using Decision trees.
Dog/Cat Image classifier (Python, Databricks) AUG 2019 – SEP 2019
• Utilized Deep Learning pipelines (API) in Apache Spark. Trained the model using large dataset of Dog and Cat images.
• Large training datasets were stored on Amazon S3 and read through Databricks to train the classifier.
• The classifier performed well with a test accuracy of 98%.