Roshini Pulishetty

Projects

Transformer-based 3D Single Object Tracking (Mar-May 2024)

In this study, we evaluated transformers as a feature fusion module and reimplemented feature-extractor in the classic Siamese-based architecture. This resulted in an improvement in mean success/precision score to (66.0/45.5) in the overall tracking of various category objects compared to (60.0/42.4) using P2B (point-to-box), while it gave a substantial improvement in tracking non-rigid objects like pedestrains. However, P2B outperformed this model in tracking rigid objects.

CODE REPORT

Breaking Bias: Quantify and Reduce LLMs Bias Towards Specific Options (Mar-May 2024)

This project examines the behavior of Large Language Models (LLMs) when answering multiple-choice questions. We primarily studied two types of inherent bias - LLMs tending to choose an answer driven by the option token it is associated with, termed "token bias" and their tendency to choose an answer based on the option's position, called "position bias" on 8 models across 3 families - Gemma, Mistral and Llama. By inferring responses on permuted questions and employing a statistical evaluation framework to gauge incorrect likelihoods, recall imbalance, and proportion of plurality agreement, we measured the bias with 99.5% confidence. We concluded that token and position biases exist independently, and their complex interplay leads the LLM to choose an incorrect answer. We demonstrated that bias varies across different models though they belong to the same family while it is domain-independent. Through experiments, we could show that LoRA fine-tuning with permuted training data mitigates biases on out-of-domain data and is sample-efficient.

CODE REPORT

Image Colorization Using Auto-encoders (Nov 2023)

I designed and implemented auto-encoders for coloring the images suitably. Given a gray-scale image, the task is to predict the colors for each segment. While linear auto-encoders tend to underfit the training data, the neural architecture gave a perk in the performance. I trained the neural architecture of 5M parameters with Leaky ReLU and Tanh as the non-linearities with the Mean Square Error loss function. I then fine-tuned the hyper-parameters - learning rate, epochs, optimizer - using grid-search.

CODE

Mocking HDFS Cluster (Oct - Nov 2023)

We developed a cluster of 5 data nodes with a name node being the leader on a docker container to analyze the fault tolerance, data consistency, and throughput. By concurrently sending around 100,000 read and write requests with different sets of nodes down, we closely observed how the cluster tries to be tolerant towards the faults compared to the standard file system. This cluster has consistent writes though it doesn't guarantee consistent reads.

PRESENTATION

Anomaly Detection In Time Series Multivariate Data (Jan - May 2019)

To detect outliers on a streaming time series like traffic data of 9 lanes collected over an year, I initially conducted exploratory data analysis to understand the data, followed by data cleaning to fill in the missing values. Then, I designed a data-driven architecture of clustering. This model encompassed the Self-Organised Maps (SOM) with the new Gray Relational Coefficients to cluster. While we backed the Gray Coefficients with the mathematical proofs, we also analyzed and gained an intuition for why these coefficients could be appropriate. We, then, analyzed its performance in terms of precision, recall, and F1 score over traditional SOM.

Breakout Game Development (Sept - Oct 2018)

In this project, I developed a GUI-enabled interactive game where the model learns from the users' moves. Building on the concept of Q-Learning, the model mimics human behavior with good actions being rewarded and bad actions getting penalized.

CODE REPORT

Creation Of Music Player (Mar - Apr 2018)

I designed a music player with a multitude of functions - play songs of choice, customize playlists, and sort and search based on genre, artist, and title, and concurrently tested by 10 users with multiple customized playlists.

CODE PROPOSAL

TCL Type Checker (Aug - Dec 2018)

This project is centered around providing a robust type-checker for TCL language. This involved designing the principles for structure, name or internal equivalence among the variable types, implementing them and validating against complex test cases.

CODE DOCUMENTS