Research

Advanced Sentiment Analysis and Topic Modelling on Indian Union Budget 2024

Reading the Indian Union Budget 2024, I was disheartened to see yet another budget pass without addressing Rajasthan’s water scarcity crisis—a critical issue for the farming community. Determined to bring this oversight to light, I analyzed over 4,500 budget-related tweets to capture the public’s response, especially around underrepresented issues. I built the Budget 2024 Tweets dataset, collecting tweets tagged for positive, negative, or neutral sentiment, and processed each tweet to remove noise and irrelevant content, ensuring clarity and relevance in the data.

I trained five different sentiment classification models, from traditional approaches like Logistic Regression and Naive Bayes to advanced deep learning models like DistilBERT. DistilBERT achieved 81% accuracy, capturing nuanced sentiment patterns across different sectors. This accuracy validated its effectiveness, particularly in short, expressive formats like tweets.

Using Non-negative Matrix Factorization (NMF), I identified six main themes from the public discourse, including middle-class concerns, tax policies, youth employment, and infrastructure. Each theme revealed specific reactions, and a sentiment-based word cloud underscored how certain demographics felt sidelined. Rajasthan’s water crisis, for instance, resonated strongly in the data but was missing in the actual budget discourse.

I want to ensure that farmers’ overlooked voices are heard. By translating public sentiment into actionable insights, I hope to provide policymakers with a clearer understanding of grassroots needs, encouraging them to address issues like water scarcity that are vital to my community’s future.

Enhancing Security and Privacy in Large Language Model-based Approaches: A Comprehensive Investigation

During the last five years, there have been significant developments in the field of Natural Language Processing including the deployment of advanced large language models such as ChatGPT, Bard, and Llama. These large language models are helpful in generating text and designing content and they have several applications in various industries. However, they can memorize and reveal malicious content and personal information from their training dataset which also includes an enormous amount of data from the internet. As a result, it can lead to compromised privacy and security challenges for users who have their personal information available on the internet directly or through third parties.

To address this issue, the proposed research work conducts a thorough investigation of these challenges and puts forward a prompt designing-based solution. In this method, I built a customized training dataset to fine-tune a pre-trained model (Llama-2) to produce a harmless response ‘I can’t provide you with this information’ to prompts seeking to extract personal information and malicious content from LLMs. Experimental results reveal that the proposed work achieves an accuracy of 63% with a precision score of 0.706 and a recall score of 0.571. The work ensures almost no leakage of private information and strengthens the LLM model against extraction attacks.

Material Science Student Research

Over the past decade, advancements in material science have introduced promising materials for environmental remediation, particularly in water purification. Reduced graphene oxide (rGO) has gained attention for its high surface area, exceptional adsorption capabilities, and potential for removing various water contaminants. My peers and I synthesized rGO and evaluated its efficacy in adsorbing and removing common water pollutants, such as heavy metals and organic dyes, through a series of rigorous experimental analyses.

To synthesize rGO, we followed a multi-step reduction process starting with graphene oxide (GO), prepared from graphite powder. GO was synthesized using the modified Hummers' method, which involved oxidizing graphite powder in a highly acidic environment with a strong oxidizing agent. This process introduced oxygenated functional groups on the graphite layers, which was crucial for achieving the high reactivity required in subsequent steps. Characterization of both GO and rGO was performed using Scanning Electron Microscopy (SEM) to assess morphological changes and surface texture before and after reduction. Structural and functional group transformations were further verified through Fourier-transform infrared spectroscopy (FTIR) and Attenuated Total Reflectance (ATR) spectroscopy.

The efficacy of rGO in removing water contaminants was tested by exposing it to model pollutant solutions under controlled laboratory conditions. We used spectrophotometric methods to monitor the concentration of contaminants, such as lead ions and methylene blue dye, in solution before and after treatment with rGO. Adsorption kinetics and isotherm models were applied to quantify the adsorption capacity and assess the mechanism of pollutant interaction with rGO.

Results demonstrated a notable reduction in pollutant concentration, indicating that rGO effectively adsorbs and removes both metal ions and organic compounds. These findings suggest that rGO could be a viable material for water treatment applications, particularly in regions with limited access to advanced filtration systems.

Inner Product Approach to generalize the notion of Pythagoras Theorem for normed spaces

The Pythagorean Theorem, a fundamental result in Euclidean geometry, traditionally relates the lengths of the sides of a right-angled triangle. In this paper, I extended the classical Pythagorean Theorem into the context of normed vector spaces, using the concept of inner products. I explored how the theorem manifests in higher-dimensional spaces and provided a generalized version applicable to normed spaces beyond two dimensions. This generalization not only reinforces the geometric interpretation of the theorem but also connects it to broader mathematical frameworks such as vector spaces, norms, and inner products. The results presented here demonstrate the versatility of the Pythagorean Theorem and its relevance across various fields of mathematics, highlighting its applications in both theoretical and applied contexts.