← Home
Research
Agentic AI AI Applications Multi-Agent Systems Alignment & Safety LLM Security Agentic Architectures
Spring '26 – Present
LLM Agent Safety & Alignment — Ongoing Directions
Master's Research · Texas A&M University–Corpus Christi

My current work investigates alignment failures and safety vulnerabilities, including prompt injection and deceptive alignment, in compromised LLM agent communication. In parallel, I am probing goal misgeneralization and unsafe behaviors in the perception-action interface of LLM computer-use agentic systems. These are general directions as specifics are being refined.

Specific research direction actively being scoped.

2 NSF REUs · Fall '24 – Fall '25
Towards Capable and Secure Autonomous Computer Use Agents

In this project, funded by two cycles of the NSF Computing Alliance of Hispanic-Serving Institutions REU program, I investigated autonomous computer-use agents (ACUAs) — systems powered by large language models that can operate a computer end-to-end. Unlike traditional chatbots, ACUAs navigate interfaces, execute tasks, and make independent decisions, raising important questions about their reliability and security.

I designed and introduced one of the first systematic evaluation frameworks for ACUAs, testing agents from OpenAI, Anthropic, and open-source projects across five task domains of increasing complexity, adapting principles of an HCI IBM UI/UX quantitative assessment to measure complexity. The study identified two classes of agents: full computer access and browser-based agents.

Performance was measured with a seven-factor rubric assessing accuracy, adaptability, efficiency, robustness, security, relevance, and consistency. Quantitative data such as completion rates, time, failed interactions, and remediation percentages were collected for evaluation.

Findings revealed significant limitations: full computer-access agents often failed due to hallucinations, navigation errors, and unauthorized system changes, while browser-based agents achieved higher success rates but still showed vulnerabilities to prompt injection and inconsistent security awareness.

These results currently guide the development of an ACUA, integrating multi-agent orchestration, machine learning, RAG, and security frameworks including access control and prompt verification. Potential approaches for addressing LLM limitations, including chain-of-thought (CoT) reasoning, are under investigation to improve decision making.

This work has been recognized with multiple honors, including a GMiS Student Poster Scholarship, an NSF LSAMP Scholarship, a second NSF CAHSI REU award, and a scholarship with second place at the international WiCyS Student Poster Competition.

NSF REU · Summer–Fall '25
Developing and Optimizing LLM Pipelines for Smart City Safety Analysis
Team of Dr. Jee Woong Park, UNLV  ·  Projects of Unmesa Ray and Niloy Das

Funded by the NSF Smart Cities REU program, I investigated large language model optimization techniques for analyzing unstructured construction safety data to enhance urban infrastructure development. LLM-pipeline optimization enables automated extraction of critical insights from accident reports, raising important questions about scalability and real-world implementation in smart city frameworks.

Project I: As the computer scientist in a team of civil engineers, I developed novel LLM-pipeline optimization methods to analyze construction accident reports, creating a prototype that processes unstructured safety data and extracts meaningful insights. My work included researching optimized RAG and machine learning pipelines, and evaluating strategies to improve efficiency while enhancing narrative tone extraction.

Project II: I collaborated on NDA research focused on machine learning for system analysis, working on techniques designed to enhance LLM performance and improve the accuracy of pattern extraction as the only computer scientist on the team.

I have been invited to continue collaborating as a research assistant. Project I has evolved quickly and I now lead the LLM-driven analytical methods, contributing to manuscript development and gaining experience integrating quickly into a new interdisciplinary team.