I have extensive research experience through three NSF REUs and multiple assistantships,
specializing in agentic AI.
I bring strong experience both in designing research projects from the ground up and in joining ongoing efforts
where I can advance it immediately and effectively.
Towards Capable and Secure Autonomous Computer Use Agents - Fall '24 to Present
Project Website
Conducted as an NSF REU research scholar under the mentorship and guidance of
Dr. Carlos Rubio-Medrano.
In this project, funded by two cycles of the NSF Computing Alliance of Hispanic-Serving Institutions REU program,
I investigated autonomous computer-use agents (ACUAs) — systems powered by large language models that
can operate a computer end-to-end. Unlike traditional chatbots, ACUAs navigate interfaces, execute tasks,
and make independent decisions, raising important questions about their reliability and security.
I designed and introduced one of the first systematic evaluation frameworks for ACUAs, testing agents from
OpenAI, Anthropic, and open-source projects across five task domains of increasing complexity, adapting
principles of an HCI IBM UI/UX quantitative assesssment to measure complexity. The study identified two classes of agents, full computer access
and browser-based agents.
Performance was measured with a seven-factor rubric assessing accuracy, adaptability, efficiency,
robustness, security, relevance, and consistency. Quantitative data such as completion rates, time, failed
interactions, and remediation percentages were collected for evaluation.
Findings revealed significant limitations: full computer-access agents often failed due to hallucinations,
navigation errors, and unauthorized system changes, while browser-based agents achieved higher success
rates but still showed vulnerabilities to prompt injection and inconsistent security awareness.
These results currently guide the development of an ACUA, integrating multi-agent orchestration,
machine learning, RAG, and security frameworks including access control and prompt verification.
Potential approaches for addressing LLM limitations, including chain-of-thought (CoT) reasoning, are under investigation
to improve decision making.
This work has been recognized with multiple honors, including a GMiS Student Poster Scholarship, an NSF Louis Stokes Alliance
for Minority Participation (LSAMP) Scholarship, a second NSF CAHSI REU award, and a scholarship with second place at the
international WiCyS Student Poster Competition.
The Cost of Being Helpful: Limitations and Vulnerabilities in RLHF-Trained Agents - Fall '25 to Present
Conducted as a research assistant, in the Cybersecurity Research and Innovation Laboratory at TAMU-CC, with PI
Dr. Carlos Rubio-Medrano, collaborating closely with Ph.D. student Jennifer Mondragon.
Large language models increasingly serve as the foundations for agentic intelligence. However, our lab's prior evaluations revealed a critical flaw—LLMs tend to
commit, rather than ommit when unsure what actions to take. This research aims to investigate the root of this error closer, which we hypothesize is heavily influenced by human intervention during RLHF.
I authored the research proposal on this work, which introduces the Commission-Induced Overreach (CIO) framework, a failure mode where
RLHF-trained agents over-optimize for helpfulness, performing unreasonable and unsafe actions. This framework includes the Commission-Induced Access Vulnerabilities (CIAV) taxonomy categorizing three vulnerability types in
RLHF agents: authorization overreach, policy circumvention, and hallucinated asset injection.
To evaluate our hypothesis, we are currently designing an experimental methodology comparing baseline RLHF against negative sample reinforcement, cost-constrained safe fine-tuning,
and KL penalties.
This work has been submitted for presentation at the International Women in Cybersecurity Conference 2026.
DEVELOPING AND OPTIMIZING LLM PIPELINES FOR SMART CITY SAFETY ANALYSIS - Summer '25 to Fall '25
Conducted as an NSF REU research scholar and research assistant, in the team of
Dr. Jee Woong Park of UNLV,
working in the projects of Unmesa Ray and Niloy Das.
In this project, funded by the NSF Smart Cities Research Experience for Undergraduates program, I investigated
large language model optimization techniques for analyzing unstructured construction safety data to enhance urban
infrastructure development. Unlike traditional data analysis approaches, LLM-pipeline optimization enables automated extraction
of critical insights from accident reports, raising important questions about scalability and real-world implementation in smart city frameworks.
Project I: As the computer scientist in a team of civil engineers, I developed novel LLM-pipeline optimization methods
to analyze construction accident reports, creating a prototype that processes unstructured safety data and extracts meaningful insights.
My work included researching optimized retrieval-augmented generation (RAG) and machine learning pipelines, and evaluating strategies
to improve efficiency while enhancing narrative tone extraction.
Project II: I collaborated on NDA research focused on machine learning for system analysis, collaborating
on techniques designed to enhance LLM performance and improve the accuracy of pattern extraction. My role was to work on implementing the AI aspects
of the experiments, as the only computer science on board.
While I cannot share specifics or findings prior to publication, I have been invited to continue collaborating as a research assistant. Project I
has evolved quickly, and I now lead the LLM-driven analytical methods. In addition, I have contributed to manuscript development
and gained valuable experience quickly integrating into a new interdisciplinary team.