Adversarial Testing of LLMs in Less Common Languages to Test and Analyze Vulnerabilities in ChatGPT and Gemini
James Mardi
Co-Presenters: Individual Presentation
College: The Dorothy and George Hennings College of Science, Mathematics and Technology
Major: Computer Science
Faculty Research Mentor: Yulia Kumar
Abstract:
This study investigates the vulnerabilities of multiple large language models (LLMs), including ChatGPT-4o Mini, Gemini Flash, ChatGPT-4o, and others. The models are tested using a series of adversarial prompts across multiple languages to assess their ability to handle ethically and legally complex queries. Initial testing is conducted in English, followed by evaluation in less commonly represented languages, specifically French and Haitian Creole. A total of over 20 prompts, iteratively modified based on model responses, are used to systematically evaluate the models' security measures and responsiveness. Additionally, the response speed of each model is analyzed to identify performance variations. This research builds on prior studies (Paredes et al., 2024) and introduces the Adversarial Response Scoring System (ARSS), an approach designed to measure the models' security awareness and judgment. The findings aim to uncover potential weaknesses in LLM security, particularly in low-resource languages, and provide insights into their susceptibility to adversarial misuse.