From pixels to personality: a cross-model comparison of hotel brand personality recognition

Tuesday, March 24, 2026 - 03:30 pm
online

 DISSERTATION DEFENSE

Author : Ningqiao Li
Advisor: Dr. Yan Tong
Date: March 24th, 2026
Time: 3:30 pm
Place: Virtual
Link:  https://teams.microsoft.com/meet/27045487833541?p=PtaQgpwzZvxJkYDZYu
Meeting ID: 270 454 878 335 41
Passcode: TP2Wm3dw
Abstract

 

As images have become a critical strategy for hospitality businesses to position and differentiate their brands online, and brand personality projected by brands and perceived by prospective consumers has been recognized as a key factor influencing booking decisions, this thesis investigates an understudied area: how brand personality perceived from hotel images can be automatically assessed using advanced computational models.

Specifically, this study systematically compares four model architectures (i.e., YOLO26 Nano, ResNet50, Swin-Small Transformer, and CLIP) across multiple label sources, including human rater annotations, labels generated by a single large language model (LLM) (GPT-4o), and average labels generated by multiple LLMs (GPT-4o, Gemini 2.5 Flash, and Claude Sonnet). A dataset of 2,182 hotel-generated images posted on a social media platform was annotated and evaluated across six brand personality dimensions: relaxing, hospitable, lively, distinctive, sophisticated, and wholesome.
The results demonstrate that CLIP, trained on multi-LLM averaged labels, achieves the highest performance, outperforming all image-only architectures as well as models trained on human annotations or a single LLM. This study contributes to a better understanding of how affective semantics can be effectively recognized by comparing different deep learning models and examining performance differences between models trained on human-labeled data and those trained on generative AI–labeled data. It further extends the discussion on the effectiveness of LLM-generated labels in contexts that require domain knowledge and higher-level semantic interpretation.

EEG P300 AI Classification and Accessibility Keyboard Control Using Open BCI's Ultracortex Mark IV for Reactive Brain-Computer Interaction

Tuesday, March 24, 2026 - 10:00 am
Room 2267, Innovation Building

 DISSERTATION DEFENSE

Author : Grant King
Advisor: Dr. Homayoun Valafar
Date: March 24th, 2026
Time: 10 am
Place: Room 2267, Innovation Building

Abstract

 

Brain-computer interfaces provide people who cannot feasibly control a keyboard and mouse with an alternative method to type, to move a cursor, or to generally issue commands and make selections. A non-invasive recording modality for brain-computer interfaces is electroencephalography, or EEG. The OpenBCI Ultracortex Mark IV is an EEG headset enabling relatively low-cost EEG acquisition using wet or dry electrodes. P300 is an event-related potential involving a deflection in an EEG channel's voltage typically 300 milliseconds after the onset of an "oddball" stimulus. A person's P300 response can be used to control a selector application. This study investigates classification of the visual P300 using the Mark IV, an example selector application, and finally a novel approach to flashing screen options (random subsets) and determining a user's desired option, demonstrated using the macOS Accessibility Keyboard. A neural network trained on data from a visual P300 task using a 10% probability oddball and evaluated on a separate test session achieved a .85 F1 score, demonstrating cross-session generalization. The same model achieved a .89 F1 score on data from a selector P300 task flashing one of 10 options with uniform probability, demonstrating cross-task generalization. Another neural network pretrained using the previous model, fine-tuned on data from the selector task, and evaluated on a live deployment session achieved a .85 F1 score, but only typing accuracy of 41.7% at 5.19 characters per minute, demonstrating the effect of false

positives. Another neural network trained on data from another selector task based on the macOS Accessibility Keyboard and evaluated within-subject achieved a .85 average F1 score among four
participants. This work contributes to the field of brain-computer interaction and could be applied to empower people with muscular or spinal disabilities to directly control any program
on their computer.

Advancing Edge AI through Integrated Neuromorphic Algorithms and Hardware

Monday, March 23, 2026 - 01:00 pm
Online/Room 2277, Storey Innovation Center

DISSERTATION DEFENSE

Author: Peyton Chandarana

Advisor: Ramtin Zand

Date: March 23, 2026

Time: 1:00 PM

Place: Online/Room 2277,  Storey Innovation Center

Remote join (MS Teams):

Link: Peyton Chandarana: Dissertation Defense | Microsoft Teams

Meeting ID: 263 606 597 033 12

Passcode: DD6ts7Mg


Abstract

The pursuit of energy-efficient intelligence for constrained and always-on sensing environments has positioned neuromorphic computing as a pivotal alternative to conventional von Neumann architectures through its adoption of asynchronous and event-based computing inspired by the biological brain. Additionally, outside of these constrained environments, neuromorphic computing design principles can help alleviate the current power and efficiency dilemmas put forth by the rapidly growing AI industry. This dissertation presents research focused on the hardware-software co-design of spiking neural networks (SNNs), progressing from foundational signal encoding techniques to the deployment of complex, heterogeneous, and hybrid systems. We start by focusing on the deployment of practical workloads, such as American Sign Language recognition, on Intel’s Loihi neuromorphic platform. Benchmarking against standard edge accelerators demonstrates that neuromorphic paradigms achieve significant gains in energy efficiency and power reduction, maximizing runtime on edge devices deployed as assistive technologies and reducing the overall energy footprint for tasks without much accuracy degradation. We then explore the integration of spiking and non-spiking domains to leverage the unique advantages of each. We present an end-to-end co-design framework that utilizes SNNs for temporal feature extraction and artificial neural networks (ANNs) for high-precision classification. To facilitate this integration, we propose custom interface hardware, specifically an accumulator circuit, designed to synchronize asynchronous spike streams for synchronous edge processing. These co-design principles provide a blueprint for the next generation of neuromorphic capabilities by highlighting areas for improvement and how co-design principles can be expanded to create more capable and reliable autonomous systems while alleviating current problems faced by the immense scale and consumption of AI workloads.

Deep Learning Algorithms for Generative Materials Design and Composition Based Property Prediction

Monday, March 23, 2026 - 10:00 am
Online/Room 2265, Storey Innovation Center

DISSERTATION DEFENSE

Author : Rongzhi Dong

Advisor: Dr. Jianjun Hu

Date: March 23, 2026

Time: 10:00AM

Place: Online/Room 2265,  Storey Innovation Center

Remote join (ZOOM):

Link: https://sc-edu.zoom.us/j/4997546955

 

 

Abstract

The accelerated discovery of novel functional materials is critical for advancing transformative technologies in energy storage, electronics, and catalysis, yet current strategies remain fundamentally constrained by the limited size of existing materials databases and the difficulty of building predictive models that generalize to unseen compounds. This dissertation addresses these challenges through five interconnected deep learning and machine learning studies. First, a diffusion language model framework is proposed for the generative design of novel inorganic materials, with DFT validation confirming the thermodynamic stability of newly identified compounds. Second, generative modeling is extended to two-dimensional (2D) materials discovery, producing diverse and stable candidates that substantially expand the known structural landscape of this emerging materials class. Third, CondADiT, a composition-conditioned latent diffusion framework, is introduced for crystal structure prediction directly from chemical composition, achieving state-of-the-art performance on multiple benchmarks. Fourth, DeepXRD is presented as a deep learning framework for predicting X-ray diffraction spectra directly from composition, enabling scalable structural inference without costly simulations or experimental measurements. Fifth, domain adaptation techniques are systematically evaluated for materials property prediction under realistic distribution shifts, demonstrating significant improvements in out-of-distribution generalization. Together, these contributions establish a comprehensive data-driven framework that integrates generative modeling, structure learning, and domain-adaptive prediction to accelerate the discovery of stable, synthesizable, and functionally diverse materials.

 

From Experience to Reasoning: Offline RL Subroutines and LLM-Based Grounding for Sample-Efficient Reinforcement Learning

Thursday, March 19, 2026 - 11:40 am
Online/Room 2267, Storey Innovation Center

DISSERTATION DEFENSE

Author : Jianhai Su

Advisor: Dr. Qi Zhang

Date: March 19, 2026

Time: 11:40 am- 1:40 pm (ET)

Place: Online/Room 2267,  Storey Innovation Center

Remote join (Microsoft Teams):

Link: https://teams.microsoft.com/meet/22389270607188?p=XPFrAyxA5Qo0IIh3tV

Meeting ID: 223 892 706 071 88

Passcode: YC7bg7zH

 

 

Abstract

Improving the learning efficiency of reinforcement learning (RL) agents remains a fundamental challenge, particularly in environments characterized by sparse rewards, long horizons, or partial observability. This dissertation investigates how RL agents can learn more efficiently through two complementary forms of guidance: mechanisms derived purely from an agent’s own experience and mechanisms that leverage reasoning priors from pretrained large language models (LLMs).

 

   On the experience-driven side, the first study develops a general framework for incorporating offline RL algorithms as subroutines within an online RL process. In this framework, an agent periodically repurposes its replay buffer as an offline dataset and applies offline optimization methods such as Implicit Q-Learning (IQL) or Calibrated Q-Learning (Cal-QL). Through systematic empirical analysis across diverse benchmark environments, this study characterizes when such experience-driven guidance improves policy quality under fixed interaction budgets and identifies several practical factors that influence its effectiveness.

 

   On the LLM-based side, the dissertation presents two complementary grounding approaches. The second study investigates implicit grounding, where a Flamingo-style vision–language model with an embedded pretrained language model acts as the high-level policy in a hierarchical RL agent. The agent processes multimodal interaction histories and proposes subgoals for a library of pretrained low-level skills, grounding pretrained language priors through policy learning.

 

   The third study introduces an explicit grounding framework in which reasoning traces produced by an external LLM are distilled into a latent reasoning module within a value-based RL agent. A potential function defined over this latent space is then learned from the agent’s trajectories and used for potential-based reward shaping. This dual-track framework combines reasoning transfer with interaction-driven learning to improve both learning efficiency and final policy performance.

 

   Together, these studies provide a structured investigation of how experience-driven learning and LLM-based grounding—both implicit and explicit—can guide reinforcement learning under realistic interaction constraints and offer practical insights for designing more sample-efficient RL agents.

Machine Learning Toward Materials Discovery: From Crystal Mapping & OOD Property Prediction to Radiation Detection & Emerging Foundation Models.

Wednesday, March 18, 2026 - 01:20 pm
Online/Room 2267, Storey Innovation Center

DISSERTATION DEFENSE

Author : Qinyang Li

Advisor: Dr. Jianjun Hu

Date: March 18, 2026

Time: 1:20PM

Place: Online/Room 2267,  Storey Innovation Center

Remote join (ZOOM):

Link: https://sc-edu.zoom.us/j/4997546955

 

Abstract

The discovery and optimization of advanced materials are central to addressing global challenges in energy, healthcare, and sustainability. This dissertation develops representation-aware and distribution-aware machine learning frameworks to improve robustness, generalization, and interpretability in materials informatics and radiation detection. The work spans crystal structure mapping, adversarial learning for out-of-distribution prediction, foundation-model-based property prediction, and deep neural classification of photon interactions. A global mapping framework is first introduced to analyze the inorganic materials space using compositional, structural, physical, and neural descriptors derived from the Materials Project database. By embedding materials into low-dimensional manifolds, the framework reveals clustering behavior and structure–property relationships, enabling systematic exploration of underrepresented material families.

To address distributional fragility in materials property prediction, the Crystal Adversarial Learning (CAL) algorithm is developed. CAL synthesizes adversarial samples in high-uncertainty regions and incorporates stability-aware training objectives, improving generalization under covariate, prior, and relation shifts. Experimental results demonstrate enhanced robustness in data-scarce regimes typical of experimental materials research.

 

The dissertation further investigates in-context foundation models for data-efficient property prediction. By integrating a pretrained tabular transformer with compositional descriptors and graph-derived structural embeddings, the proposed framework achieves competitive performance on the MatBench benchmark suite and on lattice thermal conductivity prediction without task-specific fine-tuning. Representation analyses indicate that foundation-model adaptation reorganizes latent feature spaces to better align with physical property gradients, particularly in small-to-medium data regimes.

 

Finally, a deep learning framework is applied to gamma-photon interaction classification in room-temperature semiconductor detectors. The proposed model distinguishes Compton scattering and photoelectric events from pulse waveforms with high accuracy and robustness across varying noise and energy conditions, demonstrating the transferability of representation-learning principles to signal-level scientific data.

 

Collectively, this work advances machine learning methodologies that integrate representation geometry, distributional robustness, and physical interpretability across heterogeneous scientific domains. The developed approaches provide scalable and interpretable tools for accelerating materials discovery and improving radiation detection systems.

Enhancing V2V Network Communication Reliability under Severe Weather

Tuesday, March 17, 2026 - 12:30 pm
Online/Room 2267, Storey Innovation Center

DISSERTATION DEFENSE

Author : Jian Liu

Advisor: Dr. Chin-Tser Huang

Date: March 17, 2026

Time: 12:30 PM

Place: Online/Room 2267,  Storey Innovation Center

Abstract

In this dissertation, we address a key reliability challenge in connected vehicle networks: V2V links can degrade sharply under adverse weather, especially in 5G mmWave channels, where environmental attenuation can be severe in regions with dust and sandstorms. Because conducting controlled field experiments in extreme weather is costly and difficult, this dissertation develops simulation-driven solutions that characterize weather-induced degradation. First, it introduces the first open-source NS-3 weather simulator for studying the adverse weather impacts on 5G mmWave V2V communications, enabling systematic evaluation under diverse environmental conditions. Building on this capability, the dissertation investigates predictive analytics such as ARIMA, Prophet, LSTM, and GRU to forecast weather-related performance degradation. We use these predictions to design a proactive channel-switching strategy that transitions from 5G mmWave to 4G LTE before major reliability loss occurs. Next, it advances beyond prediction-based control by developing a deep reinforcement learning (DRL) channel-switching approach that learns optimal switching decisions online using cumulative throughput as feedback, enabling vehicles to adapt autonomously to real-time environmental changes. Finally, this dissertation proposes a weather-aware, reinforcement learning–based open-loop power control method for decentralized sidelink V2V communication. Each vehicle learns how to adjust its transmitted power using only information it can measure locally together with the extra path loss caused by weather. In simulations from clear weather to severe rain, this approach achieves higher packet reception ratio (PRR) than the baseline 3GPP strategy and existing open-loop power control methods.

Rating AI Models for Robustness through a Causal Lens

Wednesday, February 4, 2026 - 09:30 am

DISSERTATION DEFENSE
Department of Computer Science and Engineering


Author : Kausik Lakkaraju
Advisor: Dr. Biplav Srivastava
Date: Feb 4th, 2026
Time: 9:30 am
Place: Room 529,  AI Institute

Abstract

 

This dissertation examines how to assess and rate instability and bias in black-box AI models, with particular attention to large language models (LLMs) and composite AI models used in finance, healthcare, and other decision-sensitive contexts. Prior studies show that small changes in input or protected attributes (sensitive user information) can cause large shifts in model outputs, an issue that becomes more pronounced when multiple models are chained together to form a composite AI model.


The work introduces a causality-based rating method that tests black-box models to quantify sensitivity, statistical bias, and confounding effects under controlled input variations. Beyond measurement, the rating method converts raw metric scores into comparable ratings that aid users in model selection, provide holistic explanations when used in conjunction with traditional explanation methods to cater to the needs of multiple stakeholders, and support the assessment and construction of robust and efficient composite AI models when integrated with probabilistic planning methods. The rating method helps users make trade-offs among fairness, utility, and computational cost when choosing a model for a task based on the data in hand.

To support practical adoption, the dissertation presents ARC (AI Rating through Causality), a tool that applies the method across multiple tasks, supports Pareto analysis, and allows users to evaluate their own models within a fixed causal setup. User studies show that ratings reduce the effort required to understand model behavior and help users build efficient composite chatbots. This work also underpins a forthcoming Springer Nature book, Assessing, Explaining, and Rating AI Systems for Trust, With Applications in Finance.

Safe AI for Senior (Citizens)

Friday, November 14, 2025 - 09:00 am
1112 Green St, Columbia, South Carolina 29208

Join us for a free event, AIx: Safe AI for Seniors on 14th November, 2025 (Friday). The event will feature panels with leading academics, professionals, and community members, on AI, cybersecurity, law and public health; open discussions on AI and what it means for seniors, AI demos, games, and lunch! More information and RSVP here.

Multi-Perspective Feature Learning for Facial Expression Recognition in the Wild

Friday, October 24, 2025 - 12:30 pm

DISSERTATION DEFENSE

Author : Xiangyu Hu
Advisor: Dr. Yan Tong
Date: Oct 24th, 2025
Time: 12:30 pm
Place: Teams
Link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_Zjk5ZGM3NzctMzZm…

Abstract

With the rapid progress of deep learning, Facial Expression Recognition (FER) has seen substantial improvements in performance, particularly “in the wild” meaning real world conditions. Despite these advances, most existing methods extract features from facial images as the sole emotional cues, which limits the model’s ability to capture the full complexity of human emotional expressions.

In reality, facial expressions are composed of diverse and multi-perspective information, including appearance-based cues and geometric structural deformations due to activations of facial muscles. Depending exclusively on one type of representation may fail to exploit the complementary nature of these cues, an issue that becomes especially pronounced under real-world conditions involving poor image quality, occlusion, varying head poses, and diverse personal attributes.

To overcome this limitation, we investigate how multiple perspectives of information, such as multiple levels of semantic patterns, facial geometry captured in facial landmarks, and multimodal representation of facial expressions, can be effectively extracted from the same facial image and integrated to enrich expression-discriminative feature representations. This multi-perspective feature learning strategy not only provides a holistic interpretation of facial expressions, but also encourages the learning of robust, multi-level representations that enhance generalization.

Motivated by this, we introduce three novel models designed to extract and fuse complementary features across different representations from facial images, thereby improving both the accuracy and robustness of FER systems.

First, we propose a Cascaded Feature Fusion Network (CFFN) that leverages low-level semantic features to refine predictions typically dominated by high-level semantic information. CFFN utilizes a multi-branch architecture featuring Semantic Feature Fusion Blocks (SFFB) to enable effective communication between neighboring branches. Additionally, Multi-Branch Fusion Blocks (MBFB) integrate multi-scale semantic features, facilitating predictions from multilevel features. Experimental results demonstrate that the proposed model achieves state-of-the-art performance, with further cross-dataset evaluations highlighting its generalization capability.

Secondly, we propose a Context-Aware Multi-cue Model (CAMM) to enhance FER by jointly leveraging appearance, geometric, and semantic information. The framework utilizes two coordinated CNN backbones to extract complementary facial appearance and geometry features; while a pretrained vision–language model generates descriptive captions that are encoded into semantic embeddings. These embeddings are incorporated into both visual branches through a Text Fusion Block (TFB) built upon Adaptive Instance Normalization, enabling adaptive modulation of visual representations guided by global semantic context. In addition, a Weighted Dilated Block (WDB) is introduced to aggregate multi-scale spatial information with learnable attention weights, thereby enhancing contextual perception. By aligning high-level semantics with spatial structure and visual appearance, CAMM produces robust and discriminative representations, achieving state-of-the-art performance under real-world conditions.

Third, we introduce a Semantic-Consensus Multi-Modal Learning (SC-MML) framework to address the challenge of noisy labels in in-the-wild FER datasets. SC-MML incorporates high-level textual descriptions generated by a pretrained vision–language model as an auxiliary modality, providing robust semantic cues that capture nuanced facial attributes and contextual emotion. The framework comprises two coordinated components: a Consensus Branch that constructs noise-robust soft labels by aggregating mutual nearest neighbors across visual and textual embedding spaces, and a Discriminative Branch equipped with a Query-Guided Gated Fusion (QGGF) module. The QGGF adaptively fuses semantic and visual representations through a gating mechanism that highlights consistent and informative cues while suppressing noisy or redundant information. By grounding supervision in cross-modal semantic consensus rather than potentially corrupted categorical annotations, SC-MML effectively decouples learning from noisy labels and enhances representation reliability. This consensus-driven design strengthens the robustness to annotation noise and improves generalization in complex real-world scenarios. Extensive evaluations on multiple benchmark FER datasets demonstrate that SC-MML surpasses existing noise-robust methods, offering a principled and efficient paradigm for multimodal learning under noisy supervision.