Shengguang Cui

AI Engineer & Researcher

Education

University of California, Los Angeles

M.S. in Electronic and Computer Engineering

Sep 2024 — Jun 2026Los Angeles, CA

GPA: 4.0/4.0

Courses: Deep Learning, Natural Language Processing, Neural Signal Processing, Information Theory, Trustworthy AI, AI on Chips

The Chinese University of Hong Kong

B.Eng. in Electronic Information Engineering — Computer Engineering

Sep 2020 — May 2024Shenzhen, China

GPA: 3.7/4.0 · Rank: 5/100 in major

Courses: Computer Architecture, Operating System, Parallel Computing, Database Systems, Algorithm Analysis, Machine Learning

Experience

Videospace, Inc.

Intern, Engineering and AI Research

Jun 2025 — PresentLos Angeles, CA
  • Built a live AI captioning system based on Wowza Streaming Engine by developing custom Java modules that capture and resample audio, stream it to speech-to-text service via WebSocket, and inject recognition results into HLS live streams, reducing end-to-end latency from 6s to 1s
  • Designed and implemented real-time translation and AI highlight services for live webcasts, wrapping LLM APIs call chains with streaming output support; implemented a configuration delivery path via REST API for per-session source language, target language, and custom vocabulary setup
  • Deployed in-house Whisper, PaddleOCR, and LLaMA models on Lightning AI with GPU auto-scaling and API serving, replacing external API dependencies; evaluated LoRA and RAG as knowledge injection strategies to drive continuous domain adaptation for client-specific use cases
  • Built a Slack integration module in Django handling OAuth authorization, message data modeling, and REST API endpoints, enabling users to retrieve and browse Slack workspace discussions within video session context
  • Adapted the internal video processing engine into an asynchronous message processing engine, using Celery to receive tasks dispatched from Django and execute message collection, conversation topic segmentation, and LLM-powered summary and title generation

University of California, Los Angeles

Research Assistant

Mar 2025 — May 2025Los Angeles, CA
  • Proposed Query-Aware Contrastive Decoding (QACD), a training-free hallucination suppression method for LVLMs which extends Visual Contrastive Decoding by having the LVLM generate query-conditional corruption in a single forward pass, with target regions derived from cross-attention
  • Implemented QACD in PyTorch on LLaVA-1.5, designing adversarial planner prompts to elicit structured corruption recipes, extracting target regions from cross-attention, and curating an image corruption operation set; evaluated on POPE against VCD and SelfAug-VCD baselines

Duke Kunshan University

Research Intern of Federated Learning

Feb 2023 — Dec 2023Shenzhen, China
  • Proposed HeteroPruneFL, a Federated Learning framework for device-heterogeneous settings that assigns each edge device a customized subnetwork via importance-based pruning, and introduces dynamic sparse training in local client training to adapt network topology to local data distributions, maintaining accuracy comparable to the full model under strict resource constraints
  • Built a reproducible PyTorch experiment environment including server-client orchestration, automated run scripts, and result analysis tooling, validating effectiveness across a 4-dataset × 3-baseline experiment matrix with consistent accuracy gains under all constraint settings

Projects

RAG-Based Emoji Toxicity Detection System

Project for course Natural Language Processing, UCLA

Jan 2026 — Mar 2026Los Angeles, CA
  • Built a context-aware emoji toxicity detection system for online content moderation using LangChain: constructed an emoji slang vector knowledge base on Pinecone, implemented hybrid retrieval combining exact symbol fetch with dense semantic search via query expansion, and fed retrieved entries into an LLM for structured toxicity classification, implementing both workflow and agent inference modes
  • Constructed an evaluation set testing the same emojis across harmful and benign contexts; compared raw LLM, workflow, and agent approaches using GPT-5, with workflow achieving 94.8% accuracy, outperforming raw LLM by 19 pp and surpassing agent in both accuracy and efficiency
  • Designed and implemented a dynamic knowledge base update pipeline based on Reddit corpus collection, LLM-assisted slang extraction, and multi-source cross-validation, with incremental indexing and automated regression testing

Automated Privacy Testing for LLMs through Fuzzing

Project for course Trustworthy AI, UCLA

Jan 2025 — Mar 2025Los Angeles, CA
  • Built an automated privacy attack testing framework for LLMs, evaluating their risk of extracting personally identifiable information; extended PROMPTFUZZ with privacy-oriented mutators and an HTML-aware template engine for automated adversarial prompt generation and evaluation
  • Achieved a +7 pp lift in PII extraction attack success rate (85%92%) on GPT-4o through iterative mutation and response analysis

Technical Skills

Programming Languages

PythonC/C++JavaSQLJavaScriptMATLAB

Frameworks & Tools

PyTorchCUDAHuggingFaceScikit-LearnMySQLDjangoREST APIsGitCelery
Download Resume

PDF Format

Shengguang Cui — Resume 2026