Shengguang Cui
AI Engineer & Researcher
Education
University of California, Los Angeles
M.S. in Electronic and Computer Engineering
Sep 2024 — Jun 2026Los Angeles, CA
GPA: 4.0/4.0
Courses: Deep Learning, Natural Language Processing, Neural Signal Processing, Information Theory, Trustworthy AI, AI on Chips
The Chinese University of Hong Kong
B.Eng. in Electronic Information Engineering — Computer Engineering
Sep 2020 — May 2024Shenzhen, China
GPA: 3.7/4.0 · Rank: 5/100 in major
Courses: Computer Architecture, Operating System, Parallel Computing, Database Systems, Algorithm Analysis, Machine Learning
Experience
Videospace, Inc.
Intern, Engineering and AI Research
Jun 2025 — PresentLos Angeles, CA
- ●Built a live AI captioning system based on Wowza Streaming Engine by developing custom Java modules that capture and resample audio, stream it to speech-to-text service via WebSocket, and inject recognition results into HLS live streams, reducing end-to-end latency from 6s to 1s
- ●Designed and implemented real-time translation and AI highlight services for live webcasts, wrapping LLM APIs call chains with streaming output support; implemented a configuration delivery path via REST API for per-session source language, target language, and custom vocabulary setup
- ●Deployed in-house Whisper, PaddleOCR, and LLaMA models on Lightning AI with GPU auto-scaling and API serving, replacing external API dependencies; evaluated LoRA and RAG as knowledge injection strategies to drive continuous domain adaptation for client-specific use cases
- ●Built a Slack integration module in Django handling OAuth authorization, message data modeling, and REST API endpoints, enabling users to retrieve and browse Slack workspace discussions within video session context
- ●Adapted the internal video processing engine into an asynchronous message processing engine, using Celery to receive tasks dispatched from Django and execute message collection, conversation topic segmentation, and LLM-powered summary and title generation
University of California, Los Angeles
Research Assistant
Mar 2025 — May 2025Los Angeles, CA
- ●Proposed Query-Aware Contrastive Decoding (QACD), a training-free hallucination suppression method for LVLMs which extends Visual Contrastive Decoding by having the LVLM generate query-conditional corruption in a single forward pass, with target regions derived from cross-attention
- ●Implemented QACD in PyTorch on LLaVA-1.5, designing adversarial planner prompts to elicit structured corruption recipes, extracting target regions from cross-attention, and curating an image corruption operation set; evaluated on POPE against VCD and SelfAug-VCD baselines
Duke Kunshan University
Research Intern of Federated Learning
Feb 2023 — Dec 2023Shenzhen, China
- ●Proposed HeteroPruneFL, a Federated Learning framework for device-heterogeneous settings that assigns each edge device a customized subnetwork via importance-based pruning, and introduces dynamic sparse training in local client training to adapt network topology to local data distributions, maintaining accuracy comparable to the full model under strict resource constraints
- ●Built a reproducible PyTorch experiment environment including server-client orchestration, automated run scripts, and result analysis tooling, validating effectiveness across a 4-dataset × 3-baseline experiment matrix with consistent accuracy gains under all constraint settings
Projects
RAG-Based Emoji Toxicity Detection System
Project for course Natural Language Processing, UCLA
Jan 2026 — Mar 2026Los Angeles, CA
- ●Built a context-aware emoji toxicity detection system for online content moderation using LangChain: constructed an emoji slang vector knowledge base on Pinecone, implemented hybrid retrieval combining exact symbol fetch with dense semantic search via query expansion, and fed retrieved entries into an LLM for structured toxicity classification, implementing both workflow and agent inference modes
- ●Constructed an evaluation set testing the same emojis across harmful and benign contexts; compared raw LLM, workflow, and agent approaches using GPT-5, with workflow achieving 94.8% accuracy, outperforming raw LLM by 19 pp and surpassing agent in both accuracy and efficiency
- ●Designed and implemented a dynamic knowledge base update pipeline based on Reddit corpus collection, LLM-assisted slang extraction, and multi-source cross-validation, with incremental indexing and automated regression testing
Automated Privacy Testing for LLMs through Fuzzing
Project for course Trustworthy AI, UCLA
Jan 2025 — Mar 2025Los Angeles, CA
- ●Built an automated privacy attack testing framework for LLMs, evaluating their risk of extracting personally identifiable information; extended PROMPTFUZZ with privacy-oriented mutators and an HTML-aware template engine for automated adversarial prompt generation and evaluation
- ●Achieved a +7 pp lift in PII extraction attack success rate (85%→92%) on GPT-4o through iterative mutation and response analysis
Technical Skills
Programming Languages
PythonC/C++JavaSQLJavaScriptMATLAB
Frameworks & Tools
PyTorchCUDAHuggingFaceScikit-LearnMySQLDjangoREST APIsGitCelery
Download Resume
PDF Format
Shengguang Cui — Resume 2026