Research Experience
Machine Learning, Perception, and Cognition Lab, University of California, San Diego 07/2022–11/2022
Advisor: Prof. Zhuowen Tu
Open-Vocabulary Image Segmentation
- Proposed a universal two-stage framework for panoptic, instance and semantic segmentation.
- Focused on adapting CLIP’s feature space for fine-grained segmentation, and found out an efficient and effective way of finetuning the query projections of CLIP’s visual branch.
- Outperformed previous works on all three tasks by at least 25%. Co-first authored a paper to appear in ICCV 2023.
Graphics and Geometric Computing Group, Tsinghua University 04/2021–07/2022
Advisor: Prof. Shimin Hu and Prof. Songhai Zhang
Write A Video: Automatic Video Editing System
- Proposed a three-step to generate well-edited short movies from input videos, simplified user input into text descriptions and choice making.
- Utilized CLIP for multi-label theme mapping and video clustering.
Automatic Commercial Place Planning System
- Parameterized design plans for commercial places into a set of patterns and constraints, and designed an optimization algorithm based on simulation annealing.
- Reached comparable results with professional designers in terms of scene plausibility and shopping efficiency in user study. Contributed to a paper submitted to IEEE VR 2023.
Semi-superivised Image Cropping
- Applied GANs for cropping images into aesthetically composed sub-images. Solved model collapse by adding saliency detection as features of image structure and designing special loss based on prior knowledge of photography.