I am Chenhao Qiu (邱晨浩), an AI researcher studying misalignment in agentic systems — and increasingly in agentic reinforcement learning: why an agent’s actions and answers drift away from the evidence and objectives it is meant to follow, and how to pull them back into alignment. My first-author work on decoupling answer authority in long-video agents appears at ICML 2026.

My path into research ran through competition rather than the lab. I spent most of my master’s deep in algorithm contests, working across nearly the full stack of machine learning — from traditional data mining and time-series modeling to computer vision, NLP, and, more recently, large language and multimodal models. Competing across so many directions is where my taste for research took shape: I kept hitting questions the leaderboard could not answer. It was only after starting full-time work that I finally had the room to chase those questions properly and turn them into papers — ICML 2026 is the first.

Today I research at the Foundation Model Research Group at Mango TV, after algorithm internships on the foundation-model SFT teams at Tencent (Hunyuan) and Meituan (Longcat) (details in Work Experience and Internships below). That competitive breadth still shows on the record: 8 First-Place and 8 Second-Place finishes in major AI contests, with cumulative prizes exceeding $530,000.

I am applying to PhD programs in North America for Fall 2027, and I am actively seeking research collaborations and research internship opportunities. Feel free to reach out — my contact details are at the bottom of this page.

🔥 News

2026.05: 🎉 Our paper Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority is accepted by ICML 2026 as a Poster.
2026.03: 🏆 Won 1st Place at the CVPR 2026 NTIRE Challenge on X-AIGC Quality Assessment in Image Editing.
2025.11: 🏆 Won 1st Place at the Tencent Advertising Algorithm Competition (1 / 8300+, prize ≈ $280K).
2025.09: 🏆 Won 1st Place at the ICCV 2025 Challenge on Visual Question Answering with Spatial Awareness.
2025.07: 🚀 Started as an AI Researcher in the Foundation Model Research Group at Mango TV.

📝 Publications

ICML 2026

Architectural comparison between the coupled agent and the decoupled agent.

Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority

Chenhao Qiu, Yechao Zhang, Xin Luo, Shien Song, Xusheng Liu

The 43rd International Conference on Machine Learning (ICML 2026), Seoul, South Korea — Poster.

Identifies a structural bottleneck — evidence misalignment — in current MLLM agents for long-video QA: agents commit to answers that drift away from the visual evidence they cite.
Proposes an authority-decoupling framework that separates evidence retrieval from answer generation, so the answering module no longer overrides the agent’s grounded findings.
Distilled from real-world industrial long-video pipelines and validated with extensive rigorous experiments across multiple long-video QA benchmarks.

Selected Manuscripts / In Progress

WISDOM: Progressive Curriculum Synthesis Makes LLMs Better Mathematical Reasoners — Chenhao Qiu et al. Submitted to ICLR 2025 (review scores 6/6/6/1). A 3-stage iterative curriculum synthesis pipeline (Weak Teacher Guiding → Critical Expert Teaching → Experts Consistency Voting); WISDOM-7B reaches 62.4% on MATH and 2/30 on AIME 2024; trained on an 88×A100 cluster.

🎖️ Honors and Awards

Competition record: 8 × 1st Place, 8 × 2nd Place, cumulative prizes ≈ $530,000. Selected highlights below.

2026 — 1st Place, CVPR NTIRE Challenge on X-AIGC Quality Assessment in Image Editing.
2025 — 1st Place, Tencent Advertising Algorithm Competition. Rank 1 / 8300+, prize ≈ $280,000.
2025 — 1st Place, ICCV Challenge on Visual Question Answering with Spatial Awareness.
2024 — 1st Place, Mango TV Large Model Competition — Logical Reasoning Track. Prize ≈ $33,000.
2024 — 1st Place, ATEC 2024 Online Competition — Track 4.
2024 — 1st Place, ATEC Challenge on Large Model Application and Security. Prize ≈ $140,000.
2024 — 1st Place, 4th SEED Competition — Healthcare Track. Prize ≈ $11,000.
2025 — 1st Place, China Telecom Cloud “Xirang Cup” Collegiate AI Competition — Online Round, LLM Mathematical Reasoning Track.
2025 — 2nd Place, China Telecom Cloud “Xirang Cup” Collegiate AI Competition — National Finals.
2025 — 2nd Place, Pazhou Algorithm Competition; recognized with an individual Letter of Appreciation issued as an official government document by the People’s Government of Haizhu District, Guangzhou.
2023 — 2nd Place, ATEC Challenge on LLM-Generated News Detection.
2023 — 2nd Place, ICDM Challenge (Ant Group TuGraph) on Pretrained-Model-based Community Detection and Gang Mining.
2023 — 2nd Place, Digital China Innovation Contest (DCIC) — Data Development Track.
2023 — 2nd Prize (National), Baidu Business AI Technology Innovation Competition — National Finals.
2023 — 2nd Prize (National), China Collegiate Computing Contest — Big Data Challenge.
2022 — 2nd Place, ATEC Tech Elite Competition — Digital Security Track.

📖 Education

2022.09 – 2025.06, M.Eng. in Software Engineering, Huazhong University of Science and Technology (HUST), Wuhan, China.
2018.09 – 2022.06, B.Eng. in Software Engineering, Nanchang University, Nanchang, China.

💼 Work Experience

2025.07 – Present, AI Researcher, Foundation Model Research Group, Mango TV, Changsha.
- Multimodal Video Understanding: Lead the R&D of multimodal understanding algorithms for intelligent media-asset management; built a shot-level structured parsing framework for long-form videos integrating face recognition, visual semantic modeling, and MLLMs.
- Independent Academic Research: Distilled real-world industrial pipelines and SOTA literature tracking into a first-author ICML 2026 paper on evidence misalignment in MLLM agents.
- Competition Organization: Key role in the 2026 Mango TV Algorithm Competition — task design, data construction & annotation standards, and baseline solutions. Proposal submitted to ACM MM.
- International Challenge Leadership: Sole contributor leading the department’s international competition track — 1st Place at both ICCV 2025 Challenge and CVPR 2026 Challenge.

💻 Internships

2024.06 – 2024.09, Algorithm Intern, Foundation Model SFT Team, Tencent (Hunyuan), Shenzhen.
- Optimized role-playing capabilities of the Hunyuan LLM — took the base model to #1 domestic / #2 overall on the SuperClue role-playing benchmark.
- Built a data synthesis pipeline on top of the Hunyuan 7×8B MoE model; lifted the perfect-score rate from 23% → >50% while holding the zero-score rate under 15% — one month ahead of schedule.
2024.02 – 2024.05, Algorithm Intern, Foundation Model SFT Team, Meituan (Longcat), Beijing.
- Curated high-quality SFT datasets blending open-source and GPT-4-generated data; dynamically tuned mixing ratios to improve both domain-specific and general metrics.
- Contributed to an in-house Megatron-based distributed training framework — 4× training speedup; fine-tuned Qwen1.5-72B, Yi-34B, Llama2-70B.
- Built an RLHF pipeline (DPO, PPO) on top of SFT models and a robust bad-case analysis loop for continuous iteration.

🛠️ Skills & Interests

AI-assisted Engineering — daily user of Claude / Codex for prototyping, refactoring, and full-system development; pair coding with careful design, testing, and engineering discipline.
Full-stack independent projects — designed, built, and shipped PaperAgent end-to-end (algorithmic pipeline, backend, database, iOS client) for AI-assisted paper retrieval, citation mining, and personalized daily literature recommendations.
Beyond research — long-distance solo travel; it has sharpened my independence, adaptability, and problem-solving in unfamiliar environments.

Chenhao Qiu