Publications
For the recent publications, please go to my Google Scholar directly.
2025
- ACL
Antileak-bench: Preventing data contamination by automatically constructing benchmarks with updated real-world knowledgearXiv preprint arXiv:2412.13670, 2025 - AAAI
Towards Verifiable Text Generation with Generative AgentProceedings of the AAAI Conference on Artificial Intelligence, 2025 - R2-FM@ICML
Guardreasoner-vl: Safeguarding vlms via reinforced reasoningarXiv preprint arXiv:2505.11049, 2025 - Preprint
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?arXiv preprint arXiv:2507.12415, 2025 - Preprint
Beyond Prompt-Induced Lies: Investigating LLM Deception on Benign PromptsarXiv preprint arXiv:2508.06361, 2025 - PRAL@ICML
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency OptimizationarXiv preprint arXiv:2505.23387, 2025 - Preprint
Benchmarking LLMs for Unit Test Generation from Real-World FunctionsarXiv preprint arXiv:2508.00408, 2025 - JMIR
Unraveling Online Mental Health Through the Lens of Early Maladaptive Schemas: AI-Enabled Content Analysis of Online Mental Health CommunitiesJournal of Medical Internet Research, 2025 - NeurIPS
EffiBench-X: A Multi-Language Benchmark for Measuring Efficiency of LLM-Generated CodearXiv preprint arXiv:2505.13004, 2025 - ICSE
Measuring the Influence of Incorrect Code on Test Generationthe International Conference on Software Engineering 2026, 2025 - NeurIPS
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency OptimizationarXiv preprint arXiv:2505.23387, 2025 - NeurIPS
Guardreasoner-vl: Safeguarding vlms via reinforced reasoningarXiv preprint arXiv:2505.11049, 2025
2024
- arXiv
Rethinking the Influence of Source Code on Test Case GenerationarXiv preprint arXiv:2402.07844, 2024 - ACL
- arXiv
Committee:Mitigating Language Model Bias via Weak SupervisionarXiv preprint, 2024 - AAAI
Chain-of-Thought Improves Text Generation with Citations in Large Language ModelsIn Proceedings of the AAAI Conference on Artificial Intelligence, 2024
2023
- TLDK
Constituency-Informed and Constituency-Constrained Extractive Question Answering with Heterogeneous Graph TransformerIn Transactions on Large-Scale Data-and Knowledge-Centered Systems LIII, 2023 - WWW
Identifying Checkworthy Cure Claims on TwitterIn Proceedings of the ACM Web Conference 2023, 2023
2022
- CLEF
NUS-IDS at CheckThat! 2022: identifying check-worthiness of tweets using CheckthaT5Working Notes of CLEF, 2022