My research interests are in the areas of Natural Language Processing (NLP), Machine Learning (ML), and systems and architectural design for large-scale text understanding, mining and retrieval.
News
I'm looking for research interns to work on Large Language Model (LLM), Retrieval Augmented Generation (RAG), Agents, and NLP in general. Please feel free to email me for more information.
2013/7-2022/08: Raytheon Engineering Fellow / Lead Scientist, Principal Investigator (PI), Head of Text group, Raytheon BBN Technologies
Led a group of scientists and engineers to develop novel NLP tools, the associated ML models, and the underlying software and computing infrastructure.
Research in NLP, automatic knowledge base construction, multilingual/cross-lingual NLP and information retrieval.
2012/6-2012/9: Research Intern, DeepQA team , IBM T.J. Watson Research Center.
Research in Deep Question Answering for Watson. Primarily worked to enrich Watson’s knowledge base with information extracted from unstructured text.
Amir Pouran Ben Veyseh, Franck Dernoncourt, Bonan Min, Thien Nguyen. Generating Labeled Data for Relation Extraction: A Meta Learning Approach with Joint GPT-2 Training. In Findings of ACL 2023.
Alexander Hanbo Li, Mingyue Shang, Evangelia Spiliopoulou, Jie Ma, Patrick Ng, Zhiguo Wang, Bonan Min, William Wang, Kathleen McKeown, Vittorio Castelli, Dan Roth, Bing Xiang. Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning. In Proceedings of ACL 2023.
Jin Zhao, Nianwen Xue, Bonan Min. Cross-Document Event Coreference Resolution: Instruct Humans or Instruct GPT?. In Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL) 2023.
John Hungerford, Yee Seng Chan, Jessica MacBride, Benjamin Gyori, Andrew Lee Zupon, Zheng Tang, Egoitz Laparra, Haoling Qiu, Bonan Min, Yan Zverev, Caitlin Hilverman, Max Thomas, Walter Andrews, Keith Alcock, Zeyu Zhang, Michael Reynolds, Mihai Surdeanu, Steven Bethard, Rebecca Sharp. Taxonomy Builder: a Data-driven and User-centric Tool for Streamlining Taxonomy Construction. To appear at the 2nd HCI + NLP Workshop at NAACL-HLT 2022
Minh Van Nguyen, Bonan Min, Franck Dernoncourt, Thien Nguyen. Learning Cross-Task Dependencies for Joint Extraction of Entities, Events, Event Arguments, and Relations. In Proceedings of EMNLP 2022.
Nghia Ngo, Bonan Min, Thien Nguyen. Unsupervised Domain Adaptation for Joint Information Extraction. In Findings of EMNLP 2022.
Amir Pouran Ben Veyseh, Franck Dernoncourt, Bonan Min, Thien Huu Nguyen. Generating Complement Data for Aspect Term Extraction with GPT-2. In Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing at NAACL-HLT 2022.
Minh Van Nguyen, Nghia Trung Ngo, Bonan Min, Thien Huu Nguyen. FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction. In Proceedings of NAACL-HLT 2022 (Demonstrations).
Oscar Sainz, Haoling Qiu, Oier Lopez de Lacalle, Eneko Agirre, Bonan Min. ZS4IE: A toolkit for Zero-Shot Information Extraction with simple Verbalizations. In Proceedings of NAACL-HLT 2022 (Demonstrations).
Oscar Sainz, Itziar Gonzalez-Dios, Oier Lopez de Lacalle, Bonan Min, Eneko Agirre. Textual Entailment for Event Argument Extraction: Zero- and Few-Shot with Multi-Source Learning. In Proceedings of NAACL-HLT 2022 (Findings).
Jiarui Yao, Nianwen Xue, Bonan Min. Modal Dependency Parsing via Language Model Priming. In Proceedings of NAACL-HLT 2022
Minh Van Nguyen, Bonan Min, Franck Dernoncourt, and Thien Huu Nguyen. Joint Extraction of Entities, Relations, and Events via Modeling Inter-Instance and Inter-Label Dependencies. In Proceedings of NAACL-HLT 2022
Amir Pouran Ben Veyseh, Minh Van Nguyen, Franck Dernoncourt, Bonan Min, Thien Huu Nguyen. Document-Level Event Argument Extraction via Optimal Transport. In Proceedings of ACL 2022 (Findings).
Bonan Min. Exploring Pre-Trained Transformers and Bilingual Transfer Learning for Arabic Coreference Resolution. To appear in Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC) at EMNLP, 2021.
Bonan Min, Benjamin Rozonoyer, Haoling Qiu, Alexander Zamanian, Nianwen Xue and Jessica MacBride. ExcavatorCovid: Extracting Events and Relations from Text Corpora for Temporal and Causal Analysis for COVID-19. To appear at EMNLP 2021 (demonstration).
Minh Van Nguyen, Tuan Ngo Nguyen, Bonan Min and Thien Huu Nguyen. Crosslingual Transfer Learning for Relation and Event Extraction via Word Category and Class Alignments. To appear at EMNLP 2021.
Amir Pouran Ben Veyseh, Minh Van Nguyen, Nghia Ngo Trung, Bonan Min and Thien Huu Nguyen. Modeling Document-Level Context for Event Detection via Important Context Selection. To appear at EMNLP 2021.
Manaj Srivastava, David Trupiano, David Akodes, Ilana Heintz, Hannah Provenza, Bonan Min, Jay DeYoung, Lance A Ramshaw, Roger Bock. Adept Automatic Knowledge Discovery System for Cold Start Knowledge Base Population. In Proceedings of the Text Analysis Conference (TAC) 2017.
Bonan Min and Marjorie Freedman. BBN's 2016 System for Cold Start Knowledge Base Population. In Proceedings of the Text Annalysis Conference (TAC), November 2016.
Bonan Min, Marjorie Freedman and Constantine Lignos. BBN’s 2015 System for Cold Start Knowledge Base Population. In Proceedings of the Text Analysis Conference (TAC), November 2015.
Bonan Min and Marjorie Freedman. BBN System for Cold Start Knowledge Base Population. In Proceedings of the Text Analysis Conference (TAC), November 2014.
Nguyen Tran, Bonan Min, Jinyang Li, and Lakshminarayanan Subramanian. Sybil resilient online content rating. In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation (NSDI), April 2009.
Mingzhong Xiao, Xiaoxiao Hou, and Bonan Min. A Practical Scheme for Content Filter in P2P File Sharing System. Computer Engineering (Chinese), 18, 2008.
Mingzhong Xiao, Jiacong Wang, and Bonan Min. Matrix Bloom Filter on Dynamic Set. Application Research of Computers (Chinese), 7, 2008.
Bonan Min. Peer-Assisted Traffic Optimization for P2P Networks. Outstanding Master's Thesis. Peking University. 2008
Bonan Min, Mingzhong Xiao, Qinyuan Feng, Jiacong Wang, Jing Jiang. A simple, universal and scalable approach to migrate applications to hybrid network using ALG. In Proceedings of International Workshop on NGI and P2P Systems (INPS), 2006.
Jiacong Wang, Mingzhong Xiao, Jing Jiang, and Bonan Min. i-DBF: an Improved Bloom Filter Representation Method on Dynamic Set. In Proceedings of International Workshop on NGI and P2P Systems (INPS), 2006.
XIAO Ming-zhong, MIN Bo-nan, WANG Jia-chong, DAI Ya-fei. Practical Hashing Function for URLs Set. MINI-MICRO SYSTEMS, 2006 Vol.27 No.3 P.538-541 (In Chinese)
Technical Report
Bonan Min, Nguyen Tran, Jinyang Li, and Lakshminarayanan Subramanian. Routepourri: Defensive BGP Route Selection. New York University CS Department. 2009.
Patents
Bonan Min, Rabih Zbib, Zhongqiang Huang. Cross-Lingual Information Retrieval and Information Extraction. US Patent 11,531,824
Bonan Min, Yee Seng Chan, Ilana Heintz. Linguistically Rich Cross-Lingual Text Event Embeddings. US Patent 11,227,128
Projects
IARPA Human Interpretable Attribution of Text Using Underlying Structure (HIATUS) (PI leading the winning proposal)
IARPA COVID-19 Research (PI)
IARPA Better Extraction from Text Towards Enhanced Retrieval (BETTER) (PI)
DARPA World Modelers (PI)
DARPA Causal Exploration (PI)
DARPA Deep Exploration and Filtering of Text (DEFT): as Tech Lead
DARPA Low Resource Languages for Emergent Incidents (LORELEI): as contributor
ONR Data-to-Decision: as contributor
DARPA Machine Reading: as contributor
Current/Past Research Interns
Haoling Qiu (Northeastern University), 2017 spring
Wenlin Yao (Texas A&M University), 2018 summer
Hayley Ross (Brandeis University), 2019 summer
Rashmi Sankepally (University of Maryland), 2019 summer
Suraj Nair (University of Maryland), 2020 summer
Md Mosharaf Hossain (University of North Texas), 2020 summer
Professional Activities
Associate Editor of:
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 2019-2021
Conference Organizing Committee:
LREC-COLING 2024 (Senior Area Chair for Information Extraction, Knowledge Extraction, and Text Mining)
ACL 2023 (Area Chair for Information Extraction)
NAACL 2022 (Co-Chair for Industry Track)
EMNLP 2018 (Video Chair)
NAACL 2018 (Area Chair for Information Extraction)