Feng Li - PhD Student

About me

My name is Feng, I am a first-year PhD student in Informatics at Indiana University Bloomington, where I am advised by Prof. Samantha Wood at the Building a Mind Lab.

I investigate the origins of biological intelligence by raising newborn animals and artificial agents in parallel controlled environments. My work utilizes Digital Twins to isolate the effects of learning machinery (e.g., curiosity-driven reinforcement learning) from training data (e.g., embodied experience), uncovering the computational basis of social and cognitive development.

Previously, I worked as a Software Engineer at AI Rudder, focusing on conversational AI and NLP.

I earned my master’s degree in Computer Science from the University of Illinois at Urbana-Champaign in Winter 2021. Prior to that, I worked as a Research Intern at Westlake University (Prof. Zhenzhong Lan), following my bachelor’s degree from Ohio State University in 2020.

Outside of work, I enjoy working out, reading, and exploring fields like cognitive science, neuroscience, psychology, and physics. At my core, I aim to minimize the free energy of my internal world model—seeking the simplest way to understand and express the world around me. These activities continuously expand my horizons and fuel my curiosity.

Publication

Toward the Limitation of Code-Switching in Cross-Lingual Transfer

Y Feng, F Li, P Koehn EMNLP

Multilingual pretrained models have shown strong cross-lingual transfer ability. Some works used code-switching sentences, which consist of tokens from multiple languages, to enhance the cross-lingual representation further, and have shown success in many zero-shot cross-lingual tasks. However, code-switched tokens are likely to cause grammatical incoherence in newly substituted sentences, and negatively affect the performance on token-sensitive tasks, such as Part-of-Speech (POS) tagging and Named-Entity-Recognition (NER). This paper mitigates the limitation of the code-switching method by not only making the token replacement but considering the similarity between the context and the switched tokens so that the newly substituted sentences are grammatically consistent during both training and inference. We conduct experiments on cross-lingual POS and NER over 30+ languages, and demonstrate the effectiveness of our method by outperforming the mBERT by 0.95 and original code-switching method by 1.67 on F1 scores.

Paper Code

December 2022
Learn To Remember: Transformer With Recurrent Memory For Document-Level Machine Translation

Y Feng, F Li, Z Song, B Zheng, P Koehn NAACL

The Transformer architecture has led to significant gains in machine translation. However, most studies focus on only sentence-level translation without considering the context dependency within documents, leading to the inadequacy of document-level coherence. Some recent research tried to mitigate this issue by introducing an additional context encoder or translating with multiple sentences or even the entire document. Such methods may lose the information on the target side or have an increasing computational complexity as documents get longer. To address such problems, we introduce a recurrent memory unit to the vanilla Transformer, which supports the information exchange between the sentence and previous context. The memory unit is recurrently updated by acquiring information from sentences, and passing the aggregated knowledge back to subsequent sentence states. We follow a two-stage training strategy, in which the model is first trained at the sentence level and then finetuned for document-level translation. We conduct experiments on three popular datasets for document-level machine translation and our model has an average improvement of 0.91 s-BLEU over the sentence-level baseline. We also achieve state-of-the-art results on TED and News, outperforming the previous work by 0.36 s-BLEU and 1.49 d-BLEU on average.

Paper Code

July 2022

What's New

[Aug 2025] I joined the Building a Mind Lab at Indiana University Bloomington as a Graduate Research Assistant, advised by Prof. Samantha Wood.
[Sep 2024] I joined the Laureate Institute for Brain Research as a Research Assistant, focusing on decision-making models to understand human behavior in cognitive tasks.
[Jun 2024] I have left AI Rudder to pursue an academic role dedicated to enhancing the reasoning abilities of machine learning models.
[May 2024] I achieved my target body weight in just 49 days! 🎉🎉 (graph starts from the second week, measuring each day).
[Mar 2024] I've started a weight loss plan with a goal to lose 8kg in 60 days through diet control and exercise.
[Feb 2023] I Joined AI Rudder As A Software Engineer, Focusing On Developing Voice Bots For Phone Calls.
[Dec 2022] Our Paper "Towards The Limitation Of Code-Switching In Cross-Lingual Transfer" Has Been Accepted By EMNLP!
[Dec 2022] I Received My Master's Degree In Computer Science From University Of Illinois At Urbana-Champaign.
[Dec 2022] I Moved To Bay Area, California!
[Jul 2022] Our Paper "Learn To Remember: Transformer With Recurrent Memory For Document-Level Machine Translation" Has Been Accepted By NAACL!

Publication

Toward the Limitation of Code-Switching in Cross-Lingual Transfer

Learn To Remember: Transformer With Recurrent Memory For Document-Level Machine Translation

What's New

Learn To Remember: Transformer With Recurrent Memory For Document-Level Machine Translation