Kayo Yin

Hi, I'm Kayo
Bonjour, je suis Kayo
今日は、綺妤です　

[kajo iɴ]
she/her
kayoyin🥸berkeley.edu
Vitae ⋅ Bio ⋅ Fun

Anonymous feedback

Hello! I'm a PhD student at UC Berkeley advised by Jacob Steinhardt and Dan Klein, and affiliated with Berkeley AI Research and Berkeley NLP. I'm interested in better understanding LLMs to make them safe and robust, and in NLP for signed languages. I am grateful to be supported by a Future of Life PhD fellowship.

I did my master's at Carnegie Mellon University where I was fortunate to be advised by Graham Neubig, and I did my undergrad at École Polytechnique. I also interned at Microsoft Research and DeepMind. In a previous life, I wanted to become a classical musician and I have a CEM from Conservatoire Frédéric Chopin.

I come from Akashi, Japan and grew up in Paris, France. I like to play music, practice martial arts, backcountry snowboard, and paint memes.

2024-05-02 Gave an invited talk at EPFL.

2023-10-27 Gave an invited talk at Université Laval.

2023-07-10 Extremely thrilled to receive the Best Resource Paper award at ACL 2023!

2023-05-15 I started my internship at Microsoft Research! Ping me if you want to meet up in NYC :)

2023-04-28 Gave an invited talk at KUNGFU.AI.

2023-04-26 Gave an invited talk at Sony CSL.

2023-02-10 Gave an invited talk at the University of Chicago and Toyota Technological Institute at Chicago.

2022-12-19 Gave an invited talk at the University of Melbourne.

2022-12-11 Extremely honored to receive the Best Paper Honorable Mention award at EMNLP 2022!

2022-08-19 Gave an invited talk at the Workshop on Pronouns and Machine Translation.

2022-07-27 Gave an invited presentation at IJCAI on Including Signed Languages in NLP. My first in-person conference yay!

2022-07-09 Gave a keynote talk at the Queer in AI Workshop @NAACL.

2022-06-06 I started my internship at DeepMind! If you're in London this summer, let's meet up :)

2022-05-19 Guested on the NLP Highlights Podcast.

2022-04-15 I will join UC Berkeley for my PhD next Fall!

2021-11-05 Gave an invited talk at DeepMind on Natural Language Processing for Signed Languages

2021-10-07 Gave an invited talk at University of Pittsburgh on Extending Neural Machine Translation to Dialogue and Signed Languages

2021-09-23 Extremely honored to be selected as a Siebel Scholar Class of 2022!

2021-09-17 Gave an invited talk at SIGTYP on Understanding, Improving and Evaluating Context Usage in Context-aware Machine Translation

2021-07-05 Extremely thrilled to receive the Best Theme Paper award at ACL 2021!

2021-03-01 Gave an invited talk at Unbabel on Do Context-Aware Translation Models Pay the Right Attention?

2020-10-18 Gave an invited talk at Computer Vision Talks on Sign Language Translation with Transformers

2020-09-21 Extremely honored to be awarded Global Winner in Computer Science at The Global Undergraduate Awards 2020!

2020-08-31 Started my Master's degree at CMU LTI!

Publications

* = equal contribution

American Sign Language Handshapes Reflect Pressures for Communicative Efficiency
Kayo Yin, Terry Regier, and Dan Klein.
Conference of the Annual Meeting of the Association for Computational Linguistics (ACL). August 2024.
PDF Code
   Communicative efficiency is a key topic in linguistics and cognitive psychology, with many studies demonstrating how the pressure to communicate with minimal effort guides the form of natural language. However, this phenomenon is rarely explored in signed languages. This paper shows how handshapes in American Sign Language (ASL) reflect these efficiency pressures and provides new evidence of communicative efficiency in the visual-gestural modality.

We focus on hand configurations in native ASL signs and signs borrowed from English to compare efficiency pressures from both ASL and English usage. First, we develop new methodologies to quantify the articulatory effort needed to produce handshapes and the perceptual effort required to recognize them. Then, we analyze correlations between communicative effort and usage statistics in ASL or English. Our findings reveal that frequent ASL handshapes are easier to produce and that pressures for communicative efficiency mostly come from ASL usage, rather than from English lexical borrowing.

@inproceedings{yin24acl,
    title = {Pressures for Communicative Efficiency in American Sign Language},
    author = {Yin, Kayo and Regier, Terry and Klein, Dan},
    booktitle = {Annual Conference of the Association for Computational Linguistics (ACL)},
    month = {August},
    year = {2024}
}
🏆 Best Resource Paper
When Does Translation Require Context? A Data-driven, Multilingual Exploration
Patrick Fernandes*, Kayo Yin*, Emmy Liu, André F. T. Martins and Graham Neubig.
Conference of the Annual Meeting of the Association for Computational Linguistics (ACL). July 2023.
PDF Code
   Although proper handling of discourse phenomena significantly contributes to the quality of machine translation (MT), common translation quality metrics do not adequately capture them. Recent works in context-aware MT attempt to target a small set of these phenomena during evaluation. In this paper, we propose a new metric, P-CXMI, which allows us to identify translations that require context systematically and confirm the difficulty of previously studied phenomena as well as uncover new ones that have not been addressed in previous work. We then develop the Multilingual Discourse-Aware (MuDA) benchmark, a series of taggers for these phenomena in 14 different language pairs, which we use to evaluate context-aware MT. We find that state-of-theart context-aware MT models find marginal improvements over context-agnostic models on our benchmark, which suggests current models do not handle these ambiguities effectively. We release code and data to invite the MT research community to increase efforts on context-aware translation on discourse phenomena and languages that are currently overlooked.

@inproceedings{fernandes23acl,
    title = {When Does Translation Require Context? A Data-driven, Multilingual Exploration},
    author = {Patrick Fernandes and Kayo Yin and Emmy Liu and André Martins and Graham Neubig},
    booktitle = {Annual Conference of the Association for Computational Linguistics (ACL)},
    month = {July},
    year = {2023}
}
🏆 Best Paper Runner-Up
Interpreting Language Models with Contrastive Explanations
Kayo Yin and Graham Neubig.
Conference on Empirical Methods in Natural Language Processing (EMNLP). December 2022.
PDF Code
   Model interpretability methods are often used to explain NLP model decisions on tasks such as text classification, where the output space is relatively small. However, when applied to language generation, where the output space often consists of tens of thousands of tokens, these methods are unable to provide informative explanations. Language models must consider various features to predict a token, such as its part of speech, number, tense, or semantics. Existing explanation methods conflate evidence for all these features into a single explanation, which is less interpretable for human understanding. To disentangle the different decisions in language modeling, we focus on explaining language models contrastively: we look for salient input tokens that explain why the model predicted one token instead of another. We demonstrate that contrastive explanations are quantifiably better than non-contrastive explanations in verifying major grammatical phenomena, and that they significantly improve contrastive model simulatability for human observers. We also identify groups of contrastive decisions where the model uses similar evidence, and we are able to characterize what input tokens models use during various language generation decisions.

@article{yin2022interpreting,
     title = "Interpreting Language Models with Contrastive Explanations",
    author = "Yin, Kayo and Neubig, Graham",
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = dec,
    year = "2022",
}
Signed Coreference Resolution
Kayo Yin, Kenneth DeHaan and Malihe Alikhani.
Conference on Empirical Methods in Natural Language Processing (EMNLP). November 2021.
PDF Code Video
   Coreference resolution is key to many natural language processing tasks and yet has only been explored for spoken languages. In signed languages, space is primarily used to establish reference. Solving coreference resolution for signed languages would not only enable higher-level Sign Language Processing systems, but also enhance our understanding of language in different modalities and of situated references, which are key problems in studying grounded language. In this paper, we: (1) introduce Signed Coreference Resolution, a new challenge for coreference modeling and Sign Language Processing; (2) collect an annotated corpus of German Sign Language with gold labels for coreference together with an annotation software for the task; (3) explore features of hand gesture, iconicity, and spatial situated properties and move forward to propose a set of linguistically informed heuristics and unsupervised models for the task; (4) put forward several proposals about ways to address the complexities of this challenge effectively. Finally, we invite the NLP community to collaborate with signing communities and direct efforts towards SCR to close this gap.

@inproceedings{yin-etal-2021-signed,
    title = "Signed Coreference Resolution",
    author = "Yin, Kayo and DeHaan, Kenneth and Alikhani, Malihe",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.405",
    pages = "4950--4961",
}
When is Wall a Pared and when a Muro?: Extracting Rules Governing Lexical Selection
Aditi Chaudhary, Kayo Yin, Antonios Anastasopoulos and Graham Neubig.
Conference on Empirical Methods in Natural Language Processing (EMNLP). November 2021.
PDF Code
   Learning fine-grained distinctions between vocabulary items is a key challenge in learning a new language. For example, the noun ``wall'' has different lexical manifestations in Spanish -- ``pared'' refers to an indoor wall while ``muro'' refers to an outside wall. However, this variety of lexical distinction may not be obvious to non-native learners unless the distinction is explained in such a way. In this work, we present a method for automatically identifying fine-grained lexical distinctions, and extracting rules explaining these distinctions in a human- and machine-readable format. We confirm the quality of these extracted rules in a language learning setup for two languages, Spanish and Greek, where we use the rules to teach non-native speakers when to translate a given ambiguous word into its different possible translations

@inproceedings{chaudhary21emnlp,
     title = "When is Wall a Pared and when a Muro?: Extracting Rules Governing Lexical Selection",
    author = "Chaudhary, Aditi and Yin, Kayo and Anastasopoulos, Antonios and Neubig, Graham",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2021",
}
🏆 Best Theme Paper
Including Signed Languages in Natural Language Processing
Kayo Yin, Amit Moryossef, Julie Hochgesang, Yoav Goldberg and Malihe Alikhani.
Conference of the Annual Meeting of the Association for Computational Linguistics (ACL). August 2021.
PDF Video
   Signed languages are the primary means of communication for many deaf and hard of hearing individuals. Since signed languages exhibit all the fundamental linguistic properties of natural language, we believe that tools and theories of Natural Language Processing (NLP) are crucial towards its modeling. However, existing research in Sign Language Processing (SLP) seldom attempt to explore and leverage the linguistic organization of signed languages. This position paper calls on the NLP community to include signed languages as a research area with high social and scientific impact. We first discuss the linguistic properties of signed languages to consider during their modeling. Then, we review the limitations of current SLP models and identify the open challenges to extend NLP to signed languages. Finally, we urge (1) the adoption of an efficient tokenization method; (2) the development of linguistically-informed models; (3) the collection of real-world signed language data; (4) the inclusion of local signed language communities as an active and leading voice in the direction of research.

@inproceedings{yin-etal-2021-including,
    title = "Including Signed Languages in Natural Language Processing",
    author = "Yin, Kayo and Moryossef, Amit and Hochgesang, Julie and Goldberg, Yoav and Alikhani, Malihe",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-long.570",
    pages = "7347--7360",
    abstract = "Signed languages are the primary means of communication for many deaf and hard of hearing individuals. Since signed languages exhibit all the fundamental linguistic properties of natural language, we believe that tools and theories of Natural Language Processing (NLP) are crucial towards its modeling. However, existing research in Sign Language Processing (SLP) seldom attempt to explore and leverage the linguistic organization of signed languages. This position paper calls on the NLP community to include signed languages as a research area with high social and scientific impact. We first discuss the linguistic properties of signed languages to consider during their modeling. Then, we review the limitations of current SLP models and identify the open challenges to extend NLP to signed languages. Finally, we urge (1) the adoption of an efficient tokenization method; (2) the development of linguistically-informed models; (3) the collection of real-world signed language data; (4) the inclusion of local signed language communities as an active and leading voice in the direction of research.",
}
Do Context-Aware Translation Models Pay the Right Attention?
Kayo Yin, Patrick Fernandes, Danish Pruthi, Aditi Chaudhary, André F. T. Martins and Graham Neubig.
Conference of the Annual Meeting of the Association for Computational Linguistics (ACL). August 2021.
PDF Code Video
   Context-aware machine translation models are designed to leverage contextual information, but often fail to do so. As a result, they inaccurately disambiguate pronouns and polysemous words that require context for resolution. In this paper, we ask several questions: What contexts do human translators use to resolve ambiguous words? Are models paying large amounts of attention to the same context? What if we explicitly train them to do so? To answer these questions, we introduce SCAT (Supporting Context for Ambiguous Translations), a new English-French dataset comprising supporting context words for 14K translations that professional translators found useful for pronoun disambiguation. Using SCAT, we perform an in-depth analysis of the context used to disambiguate, examining positional and lexical characteristics of the supporting words. Furthermore, we measure the degree of alignment between the model's attention scores and the supporting context from SCAT, and apply a guided attention strategy to encourage agreement between the two.

@inproceedings{yin-etal-2021-context,
     title = "Do Context-Aware Translation Models Pay the Right Attention?",
     author = "Yin, Kayo and Fernandes, Patrick and Pruthi, Danish and Chaudhary, Aditi and Martins, Andr{\'e} F. T. and Neubig, Graham",
     booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-long.65",
    pages = "788--801",
    abstract = "Context-aware machine translation models are designed to leverage contextual information, but often fail to do so. As a result, they inaccurately disambiguate pronouns and polysemous words that require context for resolution. In this paper, we ask several questions: What contexts do human translators use to resolve ambiguous words? Are models paying large amounts of attention to the same context? What if we explicitly train them to do so? To answer these questions, we introduce SCAT (Supporting Context for Ambiguous Translations), a new English-French dataset comprising supporting context words for 14K translations that professional translators found useful for pronoun disambiguation. Using SCAT, we perform an in-depth analysis of the context used to disambiguate, examining positional and lexical characteristics of the supporting words. Furthermore, we measure the degree of alignment between the model{'}s attention scores and the supporting context from SCAT, and apply a guided attention strategy to encourage agreement between the two.",
}
Measuring and Increasing Context Usage in Context-Aware Machine Translation
Patrick Fernandes, Kayo Yin, Graham Neubig and André F. T. Martins.
Conference of the Annual Meeting of the Association for Computational Linguistics (ACL). August 2021.
PDF Code
   Recent work in neural machine translation has demonstrated both the necessity and feasibility of using inter-sentential context -- context from sentences other than those currently being translated. However, while many current methods present model architectures that theoretically can use this extra context, it is often not clear how much they do actually utilize it at translation time. In this paper, we introduce a new metric, conditional cross-mutual information, to quantify the usage of context by these models. Using this metric, we measure how much document-level machine translation systems use particular varieties of context. We find that target context is referenced more than source context, and that conditioning on a longer context has a diminishing effect on results. We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models. Experiments show that our method increases context usage and that this reflects on the translation quality according to metrics such as BLEU and COMET, as well as performance on anaphoric pronoun resolution and lexical cohesion contrastive datasets.

@inproceedings{fernandes-etal-2021-measuring,
    title = "Measuring and Increasing Context Usage in Context-Aware Machine Translation",
    author = "Fernandes, Patrick and Yin, Kayo and Neubig, Graham and Martins, Andr{\'e} F. T.",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-long.505",
    pages = "6467--6478",
    abstract = "Recent work in neural machine translation has demonstrated both the necessity and feasibility of using inter-sentential context, context from sentences other than those currently being translated. However, while many current methods present model architectures that theoretically can use this extra context, it is often not clear how much they do actually utilize it at translation time. In this paper, we introduce a new metric, conditional cross-mutual information, to quantify usage of context by these models. Using this metric, we measure how much document-level machine translation systems use particular varieties of context. We find that target context is referenced more than source context, and that including more context has a diminishing affect on results. We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models. Experiments show that our method not only increases context usage, but also improves the translation quality according to metrics such as BLEU and COMET, as well as performance on anaphoric pronoun resolution and lexical cohesion contrastive datasets.",
}
Data Augmentation for Sign Language Gloss Translation
Amit Moryossef*, Kayo Yin*, Graham Neubig and Yoav Goldberg.
Machine Translation Summit (MTSummit) International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL). August 2021.
PDF
   Sign language translation (SLT) is often decomposed into video-to-gloss recognition and gloss-to-text translation, where a gloss is a sequence of transcribed spoken-language words in the order in which they are signed. We focus here on gloss-to-text translation, which we treat as a low-resource neural machine translation (NMT) problem. However, unlike traditional low-resource NMT, gloss-to-text translation differs because gloss-text pairs often have a higher lexical overlap and lower syntactic overlap than pairs of spoken languages. We exploit this lexical overlap and handle syntactic divergence by proposing two rule-based heuristics that generate pseudo-parallel gloss-text pairs from monolingual spoken language text. By pre-training on the thus obtained synthetic data, we improve translation from American Sign Language (ASL) to English and German Sign Language (DGS) to German by up to 3.14 and 2.20 BLEU, respectively.

@inproceedings{moryossef-etal-2021-data,
     title = "Data Augmentation for Sign Language Gloss Translation",
    author = "Moryossef, Amit and Yin, Kayo and Neubig, Graham and Goldberg, Yoav",
    booktitle = "Proceedings of the 1st International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL)",
    month = aug,
    year = "2021",
    address = "Virtual",
    publisher = "Association for Machine Translation in the Americas",
    url = "https://aclanthology.org/2021.mtsummit-at4ssl.1",
    pages = "1--11",
    abstract = "Sign language translation (SLT) is often decomposed into video-to-gloss recognition and gloss to-text translation, where a gloss is a sequence of transcribed spoken-language words in the order in which they are signed. We focus here on gloss-to-text translation, which we treat as a low-resource neural machine translation (NMT) problem. However, unlike traditional low resource NMT, gloss-to-text translation differs because gloss-text pairs often have a higher lexical overlap and lower syntactic overlap than pairs of spoken languages. We exploit this lexical overlap and handle syntactic divergence by proposing two rule-based heuristics that generate pseudo-parallel gloss-text pairs from monolingual spoken language text. By pre-training on this synthetic data, we improve translation from American Sign Language (ASL) to English and German Sign Language (DGS) to German by up to 3.14 and 2.20 BLEU, respectively.", }
🏆 Global Undergraduate Award
Better Sign Language Translation with STMC-Transformer
Kayo Yin and Jesse Read.
International Conference on Computational Linguistics (COLING). November 2020.
PDF Code
   Sign Language Translation (SLT) first uses a Sign Language Recognition (SLR) system to extract sign language glosses from videos. Then, a translation system generates spoken language translations from the sign language glosses. This paper focuses on the translation system and introduces the STMC-Transformer which improves on the current state-of-the-art by over 5 and 7 BLEU respectively on gloss-to-text and video-to-text translation of the PHOENIX-Weather 2014T dataset. On the ASLG-PC12 corpus, we report an increase of over 16 BLEU. We also demonstrate the problem in current methods that rely on gloss supervision. The video-to-text translation of our STMC-Transformer outperforms translation of GT glosses. This contradicts previous claims that GT gloss translation acts as an upper bound for SLT performance and reveals that glosses are an inefficient representation of sign language. For future SLT research, we therefore suggest an end-to-end training of the recognition and translation models, or using a different sign language annotation scheme.

@inproceedings{yin-read-2020-better,
    title = "Better Sign Language Translation with {STMC}-Transformer",
    author = "Yin, Kayo and Read, Jesse",
    booktitle = "Proceedings of the 28th International Conference on Computational Linguistics",
    month = December,
    year = "2020",
    address = "Barcelona, Spain (Online)",     publisher = "International Committee on Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.coling-main.525",
    doi = "10.18653/v1/2020.coling-main.525",
    pages = "5975--5989",
}
Sign Language Translation with Transformers
Kayo Yin and Jesse Read.
European Conference on Computer Vision (ECCV) Workshop on Sign Language Recognition, Translation and Production (SLRTP). August 2020.
PDF Code Video
   This paper improves the translation system in Sign Language Translation (SLT) by using Transformers. We report a wide range of experimental results for various Transformer setups and introduce a novel end-to-end SLT system combining Spatial-Temporal Multi-Cue (STMC) and Transformer networks. Our methodology improves on the current state-of-the-art by over 5 and 7 BLEU respectively on ground truth (GT) glosses and predicted glosses of the PHOENIX-Weather 2014T dataset. On the ASLG-PC12 corpus, we report an improvement of over 16 BLEU. Our findings also reveal that end-to-end translation with predicted glosses outperforms translation on GT glosses. This shows the potential for further improvement in SLT by either jointly training the SLR and translation systems or by revising the gloss annotation scheme.

@inproceedings{yin2020attention,
    title={{Attention is All You Sign: Sign Language Translation with Transformers}},
    author={Yin, Kayo and Read, Jesse},
    booktitle={Sign Language Recognition, Translation and Production (SLRTP) Workshop-Extended Abstracts},
    volume={4},
    year={2020}
}

Talks

Selected Awards

2023 Best Resource Paper, ACL
2023-2027 Future of Life Fellowship
2022 Best Paper Honorable Mention, EMNLP
2022-2023 Berkeley Fellowship
2021-2022 Siebel Scholarship
2021 Best Theme Paper, ACL
2020-2022 Carnegie Mellon University Research Fellowship
2020 Global Winner, The Global Undergraduate Awards
2015 Gold medal, Concours Kangourou des Mathématiques (6th place out of 13011)
2012 Gold medal, Concours Kangourou des Mathématiques (5th place out of 53937)

Hi, I'm Kayo Bonjour, je suis Kayo 今日は、綺妤です

Publications

Talks

2023

2022

2021

2020

2017-2019

Selected Awards

Hi, I'm Kayo
Bonjour, je suis Kayo
今日は、綺妤です