Tutorials

The following six tutorials will be held at AACL-IJCNLP 2020. More details will be posted later.

[T1] Natural Language Processing in Financial Technology Applications

Hsin-Hsi Chen, Hen-Hsen Huang and Chung-Chi Chen

Financial Technology (FinTech) is an emerging and popular topic in both financial and engineering domains, and the recent advances in natural language processing (NLP) are highly involved in the progress of FinTech. Workshops and shared tasks, including FNP, ECONLP, FinNLP, FiQA, and FinNum, have been introduced for sharing the ideas of interdisciplinary researchers. This suggests the importance of textual analysis for the information embedded in financial data. This tutorial introduces the recent study of FinTech from the perspective of NLP. We categorize the related work into three groups. The audiences will gain an overview of applications of NLP in the FinTech and figure out their research directions.

[T2] NLP for Healthcare in the Absence of a Healthcare Dataset

Sarvnaz Karimi and Aditya Joshi

Since MYCIN in the 1980s, deploying artificial intelligence for health care has been an attractive proposition. With the rise of digital textual data, natural language processing (NLP) for healthcare has witnessed increasing attention. However, due to privacy and access restrictions, official healthcare datasets such as hospital records may not always be available. The proposed cutting-edge tutorial discusses past work in NLP for healthcare with a rather realistic constraint: official healthcare datasets are not available.

[T3] Self-Supervised Deep Learning for NLP

William Yang Wang and Xin Wang

Self-supervised deep learning (SSDL) methods emerge as a promising learning paradigm in the field of Computer Vision recently. The approach cleverly formulates supervised learning problems using dense learning signals, without the need of external human annotations. Beyond Vision, it is a general framework that enables a variety of learning models, including deep reinforcement learning and the success of AlphaGo Zero. In NLP, SSDL has also achieved promising results in representation learning, including masked language models, such as BERT and XLNet.

In this tutorial, we provide a gentle introduction to the foundation of self-supervised deep learning, as well as some practical problem formulations and solutions in NLP. We describe recent advances in self-supervised deep learning for NLP, with a special focus on generation and language models. We provide an overview of the research area, categorize different types of self-supervised learning models, and discuss pros and cons, aiming to provide some interpretations and practical perspectives on the future of self-supervised learning for solving real-world NLP problems.

[T4] Explainability for Natural Language Processing

Shipi Dhanorkar, Christine Wolf, Kun Qian, Anbang Xu, Lucian Popa and Yunyao Li

We propose a cutting-edge tutorial that investigates the issues of transparency and interpretability as they relate to NLP. Both the research community and industry have been developing new techniques to render black-box NLP models more transparent and interpretable. Reporting from an interdisciplinary team of social science, human-computer interaction (HCI), and NLP researchers, our tutorial has two components: an introduction to explainable AI (XAI) and a review of the state-of-the-art for explainability research in NLP; and findings from a qualitative interview study of individuals working on real-world NLP projects at a large, multinational technology and consulting corporation. The first component will introduce core concepts related to explainability in NLP. Then, we will discuss explainability for NLP tasks and report on a systematic literature review of the state-of-the-art literature in AI, NLP, and HCI conferences. The second component reports on our qualitative interview study which identifies practical challenges and concerns that arise in real-world development projects which include NLP.

[T5] A Hitchhikers guide to using Transformers for multiple scenarios and languages

Said Bleik, Miguel Fierro, Hong Lu, Daisy Deng, Yijing Chen, Heather Spetalnick, Tao Wu and Sharat Chikkerur

Natural language process (NLP) has undergone tremendous changes in recent years. In the past two years, breakthroughs have been happening at an unprecedented pace. However, while the performance of the NLP models gets better, models and techniques become more complex. As a result, these new developments have become less accessible to practitioners, including software engineers and data scientists, to understand and interpret. This tutorial attempts to review the recent developments in NLP models and make them accessible to practitioners.

At the center of the recent developments in NLP is the transformer architecture. The BERT (Bidirectional Encoder Representations from Transformers) paper published in late 2018 has spurred a wave of transformer-based and pre-train-and-fine-tuning NLP techniques, such as XLNet, RoBERTa, among others. These new techniques have set the new state-of-the-art (SOTA) performance in many NLP tasks. In this tutorial, we will have a close study of these transformer-based approaches and apply them in a few standard NLP tasks and across multiple languages. We will use the open source repository (https://github.com/microsoft/nlp) to illustrate the use of transformers across different scenarios and languages. Special attention will be paid to applying these advances to non-English (and non left-to-right) languages.

[T6] Advances in Debating Technologies: Building AI That Can Debate Humans

Roy Bar-Haim, Yonatan Bilu, Liat Ein-Dor and Noam Slonim

Argumentation and debating are fundamental capabilities of human intelligence. They are essential for a wide range of everyday activities that involve reasoning, decision making or persuasion. Debating Technologies are defined as ``computational technologies developed directly to enhance, support, and engage with human debating’’ (Gurevych et al., 2016). A recent milestone in this field is Project Debater, the first demonstration of a live competitive debate between an AI system and a human debate champion. Project Debater is an IBM Research AI’s grand challenge, developed for over six years by a large team of NLP and ML researchers and engineers, and was demonstrated in February 2019, attracting massive media coverage. This significant research effort has resulted in nearly 40 scientific papers and many datasets.

In the proposed tutorial, we aim to answer the question: ``what does it take to build a system that can debate humans’’? Our main focus is on the scientific problems that such system must tackle. Some of these intriguing problems include argument retrieval for a given debate topic, argument quality assessment and stance classification, identifying relevant principled arguments to be used in conjunction with corpus-mined arguments, organizing the arguments into a compelling narrative, recognizing the arguments made by the human opponent and making a rebuttal. For each of these problems we will present relevant scientific work from various research groups as well as our own.

A complementary goal of the tutorial is to provide a holistic view of a debating system. Such a view is largely missing in the academic literature, where each paper typically addresses a specific problem in isolation. We present a complete pipeline of a debating system, and discuss the information flow and the interaction between the various components. We will also share our experience and lessons learned from developing such a complex, large scale NLP system. Finally, the tutorial will discuss practical applications and future challenges of debating technologies.