Good Morning!
NLP in Action (5 ECTS)
Summer 2021
What's it all about?
The processing and the understanding of natural language is one of the most important aspects of Artificial Intelligence in general. This is not only due to, e.g., the simulation of natural conversations among humans as a cognitive process, but also due to daily text-related applications that underpin the importance of AI as a supporting instrument ("AI is for humans"). Concrete examples are, e.g., chatbot technology, the generation of texts, the application of machine learning in text-intensive environments, and the retrieval of the right information by search engines. Just to note that the aspect of ambiguity is still an issue that even deepl.com and translate.google.com are unable to solve.
Aims of the course
The event is theoretical and practical in nature and aims to discuss own ideas related to Natural Language Processing/Understanding (and less mainstream aspects). In this sense, each participant will be assigned a research question to further explore + develop. Examples of themes
- How can secret information be hidden in texts?
- How can you discover themes in texts?
- How can new Shakespeare texts be produced? How to create poems?
- How to detect plagiiarism?
- How can a text be automatically summarised?
- How could a resocialisation chatbot be realised?
- How to train a system to be able to translate from one language in another one?
- How can we use speech processing for Parksinson?
- ...
Please see research papers under -Resources. The purpose of these papers is to initiate, motivate and support discussion and exploration of their research question. The presentation of these papers does not mean to adopt them 1:1 in their reflections. Start and find your own foundations and your own way!
Course organisation
The course takes place on Thursdays, 08h30 -10h00 from 15 April - 15 July 2021. 13 May is a public holiday. The course is either on-site (Seminar Room 046, Takustraße 9) or virtually (to be clarified). We will start on 15 April with a course overview via Webex:
Entrance:https://unilu.webex.com/unilu/j.php?MTID=md1991ae193e417c63f12cad6a988dafd
Please register to the presentations until 22 April 2021 (see also my email to you): Poll: https://doodle.com/poll/hunvdvrsf87riizi?utm_source=poll&utm_medium=link |
- 15 April : Course overview (2h)
- 22 April : Intro NLP (Lecture) (2h)
Deadline for registrations
Please let me know, which theme you want to examine. - 29 April : Intro NLP (Lecture) (2h)
- 06 May : Intro NLP (Lecture) (2h)
- 13 May : public holiday (ascension) - no course
- 20 May : -
- 27 May : -
- 04 June : Intermediates I (2h) - on demand
- 11 June : Intermediates II (2h) - on demand
- 18 June : Intermediates III (2h) - on demand
- 25 June : Intermediates IV (2h) - on demand
- 01 July : -
- 08 July : -
- 15 July : Full Day Workshop* (8h) / on-site or virtually; Open to Public
Intermediates refer to interim meetings in which participants report on their current status. Please discuss with me if and when you would like to make use of such a meeting. We can do it individually or on a group basis.
PRELIMINARY AGENDA
N | Name | ID | DATE | TIME | Chair# | I-check | I-check | RQ |
WELCOME | 15-Jul | 08h10 | ||||||
A-1 | Adrian Gruszczynski | AG | 15-Jul | 08h15 | EC | 03-Jun | 17-Jun | How to anonymise sensitive text data? |
A-2 | Juri Torhoff | JT | 15-Jul | 08h35 | AG | 03-Jun | 24-Jun | Stylometry - how to identify the author of a text based on its contents and writing style? / How to prevent these systems from identifying authors to preserve privacy and anonymity? |
A-3 | Emel Comak | EC | 15-Jul | 08h55 | JT | 03-Jun | 24-Jun | How to predict transcription factor binding sites from genome sequencing? |
Discussion | 20min | all | ||||||
B-1 | Mara Kortenkamp | MK | 15-Jul | 09h35 | LJ | 03-Jun | 24-Jun | Linguistic Steganography / How can secret information be hidden in texts? |
B-2 | Bernadeta Chisarau | BC | 15-Jul | 09h55 | MK | 03-Jun | 17-Jun | Machine Translation: how to do from English to Romanian? |
B-3 | Lilli Joppien | LJ | 15-Jul | 10h15 | BC | 03-Jun | 17-Jun | How Can Emotion Detection/ Sentiment for the analysis of Songtexts be used for Recommending Music? |
Discussion | 20min | all | ||||||
C-1 | Fabrizio Kuruc | FK | 15-Jul | 10h55 | CS | 10-Jun | 24-Jun | How to build a graph structure on a large scale knowledge document database? |
C-2 | Fang Lin | FL | 15-Jul | 11h15 | FK | 10-Jun | 24-Jun | Emotion Detection in Suicide Notes |
C-3 | Carlo Schmitt | CS | 15-Jul | 11h35 | FL | 10-Jun | 24-Jun | How to detect plagiarism? |
Discussion | 20min | all | ||||||
D-1 | Manish Baral | MB | 15 Jul | 12h15 | AT | 10-Jun | 24-Jun | How to detect sentiments on Real and Fake News? |
D-2 | Andreas Timmermann | AT | 15-Jul | 12h35 | MB | 10-Jun | 24-Jun | How can new Shakespeare texts be produced? How to create poems? |
Discussion | 10min | all | ||||||
END | 15-Jul | 12h45 |
Evaluation
50% Presentation + 50% Written summary of the presentation. The written summary remembers an extended abstract of ca. 2000 words, in which the selected research question is discussed. Comments during the presentation should be included, the submission deadline is, therefore, the Monday after: 19 July.
Requirements
Successful participation in the Natural Language Processing course is advantageous, but not a requirement. Rather, it is intended to promote one's own interest in connection with Natural Language Processing.
Selected literature
- David Jurafsky, James Martin: "Speech and Language Processing". Source: see https://web.stanford.edu/~jurafsky/slp3/
- The Natural language Toolkit. Source: http://www.nltk.org/
- J. Allen: Natural Language Understanding (Pearson)
- C. Manning, H. Schütze: Foundations of Statistical Natural Language Processing (MIT Press)
- S. Russel, P. Norvig: Artificial Intelligence, A Modern Approach (Pearson)