Proseminar: Information Retrieval S20
to Whiteboard Site

Description

   INFORMATION RETRIEVAL  

(Pro)Seminar, Summer Term 2020

Christoph Schommer, Last Update: 1 September 2020

{

Preliminary Discussion (Vorbesprechung): 1 September 2020
An Online Meeting invitation has been sent to you by 18 August 2020:

WEBEX

Where: https://unilu.webex.com/unilu/j.php?MTID=m43e46106464f5b1c008f00df14a8edf6

Meeting number (access code): 163 924 4216
Meeting password: IR1418PrelDisc

}

 

ORGANISATION

The course is a (pro)seminar with main focus on Information Retrieval. We will have a preliminary discussion (deutsch: Vorbesprechung) on Tuesday, 1 September 2020, at 10h00 c.t. An invitation to the Webex Online Meeting has been sent to you and can be found above.
 

The course takes place from 14 - 18 September 2020 as follows (see the final presentation schedule below under "PAPERS") :

Meeting Link (Monday - Friday):
https://unilu.webex.com/unilu/j.php?MTID=m6cfe93599ac5d1909719991a0a1311e0
Meeting number: 163 902 0702
Password: 3yJ739gb3rm
Host key: 886605
  • Monday: Lecture from 09h15 - 10h45
    • What is Information Retrieval?
    • Boolean Retrieval
    • Posting lists and Inverted Index Construction
  • Tuesday: Lecture from 09h15 - 10h45; Talks from 11h00 - 15h00
    • Natural Language Processing: Tokenization, Lemmatization, Stemming; Porter Stemmer.
    • Word Repair: n-grams and Jaccard, Soundex.
  • Wednesday: Lecture from 09h15 - 10h45; Talks from 11h00 - 15h00
    • Word Repair: Levenshtein (Edit) Distance
    • Wildcards and data structures (B-tree; reverse tree).
    • Ranking tf and idf, document frequency, calculation of a score.
    • Vector Space model: representing documents and queries as points in the space (-> vectors).
    • Vector Space model: use of the angle/cosine to find out the similarity/distance between 2 vectors; dot product.
  • Thursday: Lecture from 09h15 - 10h45; Talks from 11h00 - 15h00
    • PageRank idea
    • Evaluation with Precision and Recall; F-measure; van Rijsbergen's alpha.
    • kappa-model
  • Friday: Lecture from 09h15 - 10h45; Talks from 11h00 - 15h00
    • Query expansion; user feedback.
    • In short: apriori, Association Discovery for a Query expansion.

Please note that the course and the presentations will take place in English language.

CONTENT

In the lecture part, we discuss the selected aspects regarding a search engine, the role of Natural Language Processing, and typical aspects like Ranking, Evaluation, the role of Feedback, Query Extensions, Quality aspects, and more.

With the talks T1-T16, each candidate contributes once to the course by a talk Paper Review. For this purpose, each candidate has to deliver the following documents to a selected paper of his/her choice:

  • A talk of up to 30 minutes (where the paper review is presented) plus Q&A.
  • A written version of the paper review of up to 1000 - 1200 words (in pdf-format).

Each paper review should discuss the following points:

  • Give a short summary of the paper. What is the paper about and what is the main contribution of the paper?
    (as mentioned in the Vorbesprechung, you may use the abstract of the paper but you should write in your own words and extend this part by additional aspects, for example the structure of the paper. Important: this part has to be objective without any kind of subjective comments!).
  • Discuss the clarity of writing, i.e., the style, the presentation of figures and tables, the usage of acronyms (fluent reading guaranteed?), spelling errors, etc. Do you believe that the presented topic has been understood?
  • Does the paper sound technically sound? What is the scientific novelty? Have tests and experiments been made and do these convince? Is data sufficiently explained? Are the results sufficiently discussed? Are citations sufficiently made? Do you see a plagiarism?
  • Do you see a relevance of the work to other fields (cross-usage)?
  • Is there a critical reflection of the content given by the authors (e.g., a SWOT)? Are there convincing suggestions and explanations regarding a future work?
  • Which audience do you see as appropriate (industry and/or academia; practioners or theorists; only Computer Science or multi-disciplinary work)? Why?
  • What are the best and the weakest points in the paper? State one another positive and negative point, respectively.
  • How do you evaluate the speaker's presentation (see the link to the speaker's presentation behind each paper).
  • Please include your scores:
    • Overall Rating for the paper: 5 (Full Accept), 4 (Weak Accept), 3 (Borderline), 2 (Weak Reject), 1 (Reject).
    • What is your confidence in *your* rating: 5 (I am an expert), 4 (I am very convinced), 3 (I am confident), 2 (I am not sure), 1 (I am not confident at all).
    • Also: do you recommend the paper as a poster or as a short presentation or as full presentation?
    • Do you recommend the paper for a Best Paper Award? Please explain your answer!

Submission Deadline: Friday, 25 September 2020, 11h00 CEST ( confirmed )

 

 

EVALUATION

  • 40% Presentation of your paper review.
  • 50% Executive summary of your paper review.
  • 10% Presence during the course.

The main reference is the book by Manning, C., Schütze, H.: Introduction to Information Retrieval, Cambridge University Press. see https://nlp.stanford.edu/IR-book/information-retrieval-book.html .

PAPERS

The following papers have been presented at the 42nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, in Paris, France. ACM SIGIR is the premier scientific conference in the broad area of Information Retrieval. Url: https://sigir.org/sigir2019/ Each paper has 10 pages. Please note that for each paper, a video is available} (recording of the speaker's presentation: http://www.sigir.org/sigir2019/program/schedule/

TUESDAY, 15 September

  • 11h00 - 11h45 Glenn Schneider: #A Hate Speech Detection is not easy as you may think
  • 11h45 - 12h30 Evghenii Orenciuc: #D Relational Collaborative Filtering - Modeling Multiple Item Relations for Recommendation
  • 12h30 - 13h15 Juri Torhoff: #H ENT Rank - Retrieving Entities for Topical Information Needs through Entity-Neighbor-Text Relations

WEDNESDAY, 16 September

  • 11h00 - 11h45 Victoriya Kralewa: #Q An Efficient Adaptive Transfer Neural Network for Social-aware Recommendation
  • 11h45 - 12h30 Nils Thiele: #G Asking Clarifying Questions in Open-Domain Information-Seeking Conversation
  • 12h30 - 13h15 Tim Kluge: #K Teach Machine How to Read: Reading Behavior Inspired Relevance

THURSDAY, 17 September

  • 11h00 - 11h45 Samuel Enderwitz: #C Adaptive Multi-Attention Network Incorporating Answer Information for Duplicate Question Detection
  • 11h45 - 12h30 Jonas Schäfer: #L Context-Aware Intent Identification in Email Conversations
  • 12h30 - 13h15 Fritz Cremer: #J Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos

FRIDAY, 15 September

  • 11h00 - 11h45 Shiho Onitsuka: #O Health Cards for Consumer Health Search 
  • 11h45 - 12h30 -
  • 12h30 - 13h15  -

_____________________________________________________________________________________________________________________

NOT ASSIGNED (as of 1 September; Deadline of interest: Thursday, 10 September):

  • #B Answering Complex Questions by Joining Multi-Document Evidence with Quasi Knowledge Graphs
  • #E Neural Graph Collaborative Filtering
  • #F Context Attentive Document Ranking and Query Suggestion
  • #I Transparent Scrutable and Explainable User Models for Personalized Recommendation
  • #M DivGraphPointer - A Graph Pointer Network for Extracting Diverse Keyphrases
  • #N Personalized Fashion Recommendation with Visual Explanations based on Multimodal Attention Network - Towards Visually Explainable Recommendation
  • #P Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction
  • #R Online User Representation Learning Across Heterogeneous Social Networks

Contact

By Email: christoph.schommer@fu-berlin.de (or christoph.schommer@uni.lu)

 

Basic Course Info

Course No Course Type Hours
19319510 Proseminar 2

Time Span 01.09.2020 - 18.09.2020
Instructors
Christoph Schommer

Study Regulation

0086c_k150 2014, BSc Informatik (Mono), 150 LPs
0086d_k135 2014, BSc Informatik (Mono), 135 LPs
0087b_k90 2009, BSc Informatik (Kombi), 90 LPs
0088b_m60 2006, BSc Informatik (Kombi), 60 LPs
0132b_m30 2006, BSc Informatik (Kombi), 30 LPs
0207b_m37 2015, MSc Informatik (Lehramt), 37 LPs
0208b_m42 2015, MSc Informatik (Lehramt), 42 LPs
0458a_m37 2015, MSc Informatik (Lehramt), 37 LPs
0471a_m42 2015, MSc Informatik (Lehramt), 42 LPs
0496a_MA120 2016, MSc Computational Science (Mono), 120 LPs
0556a_m37 2018, M-Ed Fach 1 Informatik (Lehramt an Integrierten Sekundarschulen und Gymnasien), 37 LPs
0557a_m42 2018, M-Ed Fach 2 Informatik (Lehramt an Integrierten Sekundarschulen und Gymnasien), 42 LPs

Proseminar: Information Retrieval S20
to Whiteboard Site

Main Events

Day Time Location Details
Daily  9-13 T9/049 Seminarraum 2020-09-14 - 2020-09-18
Daily 14-16 T9/049 Seminarraum 2020-09-15 - 2020-09-17

Proseminar: Information Retrieval S20
to Whiteboard Site

Most Recent Announcement

:  

Currently there are no public announcements for this course.


Older announcements

Proseminar: Information Retrieval S20
to Whiteboard Site

Currently there are no resources for this course available.
Or at least none which you're allowed to see with your current set of permissions.
Maybe you have to log in first.