Multilingual Tweet Intimacy Analysis is a task to predict the intimacy of tweets in different languages. Recognizing intimacy can serve as an important benchmark to test the ability of computational models to understand social information. To further promote computational modeling of textual intimacy, a competition was organised around the Multilingual textual intimacy dataset (MINT). The training data in MINT covers tweets in 6 languages, including English, Spanish, French, Portuguese, Italian, and Chinese, which are languages used by over 3 billion people on Earth, in The Americas, Europe, and Asia. A total of 12,000 tweets are annotated for the six languages. To test the model generalizability under zero-shot settings, small test sets are provided for Dutch, Korean, Hindi, and Arabic (500 tweets for each).
SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis
Forum
Year
2023
Link to publication