Datathon and Machine Learning Competition 2025

2025 Datathon & Machine Learning Competition

ANNOUNCING THE WINNERS OF THE 2025 DATATHON & MACHINE LEARNING COMEPTITION:

The ISCA Datathon & Machine Learning Competition on Antisemitism 2025, hosted by the Institute for the Study of Contemporary Antisemitism (ISCA), Indiana University, brought together high school and undergraduate students to explore the technical and conceptual challenges of detecting antisemitic hate speech online.

The Datathon was coordinated by Rachel Kelly (Project Manager), who managed all correspondence with our partners and social media communications. Daniel Miehling (Computational Research Coordinator) created and redesigned the challenges and evaluation framework based on iterations of previous Datathons. He also led the second workshop on practical methods, including scraping, annotation, coding foundations, and evaluation. Günther Jikeli (ISCA Associate Director) provided the conceptual framing in the opening session, and Damir Cavar introduced automated text analysis in Workshop 3.

We gratefully acknowledge our partners and sponsors — The Bright Initiative by Bright Data, Indiana University, World Jewish Congress, TECHRI – Technology and Human Rights Institute, Jewish Federation of Greater Indianapolis, and Diane M. Druck — all of whom made this competition possible.

Challenge Overview

Challenge #1: Dataset Creation & Annotation (July 13–20, 2025

Participants used Bright Data to scrape posts from X (formerly Twitter), defined sampling strategies, and annotated their datasets using the ISCA Annotation Portal. Annotation followed the IHRA Working Definition of Antisemitism (IHRA-WDA), with teams allowed to adapt alternative frameworks if they provided clear justifications. Reports included dataset documentation, label definitions, and reflections on ambiguity. Teams could earn bonus points by reporting inter-annotator agreement (IAA).

Challenge #2: Modeling & Evaluation (July 20–27, 2025

Using ISCA’s gold-standard antisemitism datasets^[1]^[2], teams fine-tuned transformer models (e.g., RoBERTa, DeBERTa, HateBERT, BERTweet). Submissions included performance metrics, confusion matrices, error analyses, and reproducible code (in Google Colab). Bonus points were awarded for testing on newly collected and annotated unseen data.

Rank #1: Team 6 – MagenCode (Oriel Atias, Giulio Zuckermann, Eliana Woolf and Asher Rosenfeld) Retained all four members, created a 355-tweet dataset aligned with IHRA-WDA, reported Cohen’s Kappa = 0.54 (moderate agreement), and fine-tuned RoBERTa to achieve Macro F1 ≈ 0.899. Strong methodology and full documentation secured their top placement.

Rank #2: Team 3 – Bias Busters (Dvir Sacho-Tanzer, Jacob Neuer and Jennifer Cronkright) Collected a diverse dataset, adapted IHRA-WDA with justification, and documented methodology transparently. Their RoBERTa-hate model achieved Macro F1 ≈ 0.617. Despite reduced team size, they completed both challenges comprehensively.

Rank #3: Team 2 – Code4Clarity (Syed Afnan Adit, Mark Vinokur and Saisha Siram) Applied IHRA-WDA directly, reported IAA (low but transparent), and built a well-structured dataset. Their RoBERTa-offensive model achieved Accuracy ≈ 87%, with clear error analysis and even unseen-data testing.

Honorable Mention

Team 4 (Dena Shink and Daniel Maccholl) delivered high-quality results comparable to larger teams despite working only with two members throughout the Datathon

View Webinar Here

Antisemitism has long been prevalent on social media, but it surged dramatically following the October 7 Hamas attack in Israel and the ongoing war in Gaza.

The 2025 Datathon & Machine Learning Competition offers high school and college students worldwide a unique opportunity to engage in cutting-edge research at the Institute for the Study of Contemporary Antisemitism (ISCA). Over three weeks, participants will attend a series of interactive workshops exploring:

Understanding Antisemitism: How is it defined? How has it manifested online before and after October 7?

Machine Learning & Online Hate: How can AI help identify antisemitism on social media? What are its limitations?

Intelligent Data Scraping & Sampling: How can Bright Initiative’s tool be leveraged to collect and analyze social media data effectively?

The Role of Manual Annotation: Why is human review essential for training automated detection models? How do you accurately annotate tweets?

Building Detection Models: What computing techniques and machine learning algorithms can be used to predict online content? How do you develop these models?

After completing the workshops, teams will use the Bright Initiative’s tool to create data samples and manually annotate a dataset of tweets related to Jewish life and antisemitism. They will then be provided with larger annotated datasets to train their own machine learning models for automatically identifying antisemitic content. Teams will be evaluated based on the innovation of their data sampling methods, the accuracy of their manual annotations, and the performance of their trained models.

For more details, check out last year’s competition.

How to Apply

You can apply as a team of 4-5 people or as an individual, and we’ll match you with a team. Most applicants apply individually.

Prizes

Top three teams will receive:

1st place: $1,000 gift certificate*

2nd place: $600 gift certificate*

3rd place: $400 gift certificate*

*Management and distribution of all prizes are generously provided by the World Jewish Congress. Prizes will be split equally amongst teammates.

Eligibility & Requirements

Free to enter – no participation fee
No prior experience required (basic programming knowledge is helpful)
Open to all high school and undergraduate students worldwide
Must have access to a computer & internet (all workshops are conducted via Zoom)
Participants under the age of 18 must have parent or guardian permission – find the permission slip here

Workshop Schedule

📅 July 13, 2025 @ 12PM EST – What is Online Antisemitism Before and After October 7?
📅 July 20, 2025 @ 12PM EST – Data Scraping with Bright Data and Manual Annotation for Automated Content Detection: Processes and Challenges
📅 July 27, 2025 @ 12PM EST – Basics of Automated Content Detection

Don't miss this chance to develop real-world skills in data science, machine learning, and social media analysis—all while contributing to research on a critical global issue.

The deadline for submitting your application is June 30, 2025!

Apply Now!

Contact

Project Manager - Rachel Kelly (rk18@iu.edu)

Computational Research Coordinator - Dr. Daniel Miehling (damieh@iu.edu)

ISCA Associate Director - Dr. Günther Jikeli (gjikeli@iu.edu)

This text appears when someone is browsing using assistive technologies.