Benjamin Charles Germain Lee

Journal Publications

LIMEADE: From AI Explanations to Advice Taking
Benjamin C.G. Lee, Doug Downey, Kyle Lo & Daniel S. Weld
ACM Transactions on Interactive Intelligent Systems (TiiS)
Special Issue: Human-Centered Explainable AI (*conditional acceptance*)

The "Collections as ML Data" Checklist for Machine Learning and Cultural Heritage
Benjamin C.G. Lee
Journal of the Association for Information Science and Technology (JASIST)
Special Issue: Conceptual Models of the Sociotechnical (*conditional acceptance*)

Towards a Speculative Bibliography of Hemispheric Reconstruction Newspapers
Joshua Ortiz Baco*, Benjamin C.G. Lee*, Sarah Salter* & Jim Casey* (equal contribution)
Criticism: A Quarterly for Literature and the Arts
Special Issue "New Approaches to Critical Bibliography and the Material Text" (*forthcoming*)

Grappling with the Scale of Born Digital Government Publications: Toward Pipelines for Processing and Searching Millions of PDFs
Benjamin C.G. Lee & Trevor Owens
International Journal of Digital Humanities, Volume 3, 2022
DOI, ArXiv

Compounded Mediation: A Data Archaeology of the Newspaper Navigator Dataset
Benjamin C.G. Lee
Digital Humanities Quarterly, Volume 15, Issue 4, 2021
DOI, Humanities Commons

​Machine Learning and the Social Studies
Benjamin C.G. Lee, Ilene R. Berson & Michael J. Berson
Social Education, Volume 85, Issue 2, 2021

Machine Learning, Template Matching, and the International Tracing Service Archive:
Automating the Retrieval of Death Certificate Reference Cards from 40 Million Document Scans
Benjamin C.G. Lee
Digital Scholarship in the Humanities, Volume 4, Issue 3, 2019

Improved Point-source Detection in Crowded Fields Using Probabilistic Cataloging
Stephen K.N. Portillo, Benjamin C.G. Lee, Tansu Daylan & Douglas Finkbeiner
The Astronomical Journal, Volume 154, Number 4, 2017
DOI, ArXiv

Galaxy Redshifts from Discrete Optimization of Correlation Functions
Benjamin C.G. Lee, Tamás Budavári, Amitabh Basu & Mubdi Rahman
​The Astronomical Journal, Volume 152, Number 6, 2016
DOI, ArXiv

Conference Publications

Navigating the Mise-en-Page: Interpretive Machine Learning Approaches to the Visual Layouts of Multi-Ethnic Periodicals
Benjamin C.G. Lee*, Joshua Ortiz Baco*, Sarah Salter* & Jim Casey* (equal contribution)
Computational Humanities Research (CHR) 2021 (accepted)
DOI, ArXiv

LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis
Zejiang Shen, Ruochen Zhang, Melissa Dell, Benjamin C.G. Lee, Jacob Carlson & Weining Li
ICDAR 2021
DOI, ArXiv, Preview Video, Presentation Video

The Newspaper Navigator Dataset: Extracting Headlines and Visual Content from 16 Million Historic Newspaper Pages in Chronicling America
Benjamin C.G. Lee, Jaime Mears, Eileen Jakeway, Meghan Ferriter, Chris Adams, Nathan Yarasavage, Deborah Thomas, Kate Zwaard & Daniel S. Weld
CIKM 2020
DOI, ArXiv, GitHub, Dataset Website
*Best Resource Paper Runner-up (92 submissions)*
*Best Digital Humanities Dataset, 2020 DH Awards*

Book Chapters

Identity, Personhood, and Material Culture:
Personal Effects Confiscated from Prisoners Upon Arrival at Dachau Concentration Camp
Gabriel Pizzorno* & Benjamin C.G. Lee* (equal contribution)
The Material Culture of Difficult Histories, Chapter 10
Cornell Univerity Press (forthcoming)

The Digital Humanities and the Ladino Press:
Using Machine Learning to Extract and Analyze Visual Content in Historic Ladino Newspapers
Benjamin C.G. Lee
Jewish Studies in the Digital Age
Studies in Digital History and Hermeneutics Series, Chapter 10
De Gruyter Press, 2022

Computer Science Research and Digital Humanities Questions
Benjamin C.G. Lee
The Digital Futures of Graduate Study in the Humanities, Chapter 30
Debates in the Digital Humanities Series (forthcoming)

In Preparation

The European Clergy in Dachau: A Digital Humanities Research Approach to a Concentration Camp Prisoner Population
Benjamin C.G. Lee* & Andrew Kloes* (equal contribution)

User Refinement of an AI’s Explanatory Vocabulary with Interactive Machine Learning
Prithvi Tarale, Cindy Su, Benjamin C.G. Lee, Gagan Bansal & Daniel S. Weld

Commissioned Reports

A Landscape of Data Sources: Findings & Recommendations
A Report Commissioned by the Library of Congress
Benjamin C.G. Lee
In partnership with the Digital Strategy Directorate, Strategic Planning & Performance Management, and the Financial Services Directorate
February 1, 2021
Delivered to the Deputy Librarian of Congress (available as internal report only)

Workshop Publications

Line Detection in Binary Document Scans:
A Case Study with the International Tracing Service Archives
Benjamin C.G. Lee
2nd Computational Archival Science Workshop
2017 IEEE International Conference on Big Data


Newspaper Navigator: Open Faceted Search for 1.5 Million Images
Benjamin C.G. Lee & Daniel S. Weld
UIST 2020
Paper DOI, Preview Video, Short Talk Video


Newspaper Navigator: Putting Machine Learning in the Hands of Library Users
Benjamin C.G. Lee, Jaime Mears, Eileen Jakeway, Meghan Ferriter & Abigail Potter
October 16, 2020
EuropeanaTech Insight (Issue 16)

Undergraduate Thesis

Probabilistic Cataloging of the Globular Cluster Messier 2:
Improved PSF Photometry of Crowded Stellar Fields
Benjamin C.G. Lee
April 7, 2017

Public Code Repositories

Newspaper Navigator
Benjamin C.G. Lee (2020)
GitHub (199 stars)

Paired Sequence File Comparison:
Fast Validation of FASTQ Files Containing Paired-end Reads
Benjamin C.G. Lee (2012)