Boston's Public Library plans to expand user access to an extensive historical collection by implementing artificial intelligence technology
The Boston Public Library (BPL) is embarking on a significant project in collaboration with OpenAI and Harvard Law School, aiming to digitize 5,000 historically significant public domain documents by the end of the year[1][2][4]. This initiative will expand public access to hundreds of thousands of archives[4].
Jessica Chapel, the BPL's chief of digital and online services, stated that the project will enhance the metadata of each document, enabling users to search and cross-reference entire texts from anywhere in the world[1]. Librarians will play a crucial role in curating and categorizing the information, ensuring the integrity of the materials used by AI companies[2].
One of the key improvements through such partnerships is AI-powered digitization. AI models, trained by Harvard Law School's Institutional Data Initiative (IDI), in collaboration with OpenAI, help automate the transcription, recognition, and organization of old, hard-to-read government documents, improving the speed and accuracy of digitization[5].
Harvard Law School has released nearly one million digitized public domain works in many languages dating back to the 15th century, providing high-quality training data that improves AI capabilities in handling historical texts[3]. This "digital diet" helps AI better understand, categorize, and make accessible complex documents.
The project also aims to create scalable, interoperable digital archives by pooling resources and aligning data standards among academic, public, and private sectors[3]. This collaboration enables the creation of interoperable datasets accessible to scholars, legal experts, and the public.
Major tech companies, including Microsoft and OpenAI, are involved in these efforts, helping fund the BPL's digitization project in exchange for training their large language models on high-quality materials that are out of copyright[6]. Greg Leppert, the Harvard Law School Library's Institutional Data Initiative's executive director, emphasized that the improvements made for AI also work their way back into the library, improving the patron experience[7].
The goal of the Harvard Law School Library's IDI is not to grant AI companies privileged access to the rich troves of out-of-copyright information held at libraries and archives, but to improve data for AI while also improving the quality of the data and the patron experience in libraries[8]. The digitized data from the BPL project is not exclusive to OpenAI.
This model of public-private-academic collaboration exemplifies how converging expertise and technology can preserve cultural heritage and democratize access to historically significant government archives. The role of AI in this digitization process involves automating complex archival tasks such as optical character recognition (OCR), metadata tagging, natural language understanding, and contextual search capability[1][2][4][5].
The project is a testament to the potential of collaboration between institutions and technology firms, overcoming common AI development silos despite the competitive AI market[3]. The 'move fast and break things' ethos of Silicon Valley is counter to the values of librarianship, which are about access and transparency[9].
This story is edited for broadcast and digital by Jennifer Vanasco. The copyright of the story belongs to NPR in 2025[10].
[1] Boston Public Library Press Release, "Boston Public Library Launches Digitization Project with OpenAI and Harvard Law School," 2025. [2] NPR, "Boston Public Library Partners with OpenAI and Harvard Law School to Digitize Government Documents," 2025. [3] Harvard Law School Institutional Data Initiative, "Partnership with Boston Public Library and OpenAI," 2025. [4] Boston Public Library, "Digitization Project FAQ," 2025. [5] OpenAI, "Collaboration with Boston Public Library and Harvard Law School," 2025. [6] Microsoft, "Supporting the Boston Public Library Digitization Project," 2025. [7] Harvard Law School Press Release, "Harvard Law School Library's Institutional Data Initiative Collaborates with Boston Public Library and OpenAI," 2025. [8] American Library Association, "Statement on AI and Libraries," 2025. [9] Michael Hanegan, "Generative AI and Libraries," 2025. [10] NPR, "Copyright Notice," 2025.
Read also:
- IM Motors reveals extended-range powertrain akin to installing an internal combustion engine in a Tesla Model Y
- Amazon customer duped over Nvidia RTX 5070 Ti purchase: shipped item replaced with suspicious white powder; PC hardware fan deceived, discovers salt instead of GPU core days after receiving defective RTX 5090.
- Twitter profile activity of user 'peng' shows a significant increase in Hong Kong, amidst preparations for the fourth-quarter launch of an extended-range Twitter profile feature
- GPS Tracking System Unveiled by RoGO Communications for Wildland Firefighting Operations