The Smithsonian Puts 2.8 Million Images in the Public Domain

February 28, 2020February 28, 2020 NewsEditor archives, museums, Smithsonian, Wikipedia

A Charlie Parker alto sax. The original patent model for the Singer sewing machine. Around 75,000 specimens of bees. All of these live in the archives of the Smithsonian Institution, the sprawling cultural organization that comprises 19 museums, nine research centers, and one 163-acre zoo. As of this week, images of these artifacts, part of a trove of 2.8 million digital pictures and 3-D models, will be in the public domain for the first time under the Smithsonian's new Open Access program.

The Smithsonian is not the first organization to take the public domain plunge. More than 500 cultural heritage institutions have already done so, including heavy hitters like the Dutch Rijksmuseum and the Metropolitan Museum of Art. But the Smithsonian’s contributions to the commonweal still stands out, not only for its breadth but for its permissiveness. You can download all of these images and models for free, and use them however you want. There are no strings attached.

“This is much more than about access,” Lonnie G. Bunch III, secretary of the Smithsonian Institution, said at an Open Access launch event Tuesday night. “We are empowering our audiences, empowering them to remix, to repurpose, to reimagine all the richness we offer. We’re inviting our viewers to become collaborators.” Fittingly, Smithsonian Magazine first reported on the effort; you can search through the collection yourself right here.

The Smithsonian could have been far more restrictive; it could easily have disallowed commercial use, for instance, or derivative interpretations. Instead, it opted for a Creative Commons Zero license, which puts no limitations on what the public can do with a given work. If you want to sell a T-shirt imprinted with an 18th-century painting of George Washington from the National Portrait Gallery, happy hawking.

The Smithsonian also hopes for more lofty applications. The stash it released includes not just media, but data that it hopes will fuel educational and research efforts. As part of its Open Access effort, it launched a public application programming interface and put its collection data in a GitHub repository. It’s already working with Google, for instance, to use machine learning to surface overlooked stories of women in science. And artist Amy Karle has used early access to the platform to create a series of sculptures based on Hatcher, a 66-million-year-old triceratops housed at the National Museum of Natural History. By opting for CC0, it places no limits on what forms that inspiration might take.

“The desire to restrict any kind of commercial use, to restrict any kind of mash-up, is very understandable,” says James Boyle, cofounder of Duke Law School’s Center for the Study of the Public Domain and one of the founding board members of Creative Commons. “The trouble is the most exciting reuses of this work may well be things we can’t imagine, that restrictions like non-commercial and non-derivative would foreclose. We don’t know what we don’t have because we don’t have it.”

The Wikimedia Foundation, which oversees Wikipedia, applauded the Smithsonian’s decision, citing in particular hopes that having so much hi-res art and mineable research data available online will help better balance representation. “Women are historically underrepresented everywhere that we go, not just on the internet but in the world,” said Wikimedia executive director Katherine Maher at Tuesday’s event. “On Wikipedia in particular, only 18 percent of biographies are of women. Representation matters.” Projects like Google’s will help fill out that roster, and the stash of high-quality images will ideally help bring those entries to life.

Courtesy of The Smithsonian Institution

The commercial and artistic possibilities abound as well. “It will have a positive impact on those who develop works that depend on archival images, including myself—researches, teachers, historians, documentary producers,” says Marina Amaral, a photo colorization and restoration specialist. “I'm already exploring the photographs and selecting a few of them to use in my next projects.”

While 2.8 million images should keep you plenty busy, they still represent just a fraction of the Smithsonian Institution’s archive of 155 million items. Over time, Smithsonian Open Access will add more tranches of hi-res imagery. The digitization process started with some of the smaller museums and institutions, and continues to work its way up to the massive collections of, for example, the National Museum of American History.

The entire archive, though, will likely never enter the public domain. That’s in part because the Smithsonian Institution doesn’t necessarily own the copyrights on everything it houses. The decision of whether to apply a CC0 license to certain items also depends on cultural and historical context.

“We have things in our collections that support stereotypes of different cultures, because that’s a product of our world, and we want to capture that so we understand it better. But we don’t want to perpetuate those stereotypes,” says Smithsonian senior digital program officer Effie Kapsalis, who spearheaded the Open Access project. The Smithsonian also works directly with indigenous groups across the US, for instance, to make sure it doesn’t inadvertently release sensitive materials. “If there’s uncertain provenance around something, or if it’s an item that is really for the eyes only of that culture, we are not going to put that latter category online.”

Then there’s the matter of what happens to the images that the Smithsonian does post. Cultural heritage institutions tend to have the same concerns about putting their treasures in the public domain, says Duke's Boyle, recalling some examples of the most common fears he hears: “You’re going to kill our gift shop. Nazis are going to take it. Someone’s going to put it in a porn.”

Those aren't unreasonable concerns to have; this is the modern internet, after all. But they also largely miss the point. “Those people can violate your copyright anyway,” adds Boyle. “The people who you want are the people who care, who would not use an image without permission and would be respectful.”

It took years for the Smithsonian to come around to that idea, Kapsalis says; fortunately, she had the research to help make the case. In 2016, Kapsalis published a series of case studies about the impact Open Access programs had on various cultural institutions like the Cleveland Museum of Art and the New York Public Library. “Fears about loss of intellectual control of collections, or reductions in the number of in-person visits, due to open access policies are largely unfounded,” she wrote at the time. “With an open access policy, revenue from rights and reproduction activities are reduced, but retaining more restrictive terms of use may cost organizations in funding opportunities, staff time, and reputation.”

It helped too that the Smithsonian has a stated goal of reaching 1 billion people annually through its digital efforts. Liberating so much of its archive makes that much more achievable; as Wikimedia’s Maher noted Tuesday, Wikipedia alone sees at least that many visitors every single month.

The technical side of bringing Smithsonian Open Access together presented its own difficulties. The museums and research centers under its umbrella all collect different kinds of data, and apply different standards to them. Getting them to talk to one another was a feat, as was finding a way to store all those high-resolution images in the cloud so as not to bring down the Smithsonian’s websites. (Amazon Web Services provides the hosting as part of its public data set program.) More important, though, was settling on an implementation that squared with the Smithsonian’s values.

“The challenging part of implementing this is not the technology,” Kapsalis notes. “It is how we do this responsibly.”

The response so far has been encouraging; Kapsalis says Smithsonian Open Access saw around 4 million image requests within the first six hours or so of going live. The Smithsonian also maintains a dashboard that keeps a running tally of the number of assets downloaded and the percent of its collections that currently have open asset items (7,774 and 23, respectively, as of this writing). But the impact of the project will likely extend even further.

If as august an institution as the Smithsonian has embraced the public domain, what excuses remain for everyone else?

Related posts