Earlier this month CERN (yes, that CERN) announced version 2.0 of their open hardware licenses (announcement and additional context from them). Version 2.0 of the license comes in three flavors of permissiveness and marks a major step forward in open source hardware (OSHW) licensing. It is the result of seven (!) years of work by a team lead by Myriam Ayass, Andrew Katz, and Javier Serrano. Before getting to what these licenses are doing, this post will provide some background on why open source hardware licensing is so complicated in the first place.

While the world of open source software licensing is full of passionate disputes, everyone more or less agrees on one basic point: software is fully protected by copyright. Software is ‘born closed’ because the moment it is written it is automatically protected by copyright. If the creator of that software wants to share it with others, she needs to affirmatively give others permission to build on it. In doing so she can be confident that her license covers 100% of the software.

At least at an abstract level, that makes open source software licenses fairly binary: either there is no license or there is a license that covers everything.

Things are not as clean in open source hardware. Hardware includes software (sometimes). It also includes actual hardware, along with documentation that is distinct from both. Hardware’s software is protected by copyright. The hardware itself could be protected by an idiosyncratic mix of rights (more on that in a second) that include copyright, patent, and even trademark. The result of this is, at a minimum, an OSHW license needs to be aware of the fact that there may be many moving intellectual property pieces connected to a single piece of hardware - a fairly stark contrast to open source software’s ‘everything covered by copyright’ situation.

OSHW Licenses are Hard 2: Coverage is Hard to Generalize

The (at least superficially) straightforward relationship between software and copyright makes it easy to give generalized advice about licensing and to develop licenses that are useful in a broad range of situations. A lawyer can be fairly confident that the advice “you need a copyright license” is correct for any software package even without having to look at the software itself. That, in turn, means it is safe for non-lawyers to adopt “I need a copyright license for my software” as a rule of thumb, confident that it will be correct in the vast majority of cases. It also means that software developers can be confident that the obligations they impose on downstream users - like an obligation to share any contributions to the software - are legally enforceable.

As suggested above, hardware can be much more idiosyncratic. The physical elements of hardware might be protected by copyright - in whole or in part - or they might not. That means that the hardware might be born closed like software, or it might be born open, free of automatic copyright protection, and available for sharing without the need for a license. The flip side of this ambiguity is that a creator may be able to enforce obligations on future users (such as the classic copyleft sharing obligations) for some hardware, but not for other hardware. Expectations misalignment with regards to these kinds of obligations can create problems for creators and users alike.

All of this means that it can be hard to create a reliable software-style licensing rule of thumb for OSHW creators. Many OSHW creators end up following the practices of projects that went before them and hoping for the best. In fact, this ‘follow others’ model is the premise for the educational guidance that the Open Source Hardware Association (OSHWA) makes available.

OSHWA’s Approach

One of the many questions all of this sets up is a bundling vs breakout approach to licensing. Is it better to try and create an omni-license that covers the IP related to software, hardware, and documentation for OSHW, or to suggest users pick three licenses - one for software, one for hardware, and one for documentation? A creator could make very different choices about sharing the three elements, so the omni approach could get complicated fast. At the same time, having three distinct licences is a lot more complicated than just having one.

OSHWA ultimately decided to go with the three license approach in its certification program. This was driven in part by the realization that there were already mature licenses for software (OSI-approved open source software licenses) and documentation (Creative Commons licenses). That allowed OSHWA to take a “don’t do anything new if you can avoid it” approach to licensing education. It also required OSHWA to recommend licenses for hardware.

Existing OSHW Licenses

While many open source hardware creators use software (such as the GPL) or documentation (Creative Commons) licenses for hardware, neither of those licenses were really written with hardware in mind. Fortunately, there were three existing hardware licenses. OSHWA provided a quick comparison between the three licenses: CERN 1.2, Solderpad, and TAPR. Although all of these licenses were good first steps, they were all developed fairly early in the history of open source hardware. Solderpad and TAPR in particular were essentially designed to add hardware wrappers to existing open source software licenses.

CERN 2.0

CERN’s 2.0 licenses have been informed by all of the developments and thinking around open source hardware and licensing in the seven years between the release of 1.2 and today. In recognition that creators may be interested in multiple types of openness and obligations on downstream users, they come in the flavors: the strongly reciprocal S variant, the weakly reciprocal W variant, and the permissive P variant. While this structure makes it hard to mix reciprocities (by, for example, requiring strong reciprocity on documentation and weak reciprocity on the hardware itself), they provide a clear way for hardware creators to license the hardware portion of their projects. This is a deeply reasonable approach.

CERN’s ‘Available Components’

One evergreen question for open source hardware is ‘open down to what?’ Your design may be open, but does that mean that all of your components have to be open as well? Did you have to use open source software to create the design? Running on an open source operating system? Running on open source silicon?

OSHWA’s certification program addressed this question with the concept of the ‘creator contribution.’ The idea is that the creator must make available and openly license everything within her power to make available and open. Generally those will be her designs, code, and documentation. It is fine to include components sourced from third parties (even non-open components) as long as they are generally available without requiring an NDA to access.

CERN’s ‘available component’ definition achieves much the same goal. As long as a component is generally and readily available, and described with enough information to understand their interfaces, they do not themselves have to be open. Of course, both the contours of the creator contribution and available component may vary from hardware to hardware. Hopefully time and experience will help give us all a better sense of how to draw the lines.

Let’s See How it Goes

This post has mostly focused on the CERN license’s role in helping making ‘born closed’ components more open through licensing. There is a flip side to all of this: what happens when a license is used on a ‘born open’ piece of hardware. That can give both users and creators a distorted sense of their obligations when using a piece of hardware. However, that is probably a problem for public education, not license design.

This is an exciting time for open source hardware. CERN’s new license is a big step forward in licensing. As it is adopted and used we will learn what works, what doesn’t, and what tweaks might be helpful. The best way to do that is to use it yourself and see how it fits.

How We Made the Open Hardware Summit All Virtual in Less Than a Week

Side by side virtual fashion comparison

Virtual conferences call for virtual fashion.

This post originally appeared on the OSHWA blog and Engelberg Center blog in slightly different but substantially identical versions.

First, thank you again to everyone - speakers, participants, and sponsors - for a fantastic 10th anniversary Open Hardware Summit. We knew the 10th anniversary Summit would be one for the ages, although we didn’t quite expect it to be because it became the first virtual Summit.

Thanks to the timing of the Summit, the 10th anniversary Summit ended up being many people’s first virtual summit of the Covid-19 era (that includes the organizers). Unfortunately it looks like it is unlikely to be the last. In the hopes of helping event organizers struggling with the same challenges, this blog post outlines the decisions we made and the steps we took to make it happen.

Quick Context

The Open Hardware Summit is an annual gathering of the open source hardware community held by the Open Source Hardware Association (OSHWA). This year the Engelberg Center partnered with OSHWA to host the event in New York City. The event usually brings together hundreds of community members and speakers from around the world. It was scheduled for March 13, 2020.

While the situation has been evolving for some time, as recently as March 5th (8 days before the Summit) we thought that holding a reduced in-person version of the event was the right decision. By March 8 (5 days before the Summit) that was no longer tenable and we announced that the Summit was going all virtual. That was the right decision, but what does going all virtual mean?

Priorities

We had two major priorities for the virtual Summit:

  1. Online streaming video of all of the speakers and panels.
  2. A community space for discussions and coming together.

Video

The live stream of the Summit had to be both accessible to our viewers and easy to join for our speakers and panelists. After considering some options and consulting with experts in our community (huge thank you to Phil Torrone at Adafruit for the guidance), we concluded that a combination of YouTube and StreamYard would be the best option.

YouTube worked for our community because it is easily accessible on a wide range of platforms in most of the world. That meant that just about everyone would be able to see the Summit from wherever they were.

StreamYard made it easy to manage the backend. Speakers could join a virtual green room before their talk and our technical testing the day before the Summit made it clear that it was easy for them to share their slide presentations as well. One of the members of the Summit team was able to easily add and remove people (and their screens) to the live feed, along with stills and slides for introductions, sponsors, and everything else.

Community Space

We also looked at a number of options for online discussions. We decided that a discord server would be the best option for the open source hardware community. Discord allowed us to open the space to anyone who wanted to join, while at the same time giving us moderation control over the discussion. Many community members were already comfortable with discord, which was also a bonus.

We also decided to use discord for a version of Q&A for the speakers. One option would have been to try and integrate video questions from the audience into the live stream. That would have been technically possible with StreamYard (probably…), but it seemed like an unnecessary logistical complication for the organizers. As an alternative we decided to set up separate discord channels for each of the speakers. That allowed the speakers to end their talk and move to their discord channel for further discussions.

One unexpected and welcome development was that the discord server grew into a larger community hub, with channels devoted to solutions to Covid-19, community announcements, and even hacking the conference badge. We may decide to maintain the server well beyond the Summit as a community space.

It Mostly Worked

We scheduled brief runthroughs with all of the speakers the day before the Summit. Everyone had a chance to get comfortable with the process and work out any last minute problems. On the day of the Summit we embedded the livestream in the Summit site, along with a link to the discord server for discussion. There were a few audio glitches where speakers had to briefly drop out, but all things considered it went pretty smoothly.

Once the Summit was over the entire livestream of the Summit was posted automatically to OSHWA’s YouTube channel. Within a day or two we had broken out all of the individual talks into a video playlist and pulled the audio from our panel discussion into a stand alone podcast episode.

To the extent that things worked, one of the big reasons was the nature of the OSHWA community. Besides being generally great and supportive (no small thing), the open source hardware community already sees itself as a community and is already comfortable with connecting via online tools. That made it easy for them to enthusiastically watch the live stream and jump into the online discussion. Not all types of events have this starting point, which may suggest that they are not great candidates for this type of virtual structure.

If you are reading this because you are working on your own virtual event, good luck! We are happy to answer questions if you have them. Email us at info@oshwa.org. StreamYard also has a referral program, so if you drop us a line at info@oshwa.org we can give you a $10 credit if you want it.

The Smithsonian Goes Open Access

It doesn’t get a lot bigger than this. On February 25th the Smithsonian went in big on open access. With the push of a button, 2.8 million 2D images and 3D files (3D files!) became available without copyright restriction under a CC0 public domain dedication. Perhaps just as importantly, those images came with 173 years of metadata created by the Smithsonian staff. How big a deal is this? The site saw 4 million image requests within the first six hours of going live. People want access to their cultural heritage.

While this is all very exciting, I wanted to take a moment to dive a bit deeper into what I see in the licensing portion of this announcement. While there are many important parts of this announcement - like the API to actually access it, and fully downloadable data that is already being turned into interesting visualizations - the licensing decisions are worth considering as well. The Smithsonian has helped to set a new standard for how open access can work at big institutions, although there are still a few things that could use some improvement.

I also want to reflect on how this moment is the result of many years of effort and advocacy by a wide range of people. Some relevant moments in that process are Carl Malamud’s 2007 “What Would Luther Burbank Do?” effort (original and archive) (one rule of thumb about big moments in openness is that Carl was usually there years earlier laying the groundwork), Michael Edison’s work on the Smithsonian Commons (the best links I have are here and here, although I’m happy to update if anyone has something better), the Cooper Hewitt’s decision to release its metadata under CC0 (followed by the 3D scan of the entire building and their font (that I used quite recently) to boot), and the Smithsonian’s own study on the impact of open access on galleries, libraries, museums, and archives (not surprisingly, written by Effie Kapsalis, who would go on to spearhead this open access move by the Smithsonian). The Smithsonian’s decision to start making 3D models of its collection available online (lead by Vince Rossi) also helped lay the groundwork for the inclusion of 3D in this release. While these efforts are worth mentioning for many reasons, one is as a reminder that advocacy takes a long time and is made up of many smaller steps. Big things don’t just happen.

Make it Easy for Good Actors to be Good

Some people will see an announcement like this and immediately think of all of the bad things that could be done with these objects. While I do not dispute that bad things are possible, letting (the relatively small number of) bad actors guide thinking about open access policies does a disservice to (the relatively large number of) good actors. Copyright restrictions or terms of service are unlikely to stop bad actors from doing bad things with cultural artifacts. However, they create significant barriers to good actors doing good things with them. Access regimes should be designed to empower good actors, not to try and slow down every possible fringe bad actor. That seems to be largely how the Smithsonian approached this effort.

CC0 By Default

CC0 is a public domain dedication that clarifies that the Smithsonian is not making any claim of ownership over the digital files it is releasing. The cultural objects included in this release are all in the public domain, so the use of CC0 is not intended to address copyrights attaching to the objects themselves. Instead, CC0 is a way for the Smithsonian to indicate that it does not have any additional right in the digital file as distinct from the object it represents.

This is important in both the 2D and 3D context. In the US it is fairly clear that a digital copy of a 2D work does not get its own copyright. That is also true for 3D scans in the US. The EU is taking steps in that direction as well. The legal status of 2D images of 3D objects is a bit more ambiguous, as is that of 3D models (created in CAD instead of by scanning the object) of cultural artifacts. There is also a lingering possibility that some jurisdictions could take the law in a completely different directions.

While the weight of legal and logical authority suggests to me that the vast majority of digitizations of public domain objects do not get their own copyright protection, CC0 waives away that ambiguity and comes down clearly on the side of openness. In addition to being right on the law, I believe that this decision is right on the theory. Creating an accurate reproduction of a work in the public domain should not give you a right over it.

In the 3D context I really appreciate that the Smithsonian is applying CC0 to scans and reproductions. See, for example, the scan of the Apollo 11 Hatch:

That file is pretty unambiguously in the public domain and released under CC0. The copyright status of the CAD model of the same hatch is slightly less clear. Nonetheless, Smithsonian decided to clear up any ambiguity by using CC0.

Not Everything is Open

2.8 million files is a lot of files, but it is far from everything in the Smithsonian’s collection. As this slide from the Smithsonian’s 3D Digitization Team makes clear, there are still many objects left to digitize:

Slide showing that 1 million objects are currently on display while 154 million objects are  hidden

Some objects have not been digitized yet because they simply have not made it to the front of the queue yet. Others have been digitized but have not been included as part of the full open access program. In many cases, that is fine too.

One example is this scan of the “Project EgressApollo 11 hatch reproduction.

This is a scan of a replica of the Apollo 11 hatch created by Adam Savage as part of the 50th anniversary of Apollo 11. Unlike the original hatch, there is at least an argument to be made that the reproduction is protected by copyright. If the underlying object is protected by copyright the Smithsonian may not have the legal ability to release the files under CC0. So it didn’t.

It is OK to Keep Some Things Out of the Open Access Program

The more interesting example is that of the Sculpin Hat.

Sculpin Hat

The hat was a ceremonial object of the Tlingit clan of Sitka, Alaska. It was purchased in 1884, which means from a copyright standpoint it is in the public domain. The Smithsonian scanned the damaged hat in order to create a restored replica for the clan in 2019. That means that they have the scan. And, while the scan is up on the 3D portal for viewing, it is not released under a CC0 license or even downloadable.

Why not? Because there is more to an open access program than copyright considerations. As the digitization team notes, there are cultural reasons why an object might not be included:

Slide showing that some objects will not be open because of cultural or other reasons

These are complex questions without easy answers, and it is quite reasonable to want to engage in good faith dialogs about them with all of the stakeholders before releasing the digital file without restriction. The Traditional Knowledge labels project is another interesting attempt to begin to engage with these questions.

If Works are Kept Out of the Open Access Program, The Smithsonian Needs to Explain the Rules

While the Smithsonian’s instinct to hold some files back in a reasonable one, it needs to do a much better job of explaining them to the public.

The Sculpin Hat has a notice that ‘Usage Conditions Apply’

image of usage conditions

The same notice applies, somewhat unexpectedly, on the 3D scans of the gloves worn by Neil Armstrong on the Apollo 11 mission:

image of usage conditions

There are at least two problems with this state of affairs. First, the Smithsonian’s use conditions allow for “non-commercial, educational, and personal uses”. However, the files are not actually available for download on the portal. That means even uses within the Smithsonian’s rules are not possible yet.

Second, the popup notice makes it exceedingly unclear how the Smithsonian is imposing these conditions on users. Are these restrictions based in copyright law? If so, and there is no copyright in either the scanned object or the scan file, does that mean that these restrictions are not legally enforceable?

Alternatively, the restrictions may be based in the Smithsonian’s Terms of Use. Assuming the Smithsonian structured the download in a way that required users to agree to those Terms, those Terms could be considered a contract between the Smithsonian and the downloader that governs the use of the files. Basically, the Smithsonian could say that as a condition of accessing the files a downloader has to agree to their terms - that would allow the Smithsonina to impose rules without relying on copyright law. However, as currently written, the Terms of Use also seem to frame the Smithsonian’s control over the files as a copyright issue, not an access issue. The usage conditions section of the terms reads in part:

All other Content is subject to usage conditions due to copyright and/or other restrictions and may only be used for personal, educational, and other non-commercial uses consistent with the principles of fair use under Section 108 of the U.S. Copyright Act. All rights not expressly granted herein by the Smithsonian are reserved…

It is fine for the Smithsonian to reserve rights that exist. But framing the use restriction in the context of copyrights that do not exist is exceedingly confusing, if not legally invalid.

As discussed earlier, the Smithsonian may have valid reasons to want to limit access to some digital files. That being said, it also has an obligation to create and describe those limitations in a legally coherent way.


As I said at the outset, this is an exciting time for open access. The Smithsonian’s decision to release a large number of objects and to include 3D objects should help set the standard for open access going forward. While this effort - like all open access efforts - is a work in progress (I can’t help but notice that the Presidential Portraits collection is missing at least one portrait that we know exists, and I know of a few more works that people want to get in the 3D scan queue), it is largely being done with intentionality and thoughtfulness.

While I know that there were many, many people involved in this effort at the Smithsonian, I want to say a special thank you to Effie Kapsalis and Vince Rossi for the crazy amount of work and persistence they put into making this happen. I’m also heartened that my Engelberg Center colleague Neal Stimler was involved in making all of this happen. When an institution as big as the Smithsonian does something like this it makes a huge splash, but that does not mean getting it to happen is easy.

And one last thing - if you want to start imagining what you can do with all of this new culture at your fingertips, there’s a whole page of examples of things that talented artists have done so far.* You could even start with this book.

*I could write a whole other blog post about how important it is to go beyond releasing objects in an open access program and actually model use of those objects by recruiting creators. And maybe I will. But not today. This post is already way too long.

Feature image: Copying in the Louvre by Alfred Henry Maurer

How Explaining Copyright Broke the YouTube Copyright System

This post originally appeared on the Engelberg Center blog.

This is a story about how the most sophisticated copyright filter in the world prevented us from explaining copyright law. It doesn’t involve TikTok dance moves or nuanced 90s remixes featuring AOC. No, it involves a debate at a law school conference over how and when one song can infringe the copyright of another and how exactly one proves in a courtroom if the accused song is “substantially similar” enough to be deemed illegal. In the end, because it was blocked by one of the music companies who owns the song, it also became a textbook study in how fair use still suffers online and what it takes to pushback when a video is flagged. A copyright riddle wrapped up in an algorithmic enigma, symbolic of the many current content moderation dilemmas faced by online platforms today.

If you want to watch the video it is available here. If you prefer to listen to it you can subscribe to the Engelberg Center Live Events podcast here. And if you are curious about how new European laws about copyright filtering may impact this sort of situation in the future both inside and outside of Europe, you might be interested in our upcoming conference examining online copyright liability A New Global Copyright Order? on April 20, 2020. You can find out more information about that conference here.

The Video

The video in question was a recording of the “Proving Similarity” panel, which was part of the Engelberg Center’s Proving IP symposium in May of 2019. The panel, which was moderated by Professor Joseph Fishman, featured presentations and discussions by Judith Finell and Sandy Wilbur. Ms. Finell and Ms. Wilbur were the musicologist experts for the opposing parties in the high profile Blurred Lines copyright infringement case. In that case the estate of Marvin Gaye accused Robin Thicke and Pharrell Williams of infringing on Gaye’s song “Got to Give it Up” when they wrote the hit song “Blurred Lines.”

The primary purpose of the panel was to have these two musical experts explain to the largely legal audience how they analyze and explain songs in copyright litigation. The panel opened with each expert giving a presentation about how they approach song analysis. These presentations included short clips of songs, both in their popular recorded version and versions stripped down to focus on specific musical elements.

The Takedown

screenshot from the YouTube copyright summary and status page listing the songs included in the video

The video used clips of the songs in question to illustrate specific points about how they were analyzed in the context of copyright infringement litigation. As such, we were confident that our use of the songs were covered by fair use and disputed the claims using YouTube’s internal system.

Shortly thereafter we received notice that the rightsholder was rejecting our dispute on multiple songs.

screenshot from email alerting us that UMG has decided that the claim is still valid

The Decision and the Question

Still confident that our uses were covered by fair use, we researched the YouTube counternotification process. We discovered that if we continued to challenge the accusation of infringement and lost, our video would be subject to copyright strikes. If the account was subject to multiple strikes its ability to live stream could be restricted or the account could be terminated. While our colleagues in the communications department were highly supportive of our efforts, they were concerned that one misstep could wipe NYU Law’s entire YouTube presence off the internet.

In deciding how we could continue to press our case, one question was unclear. Our single video was subject to multiple copyright infringement claims. If we failed to prevail, would that mean that the account was subject to one copyright strike because all of the claims were against a single video, or multiple strikes tied to each claim against the single video? As there were four remaining claims against our video, and three claims could result in the termination of the account, the distinction was highly relevant to us.

screenshot from the appeal dispute page

Unfortunately, we still do not know the answer to that question. This page seems like the closest to one having an answer, but it does not provide one to our specific question. We tried using the ‘Was this helpful?’ link at the bottom to get additional information, but YouTube did not respond.

The Resolution

This would have been a dead end for most users. Unable to understand how the already opaque dispute resolution process might impact the status of their account, they would have to decide if it was worth gambling their entire YouTube account on the chances that their some combination of YouTube and the rightsholder would recognize their fair use claim.

Since we are the center at NYU Law focused on technology and innovation, it was not a dead end for us. We reached out to YouTube through private channels to try to get clarity around the copyright strike rules. While we never got that clarity, some weeks later we were informed that the claims against our video had been removed.

The Takeaway

What lessons can be learned from this process?

First, it highlights how challenging it can be for users with strong counter-arguments to dispute an allegation of infringement by large rightsholders. The Engelberg Center is home to some of the top technology and intellectual property scholars in the world, as well as people who have actually operated the notice and takedown processes for large online platforms. We had legal confidence in our position that would cost an average user tens of thousands of dollars (if not more) to obtain. Even all of those advantages were not enough to allow us to effectively resolve this dispute. Instead, we had to also rely on our personal networks to trigger a process - one that is still unclear - that resulted in the accusations being removed. This is not a reasonable expectation to place on average users.

Second, it highlights the imperfect nature of automated content screening and the importance of process when automation goes wrong. A system that assumes any match to an existing work is infringement needs a robust process to deal with the situations where that is not the case. Our original counterclaim included a clear explanation of the nature of the video and the reasons for using the clips. It is hard to imagine someone with any familiarity with copyright law watching the video, reviewing our claim, and then summarily rejecting it. Nonetheless, that is what happened. No matter how much automation allows you to scale, the system will still require informed and fair human review at some point.

Third, it highlights the costs of things going wrong. The YouTube copyright enforcement system is likely the most expensive and sophisticated copyright enforcement system ever created. If even this system has these types of flaws, it is likely that the systems set up by smaller sites will be even less perfect.

Nonetheless, we are happy that the video has been restored. You can watch it - along with all of the other videos from Proving IP - on the NYU Law YouTube channel. You can also listen to the audio from it and all of the Engelberg Center’s events by subscribing to our live events podcast.

Finally, Europe has recently passed legislation designed to oblige more websites to implement automated copyright filters. Our event A New Global Copyright Order? on April 20 will examine how that legislation will impact Europe and the worldwide conversation around copyright law. We hope to see you there.

Easy Public Domain Picture Frame with the Cleveland Museum of Art Open Access API

In celebration of Public Domain Day 2020 I decided to try to turn the old monitor in my office into a picture frame to display a rotating collection of public domain works. The Cleveland Museum of Art (CMA) launched a robust Open Access program in 2019, so I decided to use their API to power it. This blog post explains all of the steps in creating the project so you can make one too.

This is a fairly lightweight project, so all you need to make it happen is:

  1. A monitor
  2. A raspberry pi (or any other computer)
  3. Some code

Most of this post is about the code. The theory behind this project is that there is a website that regularly pulls a new image from the CMA’s API and displays it along with some information like the work’s title and creator. The raspberry pi boots into a fullscreen browser displaying that page. The screen also needs to automatically turn off at night because it is a waste to keep the monitor on all night when there is no one around to see it.

The entire project is a double celebration of openness. In addition to the works being displayed, the only reason I could even begin to build it is that the open nature of the internet’s architecture allows me to peek at better-designed sites to learn from them. Open educational resources like the Coding Train have taught me just enough javascript to be able to put something like this together.

The Site

I decided (guessed?) that the easiest way to make all of this work was to create a website that displayed the rotating set of pictures. I’m bad at javascript, so this gave me a chance to learn a little bit more about it.

The self-contained site is available in this repo. If you don’t care about how it works, you can just access a live version of it here.

index.html

This file is minimal and straightforward - it is essentially just a container with pointers to a stylesheet and the script. The one thing to note is that the script is inside of a container div:

        <div class='container'>

         <script src='script.js'></script>

        </div>

This allows me to overlay the text descriptions on top of the image.

script.js

This file is the heart of the action. I will walk through each section to explain what it does. All of the console.log lines are just for my own troubleshooting and can basically be ignored.

//function to generate a random number
function getRndInteger(min, max) {
  return Math.floor(Math.random() * (max - min) ) + min;
}

This initial function is used to generate a random number. The random number is needed in two places: first to pick the image from the collection, and second to determine how long the image will stay up before the page refreshes.

//uses the function to pick a random image in the collection
var offset = getRndInteger(1, 31278);
//inserts that random number into the request url, returning a json file
var target_json_url = "https://openaccess-api.clevelandart.org/api/artworks/?limit=10&indent=1&cc0=1&has_image=1&skip=" + offset;

This block of code is used to access the entry via the CMA’s API. I believe that there are 31,277 entries in the CMA’s open access catalog that have an image. The first line picks a random number between 1 and 31,277. The second line uses the API’s syntax to jump to the work that corresponds to that number.

The limit=10&indent=1 elements in the URL are probably unnecessary. The cc0=1&has_image=1 elements are important - they limit results to ones that have a CC0 license and have an image associated with the entry. Those are the open access entries that I care about.

//create new request object instance
let request = new XMLHttpRequest();
//opens the file
request.open('GET', target_json_url);
request.responseType = 'json'
request.send();

This block of text creates an object to hold the json file that the API returns at the URL and then opens the json file into it. Basically it creates and fills the container for the JSON file that corresponds to the object that we randomly selected above.

request.onload = function() {
    const response_json = request.response;
    //gets the image URL + tombstone of a random image from the collection and turns it into an array assigned to a variable
    var found_image_info = grabImageInfo(response_json);

    var picked_image_URL = found_image_info[0];
    var picked_image_tombstone = found_image_info[1];
    var picked_image_title = found_image_info[2];
    var picked_image_author = found_image_info[3];
    var picked_image_date = found_image_info[4];

    //creates the image to be  posted
    var img = document.createElement("img");
    img.src = picked_image_URL;

    img.alt = 'picked_image_tombstone';

    //creates the text
    var tomb_text = document.createTextNode(picked_image_tombstone)

    //creates the linebreak
    var linebreak = document.createElement('br');

    let item = document.createElement('div');
    item.classList.add('item');
    item.innerHTML = `<div class="container"><img class="beach-image"  src="${picked_image_URL}" alt="beach image"/><div class="textStyle">${picked_image_title}<br>${picked_image_author}<br>${picked_image_date}</div></div>`;
    document.body.appendChild(item);

    //set up the refresh
    //time is in ms
    //this sets the range
    var refresh_interval = getRndInteger(5000, 20000)
    console.log("refresh rate = " + refresh_interval);
    //this uses the range to reset the page
    setTimeout(function(){
        location = ''
    },refresh_interval)
}

This block is where most of the work happens, so I’ll break it down in smaller pieces. The reason it is all tucked into a request.onload function is that the code in this block waits to load until it has successfully loaded the data from the API in the background.

    const response_json = request.response;
    //gets the image URL + tombstone of a random image from the collection and turns it into an array assigned to a variable
    var found_image_info = grabImageInfo(response_json);

This first section assigns the contents of the JSON file to a variable and then sends the JSON file to the grabimageInfo function described below. That function pulls all of the data I care about out of the JSON file and puts it in an array that can be accessed with bracket notation (see next block).

    var picked_image_URL = found_image_info[0];
    var picked_image_tombstone = found_image_info[1];
    var picked_image_title = found_image_info[2];
    var picked_image_author = found_image_info[3];
    var picked_image_date = found_image_info[4];

This section assigns a variable to each element in the found_image_info array.

    //creates the image to be  posted
    var img = document.createElement("img");
    img.src = picked_image_URL;

    img.alt = 'picked_image_tombstone';

This section creates an image element. The source is the URL that comes from the JSON file and the alt text is the tombstone text from the JSON file.

    let item = document.createElement('div');
    item.classList.add('item');
    item.innerHTML = `<div class="container"><img class="beach-image"  src="${picked_image_URL}" alt="beach image"/><div class="textStyle">${picked_image_title}<br>${picked_image_author}<br>${picked_image_date}</div></div>`;
    document.body.appendChild(item);

This section creates the HTML to be added to the index.html file. The item.innerHTML section creates an HTML payload with the image and the title, author, and date overlayed on top of it. If you want to change what is displayed over the image this is where you should start messing around.

    //set up the refresh
    //time is in ms
    //this sets the range
    var refresh_interval = getRndInteger(5000, 20000)
    console.log("refresh rate = " + refresh_interval);
    //this uses the range to reset the page
    setTimeout(function(){
        location = ''
    },refresh_interval)

This is the section that sets up the page refresh. The arguments you pass to the getRndInteger variable determines the bounds of the refresh rate. Remember that the numbers are in ms. I decided to make this slightly random instead of a fixed number to add a bit of variability to the display.

function grabImageInfo(jsonObj) {

    //pulls the elements of each piece and assigns it to a variable
    var data_url = jsonObj['data'][0]['images']['web']['url']
    var data_tombstone = jsonObj['data'][0]['tombstone']
    console.log(data_tombstone)
    var data_title = jsonObj['data'][0]['title']
    //the author info sometimes doesn't exist, which screws up the function. Pulling this part out of the function fixes it because the jsonObj is not evaluated before the try/catch. I am not sure what that means but it works.
    try {
         data_author = jsonObj['data'][0]['creators'][0]['description']
     }
     catch (e) {
         data_author = ''

     }
    var data_creation_date = jsonObj['data'][0]['creation_date']

    console.log("url = " +data_url)

    //creates an array with the URL, tombstone, title, author, and creation date of the random object picked
    var function_image_data = [data_url, data_tombstone, data_title, data_author, data_creation_date]
    //returns that array
    return function_image_data;
}

This is the function to extract data from the JSON file. It pulls each relevant element and then adds it to an array. Each of the var data_url = jsonObj['data'][0]['images']['web']['url'] requests are essentially the same, with the difference being where in the JSON file they are looking for the relevant data.

try {
     data_author = jsonObj['data'][0]['creators'][0]['description']
 }
 catch (e) {
     data_author = ''

 }

The author variable works slightly differently. Sometimes the author data does not exist in the records. This structure allows the script to handle errors without crashing.

var function_image_data = [data_url, data_tombstone, data_title, data_author, data_creation_date]
//returns that array
return function_image_data;

Finally, each element of the data is put into an array and returned out of the function. The order of how the data is added to the array is arbitrary, but it is consistent so if you move something around here make sure to change how you pull them out at the top of the script.

style.css

This is also a fairly strightforward css file. The .textStyle section is what you use to style the text. I also believe that the .container section needs to be set to relative in order for the overlay to work.

The most interesting part of the file is probably the @font-face section. That loads the custom font. The font is the fantastic font that the Cooper Hewitt made available as part of their open access project a few years ago. I always like using the font for open access-related projects. The fonts live in the /data folder. They are applied to all of the text in the * section.

The Pi

Once you have everything up and running you can access it from any browser. You can try it here, press F11, and just let it happen in full screen.

If you want to run it constantly as a picture frame it makes sense to devote a computer to the task. A Raspberry Pi is a prefect candidate because it is inexpensive and draws a relatively small amount of electricity.

You could set things up so the pi hosts the file locally and then just opens it. I decided not to do that, mostly because that would involve automatically starting a local server on the pi, which was one more thing to set up. Since the service needs to be online to hit the API anyway, I thought it would be easier to just set up the page on my own domain. I have no idea if that is actually easier.

There are two and a half things you need to do in order to set the pi to automatically boot into displaying the site in fullscreen mode as a full time appliance.

Start in Fullscreen Mode

You can start Chromium in fullscreen mode from the command line. That means you can add the line to the pi’s autostart file. Assuming your username is just ‘pi’ (the default when you start raspbian), open a terminal window and type:

nano /home/pi/.config/lxsession/LXDE-pi/autostart

This will allow you to edit the autostart file directly. Add this line to the file (which is probably otherwise blank):

@chromium-browser --start-fullscreen michaelweinberg.org/cma_pd

You can change the final URL to whatever you like. If you are hosting your own version of this page, that is where to make the switch.

You may find that your fullscreen display still gets a scroll bar on one side. If that’s the case, the half thing you need to do is open chromium and type chrome://flags in the toolbar. Once you are looking at the flags, search for overlay scrollbars and enable it. That will hide the scroll bars.

Turn off the Screen

The final thing you might want to do is turn off the screen of the display at night. In order to do this you need to make two entries in cron. Here is a nice intro to cron. Cron is a linux utility that allows you to schedule commands.

The commands you end up scheduling may vary based on your particular setup. This is a helpful tutorial laying out options to make this happen. The ones that worked for me were the vcgencmd ones.

In order to schedule those I opened a terminal window and typed crontab -e. I then added two lines. This line turned off the display: vcgencmd display_power 0 and this line turned it back on: vcgencmd display_power 1. Use crontab to schedule these at appropriate times.


That’s that. This will let you set up a rotating set of public domain images on any display you might have access to. Good luck with your own version.

List image: The Biglin Brothers Turning the Stake, Thomas Eakins, 1873