Pi-Powered Berlin BVG Alerts

Moving from NYC to Berlin gave me an excuse to update my old Pi-Powered MTA Subway Alerts project for the BVG. Now, as then, the goal of the project is to answer the question “if I leave my house now, how long will I have to wait for my subway train?”. Although, in this case, instead of just answering that question about the subway train, it also answers it for trams.

The full repo is available here.

The project uses a raspberry pi zero to connect to the BVG real time arrival data with neopixels. The neopixels give you an indication of how far trains are away from the station. Importantly, the alerts are not based on the absolute time until the train arrives at the station (”A train will arrive at the station in 5 minutes”). Instead, the alerts are aware of how long it takes to walk to the station from my apartment and are therefore based on the time from the station when you get there (”If you leave now and walk to the station, there will be a train arriving in the station 5 minutes after you get there.”).

image of indicator light, with led strips for each line and text explaining what some of the lights represent

For example, the strip coming down from the top represents the southbound M10 tram. The light at the bottom of that strip (the light closest to the center) will be on when the tram is coming “now,” with “now” being defined as “if you leave the apartment now and start walking towards the station, the tram will be at the platform when you get there.” Similarly, “2 minutes” means “you can wait 2 minutes to leave” or “you will wait at the station for 2 minutes if you leave now.”

Everything is basically the same as the MTA version of the project, except that I am using a strip of LEDs instead of individually soldering them. This is much easier!

Install Neopixels on Pi

There are some tricks to setting up the neopixels to work on the pi. Here are the steps, which are spread across a few adafruit explainers:

  1. Install blinka library to be able to use circutpython : https://learn.adafruit.com/circuitpython-on-raspberrypi-linux/installing-circuitpython-on-raspberry-pi (You need to turn on the virtual environment every time: source env/bin/activate)

  2. install neopixel library: https://learn.adafruit.com/neopixels-on-raspberry-pi/python-usage

  3. do the things required to use sudo for the neopixel library: https://learn.adafruit.com/python-virtual-environment-usage-on-raspberry-pi/usage-with-sudo (sudo -E env PATH=$PATH python3 neo_test.py)

  4. make it run at startup: https://learn.adafruit.com/python-virtual-environment-usage-on-raspberry-pi/automatically-running-at-boot

The Code

The code starts with a bunch of settings, including the station you are pulling data about, how long it takes to get from wherever you are to the platforms, and lists for the lines you are tracking.

grabber()

After that, the code is basically two functions. grabber() gets all of the departure times related to the lines and puts them in the lists. Before putting each departure time in the list, it adjusts it based on the travel time. For example, if your tram_walk_time is 6 and a tram is scheduled to leave in 10 minutes, grabber() will add it to the list as 4 because, by the time you walk to the station, the train will be leaving in 4 minutes.

If you are customizing this, you will need to update all of the if statements that parse the train info so they are pulling data for the correct lines and directions. For example, here is the if statement for the U5:

if i['line']['name'] == 'U5':
     if i['direction'] == 'Hönow':
          u5_eastbound.append(get_modified_departure_time(i['when'], i['line']['productName']))
     elif i['direction'] == "S+U Hauptbahnhof" or "Hauptbahnhof":
          u5_westbound.append(get_modified_departure_time(i['when'], i['line']['productName']))
     else:
          error_direction = i['direction']
          print(f'unexpected U5 direction: {error_direction}')

first it finds all of the entries for the U5:

if i['line']['name'] == 'U5':

then it looks for trains in the direction of Hönow:

if i['direction'] == 'Hönow':

then it runs the get_modified_departure_time() function to get the modified departure time (the actual departure time modified by how long it takes to get to the station) and appends it to the u5_eastbound list created at the top of the script.

The same thing happens for trains headed towards Hauptbahnhof, with an error message if something goes wrong.

Once grabber() is done, each of the line lists are full of modified times for trains.

lighter()

The second function, lighter() uses the lists to light up the neopixels. The logic for which lights turn on based on the train time is in this block of code:

for i in arrival_list:
     if 0 <= i <= 1:
          #light the corresponding light
          pixels[light_1] = light_color
          #remove the light from the list so it does not go black
          if light_1 in light_list: light_list.remove(light_1)
     elif 2 <= i <= 3:
          pixels[light_2] = light_color
          if light_2 in light_list: light_list.remove(light_2)
     elif 4 <= i <= 7:
          pixels[light_3] = light_color
          if light_3 in light_list: light_list.remove(light_3)
     elif 8 <= i <= 12:
          pixels[light_4] = light_color
          if light_4 in light_list: light_list.remove(light_4)
     elif 13 <= i <= 20:
          pixels[light_5] = light_color
          if light_5 in light_list: light_list.remove(light_5)
     else:
          pass 

You can set the bands to be whatever you want by changing the values on the elif lines. For example

elif 2 <= i <= 3:
     pixels[light_2] = light_color
     if light_2 in light_list: light_list.remove(light_2)

means that if the train arrival time is between 2 and 3 minutes away (elif 2 <= i <= 3:), the second light for the line will turn on the appropriate color (pixels[light_2] = light_color). You could change elif 2 <= i <= 3: to elif 1 <= i <= 10: if you wanted a wider band or whatever.

The Loop

Now that the functions are set, the script just loops.

First it checks the current time:

now_time = datetime.now()

I only need the lights running during the day, so it then checks to see if the current time is between 8am and 10pm:

if 8 <= now_time.hour < 22:

If it is, it runs grabber() and then lighter() for all of the lines. In addition to giving lighter() the line argument, it identifies which actual pixel in the strip is the “first,” “second,” etc. pixel for that color. The strip is continuous, so the 15th pixel in an absolute sense might be the first pixel in the U5 westbound set of lights.

lighter(u5_westbound, 'u5', 15, 16, 17, 18, 19)

If it is not during the day, the pixels all turn off:

else:	
     #turn them off 
     for i in range(number_of_pixels):
          pixels[i] = (0,0,0)
     print('lights out')

Finally, the script waits for 10 seconds before doing it all again:

time.sleep(10)

that seems long enough to keep the data reasonably accurate without hammering the bvg servers.

New Open GLAM Toolkit & Open GLAM Survey from the GLAM-E Lab

This post originally appeared on the Engelberg Center blog

Today the GLAM-E Lab, a collaborative project between the Engelberg Center and the University of Exeter (UK), is releasing a number of tools and resources for the open GLAM (Galleries, Libraries, Archives, and Museum) community.

First, the GLAM-E Lab has launched an Open GLAM Toolkit! This suite of tools, developed directly with GLAM organizations, can be used by any cultural organization to develop their own access program and release collections for public reuse. The toolkit even includes templates for model internal and external open access policies for setting up new workflows and website policies.

Second, today the GLAM-E Lab has also launched a website-based version of the Open GLAM Survey. The Survey’s new format makes it much easier to find, explore, and analyze open GLAM organizations around the world than was previously possible via the Google Spreadsheet format.

Third, both of these are only possible because of our collaborators’ engagement. The GLAM-E Lab model is to work directly with GLAM organizations to remove legal barriers to creating open access programs, and convert that work into the standard toolkits that other organizations can use. We set a goal to work with 24 different GLAM organizations by the end of 2024, and we’ve even exceeded that goal!

Finally, all of this work led to the GLAM-E Lab winning Wikimedia UK’s Partnership of the Year Award for 2024!

You can watch our announcement video on YouTube and find more details below on these announcements. Of course, if you or someone else would be interested in working with us in 2025, please let us know!

OpenGLAM Toolkit

The Open GLAM Toolkit is built on everything that we have learned from working with GLAM-E Lab collaborators. When used together, the toolkit resources will help cultural organizations identify, prepare, and publish their digital collections for open access using public domain or other machine readable statements. It includes:

OpenGLAM Survey 2.0

Version 2.0 of the OpenGLAM Survey brings the OpenGLAM Survey to a new, more user-friendly interface. You can sort organizations by type, licenses and the platforms used. The new interface also makes it easier for us to expand the survey and keep its data up to date.

We’ve Collaborated with More than 24 Organizations!

The GLAM-E Lab model is simple: work directly with individual organizations to remove legal barriers to open access programs, and turn what we learn during that work into standard tools and documents that organizations of any size can use.

Of course, all of this depends on having organizations that are open to tackling collections management issues with us in the first place. That’s why we are so excited to wrap up 2024 having worked with over 24 organizations on rights related issues and questions on open access. You can find the list of collaborators on the GLAM-E site.

hero image: Gereedschappen voor het vervaardigen van een mezzotint from the Rijksmuseum collection.

What Does an Open Source Hardware Company Owe The Community When it Walks Away?

This week Prusa Research, once one of the most prominent commercial members of the open source hardware community, announced its latest 3D printer. The printer is decidedly not open source.

That’s fine? My support of, and interest in, open source hardware is not religious. I think open source hardware can be an incredibly effective tool to achieve a number of goals. But no tool is fit for all purposes. If circumstances change, and open source hardware no longer makes sense, people and companies should be allowed to change their strategies as long as they are clear that is what they are doing. Hackaday does a good job of covering the Prusa-specific developments, and Phil has covered other examples (I hesitate to call it a ‘larger trend’ because I don’t think that’s quite right) on Adafruit.

Still, I do believe a company that builds itself on open hardware owes the community an honest reckoning as it walks out the door. Call it one last blast of openness for old time’s sake.

Specifically, I think the company should explain why openness does not work for them anymore. And not just by waiving their hands while chanting vaguely about unfair copying or cloning. They should seriously engage with the issue, explaining how their approach was designed, what challenges it faced, and why open strategies were not up to the task for overcoming those strategies.

This discussion and disclosure is not a punishment for walking away from open, or an opportunity for the community to get a few last licks in. Instead, it is about giving the community more information because that information might be useful to it. Open source hardware is about learning from each other, and how to run an open hardware business is just as important a lesson as how to create an open hardware PCB.

What Could This Look Like?

Last year Průša (the person) raised concerns about the state of open source hardware, framing his post as kicking off a “discussion.” Members of the community took that invitation seriously. I responded with a series of clarifying questions and comments. So did my OSHWA co-board member Thea Flowers, and Phil at Adafruit. Průša is under no obligation to respond to any one of these (me yelling “debate me!” on the internet does not create an obligation on the person to actually respond).

However, kicking off a self-styled discussion, having a bunch of people respond, and then doing . . . nothing does not feel like the most good faith approach to exploring these questions. None of the questions in the response posts were particularly aggressive or merely rhetorical - they were mostly calls for more clarity and specificity in order to inform a more thoughtful discussion.

Without that clarity, we are stuck in a vague space that does not really help anyone understand things better. As the hackaday article astutely points out:

The company line is that releasing the source for their printers allows competitors to churn out cheap clones of their hardware — but where are they?

Let’s be honest, Bambu didn’t need to copy any of Prusa’s hardware to take their lunch money. You can only protect your edge in the market if you’re ahead of the game to begin with, and if anything, Prusa is currently playing catch-up to the rest of the industry that has moved on to faster designs. The only thing Prusa produces that their competitors are actually able to take advantage of is their slicer, but that’s another story entirely. (And of course, it is still open source, and widely forked.)

If moving from open to closed prevents cheap clones, how does that actually work? That would be useful information to the entire open source hardware community! If it does not prevent cheap clones, why use that as a pretext? Also, useful information to the community!

Feature image: Political Discussion in a Lumber Shanty from the Smithsonian Open Access collection

Keep 3D Printers Unlocked (the win! 2023)

Last summer I submitted a request that the Copyright Office renew an existing rule that allows users to break DRM that prevents them from using materials of their choice in 3D printers. As of October 28th, that rule has been renewed for another three years.

This is good news! Copyright law should not allow 3D printing manufacturers to force users to only use approved materials. Wins in copyright policy world are rare, so let’s celebrate one when it comes.

This request was part of a larger every-three-year process involving dozens of requests to allow people to do things that are not prohibited by regular copyright law, but are prohibited by a special provision of copyright law that prohibits breaking digital locks (even for otherwise legal purposes!).

This time around, that larger process was a bit of a mixed bag. The good news is that many of the existing exemptions were renewed. The less good news is that some of the new exceptions were approved on highly restricted terms, making them much less useful.

What happens now? 3D print with whatever material you want, free of fear of a copyright lawsuit (at least over using unapproved material in your printer - what you print can still get you into trouble). Three years from now, we’ll do this dance yet again.

Licensing Deals Between AI Companies and Large Publishers are Probably Bad

Licensing deals between AI companies and large publishers may be bad for pretty much everyone, especially everyone who does not directly receive a check from them.

Although the initial copyright lawsuits from large content companies like Getty Images and music labels are still very much ongoing (with new ones being filed regularly), recently we’ve also seen a series of licensing deals between large content owners and AI companies.

Setting aside the wisdom of the deal for any individual content company, I worry that these licensing deals represent a bad outcome for just about everyone else. Most of the companies entering into these agreements represent a relatively large amount of cultural power (that can be leveraged into public pressure), and a relatively small corpus of works (relative to the amount of works required to train a model), backed up with enough legal power to qualify as a plausible threat to an AI company. That puts them in a position to demand compensation that is out of proportion to their actual contribution to any given model.

The deals that flow from this dynamic allow a small number of companies to claim a disproportionate amount of compensation for their relatively modest contributions to a training dataset. In doing so, the licenses establish a precedent that may undermine the fair use defense for unlicensed training of models, making it harder for smaller competitors to enter the AI market.

This might be a positive development if these deals also increased the likelihood that everyone who created data used to train models would receive significant compensation.* However, these deals likely marginally decrease the likelihood of that outcome by allowing the media companies signing these deals to soak up most of the available licensing dollars before the vast majority of people and companies who created data in the training datasets are involved. The most likely outcome could be one similar to Spotify, where large record labels and a handful of high-profile artists receive significant compensation, while everyone else receives fractions of pennies (or no pennies).

Licensing Dollar Roll Up

It is easy for anyone who wants to be paid a licensing fee by AI model trainers to see these deals as a positive development. They may set a precedent that data must be licensed, and a market rate for data that applies to everyone else.

However, at this stage there does not appear to be any reason to see these deals as setting a standard for anything other than large (or large-ish) media companies and rightsholders. These deals do not set benchmarks for independent artists, or for anyone without the existing cultural and legal clout to demand them. After all, the terms of these deals aren’t even public.

It may be better to understand these deals as the large media companies and rightsholders jumping to the front of the line in order to soak up as much available licensing money as possible. Their incentive is to maximize the percentage of the licensing pool that they receive - not to set a standard on behalf of everyone else, or to grow the pie for others. In fact, every dollar of value that someone outside of the deal can claim is a dollar the large media companies cannot include in their own deal with the AI companies.

The result is that the large media companies leverage “creators should be paid” rhetoric to roll up all of the available licensing dollars, while making it marginally harder for anyone else to be paid for being part of the training data.

Which seems bad! As a bonus, these deals may undermine the fair use defense that allows the models to be created in the first place.

Blocking Competition

The copyright lawsuits over data used to train models all turn on whether or not the training is covered by fair use. If the act of training models on data is fair use, the trainers do not need permission from the data rightsholders (I think this is both the better reading of the law and the better policy outcome). If the act of training is not fair use, the trainers will need permission from every rightsholder of every bit of data they use to train their models.

Determining fair use involves applying a four factor test, one of which is the effect of the use on the potential market for the data. I’m confident that the AI company lawyers are crafting these agreements with an eye towards avoiding establishing a market for AI training data (available public information on the deals suggest that they are framed in terms of making it easier to access the data through APIs or other bulk data transfers, not a license to the data itself). Nonetheless, the existence of these deals does probably marginally increase the likelihood that courts would decide that there is a functioning market for licensing training data.

If that were the case, and courts found that the majority of the other fair use factors pushed against a finding of fair use, that would mean that only companies with enough money to license training data at scale could train new AI models. I think this would probably be a bad policy outcome because it could effectively block new market entrants in AI. And working out the licensing process would be somewhere between complicated and impossible.

All of which makes these deals bad for pretty much everyone. They are bad for any creators who are not being directly paid by them, bad for anyone who would welcome new competition in AI, and bad for anyone who generally thinks that non-consumptive uses of information on the internet should be protected by fair use.

*I currently believe that compensating everyone who created data used to train models is a bad idea, but I understand why it is an attractive option to many people.

Hero Image: A nun frightened by a ghost playing a guitar; page 65 from the “Images of Spain” Album (F)