Categories
coding tech

Multimedia Archiving

Wouldn’t it be awesome if there was an easy way to catalog and control the ever-rising deluge of photos and videos we generate with our devices, a system of organizing that could be transferred to future family members for safe-keeping? What if this system had the following traits?

  1. locally-controlled (by you)
  2. decentralized (resilient)
  3. platform-independent
  4. with a standardized file structure
  5. and a standardized file naming scheme
  6. that is both effortless
  7. and flexible

Well, that would be awesome, indeed.

It so happens I have such a system. It achieves the first five of the above traits, but it’s not yet effortless (if there is such a thing) or flexible. Without me the system falls into disarray. This post isn’t exactly a life hack – not yet, anyway. It’s the first of what will be several posts on this project as I get closer to making it more flexible and easier for others to use. With this post I wanted to explore the reasons for such a system, and to illustrate the general idea.

Ten years ago my dad gave me a box of photos and slides, which I scanned and integrated into my family’s multimedia mess. This effort began a home-grown archiving system that would come to be known as the Multimedia Archive Project (MAP). For me it’s a workable solution, still evolving today.

My ultimate long-term goal is to establish a system (more a protocol than an actual set of tech) that is easily transferable to my kids and subsequent generations. A decade later I’m in a holding pattern, still looking for technology that could suit my needs. Here’s some more details about what I mean by the above traits and why they’re important to me.

What is “locally-controlled”? I want the primary location of my family’s multimedia files to remain in my hands, so to speak. I’m not contributing to the oceans of photographic knowledge that an artificial intelligence uses to shape the world in ways I don’t see fit. This might seem paranoid now, but the world is starting to understand “free” services for individuals have big costs to society as a whole. It’s very important that I maintain control.

“Decentralized” just means it’s impossible to lose data. This part isn’t exactly effortless. It requires discipline and planning that most people aren’t willing to do. The Multimedia Archiving Project is backed up in the same system I use to back up all the data in my household, which includes consolidation and copies made to mirrored USB drives, a NAS, and a cloud service based in Switzerland that is a stickler for General Data Protection Regulation (GDPR) rules.

“Platform-independent” is another cornerstone of this project. I don’t want to be locked into any single app, or depend on one company’s services. When I started this effort ten years ago the big cloud services were making it difficult to switch platforms. They’re a little nicer now, but to some degree this is still true.

Apple offers a great all-in-one photo archiving solution, and they’ll no doubt be around for decades to come. I’d be the first to recommend Apple to anyone who doesn’t have the discipline or technical chops to handle a DIY solution, based on their track record of quality software (iTunes for Windows not withstanding) and data privacy. Still, I prefer platform-independence. My family and I have some Apple devices, but we don’t have a Mac. What if I put my trust in a company who changes the rules twenty years down the road in a way that is unethical, inconvenient, and-or too expensive for me?

At the opposite end of the spectrum, I’d rather delete everything than trust a mind-control advertising platform like Facebook or Google with my family memories. Last year we went skiing with another family in beautiful Niseko, Japan. The other dad and I were on the mountain together one afternoon and he took a video of me as we skied down the slopes. I thought it would be an awesome video, with the sun in the right position and the spectacular scenery. And it was! The only problem was it had been live-streamed to Facebook and I didn’t have a Facebook account. Never mind me. What if the photographer wanted to preserve this video – or any media – and pass it down to his kids? They’d need Facebook accounts, too. This illustrates the importance of platform-independence. It’s the freedom to never be locked down to a proprietary system that defines how you can use your own stuff.

There are some very positive trends in digital identity that could work in my favor as the decades unfold (see previous post). Bottom line, I want the flexibility of moving my family memories securely and safely, with the maximum privacy levels, whenever I want.

The “standardized file structure” and “file naming scheme” are the coolest features of the system. They’re inspired by ISO standards. This gets into how this system works.

How does this thing work?

First, there are rules, because every system has rules. Fortunately most of the rules are enforced by code, but the first one must be observed by humans: NEVER MODIFY ORIGINALS.

The second rule is there is one and only one destination path for any given source device, a file folder in the ORIGINALS directory. For example, there is a folder for all the originals backed up from my wife’s iPhone, a folder for my camera photos, a folder for our Gopro, and so on. These devices and paths are configured in an XML file. This system runs on Windows, so I use PowerShell. In the future I might go with Linux and Python.

The process begins by running a script to “add new files to archive,” which reads the XML file for source and destination paths, checks to see if the devices in question are plugged into the system, and if so compares the latest photo and video files on the device with what’s already in the archive. If there’s new stuff then it copies it to the destination folder. I run a separate script to rename the files in a standard format so that anyone can take one look at the file name to know the date it was created, by whom, and where (all this data is available in the metadata of standard media files). A five-digit sequence number is tacked to the end of the file name. Ten years ago I never thought I’d have more than 99,999 files per device, but who knows? My wife is approaching 10,000 photos and videos now after five years with one iPhone (and these are the files remaining after she deletes stuff from her phone).

Since rule number one is NEVER MODIFY ORIGINALS (renaming doesn’t count as a modification, as it does not change the “last modified” timestamp), I maintain a separate directory for “COLLECTIONS,” which are basically photo albums of certain events or seasons. This is a manual effort and probably always will be. I don’t have the AI at my disposal to magically identify people, places, and events to assemble a photo album on the fly.

When the files are copied, updated, and renamed, I then kick off the backup script (basically a fancy Robocopy) to replicate the changes to the various backup locations, including the folder to sync with the cloud service.

In the future I might keep this basic system intact but expose a portion of it to a paid AI service to assist with categorizing, facial-recognition, tagging, and the like.

In days of old, family memories might be preserved in the form of hard-copy photographs in a shoe box. Back then, the problem was keeping this single point of failure safe from fires and floods. Now, the problem is we have too much stuff. Some intervention is necessary, and this system works for me. As for “effortless,” I’m not sure I’ll ever completely reach this goal. Maybe the point of an archiving system is that is should require some effort, otherwise how do we decide how we’re represented by future generations?