The Centralized Web
I don’t think I’d be making news by pointing out that the internet, today, is dominated by large, centralized services. While this centralization of the internet is a far cry from the original vision of peer-to-peer interactions and democratization, those services have, in many ways, enriched our lives by connecting friends and family, individuals and businesses, citizens and government.
But I also wouldn’t be making news by pointing out that those same services have a darker side, particularly those that would bill themselves as “free”. While ostensibly costing us nothing, these free services make billions collecting and monetizing our personal data while optimizing our use of those systems to enhance engagement1. Worse, the data they collect, with or without our consent, is locked away outside of our control.
I know this. And yet I still find myself making use of many of these services, including:
- Email (Gmail)
- Storage (Photos, Drive)
- Calendar (uh… Calendar)
- Notes (Keep)
And I’m sure many others besides.
Each of these services provides immense value! Instead of having to host email, or create my own offsite storage system, or manage my own git server, I can save time and effort by having someone else do the work for me.
However, in exchange, each of these services holds a piece of who I am. And I don’t control any of it.
The Dark Side of Centralization
In some cases the piece of me these services hold is small. In others it’s so large as to be difficult to grasp. Some services hold data that I also store elsewhere, while others retain the only copy of that information. Some of this data is not terribly interesting, while other data is so sensitive that only few should see it.
This is frightening when you consider just how dangerous these services can be.
First off, each of these services may be collecting data about me in ways I may not even be aware of. Google, for example, makes it surprisingly difficult to disable location tracking in Android. Facebook is known to create shadow profiles for people who’ve never used the service. And don’t get me started on Android app data collection.
This data collection makes these centralized services extremely high-valued targets for attackers. After all, it wasn’t so long ago that an Equifax data breach left millions of people vulnerable to identity theft. The natural counterargument is that a small number of centralized services leaves fewer locations that need to be secured. However those same services provide a troubling lack of transparency regarding data collection, security, and handling practices, not to mention notification of security breaches (for example, the Equifax breach began in May, was noticed at the end of July, and announced to the public in September).
More fundamentally, there is a basic misalignment of incentives at work, particularly for these “free” services. The old adage goes that if a product is free, you are the product, and that couldn’t be more true in the current ad supported environment of the free internet. As a result, these organizations are highly incentivized to learn as much as possible about all of us, while encouraging us to use their services, even to our own detriment.
Of course, all this is pretty philosophical. What if I don’t care about all my data being collected and monetized?
Well, consider all of the content you have locked away in these services; all those photos and videos in Facebook, all those emails in Gmail, all those posts on Medium.
What happens if one of those services goes down? I know that sounds crazy, but I suspect Myspace thought the same once!
What happens if one of those services changes their terms of service in a way that makes you want to switch?
What happens if something happens to you, and your loved ones want access to all those photos or videos you once took?
What if you simply stop liking the service and want to go somewhere else?
These closed systems put our data out of our hands and out of our control, and that’s simply dangerous.
So what’s the alternative?
Breaking Out of the Silos
I’ll be the first to admit that getting away from this model of centralization is not something just anyone can do. Not yet, anyway. In that way, sadly, privacy and data autonomy is a new form of inequality, and it’s something I’m increasingly interested in exploring.
But, in the meantime, I can certainly improve my own circumstances.
I’ve already begun writing a bit about this topic. My posts on my switch to tt-rss, and my decision to transition to Jekyll, are connected and part of a theme: to move further toward self-hosting and support of IndieWeb technologies.
And then there’s my gradual shift away from Gmail (though I haven’t made the leap quite yet).
Of course, I don’t believe for a second that I can completely wean myself off of centralized internet services. However, I can mitigate the risks and control the data that’s most important to me.
Where engagement is defined as a compulsive need to continue to interact with the service. Whether that compulsion is driven by anger, fear, or joy is of course immaterial. ↩
I can’t say I’m optimistic that the #indieweb is gonna really take off, but a man can dream…
Yeah yeah, I’m posting to Twitter now. But it’s from my own Jekyll blog using the IndieWeb stack. So it’s hipster enough to be cool.
Well… I’m going to attempt something pretty major, here, and switch over my blog from my trusty Oddmuse instance to Jekyll… for better or worse.
There are numerous upsides to this. First, I’ve already built a lot of habits around taking notes using Vimwiki, and having recently made the switch to Markdown for that wiki1, having a consistent set of tools for personal and work note taking, as well as blog management sounds pretty attractive! Doubly so since I really enjoy the writing experience I’ve set up with Vim.
Second, this rebuild moves me to a well-supported set of tools that’s currently being very actively maintained. I’ve been a huge fan of Oddmuse for a long time, if only for its light weight simplicity, but its lost momentum over the years. Further, the dependency on a semi-custom markup, and the storage being in an oddball custom format, means I’m a little more tied down to its infrastructure than I’d like. Moving to pure Markdown means I get the simplicity of wiki-style markup without being tied to a specific technology platform.
Third, security. Static site generators are simpler, faster, and less complex to operate, and have a lower footprint for abuse.
That’s not to say there aren’t downsides! I’ve written a lot of content using custom plugins and markup, and I don’t know how I’m going to replace all that.
And, of course, there’s simply the act of transferring all that content.
But. I strongly feel this will be worth the transition.
And it gives me a project!
Update: And obviously I’ve moved! Of course, there’s lots of work left to do as I move into this new infrastructure. The site layout needs more work. I’d like an archive navigator. I need to enable some sort of commenting mechanism. But, so far so good!
And yeah, the tale of this entire transition and a rundown of my new toolset is probably worth a series of blog posts. Stay tuned!
This deserves a post of its own. This move has enabled me to do things like use Markor on my phone to share the same set of notes on both my laptop and my phone, which has had the ancillary benefit of basically killing Google Keep in my workflows. It’s not without its issues, and it’s not something I’d recommend to a casual user, but it’s pretty slick… ↩
With the knock out success of my ttrss service rollout, I thought it might be fun to look into other self-hosted services that I might find useful. Now, let’s be very clear, this was, on its face, entirely a make-work project to give me something fun to do with my spare time. But the outcome has proven surprisingly useful!
It all began when I came across Huginn. Huginn is an open source implementation of the kind of service offered by IFTTT, Zapier, and I’m sure others (Microsoft Flow popped up while I was finding the links to those services). The general idea is that these services allow you to plumb or connect various other services together to effect an automated workflow. For example, you might receive tweets on one end and shoot them off to, say, a Slack channel on the other.
Okay, so what would I do with this?
Well, as a bit of background, I’m an avid reader of Matt Levine. Mr. Levine offers a newsletter that one can subscribe to that is delivered daily to ones email inbox. Notably, if you want to read this content on the Bloomberg website, it’s hidden behind a decided effective paywall that happens to defeat web scrapers. That means getting this content into my RSS feed isn’t directly possible.
But wouldn’t it be nice if I could take those emails, scrape out the content, and republish them to a private RSS feed that I could incorporate into ttrss?
Well, with Huginn I can do just that!
First, I set up a rule in gmail to apply a label to, and then archive, the newsletter emails.
Then, I set up a Huginn pipeline that does the following:
- Use the Imap Folder Agent to connect to my gmail account and retrieve any new emails with the label applied (making sure to use the text/html MIME enclosure so the full message body is available).
- Use the Website Agent to parse the email body and pull out the link to the article on Bloomberg.
- Use the Data Output agent to republish the content as an RSS feed.
Finally, in ttrss I subscribe to the feed and… voila!
Now that is pretty darn useful!
Since then I’ve also set up Gotify and integrated it with Huginn and other services in my home to notify me when, for example, my offsite backup process is completed (and yeah, I could do that with just Gotify, but piping the events through Huginn gives me more flexibility later to do other things with them… like… publish them to an RSS feed? I dunno…).
This is some very nice infrastructure! I’m now very curious how else I might leverage this stuff, or what other services I could deploy (some of which are listed here)…
Previous 1 of 50 Next