Did you know Calibre can turn an RSS feed into an eBook? I didnt! It turns out Calibre, tt-rss, and Wallabag make it possible to roll your own news that you can read right on your eReader! #selfhosting #indieweb #technology
I’ve mentioned this before, but I’ll mention it again: I’m a big fan of RSS. For the uninitiated, RSS is a way to subscribe to a feed of content from a website and consume it in a reader or other tool of your choice. And despite claims that it’s dying out, I still manage to have more content in my feed reader than I possibly have time to consume.
For a long time I used Feedly as my RSS reader of choice. But back in October I decided to switch to tt-rss, a self-hosted RSS feed reading service that works on both browsers and through a mobile app. Then, in a fit of boredom, I used some self-hosted home automation tools to incorporate email newsletters into my feed. Meanwhile, I also decided to stand up an instance of Wallabag, a self-hosted website bookmarking service.
But I ran across a problem: with all this content at my fingertips, I started to fall behind, particularly on all those long-form articles and newsletters I want to read.
And then I discovered Calibre’s news scraping features and a solution presented itself!Continue reading...
I’ll be the first to admit that I’m a frequent user of tools like Google Keep, Google Docs, etc. But I’ve never been terribly comfortable with my dependency on those services. Yeah, obviously there’s the privacy concerns, but more fundamentally, I just want control over my data! It’s a heck of a lot harder to run “grep” over a set of notes in Google Keep…
Thematically, if you’ve been paying attention to this blog, you’ll notice this is part of a theme. Ultimately, I’m doing what I can to make sure I can manage and control my own information outside the walls of the common internet monopolies.
Now, quite a while ago I adopted vimwiki as my note taking method of choice. Before you get scared off, Vim is just a tool to enable a more fundamental idea: that personal information management should be built on the simplest possible tools and file formats, with the data under my own control.
In my case, I chose to focus on taking notes using plain text files, with a basic markup language that would allow me to write richer text and link those notes together.
When I first started doing this a few years ago I chose to stick with Vimwiki’s native markup, as it supported a few things out-of-the-box that Markdown, at the time, didn’t neatly support without using poorly supported extensions (I’m looking at you, checkboxes!) However, right around that same time, Github released a spec for their extensions to Markdown that plugged a lot of the holes that had concerned me, and since then support for these extensions has expanded considerably.
This caused me to revisit the issue and I concluded that a migration to Markdown made a lot of sense.Continue reading...
An intro post about my attempts to slowly pull myself out of internet silos so I can better control my data. #indieweb #selfhosting
The Centralized Web
I don’t think I’d be making news by pointing out that the internet, today, is dominated by large, centralized services. While this centralization of the internet is a far cry from the original vision of peer-to-peer interactions and democratization, those services have, in many ways, enriched our lives by connecting friends and family, individuals and businesses, citizens and government.
But I also wouldn’t be making news by pointing out that those same services have a darker side, particularly those that would bill themselves as “free”. While ostensibly costing us nothing, these free services make billions collecting and monetizing our personal data while optimizing our use of those systems to enhance engagement. Worse, the data they collect, with or without our consent, is locked away outside of our control.
I know this. And yet I still find myself making use of many of these services, including:
- Email (Gmail)
- Storage (Photos, Drive)
- Calendar (uh… Calendar)
- Notes (Keep)
And I’m sure many others besides.
Each of these services provides immense value! Instead of having to host email, or create my own offsite storage system, or manage my own git server, I can save time and effort by having someone else do the work for me.
However, in exchange, each of these services holds a piece of who I am. And I don’t control any of it.Continue reading...
Well… I’m going to attempt something pretty major, here, and switch over my blog from my trusty Oddmuse instance to Jekyll… for better or worse.
There are numerous upsides to this. First, I’ve already built a lot of habits around taking notes using Vimwiki, and having recently made the switch to Markdown for that wiki1, having a consistent set of tools for personal and work note taking, as well as blog management sounds pretty attractive! Doubly so since I really enjoy the writing experience I’ve set up with Vim.
Second, this rebuild moves me to a well-supported set of tools that’s currently being very actively maintained. I’ve been a huge fan of Oddmuse for a long time, if only for its light weight simplicity, but its lost momentum over the years. Further, the dependency on a semi-custom markup, and the storage being in an oddball custom format, means I’m a little more tied down to its infrastructure than I’d like. Moving to pure Markdown means I get the simplicity of wiki-style markup without being tied to a specific technology platform.
Third, security. Static site generators are simpler, faster, and less complex to operate, and have a lower footprint for abuse.
That’s not to say there aren’t downsides! I’ve written a lot of content using custom plugins and markup, and I don’t know how I’m going to replace all that.
And, of course, there’s simply the act of transferring all that content.
But. I strongly feel this will be worth the transition.
And it gives me a project!
Update: And obviously I’ve moved! Of course, there’s lots of work left to do as I move into this new infrastructure. The site layout needs more work. I’d like an archive navigator. I need to enable some sort of commenting mechanism. But, so far so good!
And yeah, the tale of this entire transition and a rundown of my new toolset is probably worth a series of blog posts. Stay tuned!
This deserves a post of its own. This move has enabled me to do things like use Markor on my phone to share the same set of notes on both my laptop and my phone, which has had the ancillary benefit of basically killing Google Keep in my workflows. It’s not without its issues, and it’s not something I’d recommend to a casual user, but it’s pretty slick… ↩
With the knock out success of my ttrss service rollout, I thought it might be fun to look into other self-hosted services that I might find useful. Now, let’s be very clear, this was, on its face, entirely a make-work project to give me something fun to do with my spare time. But the outcome has proven surprisingly useful!
It all began when I came across Huginn. Huginn is an open source implementation of the kind of service offered by IFTTT, Zapier, and I’m sure others (Microsoft Flow popped up while I was finding the links to those services). The general idea is that these services allow you to plumb or connect various other services together to effect an automated workflow. For example, you might receive tweets on one end and shoot them off to, say, a Slack channel on the other.
Okay, so what would I do with this?
Well, as a bit of background, I’m an avid reader of Matt Levine. Mr. Levine offers a newsletter that one can subscribe to that is delivered daily to ones email inbox. Notably, if you want to read this content on the Bloomberg website, it’s hidden behind a decided effective paywall that happens to defeat web scrapers. That means getting this content into my RSS feed isn’t directly possible.
But wouldn’t it be nice if I could take those emails, scrape out the content, and republish them to a private RSS feed that I could incorporate into ttrss?Continue reading...
I’ve been a huge fan of RSS for a very long time now. For those not aware, RSS is a protocol that allows websites (news organizations, blogs, aggregators, etc) to push out a feed of content as they publish it. As an example, the CBC publishes a list of RSS feeds that any reader can subscribe to.
The reader then uses an RSS feed reader to subscribe to the feed and consume it.
Now, that by itself sounds just okay, but the real magic happens when you subscribe to a large number of feeds. What most folks don’t realize–even those familiar with RSS–is that RSS feeds are extremely common and widely available across many web properties. In my case, I subscribe to a number of news sources (CBC, BBC, NYT, etc), some technology aggregators (Hacker News, Reddit Programming), plus a number of random blogs and other outlets.
The RSS feed reader can then combine these streams of content in various ways. Personally, my preference is to just see a single list of all the most recently published articles that I can then scroll through. The best services allow me to consume that stream of content on multiple devices–in particular, on a desktop or on a phone–so that no matter where I am, my RSS feeds are at my fingertips, showing me a stream of all the content I’ve chosen to subscribe to.
Ultimately, what this amounts to is something like the Facebook news feed, except I’m personally selecting my sources rather than having content selected for me by some proprietary algorithm on a social network.
Now up until 2013 folks widely agreed that Google Reader was one of the best feed readers out there.
Unfortunately, Google, in their infinite wisdom, decided to shut Google Reader down.
Fortunately, there are plenty of fine alternatives out there, and for a very long time Feedly was my tool of choice. The web interface is clean and functional, the Android app is excellent, and it has a lot of interesting features if you’re willing to pay for their subscription. If you’re interested in dipping a toe into the RSS waters, I highly recommend it!
However, there are a couple of things about RSS that can be a bit of a nuisance.
First, news sources frequently only publish their article titles, perhaps a brief excerpt, and a link, so that you have to leave the feed reader and visit their website to consume the content. I can understand why that is (i.e. ad revenue), but it’s a real pain. First, the context switch to the website is always a bit jarring (and on a phone, a bit slow); each site has a different layout which means the reading experience isn’t consistent; and if I want to read the content offline, I’m out of luck.
Second, some types of feeds, notably Reddit and Hacker News, publish links to their aggregation service rather than to the article content itself, often without any excerpt at all. The result is a rather bland, difficult-to-use feed.
Third, call me paranoid, but I’m not thrilled about having a third party tracking what I’m reading.
And then I discovered tt-rss.Continue reading...
Many years ago I experimented with running IPv6 in my home network (dual-stacked, not IPv6-only… I’m not that crazy!). At the time this was mainly an intellectual exercise. While a lot of major services already offered IPv6 (including Google, Facebook, and Netflix), the big draw of v6 is the ability to completely do away with NAT and simplify access to services and P2P applications running out of my home. But without broad v6 support, even if my home network was available via v6, the rest of the world wouldn’t be able to access it, which pretty severely curtailed the utility of the whole thing.
But, it was still an interesting exercise!
Until, that is, Netflix started cracking down on VPNs.
The way v6 was deployed in my network was via a tunnel supplied by Hurricane Electric. That tunnel terminated in California, and, while not intentional, it allowed me to watch US Netflix in Canada.
That is until Netflix realized people were abusing those tunnels and started blocking inbound traffic via HE.
I considered potential workarounds, but I could never figure out a satisfying solution (in large part thanks to closed devices like Chromecasts).
And so I shut down v6 in my network. While, previously, v6 didn’t provide a lot of value, it also didn’t cause me any problems. Once this issue surfaced, it was no longer worth the effort.
Recently I decided to take another look at the situation to see if anything had changed.
Well, unfortunately Netflix still blocks traffic coming from Hurricane Electric traffic originating in the US.
However, it turns out, back in 2013, HE added new Points of Presence (POPs) in both Calgary and Manitoba. That meant I could set up a tunnel with an exit point inside the country.
Would Netflix block that?
It turns out, the answer is: No!
So I now have IPv6 back up in my home network.
But has the connectivity story changed? Yes!
Much to my astonishment, I discovered that in the last couple of years, AT&T, Rogers, and Telus have all deployed native IPv6 inside their networks. That means that, when I’m out and about in both Canada and the US, I have direct v6 connectivity back to my home network! Even my mother-in-law’s house has access thanks to her Telus internet package.
That’s a huge expansion in coverage!
In fact, ironically enough, of the places I frequent, the only location that lacks v6 connectivity is my workplace. Go figure. But, in that case, I can always just tunnel through my linode VPS, which has had v6 connectivity for many many years.
IPv6 adoption may be taking a while, but it is happening!
Over the last couple of years I’ve written extensively about backup solutions. The whole thing started as I tried to find a use for my NUC, which I initially turned into a Hackintosh, a solution that was, frankly, in search of a problem.
macOS ran fairly nicely on the thing, but eventually I ran into issues which ultimately lead me to just converting the thing over to an Ubuntu 18.04 installation. In the end, Linux is just, at least in my experience, a much better home server OS for mixed-OS environments (taking the SMB issues on the Mac as a perfect example).
Anyway, I still needed a backup solution, and I originally settled on a combination of a few things:
- For Windows machines ** A Samba file share on the server ** Windows 10 built in file copy backup capabilities
- For Linux machines ** Syncthing for real-time storage redundacy ** rclone for transferring backups to Google Drive for off-site replication.
The whole thing stalled out when I screwed up the rclone mechanism and inadvertently deleted a bunch of items in my broader Google Drive account.
And so I became gun shy and paused the whole thing.
The other big change is I switched over to Ubuntu on my X1 Carbon, which meant that I now needed to sort out the backup solution for a Linux client as well. Syncthing is great for redundancy, but it’s not itself a backup solution.
So a couple of things changed, recently, that allowed me to close those gaps and resolve those issues.
First off, when it comes to rclone and Google Drive, I enabled two features:
- Set the authentication scope to “drive.file”
- Set the root_folder_id to the location on Drive where I want the backups stored
The first setting authenticates rclone to only be able to manipulate files it creates. So Google Drive should prevent rclone from accidentally touching anything else but the backups it’s transferring.
The second setting is belt-and-suspenders. By setting the root_folder_id, even if Google Drive somehow screwed up, rclone would never look outside of the target folder I selected.
So, the accidental deletion problem should be well behind me.
The issue of backups with Linux was to expand my use of Syncthing to include additional folders on my laptop I want stored on my backup server. This ensures that my laptop is always maintaining a real-time replica of critical data in another location.
Finally, I adopted Restic for producing snapshot backups of content that I replicate to my backup server.
Basically, I create a local replica of data on the server (either with Syncthing, rclone, lftp, or other mechanisms) and then use Restic to produce a backup repository from those local copies. Restic then takes care of de-duplication, snapshotting, restoration, and other mechanisms. The Restic repositories then get pushed out to Google Drive via rclone.
I’ve also extended this backup strategy to the contents of my linode instance (where this blog is hosted), and to Lenore’s blog. Specifically, I use rclone (or lftp) to create/update a local copy of the data on those respective servers, and then use Restic to produce a backup repository from those copies. And, again, those repositories are then pushed out to Drive.
Overall, I think this stack should work nicely! And I like that it neatly separates the various stages of the process (data transfer, backup, off-siting) into a set of discrete stages that I can independently monitor and control.
Just a quick handy tidbit: When using rclone for backup purposes like this, it’s a good idea to create a custom OAuth API key for use with Google Drive. By default rclone uses a default API key shared by all other rclone users, which means you’re sharing the API quota as well. As a result, you get much better performance with your own key (though, unless you’re willing to jump through a lot of hoops, you’re stuck with “drive.file” scope… which, again, for this purpose isn’t just fine, it’s desirable).
So in my previous post I mentioned some challenges I encountered using macOS on my Hackintosh as a NAS, and my ultimate success in getting it working with Windows as a backup server… after moving the actual NAS’ing to a Linux VM.
What I didn’t realize then, but I know now, is that at least on my NUC, for some reason, the IntelMausiEthernet is not actually stable! I don’t know if it’s tied to high/sustained load, but for whatever reason, over time the NIC would lose connectivity with the network. Re-plugging the network cable resolved the issue, but it would quickly recur.
This rapidly became a dealbreaker, as not only did it render the machine useless for backups, it also made it useless as a Transmission server.
Now, before you ask, no, I haven’t spent any time debugging the issues and don’t plan to. So I haven’t a clue what was actually wrong.
My solution was a lot simpler: I just bought a USB Ethernet dongle and moved on with my life. That, fortunately, has worked like an absolute charm and solved all of my network stability issues!
So, as I mentioned previously, one of my ideas for my hackintosh server was to turn it into a backup server/NAS for my home. As a server, the NUC is an excellent option, being low power, quiet, and incredibly compact. And while I can do some amount of backing up to cloud storage (i.e. Drive), for regular day-to-day backups a proper local solution is preferable.
Now, Lenore and I both have Windows 10 equipped laptops, which means we can take advantage of the File History feature to actually perform backups to a designated network drive. So, it would seem that simply setting up a drive share on the Mac, and pointing our laptops at it, would do the job nicely!
A few releases back macOS moved away from Samba to their own implementation of SMB (the Windows file sharing protocol). Well, apparently that implementation of SMB does not work with File History. And I have no idea why. The errors you get make no sense, and there’s basically no solutions out there on the internets.
You’d be amazed how long I spent pulling my hair out over this one.
Ironically, the solution I arrived at was as silly as it was obvious: I deployed an Ubuntu Server VM running headless on the Mac via VirtualBox. The VM mounts the macOS filesystem and shares it using Samba.
But it works! We now have backups!
And while I was at it, I also finally set up Transmission and Flexget so I could move my bittorrent activity to the Mac as well. The downloaded content is shared using the built-in macOS drive sharing features… for basic reads it seems to work just fine. For now, anyway.