content-curation – Morgan

A castle made of bazaar

Posted on July 31, 2025August 2, 2025 by Morgan

I've accumulated quite a lot of nerd automations in my tech stack, I'll try to give an idea of what I've done up to this day

For the Cloudron instance I run

A cron job that monitor disk usage, using bash +Cloudron API, it will alert me via email and ntfy when any folder usage > 75%.
A cron job that checks if some apps time out and restart them via Cloudron API. In bash too.

DNS Monitoring

I take a snapshot of my Hetzner DNS configs every 5 minutes and watch frequently for diffs using Changedetection.

Uptime monitoring

I'm using Uptime Kuma to monitor the status pages of several services/APIs I'm relying on (Dropbox, OpenAI, Mistral AI, ...) as well as my own self-hosted apps. I get ntfy alerts in case any is failing.

Feed generators

A weekly summary of https://khrys.eu.org/revue-hebdo/
A daily top 10 for new entries from https://indieblog.page/
A weekly reminder of kids friendly events happening nearby, based on https://out.be/.

Files Syncing

A cron job syncing my .torrent files from Dropbox to QBittorrent, using rclone
A cron job syncing my downloaded audiobooks to AudioBookShelf, using rsync.
A cron job syncing my downloaded ebooks to Calibre, by uploading the files to Calibre API, using bash.

Music management

A cron job syncing my downloaded music (torrent) to my main Music Library, using rsync.
A cron job verifying the quality of my Music Library content using mp3val and reporting for corrupted files via ntfy.
A cron job verifying the quality of my Soulseek download folder using mp3val and only moving the verified ones to my Music Library.
A user script integrating with ListenBrainz/LastFM scrobbler for when I listen to live radios from RTBF (they use radioplayer technology).
A user script to filter automatically the search results within Soulseek (web version running on Cloudron).

Photos management

A cron job that syncs my photos library between Dropbox and Immich, using rclone, but only for pics and videos under a certain size.
A cron job that generate Immich album only made of pictures of specific persons.
Scripts that I run ad-hoc, using ffmpeg, to compress my pictures, videos, fix their EXIF date at need.
Scripts that I run ad-hoc, using Syncthing, to remove all pics/videos from my Phone (WhatsApp and Camera folders) and move them to Dropbox, before I compress and triage them. Anything on Dropbox is then sync to Immich, so that's how I keep my phone clean.

Emails management

A script which checks for invoices (with attachments or downloadable links) in my emails and sends them to my Dropbox forwarding email, which in turn backups those attachments in a specific folder which can be treated.

Freelancer paperwork management

I have several scripts to rename my receipts and invoices with the right date, invoice nr, provider and organize them per year/quarter/month, it's done in PHP.
A user script to fill my timesheet automatically based on my declared days off.

Web curation and bookmark management

A cron job that will browse my recent Shaarli shares and, when needed, add tags, HN Thread links, Web Archive link, and a summary. It's done in Python.
A cron job that will browse my Miniflux unread entries and mark as read the ones that I will probably not care about. using Mistral AI.
A cron job that will browse my Miniflux unread entries and send me an email with the unread entries summarized and grouped by feed, a bit like feu Subworthy.com (by Phil Stephens) was doing around 2022.
A user script that will add TLDRs buttons at the bottom of my Miniflux entries, so I can get a quick summary generated by Mistral AI, at need.
A user script to warn me on any website if there is a Hacker News thread for the page I visit.
A user script to highlight and extract all top links from the current Hacker News thread.

LinkedIn management

A user script that adds a reply generator in LinkedIn conversations, using Mistral AI.

Obsidian Backups

I'm using aicommit2 called from Obsidian Git plugin to generate meaningful commit messages about what is being backed up.

This looks quite a lot, and that's not all.

A daily RSS summary for indieblog.page

Posted on January 26, 2025January 26, 2025 by Morgan

As a gift to the community of content curators and RSS addicts, here is a simple script to generate a daily summary of all indie blogs visible in indieblog.page, because their RSS feeds only expose a few random posts while my FOMO obligates me to try to get them all 😅.

The script that resulted from my obsession is half cooked thanks to some LLM and is adapted to my needs so I'm excluding blog posts based on language or keywords in the title, feel free to adapt to your needs. So I encourage you to adapt the script to your need. And if not, just subscribe to the feed URL below.

Demo

And here is a preview of what it looks like:

Hope it helps

Default settings for watches in Changedetection

Posted on October 4, 2024November 3, 2024 by Morgan

I'm addicted to Changedetection for spying on website changes and internet search results for specific keywords, Occasionally also for monitoring price changes. It's quite handy to discover new links added to web directories, or stay updated with some websites that do not provide any RSS feed.

Context

I'm watching hundred of URLs.
I often spy on webrings and blogrolls to discover new interesting links, and also on search engines results for specific keywords.
I'm self-hosting Changedetection through Cloudron.
I'm mostly following through those watches via my RSS Reader, Miniflux.
For some specific changes, like weather bad conditions, I subscribe via ntfy.

Anyway, I've developed a few habits that fit my workflow so well for every new watch, which are:

Settings > General

This is where we set defaults for all future watches, it's pretty obvious you must start here. Here is my current setting:

Time between check: By forcing a convenient interval between checks, you try to find a balance between information overload and staying current. Pick your poison, but don't hesitate to override this setting at per-watch level.
Extract from document and use as watch title: it's convenient to let Changedetection take care of naming your watches based on the webpage titles rather than leaving the sometimes very long and non human-friendly URL as a default description.
Random jitter: this is handy to avoid stressing your I/O too much.

General > Group tag

This one is mostly for better organizing stuff, as I mentioned I follow those changes through RSS, I noticed it was harder to distinct between important and less important stuff because I was following the default RSS feed, but Changedetection provides distinct RSS feeds per groups/tabs of watches, and that's my preferred workflow now.

I'm trying to always set a label, I have around 15 in total, some for specific interests (privacy, discovery aka list of links, devops, music, ...) or specific people, locations and business updates. The rest is generally less important and is labelled with things like FOMO, misc, ...

Those group tags appear as labels next to the URLs you are watching.

If you want to watch a whole group through RSS, link is at the bottom right of the page on the group tab.

Filters & Triggers > Remove elements

It's common on ~~bloated~~ rich web pages to want to focus on specific parts, like everything between <header> and <footer> sections, so I sometimes have to add footer and header. It's mostly needed for sites like eBay, 2ememain, where we can buy and sell things.

Filters & Triggers > default filter and triggers

my Text filtering defaults in Changedetection.

This is purely for spam reduction as I mostly want to know when something new is made.

Sometimes I also enable Sort text alphabetically depending how the page is updated by its author.

🆕 Those new settings have been added recently and I'm also enabling them on new watches:

Extension

Try the web browser extension for Chromium based browsers, it makes watches one-click away.

I've opened a discussion in Changedetection's repository to talk about how repetitive it feels to me, in the hope we can see something like template settings be proposed in the future, at least for the filters & triggers which I consider is not too hard to start with.

Minifux scraper rules

Posted on August 27, 2024May 6, 2025 by Morgan

I'm following Joy of Tech comic via RSS in Miniflux but the image was never loading.

I found half a solution on this blog post of Jan-Lukas Else, unfortunately the proposed solution fails probably as a consequence of some changes in the format of Joy of Tech pages.

The fix is quite simple actually. so for the feed at URL https://www.geekculture.com/joyoftech/jotblog/atom.xml, edit the feed settings, set the scraper rules to the following:

p.Maintext > img[src$=".png"]

And of course enable "Fetch original content" in the feed options.

And voilà, simple and beautiful.

Reading RSS in peace with a few Miniflux Hacks

Posted on August 20, 2024May 24, 2025 by Morgan

This page is regularly updated.  Feel free to use the scripts and take ownership of them. You can support me through Support of course :-)

I'm avid of content curation using RSS feeds. Let me share some of my tips here and some code. This is a living document so please come back for new tips 🙂 and explore my other articles on this topic.

Some of those tips rely on Userscripts which are snippets of code executed automatically on web pages, and usually it's very handy to customize your navigation. I'm using the Custom JavaScript block in Miniflux Settings. But some scripts won't work because of reliance on external resources, and in that case I'm using Tampermonkey for special cases that require loading external resources (think CSP & co).

Translate entries (EN->FR).

As a Belgian product, I speak French and English and can get ouf of trouble in Dutch as well. Yet even if I read mostly in English, I like from time to time to relax my brain and read in French which I speak natively.

My user scripts calls SimplyTranslate and thus clicking this button at the bottom of english articles...

Will trigger the translation...

Adding the result as a blockquote, e.g below.

Source

https://gitea.zoemp.be/sansguidon/snippets/raw/branch/main/miniflux_scripts/translate_entries.js

Filter categories (remove empty ones) using Custom JavaScript block

There is by default no distinction between categories with or without content, and it can be annoying. I made a user script to remove categories with no content to read.

Source

https://gitea.zoemp.be/sansguidon/snippets/raw/branch/main/miniflux_scripts/filter_categories.js

Demo

Before applying the script, we have some categories, including one with (0) unread entries.

After

The category with (0) unread entries is hidden.

Feed organizer - using Tampermonkey

This one is for grouping together all feed entries by feed/author in the main on unread, read, and starred pages. I needed this one because by default, in unread tab, the feed entries are mixed all together and I often wanna consume content per feed/author and not in chronological order.

Source

https://gitea.zoemp.be/sansguidon/snippets/raw/branch/main/miniflux_scripts/feed_organizer.js

Demo

Distinct boring from interesting feeds thanks to objective ranking - with Custom JavaScript in global settings

When opening the "Show all entries" view of a feed, this trick will show you if you shall keep this feed or not. The classification is based on the ratio of starred entries vs total. In this case, clearly, my assistant tells me it's quite 🥱 boring. Other values are: Thinking 💭 (in case we lack data), Interesting 😍 (we star a lot of items), Thinking 🤔 (in case we stared at least some entries). Feel free to make it yours and customize the behavior!

Source

https://gitea.zoemp.be/sansguidon/snippets/raw/branch/main/miniflux_scripts/feed_classifier.js

Demo

Fetch original content - Per feed settings

This is a trick that works well with the majority of feeds so you can fetch the whole article content in your reader instead of just the excerpt.

Filter feed entries by title / content

I've customize the feed settings to exclude specific keywords, and on top of this I've also global rules which apply to all feeds, for excluding feed entries when keywords are found in their content or title. This makes it easy to exclude clickbait uninteresting or depressing content 🙂

My current setting is here as an example https://gitea.zoemp.be/sansguidon/snippets/raw/branch/main/miniflux/block.rules (RSS)

In this case I follow news with heavy metal album releases and I'm excluding specific genre like Death Metal. I'm also abusing the feature to avoid being spammed with recurrent news like Olympic games (Paris 2024). Finally there are already many reasons for me to be anxious, and I do not need more. The last rule saves me from the useless negative news. I keep fine tuning the list and I could improve this by including terms from public blacklists, like this.

Entry sorting - Application settings

This is a setting that helps well to decrease the FOMO-scrolling, by ensuring the top of your unread list stays the same. So it is very simple, sort by Older entries first! Easy.

Filter short or long entries using Custom JavaScript rules

Sometimes I just have so much time and it is impossible to read long articles, so here is my life saver. This will hide all entries not matching the filter. I've added those filters on every page so I can focus on (e.g) short reads at the cost of 1 click only. Very practical when in a rush. The short entries will be highlighted in green. The long entries will be highlighted in red.

Source

https://gitea.zoemp.be/sansguidon/snippets/src/branch/main/miniflux_scripts/entries_duration_filter.js

Demo

Startup options

You can override Miniflux behavior with some environment variables. See https://miniflux.app/docs/configuration.html for more configuration options.

Those are mine and make Miniflux more indulgent to unstable RSS feeds.

export HTTP_CLIENT_TIMEOUT=60
export POLLING_PARSING_ERROR_LIMIT=6
export POLLING_FREQUENCY=60
export BATCH_SIZE=100
export FORCE_REFRESH_INTERVAL=1
export CLEANUP_ARCHIVE_UNREAD_DAYS=30
export CLEANUP_ARCHIVE_READ_DAYS=15
export FETCH_YOUTUBE_WATCH_TIME=1

Tag: content-curation

A castle made of bazaar

A daily RSS summary for indieblog.page

Links

Demo

Default settings for watches in Changedetection

Context

Settings > General

General > Group tag

Filters & Triggers > Remove elements

Filters & Triggers > default filter and triggers

Extension

Next

Minifux scraper rules

Reading RSS in peace with a few Miniflux Hacks

Translate entries (EN->FR).

Source

Filter categories (remove empty ones) using Custom JavaScript block

Source

Demo

After

Feed organizer - using Tampermonkey

Source

Demo

Distinct boring from interesting feeds thanks to objective ranking - with Custom JavaScript in global settings

Source

Demo

Fetch original content - Per feed settings

Filter feed entries by title / content

Entry sorting - Application settings

Filter short or long entries using Custom JavaScript rules

Source

Demo

Startup options