- The filesystem is a presentation layer
- What actually happens when you open a file
- fanotify: the interception point
- Stub files and the illusion of presence
- Why you'd want to move data without telling applications
- Searching what you can't see: the Husk Catalog
- Why user-space is the right place for this
The Filesystem Is a Presentation Layer
Let's start with something that feels obvious but has a profound implication: when you look at a folder on your computer, you are not looking at your data. You are looking at a map of your data — a structured set of names, paths, and pointers that your operating system maintains so you don't have to think about where bytes actually live on physical hardware.
This map is the filesystem. And like any map, it can point to different territories without the reader knowing or caring. A road atlas doesn't change when a highway gets rerouted — the map is updated and you drive the new road. Your applications work the same way.
The file is at that path
When you see /archive/video/film.mov, it feels like the data physically lives at that location — like a physical file in a physical folder in a physical cabinet.
The path is a pointer
The path is a name. The name points to a metadata record. The metadata record points to physical blocks. Those blocks could be on an SSD, a hard drive, a remote server, or a tape cartridge. The name doesn't change when the blocks move.
This is not a quirk or an edge case. It is the fundamental design of every major operating system. UNIX-style systems have worked this way since the 1970s. Windows NTFS has worked this way since the 1990s. The abstraction is so good that most people who work with computers professionally have never had to think about it. HuskHoard is built on top of that abstraction — and exploits it deliberately.
What Actually Happens When You Open a File
When your video editor calls open("/archive/video/film.mov"), a surprising number of things happen before a single byte of video data is read. Understanding this chain is essential to understanding where HuskHoard sits.
open(path). Has no idea what happens next.The critical insight is layer 2: the Virtual File System. The VFS is the part of the Linux kernel that lets you mount an NFS share, an SMB share, a FUSE filesystem, and a local ext4 disk — and address files on all of them with the same path syntax. The VFS is why /mnt/nas/file.mov and /local/file.mov look identical to your application. What's underneath the mount point is irrelevant to anything above it.
HuskHoard operates just above this layer, using fanotify to intercept access events before they reach the VFS for fulfillment. At that point it has a choice: allow the access (because the data is local), or pause the access and go get the data from wherever it actually lives.
fanotify: The Interception Point
Linux has two APIs for watching filesystem events. inotify is the older and more familiar one — it tells you that something happened (a file was opened, modified, deleted) but by the time you hear about it, the event is done. You're an observer after the fact.
fanotify is different. With fanotify in permission mode, your process doesn't just observe events — it participates in them. The kernel holds the calling process in a suspended state and waits for your daemon to issue a verdict before proceeding. This is the mechanism that makes transparent demand-loading possible.
The calling application — the video editor, the backup job, the Python script — never receives an error. It issued an open() call, and eventually the call succeeded. Whether that took 2 milliseconds (local SSD) or 45 seconds (tape load + seek) is the only observable difference. The path didn't change. The filename didn't change. The application's code didn't change.
Custom kernel modules can also intercept filesystem operations, but a bug in kernel space causes a full system panic — the entire machine goes down. A bug in a user-space daemon is just a crashed process. With HuskHoard written in Rust, even that is unlikely: Rust's memory model eliminates the entire class of use-after-free, buffer overflow, and race conditions that cause most daemon crashes. User-space is both safer and faster to develop against.
Stub Files and the Illusion of Presence
When HuskHoard moves a file's data to tape, it replaces the file's contents with a stub. A stub is a near-zero-size placeholder that lives at the original path and carries the file's metadata in its Extended Attributes (xattr): the real file size, its checksum, and the UUID of the tape volume that holds the actual data.
From the directory listing's perspective, the file is still there. ls -lh reports the correct file size. stat returns the correct timestamps. A backup job scanning the directory sees all the filenames. Nothing looks different — until something tries to actually read the bytes, at which point fanotify catches the open request and HuskHoard goes to work.
This design means your directory tree is always a complete and accurate representation of your archive. You never have to wonder what's "on tape" versus what's "on disk" — everything is present in the namespace. The catalog tells you the storage tier; the stub ensures the path is always valid.
Why You'd Want to Move Data Without Telling Applications
Once you accept that the filesystem path is just a pointer, and that HuskHoard can transparently intercept access to data that isn't local, the interesting question becomes: when would you want to move data behind the scenes? The answer turns out to be surprisingly broad.
In all four cases, the mechanism is the same: HuskHoard relocates the physical data, leaves a stub at the original path, and uses fanotify to make retrieval seamless when something needs the data again. The applications above never see the machinery.
Tiering Policies in Practice
HuskHoard lets you define migration policies that run automatically. A simple policy might look like this:
Once a policy runs, the files are on tape and the stubs are in place. Nothing else in your workflow changes. A user who opens the legal folder the next morning sees exactly the same files they always saw. If they open one, HuskHoard handles the retrieval. If they never touch it, the data costs $0.005/GB to keep rather than $0.030/GB.
Searching What You Can't See: The Husk Catalog
The stub-file model keeps your directory tree intact. But what happens when the file is on a cartridge that's in a box in another room? The stub is there, but if you try to open it HuskHoard needs to tell you which cartridge to insert. And what if you don't remember which folder the file was in — you just know it exists somewhere in your 200TB archive across 15 tapes?
This is what the Husk Catalog is for. The catalog is a persistent, queryable index that HuskHoard maintains on your host machine, entirely separate from the physical media. Every file across every volume — whether that volume is currently loaded, sitting on a shelf, or in an offsite location — is indexed and searchable.
The search above ran across three volumes — two of them offline, one loaded — and returned results in milliseconds. No cartridges were touched. The catalog already knew what was on each volume because it indexed the files when they were written.
This is fundamentally different from how LTFS works. LTFS stores its index on the tape itself. If the tape is ejected, the index is inaccessible. You cannot search an LTFS volume without mounting it. With the Husk Catalog, the index is always available regardless of where the physical media is. You can search your entire archive from a laptop while all your tapes are in a safe across town.
What the Catalog Stores
For every file on every volume, the catalog tracks:
- Full path at the time of write
- File size and last-modified timestamp
- SHA-256 checksum (used for integrity verification)
- The UUID of the volume the file lives on
- The tape block address (for fast seeks during retrieval)
- Current file state: local, cached, offline, or WORM
The checksum is particularly important. At any time you can ask HuskHoard to verify your archive — not by mounting every tape, but by comparing the stored checksums against the catalog's records. If a cartridge has degraded, you know before you need the data.
Why User-Space Is the Right Place for This
The question we get most often from technically-minded readers is: why fanotify and not a custom filesystem or kernel module? Wouldn't a custom FUSE filesystem give you more control?
FUSE is a legitimate option and HuskHoard considered it. The problem is that FUSE filesystems need to be the mount point — you'd have to mount the Husk filesystem somewhere and then route your data through it. This means your existing directory structure either has to move (disruptive) or you have to use bind mounts everywhere (complex and fragile). It also means FUSE overhead on every single filesystem call, not just the ones that need intervention.
fanotify doesn't require any remounting. It watches a directory tree that already exists, on whatever filesystem you're already using. It only intervenes when intervention is needed. When a file is local, the fanotify handler responds in microseconds with FAN_ALLOW and gets out of the way. There is no measurable overhead for routine access to local files.
The result is an archive system that sits invisibly inside your existing directory structure, imposes no overhead on data that's already local, and handles offline retrieval transparently for anything that isn't. No custom kernel code, no separate mount points, no changes to any application that reads your files.
The filesystem is a presentation layer. HuskHoard manages what sits behind it.