Refactor rootfs cache system
Description
Checklist
hideActivity

Igor Pecovnik August 1, 2022 at 8:37 PMEdited
Thank you for your feedback!
That sounds sane, to better isolate things. I wonder if putting the caches outside of github might also make sense, though github is of course nice and easy and free :-)
Once cache is generated its uploaded to Github releases, but is also (currently) distributed to 4 dedicated servers under our control. Two of them are slow, two fast. We can scale this when / if needed.
If using github releases, would it make sense to use single release for each cache version
just remove the release?
The problem is caching until all mirrors are not in sync and we have a new version, which is defined once GitHub action scripts are done with building. We need to find a bullet proof concept.
Iām not sure how much better that compression is
5143582720 = sid-budgie-amd64.cb324f1e13f59906dce95bd53caf56f8.tar
2095987869 = lz4
1664039699 = zstd
as the rootfs will likely increase in size over time anyway.
IMO we can patching for several years. We can code that way that GitHub release upload is optional as we have our own capacity, but why not using free beer as much as possible Right now its already good enough to just ignore (now its error - changing to warning) those files that goes over since other mirrors will cover for them ā¦ download is done from multiple sources at once or via torrent network.
you might need to split caches in multiple parts
This complexity I would like to avoid at this stage. If possible.
Are caches still intended to be rotated monthly
Refreshing package base once per month seems reasonable. But every two months would also do.
How about manual version bumps
That is already integrated - if changes were made to the packages sector in general, if packages are changed. But patch name = hash of packages inside + version. This means they are not binary compatible in time as packages inside can have different versions ā¦

Matthijs Kooijman August 1, 2022 at 8:04 PM
Iām not superfamiliar with the caching system (mostly been looking at the build scripts for this a while back), but since asked for feedback, I had a quick look.
move GitHub cache storage under new repository, namely rootfscache, cache or similar
That sounds sane, to better isolate things. I wonder if putting the caches outside of github might also make sense, though github is of course nice and easy and free :-)
If using github releases, would it make sense to use single release for each cache version (i.e. one new release each month)? That might also make it easier to clean up outdated caches, just remove the release?
change compression technology to (multicore) zstd which compression is better. This is needed as some caches already go over 2Gb GitHub filesize limits
Iām not sure how much better that compression is, but I can imagine that this only delays size problems, as the rootfs will likely increase in size over time anyway. At some point, you might need to split caches in multiple parts (and maybe it would make sense to do this right away)?
Are caches still intended to be rotated monthly (i.e. you would have up to a month old rootfs when downloading from the cache)? How about manual version bumps (which I think currently happen when some significant change is made in the build process)?
Currently we are generating cache files within script by using a parameter
which is calling this script
Signed caches are stored to and torrent files are created. Torrents have webseeds containing our (4) mirrors which are having all the cache. (cache.armbian.com)
Current solution is more like POC then real service.
What we need to do?
- store up to 3 full cache releases under releases page so we donāt have a black hole when cache creation is taking place (can take 1/2 a day)
- move GitHub cache storage under new repository, namely rootfscache, cache or similar
- change compression technology to (multicore) zstd which compression is better. This is needed as some caches already go over 2Gb GitHub filesize limits
What else?