Bastrama is a command-line tool to manage backup files that are stored on random access memory (eg. hard drives). It implements an infinite grandfather-father-son strategy by deleting a defined subset of the backup files, therefore saving storage space while still keeping some of the older backups in case something went wrong (and needs to be restored from) a long time ago.
The idea is: Bastrama lets you run full daily backups to hard drive and you will just never run out of space. - Of course you'll have to do the backups yourself (using the backup tool of your choice). Bastrama is just managing the resulting backup files.
From you backup files, numbered linear starting from 0, Bastrama builds a "tree", where every node has n children. Then from every level of the tree, Bastrama keeps the latest k files. The rest is deleted.
Here are some charts to make it clear. A sequence of backups is drawn from left to right, numbered from 0 to 68, where 0 denotes the oldest backup and 68 the latest (eg. "today's"). Green files are kept, gray ones are deleted.
n=2, k=2
n=2, k=4
n=3, k=3
n=3, k=6
It can be seen that the oldest and the latest backup are always kept. Between them, the probability of a backup being kept decreases exponentially with its age.
As a consequence, the required storage space increases only logarithmically over time, which is very, very slow. (This is assuming, that the later backup files aren't larger than the older ones.).
An example: After 10000 backup cycles with a n=3, k=3 strategy, 18 files are kept:
If you liked this article, subscribe to the feed by clicking the image below to keep informed about new contents of the blog:
The idea is: Bastrama lets you run full daily backups to hard drive and you will just never run out of space. - Of course you'll have to do the backups yourself (using the backup tool of your choice). Bastrama is just managing the resulting backup files.
From you backup files, numbered linear starting from 0, Bastrama builds a "tree", where every node has n children. Then from every level of the tree, Bastrama keeps the latest k files. The rest is deleted.
Here are some charts to make it clear. A sequence of backups is drawn from left to right, numbered from 0 to 68, where 0 denotes the oldest backup and 68 the latest (eg. "today's"). Green files are kept, gray ones are deleted.
n=2, k=2
n=2, k=4
n=3, k=3
n=3, k=6
It can be seen that the oldest and the latest backup are always kept. Between them, the probability of a backup being kept decreases exponentially with its age.
As a consequence, the required storage space increases only logarithmically over time, which is very, very slow. (This is assuming, that the later backup files aren't larger than the older ones.).
An example: After 10000 backup cycles with a n=3, k=3 strategy, 18 files are kept:
file kept | age |
---|---|
# 10000 | 0 cycles |
# 9999 | 1 cycles |
# 9998 | 2 cycles |
# 9996 | 4 cycles |
# 9993 | 7 cycles |
# 9990 | 10 cycles |
# 9981 | 19 cycles |
# 9963 | 37 cycles |
# 9936 | 64 cycles |
# 9882 | 118 cycles |
# 9801 | 199 cycles |
# 9720 | 280 cycles |
# 9477 | 523 cycles |
# 8748 | 1252 cycles |
# 8019 | 1981 cycles |
# 6561 | 3439 cycles |
# 4374 | 5626 cycles |
# 0 | 10000 cycles |
If you liked this article, subscribe to the feed by clicking the image below to keep informed about new contents of the blog:
0 commenti:
Post a Comment