Filezcache - cache s3 and other files on disk
=============================================
This is a file caching system optimized for use with cowmachine
but then also be used with other services.
It has some unique properties:
* Lookups return file references (if possible) which can be used with sendfile
* Cached files can be streamed to requestors while they are being filled
* On startup the cache is repopulated with existing files
* Lookups return data formats directly useable for cowmachine serving
The system uses the file system to store files and a disk log. The files are stored
in `priv/data/` and the disk log in `priv/journal`.
The disk log is used to rebuild the cache after a start. All files are checked against
the checksum from the disk log, non-matching files are deleted from the cache.
Example
-------
$ erl -pa ebin
Erlang R15B03 (erts-5.9.3.1) [source] [64-bit] [smp:4:4] [async-threads:0] [kernel-poll:false]
Eshell V5.9.3.1 (abort with ^G)
1> application:start(crypto).
ok
2> application:start(filezcache).
ok
3> filezcache:lookup(mykey).
{error, enoent}
4> filezcache:insert(mykey, <<"foobar">>).
{ok,<0.173.0>}.
5> filezcache:lookup(mykey).
{ok,{file,6,"priv/data/4J0I2F06043V5P0V603D4O4I6L1J5M1B4I5Y2B2C606W28131H164Z421M4X6221"}}
6> filezcache:insert(mykey, <<>>).
{error,{already_started,<0.173.0>}}
7> filezcache:delete(mykey).
ok
Configuration keys
------------------
* `max_bytes` Maximum size of the cache in bytes, defaults to 10GiB (10737418240 bytes)
* `journal_dir` Directory for the disk log, defaults to `priv/journal`
* `data_dir` Directory for the cached files, defaults to `priv/data`
TODO
----
There are some known issues that need to be resolved:
* On startup delete files that are unknown to the disk log
* Add timeouts to `filezcache_entry` states `wait_for_data` and `streaming`
* Extra intelligence in filezcache_entry to prevent evicting active entries during garbage collection