Right after the 1.1 release we get 1.2 out, which includes a few stability changes and mainly performance improvements, especially in background sync, that reduced a lot the latency of the commit, so let's go trough the changes.
The work on performance improvements on background sync (or better said latency reduction) started even before the 1.1 release. Actually, the 1.1 release was done to publish all the work done so far without delaying the release any further, waiting for the latency improvements that I had not yet fully clear how much was needed to get them done.
Quickly after the 1.1 release though I could complete the main changes required to reduce the latency, that involved making sure that flushing of "root pages" (the pages that handle basic metadata like journal location, existing segments, free pages) was safe also when the fsync was done in a background thread.
Done that, I spent quite some time doing some "high speed run and
kill -9" tests to verify if everything was fine, and
it wasn't! Thanks to that test I discovered a small nasty concurrency bug on the journal start, that created some issues in high concurrency
cases that likely I would not be able to reproduce without the latency reduction done by the patch.
Anyway, the fix for this bug was also backported to previous releases in the form of patch releases 1.0.1 and 1.1.1. After fixing this bug I could not reproduce any other issues, so good to go.
In terms of numbers, I run some benchmarks with a test project against the 1.1 and the 1.2. The benchmark is a simple case of inserting one key per transaction in a single thread: version 1.1 will do ~300 transactions per seconds; version 1.2, same code, will do ~8000 transactions per second.
Running the same benchmark in multi thread (20 threads) 1.1 will have suffer a performance reduction down to 150 transactions per second while 1.2 would do 14000 transactions per second.
The durability of transaction with background sync depends on the kind of crash you have: if the crash is at process level, the OS should guarantee all writes so no transaction is lost. In case of a full OS crash (example a sudden power loss and the machine goes badly off), some of the latest confirmed transactions may be lost. This behavior is actually common on many systems; like if you are using SQLite with Normal+WAL or Postgresql with synchronous commit off, in no case a crash would (should) corrupt a database in a way that "long ago committed data" would not be accessible, or the whole database becomes not accessible.
In Persy you can also choose at the start of each transaction if you want to wait for a guaranteed confirm or rather a confirm with a background fsync: this gives you a granular control on what is critical and what is not.
After the speed improvements I was testing the project integrating it with some test-bed project I have, and I received some feedback about a memory issue somewhere, so I went deep in memory profiling and I noticed that a memory issue actually existed in the management of the snapshots. After fixing that, I backported the fix and released patch versions 1.0.2 and 1.1.2.
Further memory profiling after the fix didn't show anything concerning.
One small regression in 1.1.0 in the ByteVec management was also discovered by a third party and that led to the release of 1.1.3 and is also included in 1.2.
Some more minor improvements and minor spelling fixes come from contributors, thanks everyone for the great feedback!
After this release I'm fairly happy with the speed level reached, more improvement may come in future for additional profiling or from users feedbacks on specific use cases.
The next step now will be writing some blog posts explaining the details of persy and more example of use cases, maybe start some planning for some new features, so I invite anyone that has interest to come and open a feature request issue on GitLab Issues. Stay tuned!