Append-only files
I need to implement an append-only persistent data structure for a project using CQRS (I will use it for the event store). The records will be variable in length (a string serialization of the events using either XML, JSON or YAML - I haven't decided yet).
Now, appending to a text file is easy in Windows; just open the file for append. However, I will also need to read the events, starting with a given Id
; that's harder to do with a text file. Given that I'm in a "designing algorithms" mood lately (see my first Coursera certificate) and that I haven't been able to find a satisfactory library written by someone else, here's what I came up with.
The data file
Given that the records in the data file will have different lengths, I could use an end-of-record character as a delimiter; however, finding the next record would require a full record scan in that case. (Plus, I am not 100% sure that I want to use text serialization - maybe I'll switch to binary?) That means using the same "trick" that Delphi uses for its strings: prefix them with the string length.
This means that the data file will have the following structure:
(record #) | Size (long ) | Data |
0 | 20 | … 20 bytes … |
1 | 74 | … 74 bytes … |
… | … | … and so on… |
Each record starts with a long
size, followed by size
bytes of data. (The record number is not part of the file, it is just shown for completion.)
The index file
The index file is very simple: it is a list of long
values, each one pointing to the start of the matching record in the data file (to the size
field). The first record is always a zero, since the data file starts there.
Operations
- Rebuilding the index file is extremely simple; the following pseudo-code assumes the data file is accessible as a
byte[]
array calleddata
(for example, as a memory-mapped file):
IEnumerable<long> BuildIndex(byte[] data): position = 0 while (position < data.Length) yield return position size = read_long(data, position) position += size + sizeof(long) // the size doesn't include the field itself
- Appending a new record:
void Append(byte[] data, byte[] record): position = data.Length index.add(position) write_long(data, position, record.Length) write_bytes(data, position + sizeof(long), record)
- Reading a specific record:
byte[] Read(byte[] data, long i): position = index[i] size = read_long(data, position) return read_bytes(data, position + sizeof(long), size)
- Reading all records starting from a given position:
IEnumerable<byte[]> ReadFrom(byte[] data, long i): position = index[i] while (position < data.Length) size = read_long(data, position) yield return read_bytes(data, position + sizeof(long), size) position += size + sizeof(long)
The above code fragments use the following primitives:
long read_long(byte[] data, long position)
Reads along
from thedata
array, starting at the givenposition
byte[] read_bytes(byte[] data, long position, long size)
Reads a record from thedata
array, starting at the givenposition
,size
bytes longvoid write_long(byte[] data, long position, long value)
Writes along
to thedata
array, starting at the givenposition
void write_bytes(byte[] data, long position, byte[] record)
Writes abyte
array, starting at the givenposition
Of course, even the operations I described are pretty low-level (using a byte
array instead of an object); I will have to implement a higher-level interface for real use.
I have uploaded the code to my GitHub repository. It was not written using TDD, I have only added a smoke test to verify that it works correctly. I think the design is pretty clean so far.
Edit: the timings are much better than I expected:
Using ...\tmpC056.tmp - writing and reading 100000 records append - time elapsed: 3176 msec read all - time elapsed: 1252 msec rebuild index - time elapsed: 273 msec Press any key to continue . . .
That means that, if an app will have to read and replay a million events when it starts, it won't take more than half a minute - which is acceptable.
Comments