D_store

overview | tutorial | download | API | misc | faq | license

D_store is an in memory database. A good mental model for d_store is a programmatic spreadsheet; there are rows and rows contain fields with values.

D_store is mostly row oriented, all stores represent either all the rows in the spreadsheet or a subset thereof (or the empty set). The query powers of d_store come from the set operations you can do with stores, like and, or, sub, etc.

Why D_store ?

D_store is good for storing highly related data and performing queries on that data. It can however only scale to the amount of memory available. But, this fits very well for GUI applications that let users interact with a lot of interrelated data; d_store is ment as an enabler of rich applications like iTunes, or addressbooks, or bookmark managers etc.

D_store supports tagging, full text searching, hierarchies, and almost any relation you can define on your data. The API provides very direct access to the data, unlike, for instance SQL engines, but it still has the query power as SQL.

In d_store, all data is stored as values, but every value is stored only once per store, which is good for memory usage. And this is where the second part of the querying power of d_store comes from: every value knows in which rows it appears.

A last powerful concept that d_store has is that everything operates on stores, and the results sets are always stores. So when you get the results of a query, you can query that result again and again, refining the results further and further.

D_store is written in plain C and released under the GPL.

Skinny dipping the API

#include <d_store/d_store.h>
...
// (1)
d_store * store = d_store_create(NULL);

// (2)
d_insert(store, D_INT(1), "name", "test person one", NULL);
d_insert(store, D_INT(1), "tel", "+555323323", NULL);
d_insert(store, D_INT(1), "tel", "+031299288", NULL);
d_insert(store, D_INT(2), "name", test person two", NULL);

// (3)
d_store * all_tel = d_select(store, 2, D_STRING("tel"));
d_store * tel = d_nodup(all_tel, 1);
d_store_free(all_tel);

// (4)
d_row * row;
d_row * name;
for (row = d_first(tel); row; row = d_next(tel)) {
    name = d_select_first(store, 1, d_row_get(row, 1));
    printf("person: %s\n",
        D_TOSTRING(d_row_get(name, 3)));
}
d_store_free(tel);

// (5)
d_store_free(store);
...
  1. We create a d_store with a new database. Passing an existing d_store to d_store_create() creates a d_store for the same database, passing in NULL creates a new database.
  2. Then we insert some data in the database. All variable length functions will automatically convert strings to d_values, except for the empty string (""). However, it is good practice to use the D_STRING marco anyhow, to create d_values that contain strings.
  3. We create a sub store which contains all telephone numbers, using d_select. As one person can have more the one telephone number, we remove all instances where we have the same person twice, using the nodup operation. As we no longer need the all_tel store, so we free it using d_store_free().
  4. We iterate the tel store using d_first and d_next as iterators. Inside the loop we fetch the d_row representing the name to which the telephone number belongs, and we print it. Notice we don't free any values or rows. We do free the tel store when we are done with it.
  5. We free the last last d_store belonging to a database, will release all memory associated with the database.

We never freed any value we gave to the database, this is because the macros D_INT and D_STRING create new d_values using d_value_from_*_oneshot() which tells d_store that this value is under his control, and can be used as it sees fit.

Also we never freed any d_row or d_value we get from the database, because these keep being under the d_store's control.

As you can see, d_store tries to be a really friendly API, but also a very direct API. See API for more.

Status and Roadmap

D_store is currently not finished, so you should regard it as beta. The 0.2 release is however a stable release, and finished enough to be useful.

The plan is that by publishing d_store now, I can get some feedback on the API. If needed, any next 0.3 release will change and enhance the API. In the mean time, the 0.2 release will be kept stable and bug-fixed.

Because d_store is far from finished, during 0.2 and 0.3 releases, I'll be working on the unstable 0.4 and further releases, which should bring the following, in the order they will likely be implemented.

Only after a file based backend and error management can the API be considered to approach a 1.0 release. But as I'm doing this in my free time, it might take some time before any of this is implemented.

The 0.2 release is however stable and has all the features needed to be used.

Some numbers

A sample application (maemo-todo) with a database of 5000 todo elements of which around 1/3 are tagged with multiple tags, takes just above 2Mb of memory for d_store. And it works just fine on a Nokia 770. (Except for a bug in gtk treeview, that always draws all rows always, even if they are invisible, it works fine with gtk 2.10.)

Scaling up to 250.000 (quarter million) rows will consume just below 100Mb of RAM in the todo application example. However I've only tried it with a 130.000 rows, eating just below 50Mb but the app stays fully functional, including the searching or the tagging. (On a desktop PC.)

So depending on what your application will store per row, and how interrelated your data is (the more related, the better) and what kind of environment you target. Storing over 10.000 rows could still work on a Nokia 770 (The device has 64 Mb or ram) and storing until 200.000 rows should work just fine on a normal desktop PC.

In the current design, I would guess, its performance will degrade with over half a million rows, however, I'm confident that a 1.0 release should support a million rows still.

But this is just me doing some wet finger estimations, I can be way off.

Last modified: 2007-11-19 20:15 GMT