Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)

Boltzmann
Hello Everyone

I'm Bharath Chandra, I'll be developing a persistent cache feature for
OpenSCAD as my GSOC project this year. Currently,  I'm in the community
bonding period of the GSOC program. Selecting a perfect key-value data store
for storing the serialized Geometry objects is the critical and first part
of my project. And this should be completed before the coding period begins
(June 1).

link to my abstract:
https://summerofcode.withgoogle.com/projects/#5522327720165376
<https://summerofcode.withgoogle.com/projects/#5522327720165376>  

I have listed a few options below which I considered during my proposal,
please have a look at them and suggest my the best. If there is any other
datastore that I haven't considered and if it is a good option for this
task, please post it. I have considered the Level of persistence, the Server
type(sever or serverless), Eviction policies, Serialization, Availability of
c++ client library, Community support, License, Ease of usage, and MXE and
Homebrew package listing as some parameter in selecting the perfect
Key-value datastore.

1) Memcached - in-memory key-value store and it’s server-based.
https://memcached.org/​ <https://memcached.org/​>  
      - libmemcached is its c++ client lib but it  is not listed in MXE
packages list.

2) Extstore -   https://github.com/memcached/memcached/wiki/Extstore​
<https://github.com/memcached/memcached/wiki/Extstore​>  

3) Redis -  different levels of on-disk persistence and it's server-based  
https://redis.io/​ <https://redis.io/​>  
      - Hiredis is its c client library but again it is not listed in MXE
packages list.

4)UnQlite - disk based and serverless. ​ https://unqlite.org/​
<http://​https://unqlite.org/​>  

5) Protocal buffers like Cap'n Proto, protobuf, flatbuffer - in-memory and
serialization is handled by itself but it is not possible for complex data
structure that we are using for storing our geometry.

Thank you and will be waiting for your reponses.



--
Sent from: http://forum.openscad.org/

_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
Reply | Threaded
Open this post in threaded view
|

Re: Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)

doug.moen
It seems to me that OpenSCAD needs a lightweight, serverless key-value store.

Of the options you mention, only unQLite is serverless. But unQLite does not appear to be lightweight. It contains a Turing-complete query language, Jx9, containing 312 functions, including an HTTP request parser. This seems excessive. I don't think we need a query language.

What about LevelDB? It's a persistent key-value store used by the Chrome web browser. https://github.com/google/leveldb
From the web site, it has the following limitations:
 * This is not a SQL database. It does not have a relational data model, it does not support SQL queries, and it has no support for indexes.
 * Only a single process (possibly multi-threaded) can access a particular database at a time.
 * There is no client-server support builtin to the library. An application that needs such support will have to wrap their own server around the library.

These limitations seem to align with OpenSCAD's requirements, and suggest that LevelDB might be lightweight enough to consider.

Doug Moen.

On Mon, May 11, 2020, at 2:59 AM, Boltzmann wrote:

> Hello Everyone
>
> I'm Bharath Chandra, I'll be developing a persistent cache feature for
> OpenSCAD as my GSOC project this year. Currently,  I'm in the community
> bonding period of the GSOC program. Selecting a perfect key-value data store
> for storing the serialized Geometry objects is the critical and first part
> of my project. And this should be completed before the coding period begins
> (June 1).
>
> link to my abstract:
> https://summerofcode.withgoogle.com/projects/#5522327720165376
> <https://summerofcode.withgoogle.com/projects/#5522327720165376>  
>
> I have listed a few options below which I considered during my proposal,
> please have a look at them and suggest my the best. If there is any other
> datastore that I haven't considered and if it is a good option for this
> task, please post it. I have considered the Level of persistence, the Server
> type(sever or serverless), Eviction policies, Serialization, Availability of
> c++ client library, Community support, License, Ease of usage, and MXE and
> Homebrew package listing as some parameter in selecting the perfect
> Key-value datastore.
>
> 1) Memcached - in-memory key-value store and it’s server-based.
> https://memcached.org/​ <https://memcached.org/​>  
>       - libmemcached is its c++ client lib but it  is not listed in MXE
> packages list.
>
> 2) Extstore -   https://github.com/memcached/memcached/wiki/Extstore​
> <https://github.com/memcached/memcached/wiki/Extstore​>  
>
> 3) Redis -  different levels of on-disk persistence and it's server-based  
> https://redis.io/​ <https://redis.io/​>  
>       - Hiredis is its c client library but again it is not listed in MXE
> packages list.
>
> 4)UnQlite - disk based and serverless. ​ https://unqlite.org/​
> <http://​https://unqlite.org/​>  
>
> 5) Protocal buffers like Cap'n Proto, protobuf, flatbuffer - in-memory and
> serialization is handled by itself but it is not possible for complex data
> structure that we are using for storing our geometry.
>
> Thank you and will be waiting for your reponses.
>
>
>
> --
> Sent from: http://forum.openscad.org/
>
> _______________________________________________
> OpenSCAD mailing list
> [hidden email]
> http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
>

_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
tp3
Reply | Threaded
Open this post in threaded view
|

Re: Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)

tp3
On 11.05.20 11:48, Doug Moen wrote:
> It seems to me that OpenSCAD needs a lightweight, serverless
> key-value store.

Why? Where is that requirement coming from?

I agree it makes sense to have that option, but so far the most
cases where people have actually asked for persistent caches
came from running OpenSCAD on a web server.
Specifically memcached and/or Redis are something which is
usually available in that setup, e.g. AWS provides those under
the Amazon ElastiCache name.

> These limitations seem to align with OpenSCAD's requirements,
> and suggest that LevelDB might be lightweight enough to
> consider.

It can't hurt having a look at LevelDB, but it being a Google
projects brings a big pile of minus points to the table too.
Their development strategy and handling of their own open
source projects seems a bit too volatile for our needs.

That said, the main topic is not so much the backend store
but the work inside OpenSCAD to make this possible. So if
the internal structure is ready, adding a second backend is
probably not a big effort and risk anymore.

ciao,
  Torsten.



_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
-- Torsten
Reply | Threaded
Open this post in threaded view
|

Re: Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)

doug.moen
Okay, I wasn't aware of the requirements. I have no idea how to run OpenSCAD on a web server, and for my own personal uses of OpenSCAD, I would not want to have to deal with administering a memcached or redis server on my desktop machine just to get the performance benefits of better object caching. I mentioned LevelDB only as an example, because I wanted to have a discussion about the requirements. And figure out how this change affects me as a user, obviously.

On Mon, May 11, 2020, at 7:23 AM, Torsten Paul wrote:

> On 11.05.20 11:48, Doug Moen wrote:
> > It seems to me that OpenSCAD needs a lightweight, serverless
> > key-value store.
>
> Why? Where is that requirement coming from?
>
> I agree it makes sense to have that option, but so far the most
> cases where people have actually asked for persistent caches
> came from running OpenSCAD on a web server.
> Specifically memcached and/or Redis are something which is
> usually available in that setup, e.g. AWS provides those under
> the Amazon ElastiCache name.
>
> > These limitations seem to align with OpenSCAD's requirements,
> > and suggest that LevelDB might be lightweight enough to
> > consider.
>
> It can't hurt having a look at LevelDB, but it being a Google
> projects brings a big pile of minus points to the table too.
> Their development strategy and handling of their own open
> source projects seems a bit too volatile for our needs.
>
> That said, the main topic is not so much the backend store
> but the work inside OpenSCAD to make this possible. So if
> the internal structure is ready, adding a second backend is
> probably not a big effort and risk anymore.
>
> ciao,
>   Torsten.
>
>
>
> _______________________________________________
> OpenSCAD mailing list
> [hidden email]
> http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
>

_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
Reply | Threaded
Open this post in threaded view
|

Re: Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)

jon_bondy
I agree that a discussion of the requirements is important.  I, too,
would not want OpenSCAD's light weight to be altered, but maybe I
understand neither the goals of this project nor its impact very well.

Jon

On 5/11/2020 8:11 AM, Doug Moen wrote:
> Okay, I wasn't aware of the requirements. I have no idea how to run OpenSCAD on a web server, and for my own personal uses of OpenSCAD, I would not want to have to deal with administering a memcached or redis server on my desktop machine just to get the performance benefits of better object caching. I mentioned LevelDB only as an example, because I wanted to have a discussion about the requirements. And figure out how this change affects me as a user, obviously.
>
>

_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
Reply | Threaded
Open this post in threaded view
|

Re: Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)

nophead
I too would not want to install a database as a separate component, I would just want OpenSCAD to save its cache to disk. Not sure it even needs to use a database for that. Just an object serialiser that is memory address and OpenSCAD version independent, or at least clears itself if the objects change from one version to another. I.e. it is not a big problem losing the cache each time I update OpenSCAD.

On Mon, 11 May 2020 at 13:21, jon <[hidden email]> wrote:
I agree that a discussion of the requirements is important.  I, too,
would not want OpenSCAD's light weight to be altered, but maybe I
understand neither the goals of this project nor its impact very well.

Jon

On 5/11/2020 8:11 AM, Doug Moen wrote:
> Okay, I wasn't aware of the requirements. I have no idea how to run OpenSCAD on a web server, and for my own personal uses of OpenSCAD, I would not want to have to deal with administering a memcached or redis server on my desktop machine just to get the performance benefits of better object caching. I mentioned LevelDB only as an example, because I wanted to have a discussion about the requirements. And figure out how this change affects me as a user, obviously.
>
>

_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org

_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
tp3
Reply | Threaded
Open this post in threaded view
|

Re: Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)

tp3
In reply to this post by jon_bondy
To be clear, the persistent caching is always going to be an
option, not a mandatory requirement to run external servers.
Independent of possible back-ends, the simple internal caching
will remain at least feature-wise (probably even just the
current code).

I see there's useful cases for local file based caching too,
e.g. for people running OpenSCAD in parallel via make or so.

So in the end, we may want all 3 possibilities:

1) trivial run-time cache as it is now
2) local file based cache
3) server based cache for hosted installations

For the GSoC project, we'll focus on only one of 2) or 3)
and we need to decide soon, so we can provide the framework
for the core topics that need to be covered.

We have to start somewhere.

ciao,
  Torsten.




_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
-- Torsten
Reply | Threaded
Open this post in threaded view
|

Re: Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)

bedders
SQLite is self contained and lightweight

Regards

Richard

From: Discuss <[hidden email]> on behalf of Torsten Paul <[hidden email]>
Sent: 11 May 2020 13:10
To: [hidden email] <[hidden email]>
Subject: Re: [OpenSCAD] Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)
 
To be clear, the persistent caching is always going to be an
option, not a mandatory requirement to run external servers.
Independent of possible back-ends, the simple internal caching
will remain at least feature-wise (probably even just the
current code).

I see there's useful cases for local file based caching too,
e.g. for people running OpenSCAD in parallel via make or so.

So in the end, we may want all 3 possibilities:

1) trivial run-time cache as it is now
2) local file based cache
3) server based cache for hosted installations

For the GSoC project, we'll focus on only one of 2) or 3)
and we need to decide soon, so we can provide the framework
for the core topics that need to be covered.

We have to start somewhere.

ciao,
  Torsten.




_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org

_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
tp3
Reply | Threaded
Open this post in threaded view
|

Re: Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)

tp3
In reply to this post by nophead
On 11.05.20 15:06, nop head wrote:
> Just an object serialiser that is memory address and
> OpenSCAD version independent, or at least clears itself
> if the objects change from one version to another.

It needs to store things in a transactional way, so it's
not destroying the storage if multiple processes access
at the same time, or the accessing application is killed.
Also we need to find the stored data again quickly.

That has all the core properties of a Database (ACID).
We don't need a complicated query language as Doug
already pointed out, but that's and additional feature.

Even the DBM library, released 1979 was called database.
https://en.wikipedia.org/wiki/DBM_(computing), so having
a server is not required to have a database. That's just
the convention for big centralized storage in enterprise
environments. Nowadays lots of programs like Firefox,
Quassel (IRC-Client) use embedded databases like SQLite
which even support SQL as query language.

Looking though infos about LevelDB, I found mention of
LMDB (Lightning Memory-Mapped Database) which is part
of OpenLDAP. This seems like an interesting option too.

ciao,
  Torsten.

_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
-- Torsten
Reply | Threaded
Open this post in threaded view
|

Re: Help me in choosing the perfect key-value datastore for persistent cache (GSOC 2020)

Boltzmann
By revisiting the options suggested, I have made some opinion on LevelDB,
SQLite, and LMDB

Let's start with SQLite
pros: 1) Serverless, 2) Completely disk-based (persistence)
cons: 1) No built-in support for eviction policies (We need to write it) 2)
Definitely reduced performance than the existing one.
In conclusion, we get persistence in a serverless mode in exchange for
performance.

The case is the same with LevelDB but it has built-in support for eviction
and more performance compared to SQLite.

In the case of LMDB, No built-in support for eviction but it shows greater
performance compared to LevelDB and SQLite (In their benchmark tests)
http://www.lmdb.tech/bench/microbench/
LMDB or SQLite performance may decrease because we need to create a new add
function that takes care of eviction also.

As this project is useful for web server use cases and maintaining a
server-based caching system is pretty much common for web applications, the
real parameters we need to consider are performance and persistence.



--
Sent from: http://forum.openscad.org/

_______________________________________________
OpenSCAD mailing list
[hidden email]
http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org