Scylla is an open-source distributed NoSQL data store. It was designed to be compatible with Apache Cassandra while achieving significantly higher throughputs and lower latencies. It supports the same protocols as Cassandra (CQL and Thrift) and the same file formats (SSTable), but is a completely rewritten implementation, using the C++14 language replacing Cassandra's Java, and the Seastar asynchronous programming library replacing threads, shared memory, mapped files, and other classic Linux programming techniques.
Scylla uses a shared design on each node, meaning that each CPU core handles a different subset of data. Cores do not share data, but rather communicate explicitly when they need to. The Scylla authors claim that this design allows Scylla to achieve much better performance on modern NUMA SMP machines, and to scale very well with the number of cores. They have measured as much as 2 million requests per second on a single machine, and also claim that a Scylla cluster can serve as many requests as a Cassandra cluster 10 times its size - and do so with lower latencies.
Multi-Model: Documents, graphs and key-value pairs — model your data as you see fit for your application. Joins: Conveniently join what belongs together for flexible ad-hoc querying, less data redundancy. Transactions: Easy application development keeping your data consistent and safe. No hassle in your client.
ArangoDB is to use the arangoimp command-line tool. arangoimp allows you to import data records from a file into an existing database collection.
ArangoDB provides scalable, highly efficient queries when working with graph data.The database uses JSON as a default storage format, but internally it uses ArangoDB's VelocyPack - a fast and compact binary format for serialization and storage.ArangoDB can natively store a nested JSON object as a data entry inside a collection. Therefore, there is no need to disassemble the resulting JSON objects.
HyperDex is a next generation key-value and document store with a wide array of features. HyperDex's key features -- namely its rich API, strong consistency, fault tolerance, support for MongoDB API, and ease of use -- provide strong guarantees to applications that are not matched by other NoSQL systems. HyperDex is an open source distributed data store. In the NoSQL data store space, HyperDex distinguishes itself by offering high performance, a rich API, ACID transactions that span multiple objects, and strong consistency and fault-tolerance guarantees.
Features: Distributed Data is distributed across the cluster, without a single point of failure. Flexible Data Model HyperDex can act both as a key-value datastore and a document store, supporting unstructured (schema-free), semi-structured, and structured (schema-based) data. High-Performance Next-generation replication and query protocols enable HyperDex to process operations with minimal overhead. Scalability Read and write throughput both increase linearly as new machines are added, with no downtime or interruption to applications. Fault-tolerant Data is automatically replicated across multiple servers to tolerate a user-specified number of concurrent failures. Failed nodes can be replaced with no downtime. Strong consistency HyperDex guarantees that every GET returns the result of the latest PUT. There are no complicated consistency models to learn or programming quirks, such as conflict resolution, to master. Multikey transactions HyperDex supports ACID transactions that span any number of objects.
RethinkDB is the first open-source, scalable JSON database built from the ground up for the realtime web. It inverts the traditional database architecture by exposing an exciting new access model – instead of polling for changes, the developer can tell RethinkDB to continuously push updated query results to applications in realtime. RethinkDB’s realtime push architecture dramatically reduces the time and effort necessary to build scalable realtime apps.
Open-source database for building realtime web applications
NoSQL database that stores schemaless JSON documents
Distributed database that is easy to scale
High availability database with automatic failover and robust fault tolerance
In addition to being designed from the ground up for realtime apps, RethinkDB offers a flexible query language, intuitive operations and monitoring APIs, and is easy to setup and learn.
The query-response database access model works well on the web because it maps directly to HTTP’s request-response. However, modern applications require sending data directly to the client in realtime. Use cases where companies benefited from RethinkDB’s realtime push architecture include:
What is LMDB? Lightning Memory-Mapped Database (LMDB) is a software library that provides a high-performance embedded transactional database in the form of a key-value store. LMDB is written in C with API bindings for several programming languages.
LMDB is a Btree-based database management library modeled loosely on the BerkeleyDB API, but much simplified. The entire database is exposed in a memory map, and all data fetches return data directly from the mapped memory, so no malloc's or memcpy's occur during data fetches. As such, the library is extremely simple because it requires no page caching layer of its own, and it is extremely high performance and memory-efficient. It is also fully transactional with full ACID semantics, and when the memory map is read-only, the database integrity cannot be corrupted by stray pointer writes from application code.
The library is fully thread-aware and supports concurrent read/write access from multiple processes and threads. Data pages use a copy-on- write strategy so no active data pages are ever overwritten, which also provides resistance to corruption and eliminates the need of any special recovery procedures after a system crash. Writes are fully serialized; only one write transaction may be active at a time, which guarantees that writers can never deadlock. The database structure is multi-versioned so readers run with no locks; writers cannot block readers, and readers don't block writers.
Unlike other well-known database mechanisms which use either write-ahead transaction logs or append-only data writes, LMDB requires no maintenance during operation. Both write-ahead loggers and append-only databases require periodic checkpointing and/or compaction of their log or database files otherwise they grow without bound. LMDB tracks free pages within the database and re-uses them for new write operations, so the database size does not grow without bound in normal use
LMDB uses memory-mapped files, giving much better I/O performance.
Works well with really large datasets. The HDF5 files are always read entirely into memory, so you can’t have any HDF5 file exceed your memory capacity. You can easily split your data into several HDF5 files though (just put several paths to h5 files in your text file).
Aerospike is a key-value, in-memory, operational NoSQL database with ACID properties which support complex objects and easy to scale. But I have already used something which does absolutely the same.
Aerospike is designed to be the premier high speed, scalable, and reliable NoSQL database. Every line of Aerospike code, every architectural decision focuses on high performance and easy scaling and operations:
Indexes in RAM
Threaded transaction models
Cache-line optimization transaction and data replication
Seamless auto-rebalance scaling
Use of Linux
Storage and failover reliability.
Aerospike’s distributed Shared-Nothing NoSQL database architecture is designed and built to reliably store data with high availability.