newsql

Get the best of both worlds

newsql

SQL or NoSQL? That's a question having been asked by many developers. With SQL, we get the ACID (atomicity, consistency, isolation, durability) benefits but the data model can be rather rigid. As to NoSQL, its data model is very flexible at the cost of losing the ACID characteristics.

When we look at the real world applications, we may find their data models are usually a mix of both. That is a data model is usually composed of a set of "core" properties which are shared by all data instances and there are "variable" properties which are owned by just some of the data instances. Those variable properties could spread out like a long tail. With that observation, it's easy to see why we'll never find the "right" answer by going either way.

The good news is you may now have a solution. newsql allows you to store data with properties not defined as table columns. You can even query on those "undefined" properties. That will give you the benefits of NoSQL. On the other hand, newsql still exhibits the ACID characteristics and transactions are supported which are not available for NoSQL. Better yet, you can index any "undefined" properties whenever necessary. There are no limitations on how many indexes you can put on a table (or collection) as most NoSQL databases have difficulties in offering them.

What's New

Detailed info of each release is described in release notes. Below are some highlights:

  • If table columns are specified in the SQL exprssion to the insert() or update() function call, the input data will be filtered using the table columns specified before being written to the database (0.1.2).

  • Full support of query objects (0.1.1).

  • Some APIs have been revised to make them even easier to use. As a result, v 0.1.0 will not be compatible with earlier releases.

  • newsql can now correctly access your existing mySQL tables. If you try to update your existing tables with newsql, newsql will automatically convert your table to be newsql enbaled. You can turn this feature off (0.0.5).

  • newsql can do something most (if not all) NoSQL databases can not do: indexing properties of documents whenever you need to. newsql provides a indexProperty() function to index a property when a performance boost is needed (0.0.4).

Install

newsql uses mySQL as the underlying database engine. Throughout this document, we'll assume you have installed the mySQL database. With mySQL up and running, you can add the newsql features:

npm install newsql

Documentation

There is a developer guide for SQL developers.

Contents

For NoSQL Developers

If you're a NoSQL developer and weary of maintaining data consistency in applications by yourself, newsql could be a perfect solution for you.

The way to program newsql is quite similar to program NoSQL (such as MongoDB). You can create a new collection by calling newsql.createCollection(). Below is the sample code:

var  newsql = require('newsql');

newsql.createCollection('colName', function(err) {
    if (err)
        // something went wrong
    else
        // a data collection has been created
});

where 'colName' is the name of your data collection.

To remove a collection is just as simple:

newsql.dropCollection('colName', function(err) {
    if (err)
        // something went wrong
    else
        // the 'colName' collection has been removed
});

To insert a document to a collection, you can

newsql.insert('colName', document, function(err, docKey) {
});

where 'document' is a JSON object containing data to be inserted.

For all query operations (find, insert, update, and delete), please refer to the APIs section for details.

Underneath newsql is a relational database (mySQL), so you'll have all the benefits of relational databases. On the other hand, newsql adds a layer on top of mySQL so you can use if as if it's a NoSQL database. You can even use SQL statements directly if you want. Also, a plethora of DBMS clients or management tools are available to help you manage or inspect your databases.

For SQL Developers

If you're SQL developers, you could use newsql as a wrapper around databases so you don't have to hand-code SQL statements. What's more, it allows you to store and retrieve properties which have not been defined as table columns. Later on, if those "undefined" properties are popular enough, you can "raise" those properties to be table columns. newsql can automatically perform the necessary data migration for you.

You can use newsql to access your existing tables. Assuming you have a table named 'Person', you can find all persons with age equal to 25 by:

newsql.find('Person', {age: 25}, function(err, list) {
    // list is an array containing people of age 25
});

Or you want to find out all persons with age greater than 25. Then you can:

newsql.find('Person', {age: {op: '>', value: 25}}, function(err, list) {
	// list is an array containing people older than 25
});

Inserting entries to a table is just as simple:

var  data = {
               name: 'Andy',
               age: 32
            };
            
newsql.insert('Person', data, function(err, pk) {
    // pk is an object with primary keys and values of the
    // newly inserted entry.
});

Up to this point, newsql is like a handy tool to save you from hand-coding sql statements, but it's not what newsql is about. The power of newsql is to accommodate data properties not defined as columns.

Assuming our examplary 'Person' table is defined as below:

CREATE TABLE Person (
    id  bigint not null auto_increment,
    name  varchar(32),
    age   smallint,
    primary key (id)
) engine = InnoDB;

You're happy with its schema until someday you find out there is one out of a few thousands data instances that has a property named 'salary'. To save person's salary you may consider to add a new column named 'salary':

ALTER TABLE ADD salary int;

Ok. That seems to solve the problem, but as your application grow more and more pupular you find out a few persons have a 'hobby' property. What would you do this time? Add another table column? You may but you also realize keep adding table columns is not a good solution and one day your table will be full of columns which are rarely used and you may even forget what they are about.

This is the time when newsql will be helpful. newsql allows you to save properties which are not defined as table columns. Assuming you keep the 'Person' table schema intact, you can do the following:

newsql.update('Person', {hobby: 'jogging'}, {id: 1}, function(err) {
});

to add a "jogging" hobby for person #1. You can even query on this "undefined" hobby proeprty:

newsql.find('Person', {hobby: 'jogging'}, function(err, psnList) {
    // psnList contains persons whose hobby is 'jogging'
});

If it turns out the "hobby" property becomes popular and are queried frequently, you can also index it for better performance:

var  colSpec = {
         type: 'string',
         maxLength: 64
     };
     
newsql.indexProperty('Person', 'hobby', colSpec, function(err) {
});

The above example actually defines a new column 'hobby' to the 'Person' table. What is more, newsql will move data to this new column and index them.

Setup and configure

Before newsql do the magical things for you, you have to configure it to talk to the database. Beneath newsql is the mySQL DBMS, so you have to setup a mySQL database and configure newsql to work with that database.

There are two ways to specify the database configuration: using a config file or doing it programmatically.

<a name="configFile"">

The config.json file

In the newsql package root directory, there is a config.json file to specify database connection parameters. It looks like the following:

{
	"dbConfig": {
		"host"     : "127.0.0.1",
		"database" : "mySample",
		"user"     : "your_acc_name",
		"password" : "your_passwd",
		"supportBigNumbers" : true,
		"connectionLimit"   : 32
	},
    "autoConvert": true
}

where host is the database host and database is the database name. user and password are the database user name and password respectively. newsql will automatically turn on the connection pool for better performance.

autoConvert is a flag to instruct newsql if it should automatically convert a plain SQL table to be newsql enabled. If you don't want newsql to do that, you can set autoConvert to false. However, by setting autoConvert to false you'll not be able to save undefined properties to your existing tables (you can still save undefined properties to tables created by newsql). autoConvert is default true.

<a name="configPro"">

Configure programmatically

You can configure the database connection settings right inside your node program. Here is how:

var  newsql = require('newsql');
var  options = {
        dbConfig: {
            "host"     : "127.0.0.1",
            "database" : "mySample",
            "user"     : "your_acc_name",
            "password" : "your_passwd",
            "supportBigNumbers" : true,
            "connectionLimit"   : 32
        },
        "autoConvert": true
     };

newsql.config( options );

SQL Expressions

You can use SQL expressions to instruct newsql how to talk with databases. With SQL expressions, you can compose and reuse SQL queries in a clean and managable way. In essence, SQL expressions are nothing more than SQL statements encoded as a JSON object. An example should help to understand what is a SQL expression:

var  expr = newsql.sql('Person')
                  .column(['id', 'addr AS address', 'age'])
                  .filter( {name: 'age', op: '>='} )
                  .extra( 'ORDER BY id' );

The above sample code just constructed a SQL expression. You can use it to do a database query:

var  cmd = {
	    op: 'list',
	    expr: expr
     },
     query = {age: 18};

newsql.execute(cmd, query, function(err, list) {
	// 'list' is the query result
});

That's equivalent to:

SELECT id, addr AS address, age
FROM Person
WHERE age >= 18;

"Well, that's nice but what's the befenit?" you may ask. The magic is you can use the same SQL expression in update:

var  cmd = {
        op: 'update',
        expr: expr
     };
     
newsql.execute(cmd, {canDrive: true}, {age: 18}, callback);

Actually, the same SQL expressions can be used in all CRUD operations. newsql is smart enough to retrieve the needed information from a SQL expression and compose the SQL statement you want.

Assuming you're satisfied, below is how to construct a SQL expression: newsql.sql(tableName) takes a table name as its input and returns a SQL Expression object. With that object, you can add columns, set query conditions and specify addtional options. Most SQL expression functions will return the expression object itself, so you can chain funcion calls such that SQL expressions can be composed succintly.

APIs

Below explains the newsql APIs.

SQL expression APIs

newsql.sql(tableName)

This function returns a SQL expression. tableName is the name of a table.

Example:

var  expr = newsql.sql('myTable');

expr.join(joinExpr)

With the SQL expression obtained from the soar.sql() funciton call, you can use its join() function to specify table joins.

Example:

var  expr = newsql.sql('myTable AS myT')
                  .join({
                      table: 'Location AS loc', 
                      onWhat: 'myT.locID=loc.locID'
                   });

If you want to make multiple joins, just call join() as many times as you need. The parameter to the join() function call is a plain JSON object with the following properties:

  • table: name of the joined table.
  • type: if you want to make a left join, you can set this property to 'LEFT'.
  • onWhat: the join clause. If the use property described below is specified, this property will be ignored.
  • use: the common column name to join two tables.

expr.column(column)

This function can be used to add table columns to a SQL expression. To add a single column, the parameter is the name of the column. If you want to add multiple columns, the parameter should be an array of column names.

Example:

var  expr = newsql.sql('Person')
                  .column(['name', 'age', 'weight']);

expr.filter(filter)

This function is used to set query conditions (filter) of a SQL expression. The parameter to the function call is a plain JSON object with the following properties:

  • name: name of the filter. It's also used as the key to retrieve the query value from a query object. This property is required.
  • field: the real column name in a table. If this property is missing, the name property will be used instead.
  • op: what comparator to be used. It can be '>', '=' or 'IS NULL', etc.
  • noArg: when a query operation does not require argument (e.g. IS NULL), this property should be set to true.

Note that this function should be called just once for a SQL expression. When called multiple times, the new setting will replace the old one.

Example:

var  expr = newsql.sql('Person')
                  .filter({name: 'age', op: '>='});

newsql.chainFilters(op, filters)

If you want to make a compound filter (ANDed or ORed filters), this is the function you need. op should be 'AND' or 'OR', and filters is an array of filters.

Example:

var  orFilters = newsql.chainFilters('OR', [
        {name: 'region', op: '='},
        {name: 'age', op: '>'}
     ]);
     
var  expr = newsql.sql('myTable')
                  .filter( orFilters );

The resulting filter (orFilters) is a compound filter ORing two filters (region and age).

expr.extra(extra)

This function can add extra options to a SQL statement. extra is a string with possible values like 'GROUP BY col_name' or 'ORDER BY col_name'.

Example:

var  expr = newsql.sql('myTable')
                  .extra('ORDER BY region');

Data manipulation APIs

find(expr, query, cb)

expr can be the table name (collection name) or a SQL expression which can be built by newsql.sql(). query is the actual value to be applied to the query condition. cb(err, list) is a callback function which receives an error (if any) and an array of returned data.

Example:

var  expr = newsql.sql('myTable')
                  .column('name')
                  .filter({name: 'age', op: '>'});
                  
newsql.find(expr, {age: 18}, function(err, list) {
	// list will contain people whose age is greater than 18
});

In the above example, we use a SQL expression to compose a query which is almost the same as the following SQL statement:

SELECT name FROM myTable WHERE age > 18;

The above example can be programmed in a more concise way:

var  query = {age: {op: '>', value: 18}};

newsql.find('myTable', query, function(err, list) {
	// list will contain people whose age is greater than 18
});

This time, every table columns will be returned instead of just the 'name' column. Also, we use a query object to specify the query condition. To fully explore the features of query objects, please refer to this short article.

findOne(expr, query, cb)

If you exepct your query should return just one data instance, you can use findOne() instead of find() . expr can be the table name (collection name) or a SQL expression which can be built by newsql.sql(). query is the actual value to be applied to the query condition. cb(err, list) is a callback function which receives an error (if any) and an array of returned data.

Example:

var  expr = newsql.sql('myTable')
                  .column('name');
                  
newsql.findOne(expr, {age: 18}, function(err, data) {
	// data is an object of people whose age is equal to 18
});

In the above example, we use a SQL expression to compose a query which is almost the same as the following SQL statement:

SELECT name FROM myTable WHERE age = 18;

insert(expr, data, cb)

The insert() function can insert an entity to a table or add a document to a collection (depending on you view it as a SQL or NoSQL operation). expr can be the table name (collection name) which a new entry will be inserted into or a SQL expression which can be built by newsql.sql(). data is a plain JSON object containing data to be inserted. cb(err, entityKey) is a callback function which recevies an error (if any) and an entityKey object. entityKey is the object of table's primary keys and their values from the newly inserted entity. For NoSQL collections, entityKey should look like {id: docID} where docID is a serial number for the inserted document.

Example:

var  data = {
             name: 'David',
             dob: '1988-12-05',
             skill: ['node.js', 'Java']
             };
             
newsql.insert('Person', data, function(err, pk) {
    if (err)
        console.log( err.stack );
    else {
        // a person added, see what's the primary key value
        // of this newly added entry:
        console.log( JSON.stringify(pk, null, 4) );
    }
});

update(expr, data, query, cb)

The update() function can update a table entity or a doument in a collection. expr can be either the table name (collection name) of the entry to be updated or a SQL expression which is built by newsql.sql(). data is a plain object containing update data. query is the actual value to be applied to the query condition. cb(err) is a callback function which receives an error object (if errors occurred).

Example:

var  data = {status: 'health check'},
     query = {weight: {op: '>', value: 300}};
     
newsql.update('Person', data, query, function(err) {
    if (err)
        console.log( err.stack );
    else
        // update successfully
});

The above example is similar to the following SQL statement:

UPDATE Person SET status='health check'
WHERE weight > 300;

del(tbName, query, cb)

The del() function can delete table entities or documents. expr can be the table name (collection name) whose data will be deleted or a SQL expression which can be built by newsql.sql(). query is the actual value to be applied to the query condition. cb(err) is a callback function which receives an error object (if errors occurred).

Example:

var  query = {status: 'closed'};
     
newsql.del('PurchaseOrder', query, function(err) {
    if (err)
        console.log( err.stack );
    else
        // delete successfully
});

The above example is equivalent to the following SQL statement:

DELETE FROM PurchaseOrder
WHERE status = 'closed';

execute(cmd, data, query cb)

Besides the find(), insert(), update(), and del() functions, you can also use execute() to perform any of the above CRUD operations. Actually, find(), insert(), update(), and delete() are just wrappers which call the execute() function.

The data parameter is a JSON object which contains data to be inserted or updated to a table (or documents). The query parameter is a JSON object which specifies actual query values. cmd is a command object to the execute() function. It has the following properties:

  • op: specifies which CRUD operations will be performed. It should be one of the following: 'query', 'list', 'insert', 'update' and 'delete'.
  • expr: a SQL expression which can be built by the newsql.sql() function.
  • conn: a database connection object. You usually don't have to specify this property unless you want to do transactions.

If the data parameter is not needed (for example, query, list and delete), the function can be simplified to execute(cmd, query, cb).

cb is the callback function which receives an error (if any) and sometimes a result object (when it's a query, list or insert operation).

Schema management APIs

createTable(schema, cb)

You can use newsql to crerate a table right inside your program as:

newsql.createTable(schema, function(err) {
    if (err)
        // something went wrong
    else
        // a data collection has been created
});

where schema is a JSON object to describe the table schema. For details about how to define a table schema, you can refer to the schema notation.

alterTable(schema, cb)

You can also alter your table schema like the following:

newsql.alterTable(schema, function(err) {
    if (err)
        // something went wrong
    else
        // table schema modified
});

Again, the schema parameter is a JSON object to describe the table schema. The schema notation article will show you how to create them.

dropTable(tableName, cb)

Dropping a table is just as easy:

newsql.dropTable(tableName, function(err) {
    if (err)
        // something went wrong
    else
        // table has been dropped
});

describeTable(tableName, cb)

It's also possible to find out the structure of a table. You can use newsql.describeTable() to do a scan, and the result will be returned as a JSON object:

newsql.describeTable(schema, function(err, schema) {
    if (err)
        // something went wrong
    else
        // table structure will be manifested in the schema object
});

createCollection(colName, cb)

This function can be use to create a new colletion (the NoSQL equivalent to SQL table). The difference between this function and the createTable() function is that createCollection() does not require you to specify table schema.

dropCollection(colName, cb)

This function is to remove a collection. This function is actually the same as dropTable(), but is provided to make it easy for NoSQL developers.

indexProperty(colName, propName, propType, cb)

This is a great tool for NoSQL developers. When working with NoSQL databases, sometimes you'd want to index a document property to improve query performance when your data grow big. Unfortunately, that's something NoSQL databases would fall short.

With newsql, you can index (almost) any property you like by calling the indexProperty() method on the property which you would like to index. The function takes four parameters. colName is the name of the collection and propName is the name of the property to be indexed. propType is a JSON object specifying the data type of a property so it can be properly indexed. The propType parameter has three properties of its own:

  • type: data type of a property. Possible values are 'boolean', 'integer', 'number', and 'string'. This is required.
  • format: provides additional information about the data type of a property. If the data type is 'integer', format can be 'int8', 'int16', or 'int64' and those will be mapped to 'tinyint', 'smallint' and 'bigint' respectively. If the data type is 'number', format can be 'double', 'float' or 'decimal(n,s)'. For 'string' data type, format can be 'text'.
  • maxLength: if data type is 'string', this property can be used to specify the maximum length of a string property. For example, {type: 'string', maxLength: 32} means it's a string property with length no longer than 32. That's actually what we call varchar(32) in SQL.

Let's have some sample code below:

newsql.indexProperty('PersonDoc', 'weight', 
                     {type: 'integer'}, function(err) {
    if (err)
        console.log( err.stack );
    else
        // we've truned the weight property into indexed
});

removeIndex(colName, propName, cb)

Contrary to indexProperty(), this function remove an index (the property data will NOT be lost). This function should be rarely used.

How to do transactions

Doing transaction is faily simple. All you need to do is to obtain a database connection and pass it to newsql.execute(). Below is the sample code:

var  expr = newsql.sqlTemplate('Perons').value();

newsql.getConnection( function(err, conn) {
    // remember to specify database connection in 'option'
    var  option = {
        op: 'insert',
        expr: expr,
        data: {name: 'Scott Cooper'},
        conn: conn
    };
        
    conn.beginTransaction(function(err) {
        newsql.execute(option, function(err, data) {
            if (err)
                conn.rollback();
            else
                conn.commit();
        });
    };
});

How to do table join

Just like relational databases, newsql supports table join. Even if you use newsql in NoSQL style, you can join collections if you store docyment keys of one collection in another.

The following sample code shows how to do table join with newsql:

var  expr = newsql.sql('Person AS psn')
                  .join( {table: 'Company AS cpy', 
                          onWhat: 'psn.workFor=cpy.Company_id'})
                  .column(['name', 'salary', 'cpy.name AS companyName']).
                  .filter({name: 'cpy.size', op: '>'});

The above SQL template is the same as:

SELECT name, salary, cpy.name AS companyName
FROM Person AS psn
JOIN Company AS cpy on psn.workFor=cpy.Company_id
WHERE cpy.size > ?;

So it's almost the same as SQL table join as you might have already observed. However, there is one big difference to pay attention to: when doing SQL join, column names do not have to be prefixed with table name if there are no ambiguities. In newsql, you always have to prefix a column name with its table name if the column is not of the base table. That's because newsql allows you to read/write properties not defined as table columns. As a result, when a column name is not prefixed with its table name, newsql will treat it as the "undefined" properties of the base table instead of trying to interpret the column as belonging to the joined table(s).

In short, when doing join in newsql always prefix columns of non-base tables with their table name.

Test cases

You can find more examples in test cases. To run those test cases, you have to setup a sample DB for those tests to run. The test directory contains a schema.sql file to build the database schema and a data.sql file to load sample data. Remeber to put your own database user name and password in the config.json file at the package root.

Implementation

The implementation is based on JSON-FP and mySQL. JSON-FP is used to solve object query problems. If necessary, the implementation can be extended to support other RDBMS.