Sunday, July 15, 2012

couchdb running multiple instances and data replication

http://couchdb.apache.org/

Running multiple instance of couchdb:

The following has to be changed in the instance-x.ini:

database_dir = some_location/couchdb-x (data location for the couchdb instance)

view_index_dir = some_location/couchdb-x
.....
[httpd]
port = new-port-no (assume instance1 runs on 5984 and instance2 runs on 5985)
....

[log]
file = some_location/log/couch-x.log


Assuming that both the instances are going to run on the same host. Running first instance of couchdb 

couchdb -a some_location/instance1.ini  -p instance1.pid

Running second instance of couchdb
couchdb -a some_location/instance2.ini  -p instance2.pid

Make sure that both the instance of the database is running using the 'ps -ef | grep couch' command.

Data Replication:

Run the following command to replicate the data between multiple instances (in our case from instance1 to instance2 and the other way).

Replication from instance1 to instance 2

curl -X POST -H "Content-Type: application/json" -d '{"source":"http://localhost:5984/test-db", "target":"test-db", "continuous":true, "doc_ids":["doc1", "doc2"]}' http://localhost:5985/_replicate

Reverse the port no's in the above command (change 5984 to 5985 and viceversa) for the replication from instance 2 to instance1. Optional parameter "continuous":true tells couchdb to replicate continuously whenever there is a change in the db. Leaving this parameter would do the replication once when the command is executed. Couchdb is famous for filtered replication. There are multiple ways to do filtered replication.  In the above command the optional parameter "doc_ids":["doc1", "doc2"] tells couchdb to replicate only the doc1 and doc2 documents. If there are other documents for eg: doc3, doc4, etc they won't be replicated. Couchdb is very powerful in filtered replication and works very nicely on the mobile devices which might go online and offline very frequently due to coverage issues.

Some useful considerations for couchdb:

1 - Lightweight (around 23M resident size)
2 - MVCC - Multi Version Concurrency Control - would be a natural fit for the products that need to track multiple updates to the same document
3 - Filtered data replication
4 - NoSQL,Document based,schema-free design - No relational data and the data could vary during updates at various intervals
5 - For a embeddable solution take a look at touchdb
6 - Offline web applications - pouchDB
7 - Clustering Big Couch

Cassandra vs MongoDB vs CouchDb vs Redis vs Riak vs HBase vs Membase vs Neo4j

The above are some of the main features apart from plethora of benefits couchdb provides.

Couchdb is supported by cloudant team - http://blog.cloudant.com/the-future-of-couchdb/

couchdb doesn't stop here ... there are lot more ....

No comments: