mirror of
				https://github.com/bigchaindb/bigchaindb.git
				synced 2024-10-13 13:34:05 +00:00 
			
		
		
		
	 d066bfe132
			
		
	
	
		d066bfe132
		
			
		
	
	
	
	
		
			
			* Problem: No docs explaining use of MongoDB for querying Solution: Start a new root docs page explaining how a node operator can use the full power of MongoDB's query engine, and can expose as much of that as they like to end users. * Finished first draft of new docs page 'Querying BigchainDB'
		
			
				
	
	
		
			79 lines
		
	
	
		
			5.2 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			79 lines
		
	
	
		
			5.2 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| Querying BigchainDB
 | |
| ===================
 | |
| 
 | |
| A node operator can use the full power of MongoDB's query engine to search and query all stored data, including all transactions, assets and metadata.
 | |
| The node operator can decide for themselves how much of that query power they expose to external users.
 | |
| 
 | |
| How to Query
 | |
| ------------
 | |
| 
 | |
| A BigchainDB node operator has full access to their local MongoDB instance, so they can use any of MongoDB's APIs for running queries, including:
 | |
| 
 | |
| - `the mongo Shell <https://docs.mongodb.com/manual/mongo/>`_,
 | |
| - one of `the MongoDB drivers <https://docs.mongodb.com/ecosystem/drivers/>`_, such as `PyMongo <https://api.mongodb.com/python/current/>`_, or
 | |
| - a third-party tool or driver for doing MongoDB queries, such as RazorSQL.
 | |
| 
 | |
| What Can be Queried?
 | |
| --------------------
 | |
| 
 | |
| BigchainDB Server creates several `MongoDB collections <https://docs.mongodb.com/manual/core/databases-and-collections/>`_ in the node's local MongoDB database.
 | |
| You can see the list of collections by looking at the ``create_tables`` method in the BigchainDB Server file ``bigchaindb/backend/localmongodb/schema.py``. The most interesting collections are:
 | |
| 
 | |
| - transactions
 | |
| - assets
 | |
| - metadata
 | |
| - blocks
 | |
| 
 | |
| We don't detail what's in each collection here, but the collection names are fairly self-explanatory. You can explore their contents using MongoDB queries. A couple of things worth noting are:
 | |
| 
 | |
| 1. The transactions collection doesn't include any ``asset.data`` or ``metadata`` values (JSON documents). Those are all removed and stored separately in the assets and metadata collections, respectively.
 | |
| 2. The JSON documents stored in the blocks collection are *not* `Tendermint blocks <https://github.com/tendermint/tendermint/blob/master/types/block.go>`_, they are `BigchainDB blocks <https://docs.bigchaindb.com/projects/server/en/latest/data-models/block-model.html>`_.
 | |
| 3. Votes aren't stored in any MongoDB collection, currently. They are all handled and stored by Tendermint in its own (LevelDB) database.
 | |
| 
 | |
| What a Node Operator Can Expose to External Users
 | |
| -------------------------------------------------
 | |
| 
 | |
| Each node operator can decide how they let external users get information from their local MongoDB database. They could expose:
 | |
| 
 | |
| - their local MonogoDB database itself to queries from external users, maybe as a MongoDB user with a role that has limited privileges, e.g. read-only.
 | |
| - a limited HTTP API, allowing a restricted set of predefined queries, such as `the HTTP API provided by BigchainDB Server <http://bigchaindb.com/http-api>`_, or a custom HTTP API implemented using Django, Express, Ruby on Rails, or ASP.NET.
 | |
| - some other API, such as a GraphQL API. They could do that using custom code or code from a third party.
 | |
| 
 | |
| Each node operator can expose a different level or type of access to their local MongoDB database.
 | |
| For example, one node operator might decide to specialize in offering optimized `geospatial queries <https://docs.mongodb.com/manual/reference/operator/query-geospatial/>`_.
 | |
| 
 | |
| Security Considerations
 | |
| -----------------------
 | |
| 
 | |
| In BigchainDB version 1.3.0 and earlier, there was one logical MongoDB database, so exposing that database to external users was very risky, and was not recommended.
 | |
| "Drop database" would delete that one shared MongoDB database.
 | |
| 
 | |
| In BigchainDB version 2.0.0 and later, each node has its own isolated local MongoDB database.
 | |
| Inter-node communications are done using Tendermint protocols, not MongoDB protocols, as illustrated in Figure 1 below.
 | |
| If a node's local MongoDB database gets compromised, none of the other MongoDB databases (in the other nodes) will be affected.
 | |
| 
 | |
| .. figure:: _static/schemaDB.png
 | |
|    :alt: Diagram of a four-node BigchainDB 2.0 network
 | |
|    :align: center
 | |
|    
 | |
|    Figure 1: A Four-Node BigchainDB 2.0 Network
 | |
| 
 | |
| .. raw:: html
 | |
| 
 | |
|    <br>
 | |
|    <br>
 | |
|    <br>
 | |
| 
 | |
| Performance and Cost Considerations
 | |
| -----------------------------------
 | |
| 
 | |
| Query processing can be quite resource-intensive, so it's a good idea to have MongoDB running in a separate machine from those running BigchainDB Server and Tendermint Core.
 | |
| 
 | |
| A node operator might want to measure the resources used by a query, so they can charge whoever requested the query accordingly.
 | |
| 
 | |
| Some queries can take too long or use too many resources. A node operator should put upper bounds on the resources that a query can use, and halt (or prevent) any query that goes over.
 | |
| 
 | |
| To make MongoDB queries more efficient, one can create `indexes <https://docs.mongodb.com/manual/indexes/>`_. Those indexes might be created by the node operator or by some external users (if the node operator allows that). It's worth noting that indexes aren't free: whenever new data is appended to a collection, the corresponding indexes must be updated. The node operator might want to pass those costs on to whoever created the index. Moreover, in MongoDB, `a single collection can have no more than 64 indexes <https://docs.mongodb.com/manual/reference/limits/#Number-of-Indexes-per-Collection>`_.
 | |
| 
 | |
| One can create a follower node: a node with Tendermint voting power 0. It would still have a copy of all the data, so it could be used as read-only node. A follower node could offer specialized queries as a service without affecting the workload on the voting validators (which can also write). There could even be followers of followers.
 |