From 5ab0bcdbf5e3a3816f3fdf1bc70fe471a0495f5c Mon Sep 17 00:00:00 2001 From: Hayden Young Date: Fri, 5 May 2023 00:33:36 +0800 Subject: [PATCH] Docs (#66) * docs: Access controllers. * test: Re-open an existing db using its address. * docs: Simple db interaction. * docs: Basic Identities. * docs: Storage. * docs: Implementing a custom database. * docs: Example OrbitDB AC. * docs: Use identity id when customizing access. * docs: canAppend. * docs: Graphically describe log joining. * docs: Update db types. * docs: Sync-ing. * docs: Reverse flow arrows. * docs: Logical clock. * docs: DB address and manifest. * docs: Move ops description to db. * docs: CRDT. * docs: Peer discovery, connecting ipfs nodes, orbitdb replication. * docs: Change file name case to match other documentation solutions (e.g. IPFS/libp2p). * docs: Links to CRDT papers. * docs: A getting started to get up and running quickly. * docs: Move replication to own readme. * docs: Links to various js-libp2p connection config. * docs: Examples for connecting two node servers. * docs: Server to browser connection. * docs: Replication how-to. * docs: Remove SYNC. * docs: Simplify oplog discussion. * docs: Connecting to IPFS in the browser. * docs: Topics moved to separate docs. --- FAQ.md | 97 ------- GUIDE.md | 548 ------------------------------------- docs/ACCESS_CONTROLLERS.md | 126 +++++++++ docs/CONNECTING_PEERS.md | 119 ++++++++ docs/DATABASES.md | 232 ++++++++++++++++ docs/GETTING_STARTED.md | 57 ++++ docs/IDENTITIES.md | 28 ++ docs/OPLOG.md | 122 +++++++++ docs/REPLICATION.md | 38 +++ docs/STORAGE.md | 57 ++++ test/orbitdb-open.test.js | 8 + 11 files changed, 787 insertions(+), 645 deletions(-) delete mode 100644 FAQ.md delete mode 100644 GUIDE.md create mode 100644 docs/ACCESS_CONTROLLERS.md create mode 100644 docs/CONNECTING_PEERS.md create mode 100644 docs/DATABASES.md create mode 100644 docs/GETTING_STARTED.md create mode 100644 docs/IDENTITIES.md create mode 100644 docs/OPLOG.md create mode 100644 docs/REPLICATION.md create mode 100644 docs/STORAGE.md diff --git a/FAQ.md b/FAQ.md deleted file mode 100644 index fea691f..0000000 --- a/FAQ.md +++ /dev/null @@ -1,97 +0,0 @@ -# Frequently Asked Questions - -OrbitDB, like all code, is in a state of constant development. Doubtless, you're going to have some questions. The purpose of this FAQ is to answer the most common questions regarding how to get OrbitDB up and running, how to address common issues, and how to deal with pitfalls and common errors implementing it. - -This is a living document. If you see an answer that could be improved, please [open an issue](https://github.com/orbitdb/orbit-db/issues/new) or submit a PR directly. If you think than a question is missing, please [open an issue](https://github.com/orbitdb/orbit-db/issues/new). If you think that there is a better way to resolve a question - perhaps by improving the `orbitdb --help` docs or by adding a feature - please [open an issue](https://github.com/orbitdb/orbit-db/issues/new). Sense a theme yet? :) - -This document is seeded by questions from people opening issues in this repository. If enough people ask the same question, we'll add one here and point newcomers to it. Please don't be offended if the maintainers say "read the FAQ" - it's our way of making sure we don't spend all of our time answering the same questions. - -**Questions** - - - -- [Frequently Asked Questions](#frequently-asked-questions) - - [Database replication is not working. Why?](#database-replication-is-not-working-why) - - [Can I recreate the entire database on another machine based on the address?](#can-i-recreate-the-entire-database-on-another-machine-based-on-the-address) - - [Is every `put` to OrbitDB immediately sent to the network and persisted?](#is-every-put-to-orbitdb-immediately-sent-to-the-network-and-persisted) - - [Does OrbitDB already support pinning when using js-ipfs ?](#does-orbitdb-already-support-pinning-when-using-js-ipfs-) - - [Does orbit have a shared feed between peers where multiple peers can append to the same feed?](#does-orbit-have-a-shared-feed-between-peers-where-multiple-peers-can-append-to-the-same-feed) - - [I'm getting a lot of 429 (Too Many Requests) errors when I run OrbitDB](#im-getting-a-lot-of-429-too-many-requests-errors-when-i-run-orbitdb) - - [Where can I learn more about security, encryption, and account recovery?](#where-can-i-learn-more-about-security-encryption-and-account-recovery) - - [Does OrbitDB natively allow for a multi-writer capability permission model?](#does-orbitdb-natively-allow-for-a-multi-writer-capability-permission-model) - - [How can I contribute to this FAQ?](#how-can-i-contribute-to-this-faq) - - - ---- - -### Database replication is not working. Why? - -_The answer to this question is a work in progress. See [orbit-db#505](https://github.com/orbitdb/orbit-db/issues/505)._ - -### Can I recreate the entire database on another machine based on the address? - -A database can't be "recreated" without downloading the database from other peers. Knowing an address will allow a user to open the database, which automatically connects to other peers who have the database open, and downloads the database which then "recreates" the database state locally, ie. replicates the database. - -### Is every `put` to OrbitDB immediately sent to the network and persisted? - -When calling `put` or any other update operation on a database, the data is 1) saved locally and persisted to IPFS and 2) send to the network, through IPFS Pubsub, to peers who have the database open (ie. peers). - -Upon calling `put` (or other updates), OrbitDB saves the data locally and returns. That is, the operation and its data is saved to the local node only after which `put` returns and *asynchronously* sends a message to pubsub peers. OrbitDB doesn't have a notion of confirming replication status from other peers (although this can be added on user-level) and considers operation a success upon persisting it locally. OrbitDB doesn't use consensus nor does it wait for the network to confirm operations making it an *eventually consistent* system. - -In short: it can't be assumed that data has been replicated to the network after an update-operation call finishes (eg. `put`, `add`). - -### Does OrbitDB already support pinning when using js-ipfs ? - -Currently [js-ipfs](https://github.com/ipfs/js-ipfs) supports `ipfs.repo.gc()` but it's yet not run on any sort of schedule, so nothing gets removed from a `js-ipfs` node and therefore an OrbitDB database. - -However, this will change in the future as js-ipfs schedules GC and we want to make sure that OrbitDB is actually persisting everything (by default), so [some work on pinning needs to happen](https://github.com/ipfs/js-ipfs/issues/2650). If you're using OrbitDB with go-ipfs (through js-ipfs-api), and GC happens and data may not be persisted anymore. Once the pinning performance is fixed we will implement pinning-by-default in [`orbit-db-io`](https://github.com/orbitdb/orbit-db-io). - -### Does orbit have a shared feed between peers where multiple peers can append to the same feed? - -> "...or, is it done more like scuttlebutt, where each peer has their own feed" - -All databases (feeds) are shared between peers, so nobody "owns them" like users do in ssb (afaik). Multiple peers can append to the same db. @tyleryasaka is right in that each peer has their own copy of the db (the log) and they may have different versions between them but, as @tyleryasaka (correctly) describes, through the syncing/replication process the peers exchange "their knowledge of the db" (heads) with each other, the dbs/logs get merged. This is what the "CRDT" in ipfs-log enables. But from address/authority/ownership perspective, they all share the same feed. - -### I'm getting a lot of 429 (Too Many Requests) errors when I run OrbitDB - -This happens when there's only one node with data available, and the system isn't able to effectively get all of the data it needs from it. In order to get around this, IPFS instantiates nodes with preload enabled, so that one node isn't effectively DDoSed. However, sometimes these nodes go down, as well, causing 429 errors. To get around this in example cases (certainly not in production), disable preload: - -``` -this.ipfs = new Ipfs({ - preload: { enabled: false }, - // ... -} -``` - -### Where can I learn more about security, encryption, and account recovery? - -The very short answer is that OrbitDB is agnostic in terms of encryption and account recovery with the aim of providing maximum flexibility with your apps. We don't do any encryption on our side; however, nothing is stopping you from encrypting data before storing it in the OrbitDB network. OrbitDB (just like IPFS) will treat encrypted the data exactly the same. Any node can replicate the data, but only nodes which have access to the encryption key from some other means will be able to decrypt it. - - -### Does OrbitDB natively allow for a multi-writer capability permission model? - -BY default, if you allow `*` access to the access controller, like so: - -`orbitdb.feed('name', { accessController: { write ['*'] }})` - -To allow specific keys to write to the database, pass the keys as strings like so: - -`orbitdb.feed('name', { accessController: { write ['key1', 'key2'] }}) // keys cannot be revoked` - -Allows anyone to write to the db. If you specify keys, the process involves granting and revoking keys. Granting is doable, but revocation is a harder and is being worked on by multiple parties, without a solution. - -If you want to encrypt the keys or content, it's easier with a single user. If you want to use encryption with multiwriters, that's another bag which also hasn't been solved. - -The concept of identity in OrbitDB currently centers on a single user associated with a public key. To do more than this, you may need a different access controller. [@3box](https://github.com/3box) has a modified access controller plugin, [3box-orbitdb-plugins](https://github.com/3box/3box-orbitdb-plugins), which is worth looking at for how to do this. - -| | Non-Encrypted | Encrypted | -| ----- | ----- | ---- | -| Single Writer | Default | Requires encryption key management | -| Multi Writer | Difficulty w/ granting + revocation | Difficulty w/ granting + revocation AND sharing encryption keys | - -We'd love to add multi-writer support to OrbitDB! The maintainers at Haja are currently not working on anything related to it though but would be happy to help. Your best bet is to jump on [Gitter](https://gitter.im/orbitdb/Lobby) and ask us where the current efforts are. - -### How can I contribute to this FAQ? - -See the introduction at the top! Please open any issues and pull requests you can to improve this FAQ.md. It is here for you. If you're confused, ask another question publicly; it's possible that other people are, too. If you don't want to open an issue, feel free to jump onto [the Gitter](https://gitter.im/orbitdb/Lobby) and ask us there, too. diff --git a/GUIDE.md b/GUIDE.md deleted file mode 100644 index 69aaf96..0000000 --- a/GUIDE.md +++ /dev/null @@ -1,548 +0,0 @@ -# Getting Started with OrbitDB - -This guide will get you familiar with using OrbitDB in your JavaScript application. OrbitDB and IPFS both work in Node.js applications as well as in browser applications. (Windows is not supported yet though). - -This guide is still being worked on and we would love to get [feedback and suggestions](https://github.com/orbitdb/orbit-db/issues) on how to improve it! - -## Table of Contents - - - -- [Background](#background) -- [Install](#install) -- [API](#api) -- [Setup](#setup) -- [Create a database](#create-a-database) - * [Address](#address) - + [Manifest](#manifest) - * [Identity](#identity) - + [Creating an identity](#creating-an-identity) - * [Access Control](#access-control) - + [Public databases](#public-databases) - + [Granting access after database creation](#granting-access-after-database-creation) - + [Custom Access Controller](#custom-access-controller) -- [Add an entry](#add-an-entry) -- [Get an entry](#get-an-entry) -- [Entry sorting and conflict resolution](#entry-sorting-and-conflict-resolution) -- [Persistency](#persistency) -- [Replicating a database](#replicating-a-database) -- [Custom Stores](#custom-stores) -- [More information](#more-information) - - - -## Background - -OrbitDB is a peer-to-peer database meaning that each peer has its own instance of a specific database. A database is replicated between the peers automatically resulting in an up-to-date view of the database upon updates from any peer. That is to say, the database gets pulled to the clients. - -This means that each application contains the full database that they're using. This in turn changes the data modeling as compared to client-server model where there's usually one big database for all entries: in OrbitDB, the data should be stored, "partitioned" or "sharded" based on the access rights for that data. For example, in a twitter-like application, tweets would not be saved in a global "tweets" database to which millions of users write concurrently, but rather, ***each user would have their own database*** for their tweets. To follow a user, a peer would subscribe to a user's feed, ie. replicate their feed database. - -OrbitDB supports multiple data models (see more details below) and as such the developer has a variety of ways to structure data. Combined with the peer-to-peer paradigm, the data modeling is important factor to build scalable decentralized applications. - -This may not be intuitive or you might not be sure what the best approach would be and we'd be happy to help you decide on your data modeling and application needs, [feel free to reach out](https://github.com/orbitdb/orbit-db/issues)! - -## Install - -Install [orbit-db](https://github.com/orbitdb/orbit-db) and [ipfs](https://www.npmjs.com/package/ipfs) from npm: - -```sh -npm install orbit-db ipfs -``` - -## API - -See [API.md](https://github.com/orbitdb/orbit-db/blob/master/API.md) for the full documentation. - -## Setup - -Require OrbitDB and IPFS in your program and create the instances: - -```javascript -import IPFS from 'ipfs' -import OrbitDB from 'orbit-db' - -async function main () { - // Create IPFS instance - const ipfsOptions = { repo : './ipfs', } - const ipfs = await IPFS.create(ipfsOptions) - - // Create OrbitDB instance - const orbitdb = await OrbitDB.createInstance(ipfs) - } - -main() -``` - -`orbitdb` is now the OrbitDB instance we can use to interact with the databases. - -## Create a database - -First, choose the data model you want to use. The available data models are: -- [Key-Value](https://github.com/orbitdb/orbit-db/blob/master/API.md#orbitdbkeyvaluenameaddress) -- [Log](https://github.com/orbitdb/orbit-db/blob/master/API.md#orbitdblognameaddress) (append-only log) -- [Feed](https://github.com/orbitdb/orbit-db/blob/master/API.md#orbitdbfeednameaddress) (same as log database but entries can be removed) -- [Documents](https://github.com/orbitdb/orbit-db/blob/master/API.md#orbitdbdocsnameaddress-options) (store indexed JSON documents) -- [Counters](https://github.com/orbitdb/orbit-db/blob/master/API.md#orbitdbcounternameaddress) - -Then, create a database instance (we'll use Key-Value database in this example): - -```javascript -import IPFS from 'ipfs' -import OrbitDB from 'orbit-db' - -async function main () { - // Create IPFS instance - const ipfsOptions = { repo : './ipfs', } - const ipfs = await IPFS.create(ipfsOptions) - - // Create OrbitDB instance - const orbitdb = await OrbitDB.createInstance(ipfs) - - // Create database instance - const db = await orbitdb.keyvalue('first-database') -} -main() -``` - -### Address - -When a database is created, it will be assigned an address by OrbitDB. The address consists of three parts: -``` -/orbitdb/Qmd8TmZrWASypEp4Er9tgWP4kCNQnW4ncSnvjvyHQ3EVSU/first-database -``` - -The first part, `/orbitdb`, specifies the protocol in use. The second part, an IPFS multihash `Qmd8TmZrWASypEp4Er9tgWP4kCNQnW4ncSnvjvyHQ3EVSU`, is the database manifest which contains the database info such as the name and type, and a pointer to the access controller. The last part, `first-database`, is the name of the database. - -In order to replicate the database with peers, the address is what you need to give to other peers in order for them to start replicating the database. - -The database address can be accessed as `db.address` from the database instance: -```javascript -const address = db.address -// address == '/orbitdb/Qmdgwt7w4uBsw8LXduzCd18zfGXeTmBsiR8edQ1hSfzcJC/first-database' -``` - -For example: -```javascript -import IPFS from 'ipfs' -import OrbitDB from 'orbit-db' - -async function main () { - const ipfsOptions = { repo: './ipfs',} - const ipfs = await IPFS.create(ipfsOptions) - const orbitdb = await OrbitDB.createInstance(ipfs) - const db = await orbitdb.keyvalue('first-database') - console.log(db.address.toString()) - // /orbitdb/Qmd8TmZrWASypEp4Er9tgWP4kCNQnW4ncSnvjvyHQ3EVSU/first-database - -} -main() -``` - -#### Manifest - -The second part of the address, the IPFS multihash `Qmdgwt7w4uBsw8LXduzCd18zfGXeTmBsiR8edQ1hSfzcJC`, is the manifest of a database. It's an IPFS object that contains information about the database. - -The database manifest can be fetched from IPFS with `ipfs dag get ` command and it looks like this: - -```json -{ - "Data": "{\"name\":\"a\",\"type\":\"feed\",\"accessController\":\"/ipfs/QmdjrCN7SqGxRapsm6LuoS4HrWmLeQHVM6f1Zk5A3UveqA\"}", - "Hash": "Qmdgwt7w4uBsw8LXduzCd18zfGXeTmBsiR8edQ1hSfzcJC", - "Size": 102, - "Links": [] -} -``` - -### Identity - -Each entry in a database is signed by who created that entry. The identity, which includes the public key used to sign entries, can be accessed via the identity member variable of the database instance: - -```javascript -const identity = db.identity -console.log(identity.toJSON()) -// prints -{ - id: '0443729cbd756ad8e598acdf1986c8d586214a1ca9fa8c7932af1d59f7334d41aa2ec2342ea402e4f3c0195308a4815bea326750de0a63470e711c534932b3131c', - publicKey: '0446829cbd926ad8e858acdf1988b8d586214a1ca9fa8c7932af1d59f7334d41aa2ec2342ea402e4f3c0195308a4815bea326750de0a63470e711c534932b3131c', - signatures: { - id: '3045022058bbb2aa415623085124b32b254b8668d95370261ade8718765a8086644fc8ae022100c736b45c6b2ef60c921848027f51020a70ee50afa20bc9853877e994e6121c15', - publicKey: '3046022100d138ccc0fbd48bd41e74e40ddf05c1fa6ff903a83b2577ef7d6387a33992ea4b022100ca39e8d8aef43ac0c6ec05c1b95b41fce07630b5dc61587a32d90dc8e4cf9766' - }, - type: 'orbitdb' -} -``` - - -#### Creating an identity -```javascript -import Identities from 'orbit-db-identity-provider' -const options = { id: 'local-id' } -const identity = await Identities.createIdentity(options) -``` -This identity can be used in OrbitDB by passing it in as an argument in the `options` object: -```javascript -const orbitdb = await OrbitDB.createInstance(ipfs, { identity: identity }) -``` -The identity also contains signatures proving possession of the id and OrbitDB public key. This is included to allow proof of ownership of an external public key within OrbitDB. You can read more [here](https://github.com/orbitdb/orbit-db-identity-provider) - -The OrbitDB public key can be retrieved with: -```javascript -console.log(db.identity.publicKey) -// 04d009bd530f2fa0cda29202e1b15e97247893cb1e88601968abfe787f7ea03828fdb7624a618fd67c4c437ad7f48e670cc5a6ea2340b896e42b0c8a3e4d54aebe -``` - -If you want to give access to other peers to write to a database, you need to get their public key in hex and add it to the access controller upon creating the database. If you want others to give you the access to write, you'll need to give them your public key (output of `orbitdb.identity.publicKey`). For more information, see: [Access Control](https://github.com/orbitdb/orbit-db/blob/master/GUIDE.md#access-control). - -### Access Control - -You can specify the peers that have write-access to a database. You can define a set of peers that can write to a database or allow anyone write to a database. **By default and if not specified otherwise, only the creator of the database will be given write-access**. - -***Note!*** *OrbitDB currently supports only dynamically adding write-access. That is, write-access cannot be revoked once added. In the future OrbitDB will support access revocation and read access control. At the moment, if access rights need to be removed, the address of the database will change.* - -Access rights are setup by passing an `accessController` object that specifies the access-controller type and access rights of the database when created. OrbitDB currently supports write-access. The access rights are specified as an array of public keys of the peers who can write to the database. The public keys to which access is given can be retrieved from the identity.publicKey property of each peer. - -```javascript -import IPFS from 'ipfs' -import OrbitDB from 'orbit-db' - -async function main () { - const ipfsOptions = { repo: './ipfs',} - const ipfs = await IPFS.create(ipfsOptions) - const orbitdb = await OrbitDB.createInstance(ipfs) - const options = { - // Give write access to ourselves - accessController: { - write: [orbitdb.identity.id] - } - } - - const db = await orbitdb.keyvalue('first-database', options) - console.log(db.address.toString()) - // /orbitdb/Qmd8TmZrWASypEp4Er9tgWP4kCNQnW4ncSnvjvyHQ3EVSU/first-database -} -main() -``` - -To give write access to another peer, you'll need to get their public key with some means. They'll need to give you the output of their OrbitDB instance's id: `orbitdb.identity.id`. - -The keys look like this: -`042c07044e7ea51a489c02854db5e09f0191690dc59db0afd95328c9db614a2976e088cab7c86d7e48183191258fc59dc699653508ce25bf0369d67f33d5d77839` - -Give access to another peer to write to the database: -```javascript -import IPFS from 'ipfs' -import OrbitDB from 'orbit-db' - -async function main () { - const ipfsOptions = { repo: './ipfs', } - const ipfs = await IPFS.create(ipfsOptions) - const orbitdb = await OrbitDB.createInstance(ipfs) - - const options = { - // Setup write access - accessController: { - write: [ - // Give access to ourselves - orbitdb.identity.id, - // Give access to the second peer - '042c07044e7ea51a489c02854db5e09f0191690dc59db0afd95328c9db614a2976e088cab7c86d7e48183191258fc59dc699653508ce25bf0369d67f33d5d77839', - ] - } - } - - const db1 = await orbitdb.keyvalue('first-database', options) - console.log(db1.address.toString()) - // /orbitdb/Qmdgwt7w4uBsw8LXduzCd18zfGXeTmBsiR8edQ1hSfzcJC/first-database - - // Second peer opens the database from the address - const db2 = await orbitdb.keyvalue(db1.address.toString()) -} - -main() -``` - -#### Public databases - -The access control mechanism also support "public" databases to which anyone can write to. - -This can be done by adding a `*` to the write access array: -```javascript -import IPFS from 'ipfs' -import OrbitDB from 'orbit-db' - -async function main () { - const ipfsOptions = { repo: './ipfs', } - const ipfs = await IPFS.create(ipfsOptions) - const orbitdb = await OrbitDB.createInstance(ipfs) - - const options = { - // Give write access to everyone - accessController: { - write: ['*'] - } - } - - const db = await orbitdb.keyvalue('first-database', options) - console.log(db.address.toString()) - // /orbitdb/QmRrauSxaAvNjpZcm2Cq6y9DcrH8wQQWGjtokF4tgCUxGP/first-database -} - -main() -``` - -Note how the access controller hash is different compared to the previous example! - -#### Granting access after database creation - -To give access to another peer after the database has been created, you must set the access-controller `type` to an `AccessController` which supports dynamically adding write-access such as `OrbitDBAccessController`. - -```javaScript -db = await orbitdb1.feed('AABB', { - accessController: { - type: 'orbitdb', //OrbitDBAccessController - write: [identity1.publicKey] - } -}) - -await db.access.grant('write', identity2.publicKey) // grant access to identity2 -``` - -#### Custom Access Controller - -You can create a custom access controller by implementing the `AccessController` [interface](https://github.com/orbitdb/orbit-db-access-controllers/blob/master/src/access-controller-interface.js) and adding it to the AccessControllers object before passing it to OrbitDB. - -```javascript -import AccessControllers from 'orbit-db-access-controllers' -import AccessController from 'orbit-db-access-controllers/interface' - -class OtherAccessController extends AccessController { - - static get type () { return 'othertype' } // Return the type for this controller - - async canAppend(entry, identityProvider) { - // logic to determine if entry can be added, for example: - if (entry.payload === "hello world" && entry.identity.id === identity.id && identityProvider.verifyIdentity(entry.identity)) - return true - - return false - } - - async grant (access, identity) {} // Logic for granting access to identity - - async save () { - // return parameters needed for loading - return { parameter: 'some-parameter-needed-for-loading' } - } - - static async create (orbitdb, options) { - return new OtherAccessController() - } -} - -AccessControllers.addAccessController({ AccessController: OtherAccessController }) - -const orbitdb = await OrbitDB.createInstance(ipfs, { - AccessControllers: AccessControllers -}) - -const db = await orbitdb.keyvalue('first-database', { - accessController: { - type: 'othertype', - write: [id1.id] - } -}) -``` - -## Add an entry - -To add an entry to the database, we simply call `db.put(key, value)`. - -```javascript -import IPFS from 'ipfs' -import OrbitDB from 'orbit-db' - -async function main () { - const ipfsOptions = { repo: './ipfs'} - const ipfs = await IPFS.create(ipfsOptions) - const orbitdb = await OrbitDB.createInstance(ipfs) - const db = await orbitdb.keyvalue('first-database') - await db.put('name', 'hello') -} - -main() -``` - -**NOTE ON PERSISTENCY** - -OrbitDB does not automatically pin content added to IPFS. This means that if garbage collection is triggered, any unpinned content will be erased. To pin the entry, pass the optional `{ pin: true }` in the arguments: - -```js -await db.put('name', 'hello', { pin: true }) -``` - -For adding entries to other databases, see: -- [log.add()](https://github.com/orbitdb/orbit-db/blob/master/API.md#addevent) -- [feed.add()](https://github.com/orbitdb/orbit-db/blob/master/API.md#adddata) -- [docs.put()](https://github.com/orbitdb/orbit-db/blob/master/API.md#putdoc) -- [counter.inc()](https://github.com/orbitdb/orbit-db/blob/master/API.md#incvalue) - -**Parallelism** - -We currently don't support parallel updates. Updates to a database need to be executed in a sequential manner. The write throughput is several hundreds or thousands of writes per second (depending on your platform and hardware, YMMV), so this shouldn't slow down your app too much. If it does, [lets us know](https://github.com/orbitdb/orbit-db/issues)! - -Update the database one after another: -```javascript -await db.put('key1', 'hello1') -await db.put('key2', 'hello2') -await db.put('key3', 'hello3') -``` - -Not: -```javascript -// This is not supported atm! -Promise.all([ - db.put('key1', 'hello1'), - db.put('key2', 'hello2'), - db.put('key3', 'hello3') -]) -``` - -## Get an entry - -To get a value or entry from the database, we call the appropriate query function which is different per database type. - -Key-Value: -```javascript -async function main () { - const ipfsOptions = { repo: './ipfs'} - const ipfs = await IPFS.create(ipfsOptions) - const orbitdb = await OrbitDB.createInstance(ipfs) - const db = await orbitdb.keyvalue('first-database') - await db.put('name', 'hello') - const value = db.get('name') -} - -main() -``` - -Other databases, see: -- [log.iterator()](https://github.com/orbitdb/orbit-db/blob/master/API.md#iteratoroptions) -- [feed.iterator()](https://github.com/orbitdb/orbit-db/blob/master/API.md#iteratoroptions-1) -- [docs.get()](https://github.com/orbitdb/orbit-db/blob/master/API.md#getkey-1) -- [docs.query()](https://github.com/orbitdb/orbit-db/blob/master/API.md#querymapper) -- [counter.value](https://github.com/orbitdb/orbit-db/blob/master/API.md#value) - -## Entry sorting and conflict resolution - -OrbitDB relies on [ipfs-log](https://github.com/orbitdb/ipfs-log) which sorts the entries based on a `sortFn` which determines the order. By default, the `sortFn` is set to [Last Writer Wins](https://github.com/orbitdb/ipfs-log/blob/1d609385f7c5db9926a0388cfcdf7fd2a796c522/src/log-sorting.js#L15) where the entry with the greater clock wins and conflicts are resolved by clock id. - -You can pass a custom sorting function to handle conflicts differently as follows: - -```javaScript -const db = await orbitdb.log('sortDifferently', { - sortFn: SomeOtherSortFn -}) -``` - -`SomeOtherSortFn` takes two entries and should return either `-1` or `1` indicating which of the arguments is greater. The function must not return `0` when comparing entries. See [Log Sorting](https://github.com/orbitdb/ipfs-log/blob/master/src/log-sorting.js#L15) - -## Persistency - -OrbitDB saves the state of the database automatically on disk. This means that upon opening a database, the developer can choose to load locally the persisted before using the database. **Loading the database locally before using it is highly recommended!** - -```javascript -async function main () { - const ipfsOptions = { repo: './ipfs'} - const ipfs = await IPFS.create(ipfsOptions) - const orbitdb = await OrbitDB.createInstance(ipfs) - - const db1 = await orbitdb.keyvalue('first-database') - await db1.put('name', 'hello') - await db1.close() - - const db2 = await orbitdb.keyvalue('first-database') - await db2.load() - const value = db2.get('name') - // 'hello' -} - -main() -``` - -If the developer doesn't call `load()`, the database will be operational but will not have the persisted data available immediately. Instead, OrbitDB will load the data on the background as new updates come in from peers. - -## Replicating a database - -In order to have the same data, ie. a query returns the same result for all peers, an OrbitDB database must be replicated between the peers. This happens automatically in OrbitDB in a way that a peer only needs to open an OrbitDB from an address and it'll start replicating the database. - -To know when database was updated, we can listen for the `replicated` event of a database: `db2.events.on('replicated', () => ...)`. When the `replicated` event is fired, it means we received updates for the database from a peer. This is a good time to query the database for new results. - -Replicate a database between two nodes: - -```javascript -async function main() { - // Create the first peer - const ipfs1_config = { repo: './ipfs1', } - const ipfs1 = await IPFS.create(ipfs1_config) - - // Create the database - const orbitdb1 = await OrbitDB.createInstance(ipfs1, { directory: './orbitdb1' }) - const db1 = await orbitdb1.log('events') - - // Create the second peer - const ipfs2_config = { repo: './ipfs2', } - const ipfs2 = await IPFS.create(ipfs2_config) - - // Open the first database for the second peer, - // ie. replicate the database - const orbitdb2 = await OrbitDB.createInstance(ipfs2, { directory: './orbitdb2' }) - const db2 = await orbitdb2.log(db1.address.toString()) - - console.log('Making db2 check replica') - - // When the second database replicated new heads, query the database - db2.events.on('replicated', () => { - const result = db2.iterator({ limit: -1 }).collect().map(e => e.payload.value) - console.log(result.join('\n')) - }) - - // Start adding entries to the first database - setInterval(async () => { - await db1.add({ time: new Date().getTime() }) - }, 1000) - -} - -main() -``` - -## Custom Stores - -Use a custom store to implement case specific functionality that is not supported by the default OrbitDB database stores. Then, you can easily add and use your custom store with OrbitDB: - -```javascript -// define custom store type -class CustomStore extends DocumentStore { - constructor (ipfs, id, dbname, options) { - super(ipfs, id, dbname, options) - this._type = CustomStore.type - } - - static get type () { - return 'custom' - } -} - -// add custom type to orbitdb -OrbitDB.addDatabaseType(CustomStore.type, CustomStore) - -// instantiate custom store -let orbitdb = await OrbitDB.createInstance(ipfs, { directory: dbPath }) -let store = orbitdb.create(name, CustomStore.type) -``` - -## More information - -Is this guide missing something you'd like to understand or found an error? Please [open an issue](https://github.com/orbitdb/orbit-db/issues) and let us know what's missing! - -Also, if you want a much more in-depth tutorial and exploration of OrbitDB's architecture, please check out the [OrbitDB Field Manual](https://github.com/orbitdb/field-manual). diff --git a/docs/ACCESS_CONTROLLERS.md b/docs/ACCESS_CONTROLLERS.md new file mode 100644 index 0000000..6194ca4 --- /dev/null +++ b/docs/ACCESS_CONTROLLERS.md @@ -0,0 +1,126 @@ +# Access Controllers + +Access controllers define the write access a user has to a database. By default, write access is limited to the user who created the database. Access controllers provide a way in which write access can be expanded to users other than the database creator. + +An access controller is passed when a database is opened for the first time. Once created, the database's write access will be limited to only those users who are listed. By default, only the user creating the database will have write access. + +Different access controllers can be assigned to the database using the `AccessController` param and passing it to OrbitDB's `open` function. + +``` +const orbitdb = await OrbitDB() +const db = orbitdb.open('my-db', { AccessController: SomeAccessController() }) +``` + +OrbitDB is bundled with two AccessControllers; IPFSAccessController, an immutable access controller which uses IPFS to store the access settings, and OrbitDBAccessController, a mutable access controller which uses OrbitDB's keyvalue database to store one or more permissions. + +## IPFS Access Controller + +By default, the database `db` will use the IPFSAccessController and allow only the creator to write to the database. + +``` +const orbitdb = await OrbitDB() +const db = orbitdb.open('my-db') +``` + +To change write access, pass the IPFSAccessController with the `write ` parameter and an array of one or more Identity ids: + +``` +const identities = await Identities() +const identity1 = identities.createIdentity('userA') +const identity2 = identities.createIdentity('userB') + +const orbitdb = await OrbitDB() +const db = orbitdb.open('my-db', { AccessController: IPFSAccessController(write: [identity1.id, identity2.id]) }) +``` + +To allow anyone to write to the database, specify the wildcard '*': + +``` +const orbitdb = await OrbitDB() +const db = orbitdb.open('my-db', { AccessController: IPFSAccessController(write: ['*']) }) +``` + +## OrbitDB Access Controller + +The OrbitDB access controller provides configurable write access using grant and revoke. + +``` +const identities = await Identities() +const identity1 = identities.createIdentity('userA') +const identity2 = identities.createIdentity('userB') + +const orbitdb = await OrbitDB() +const db = orbitdb.open('my-db', { AccessController: OrbitDBAccessController(write: [identity1.id]) }) + +db.access.grant('write', identity2.id) +db.access.revoke('write', identity2.id) +``` + +When granting or revoking access, a capability and the identity's id must be defined. + +Grant and revoke are not limited to 'write' access only. A custom access capability can be specified, for example, `db.access.grant('custom-access', identity1.id)`. + +## Custom Access Controller + +Access can be customized by implementing a custom access controller. To implement a custom access controller, specify: + +- A curried function with the function signature `async ({ orbitdb, identities, address })`, +- A `type` constant, +- A canAppend function with the param `entry`. + +``` +const type = 'custom' + +const CustomAccessController = () => async ({ orbitdb, identities, address }) => { + address = '/custom/access-controller' + + const canAppend = (entry) => { + + } +} + +CustomAccessController.type = type +``` + +Additional configuration can be passed to the access controller by adding one or more parameters to the `CustomAccessController` function. For example, passing a configurable object parameter with the variable `write`: + +``` +const CustomAccessController = ({ write }) => async ({ orbitdb, identities, address }) => { +} +``` + +### The canAppend function + +The main driver of the access controller is the canAppend function. This specifies whether an identity can or cannot add an item to the operations log (or any other mechanism requiring access authorization). + +How the custom access controller evaluates access will be determined by its use case, but in most instances, the canAppend function will want to check whether the entry being created can be written to the database (and underlying operations log). Therefore, the entry's identity will need to be used to retrieve the identity's id: + +``` +write = [identity.id] + +const canAppend = async (entry) => { + const writerIdentity = await identities.getIdentity(entry.identity) + if (!writerIdentity) { + return false + } + + const { id } = writerIdentity + + if (write.includes(id) || write.includes('*')) { + return identities.verifyIdentity(writerIdentity) + } + return false +} +``` + +In the above example, the `entry.identity` will be the hash of the identity. Using this hash, the entire identity can be retrieved and the identity's id is used to verify write access. `write.includes('*')` is wildcard write and would allow any identity to write to the operations log. + +### Using a custom access controller with OrbitDB + +Before passing the custom access controller to the `open` function, it must be added to OrbitDB's AccessControllers: + +``` +AccessControllers.add(CustomAccessController) +const orbitdb = await OrbitDB() +const db = await orbitdb.open('my-db', { AccessController: CustomAccessController(params) }) +``` \ No newline at end of file diff --git a/docs/CONNECTING_PEERS.md b/docs/CONNECTING_PEERS.md new file mode 100644 index 0000000..cbdab05 --- /dev/null +++ b/docs/CONNECTING_PEERS.md @@ -0,0 +1,119 @@ +# Connecting Peers + +OrbitDB peers connect to one another using js-libp2p. Connection settings will vary depending on what environment the peer is running in and what system the peers is attempting to connect to. + +## Node Daemon to Node Daemon + +Node.js allows libp2p to open connections with other Node.js daemons. + +```javascript +ipfs1 = await IPFS.create({ repo: './ipfs1' }) +ipfs2 = await IPFS.create({ repo: './ipfs2' }) + +const cid = await ipfs1.block.put('here is some data') +const block = await ipfs2.block.get(cid) +``` + +On localhost or a local network, both ipfs nodes should discover each other quickly enough that ipfs2 will retrieve the block added to ipfs1. + +In remote networks, retrieval of content across peers may take significantly longer. To speed up communication between the two peers, connect one peer to another directly using the swarm API and a peer's publicly accessible address. For example, assuming ipfs1 is listening on the address /ip4/1.2.3.4/tcp/12345/p2p/ipfs1-peer-hash: + +```javascript +ipfs1 = await IPFS.create({ repo: './ipfs1' }) +ipfs2 = await IPFS.create({ repo: './ipfs2' }) + +await ipfs2.swarm.connect('/ip4/1.2.3.4/tcp/12345/p2p/ipfs1-peer-hash') + +const cid = await ipfs1.block.put('here is some data') +const block = await ipfs2.block.get(cid) +``` + +## Browser to Node Daemon + +For various security reasons, a browser cannot dial another peer over a raw TCP or QUIC connection from within a web page. One option is to use a websocket exposed on the server and dial in via the browser. + +On the server, listen for incoming websocket connections: + +```javascript +import { WebSockets } from '@libp2p/websockets' +import { create } from 'ipfs-core' + +ipfs1 = await IPFS.create({ + libp2p: { + addresses: { + listen: [ + '/ip4/0.0.0.0/tcp/0/ws' + ] + }, + transports: [new WebSockets()] + }, + repo: './ipfs1' +}) +``` + +Within the browser, dial into the server using the server's exposed web socket: + +```javascript +// import the following libraries if using a build environment such as vite. +import { WebSockets } from '@libp2p/websockets' +import { create } from 'ipfs-core' +import { all } from '@libp2p/websockets/filters' + +// uncomment { filter: all } if no tls certificate is deployed. Only do this in development environments. +const ws = new webSockets(/* { filter: all } */) + +ipfs1 = await IPFS.create({ + libp2p: { + transports: [new webSockets()] + }}, + repo: './ipfs1' +}) +``` + +## Browser to Browser and Node Daemon to Browser + +A connection cannot be made directly to a browser node. Browsers do not listen for incoming connections, they operate in a server/client environment where the server address is known and the browser connects to the server using the known address. Therefore, for a browser to respond to an incoming connection a relay is required to "listen" on the browser's behalf. The relay assigns and advertises multi addresses on behalf of the browser nodes, allowing the nodes to create a direct connection between each other. + +Peer to peer connections where another peer connects to a browser node can use WebRTC as the transport protocol. A signalling server is required to facilitate the discovery of a node and establish a direct connection to another node. + +Details on how to [deploy a WebRTC signalling server](https://github.com/libp2p/js-libp2p-webrtc-star/tree/master/packages/webrtc-star-signalling-server) are provided by the libp2p project. + +To connect two nodes via a relay, the IPFS swarm address should match the address of the signalling server. + +In the first browser peer, configure + +```javascript +import { create } from 'ipfs-core' +import { multiaddr } from 'multiaddr' + +ipfs = await IPFS.create({ + config: { + Addresses: { + Swarm: ['/ip4/0.0.0.0/tcp/12345/ws/p2p-webrtc-star'] + } + } +}) +``` + +Configure the second browser node in the same way as the first, then dial in to the first browser peer using its multiaddress: + +```javascript +import { create } from 'ipfs-core' +import { multiaddr } from 'multiaddr' + +ipfs = await IPFS.create({ + config: { + Addresses: { + Swarm: ['/ip4/0.0.0.0/tcp/12345/ws/p2p-webrtc-star'] + } + } +}) + +await ipfs.swarm.connect('/multiaddr/of/first-peer') +``` + +## Further Reading + +The js-libp2p library provides variety of [configuration options](https://github.com/libp2p/js-libp2p/blob/master/doc/CONFIGURATION.md) for finding peers and connecting them to one another. + +The different methods of connecting various systems is outlined in [libp2p's connectivity](https://connectivity.libp2p.io) section. \ No newline at end of file diff --git a/docs/DATABASES.md b/docs/DATABASES.md new file mode 100644 index 0000000..1d431da --- /dev/null +++ b/docs/DATABASES.md @@ -0,0 +1,232 @@ +# DB + +DB provides a variety of different data stores with a common interface. + +## Types + +OrbitDB provides four types of data stores: + +- Events +- Documents +- Key/Value +- Indexed Key/Value + +The type of database can be specified when calling OrbitDB's `open` function by using the `type` parameter: + +``` +const type = 'documents' +orbitdb.open('my-db', { type }) +``` + +If no type is specified, Events will the default database type. + +### Address + +When a database is created, it is assigned an address by OrbitDB. The address consists of three parts: + +``` +/orbitdb/zdpuAmrcSRUhkQcnRQ6p4bphs7DJWGBkqczSGFYynX6moTcDL +``` + +The first part, `/orbitdb`, specifies the protocol in use. The second part, an IPFS multihash `zdpuAmrcSRUhkQcnRQ6p4bphs7DJWGBkqczSGFYynX6moTcDL`, is the database manifest which contains the database info such as the name and type, and a pointer to the access controller. + +In order to replicate the database with peers, the address is what you need to give to other peers in order for them to start replicating the database. + +```javascript +import IPFS from 'ipfs-core' +import OrbitDB from 'orbit-db' + +const ipfs = await IPFS.create() +const orbitdb = await OrbitDB({ ipfs }) +const db = await orbitdb.open('my-db') +console.log(db.address) +// /orbitdb/zdpuAmrcSRUhkQcnRQ6p4bphs7DJWGBkqczSGFYynX6moTcDL +``` + +### Manifest + +The second part of the address, the IPFS multihash `zdpuAmrcSRUhkQcnRQ6p4bphs7DJWGBkqczSGFYynX6moTcDL`, is also the hash of the database's manifest. The manifest contains information about the database such as name, type and other metadata. It also contains a reference to the access controller, which is made up of the type and the hash of the access controller object. + +An example of a manifest is given below: + +```json +{ + hash: 'zdpuAzzxCWEzRffxFrxNNVkcVFbkmA1EQdpZJJPc3wpjojkAT', + manifest: { + name: 'my-db', + type: 'events', + accessController: '/ipfs/zdpuB1TUuF5E81MFChDbRsZZ1A3Kz2piLJwKQ2ddnfZLEBx64' + } +} +``` + +## Operations + +Operations are of either type "PUT" or "DEL". + +A PUT operation describes a record which has been created or edited. If operations share the same key or id, they are assumed to be related and the operation which was created after all other operations with the same key will be the latest version of the record. + +A DEL operation describes a record which has been removed. It will share the same key as a previous PUT operation and will indicate that the record that was PUT is now deleted. + +A PUT record might look like: + +``` +{ + id: 'log-1', + payload: { op: 'PUT', key: 4, value: 'Some data' }, + next: [ '3' ], + refs: [ + '2', + '1' + ], + clock: Clock { + id: '038cc50a92f10c39f74394a1779dffb2c79ddc6b7d1bbef8c484bd4bbf8330c426', + time: 4 + }, + v: 2 +} +``` + +In the above example, payload holds the information about the record. `op` is the operation carried out, in this case PUT (the other option is DEL). `key` holds a unique identifier for the record and value contains some data. In the above example, data is a string but it could be a number, XML or even the JSON representation of an object. + +## Opening a new database + +Opening a default event store: + +``` +const orbitdb = await OrbitDB() +await orbitdb.open('my-db') +``` + +Opening a documents database: + +``` +const orbitdb = await OrbitDB() +await orbitdb.open('my-db', { type: 'documents' }) +``` + +Opening a keyvalue database: + +``` +const orbitdb = await OrbitDB() +await orbitdb.open('my-db', { type: 'keyvalue' }) +``` + +Opening a database and adding meta + +``` +const meta = { description: 'A database with metadata.' } +const orbitdb = await OrbitDB() +await orbitdb.open('my-db', { meta }) +``` + +## Loading an existing database + +``` +const orbitdb = await OrbitDB() +const db = await orbitdb.open('my-db') +db.close() +const dbReopened = await orbitdb.open(db.address) +``` + +## Interacting with a database + +### Adding/Putting items in a database + +All databases expose a common `put` function which is used to add items to the database. + +``` +const orbitdb = await OrbitDB() +const db = await orbitdb.open('my-db', { type: keyvalue }) +const hash = await db.put('key', 'value') +``` + +For databases such as Events which is an append-only data store, a `null` key will need to be used: + +``` +const orbitdb = await OrbitDB() +const db = await orbitdb.open('my-db') +const hash = await db.put(null, 'event') +``` + +Alternatively, append-only databases can implement the convenience function `add`: + +``` +const orbitdb = await OrbitDB() +const db = await orbitdb.open('my-db') +const hash = await db.add('event') +``` + +### Removing/Deleting items from a database + +To delete an item from a databse, use the `del` function: + +``` +const orbitdb = await OrbitDB() +const db = await orbitdb.open('my-db', { type: keyvalue }) +const hash = await db.put('key', 'value') +await db.del(hash) +``` + +## Replicating a database across peers + +``` +import * as IPFS from 'ipfs-core' + +const ipfs1 = await IPFS.create({ config1, repo: './ipfs1' }) +const ipfs2 = await IPFS.create({ config2, repo: './ipfs2' }) + +orbitdb1 = await OrbitDB({ ipfs: ipfs1, id: 'user1', directory: './orbitdb1' }) +orbitdb2 = await OrbitDB({ ipfs: ipfs2, id: 'user2', directory: './orbitdb2' }) + +const db1 = await orbitdb1.open('my-db') +const db2 = await orbitdb2.open(db1.address) +``` + +## Building a custom database + +OrbitDB can be extended to use custom or third party data stores. To implement a custom database, ensure the Database object is extended and that the OrbitDB database interface is implement. The database will also require a unique type. + +``` +const CustomStore = async ({ OpLog, Database, ipfs, identity, address, name, access, directory, storage, meta, syncAutomatically, indexBy = '_id' }) => { + const database = await Database({ OpLog, ipfs, identity, address, name, access, directory, storage, meta, syncAutomatically }) + + const { addOperation, log } = database + + /** + * Puts an item to the underlying database. You will probably want to call + * Database's addOperation here with an op code 'PUT'. + */ + const put = async (doc) => { + } + + /** + * Deletes an item from the underlying database. You will probably want to + * call Database's addOperation here with an op code 'DEL'. + */ + const del = async (key) => { + } + + /** + * Gets an item from the underlying database. Use a hash or key to retrieve + * the value. + */ + const get = async (key) => { + } + + /** + * Iterates over the data set. + */ + const iterator = async function * ({ amount } = {}) { + } + + return { + ...database, + type: 'customstore', + put, + del, + get, + iterator + } +} +``` \ No newline at end of file diff --git a/docs/GETTING_STARTED.md b/docs/GETTING_STARTED.md new file mode 100644 index 0000000..aaa869b --- /dev/null +++ b/docs/GETTING_STARTED.md @@ -0,0 +1,57 @@ +# Getting Started + +This guide will help you get up and running with a simple OrbitDB database that you can replicate across multiple peers. + +## install + +Install OrbitDB: + +``` +npm i orbit-db +``` + +You will also need IPFS for replication: + +``` +npm i ipfs-core +``` + +## Creating a simple database + +To create a database, launch an instance of OrbitDB call the `open` function with a unique database name: + +``` +const ipfs = await IPFS.create() +const orbitdb = await OrbitDB({ ipfs }) +const db = await orbitdb.open('my-db') +``` + +Once opened, your new database will reside on the system it was created on. + +Without a type, OrbitDB defaults to a database type of 'events'. To change the database type, pass a `type` with a valid database type: + +``` +const type = 'documents' +const ipfs = await IPFS.create() +const orbitdb = await OrbitDB({ ipfs }) +const db = await orbitdb.open('my-db', { type }) +``` + +## Replicating a database + +A database created on one peer can be replicated on another by opening the database by its address rather than by its name: + +``` +const address = '/orbitdb/zdpuAzzxCWEzRffxFrxNNVkcVFbkmA1EQdpZJJPc3wpjojkAT' +const ipfs = await IPFS.create() +const orbitdb = await OrbitDB({ ipfs }) +const db = await db.open(address) +``` + +IPFS is required for carrying out the underlying replication. + +More information about replication is available in the [Replication](./REPLICATION.md) documentation. + +## Further Reading + +The [Databases](./DATABASES.md) documentation covers replication and data entry in more detail. \ No newline at end of file diff --git a/docs/IDENTITIES.md b/docs/IDENTITIES.md new file mode 100644 index 0000000..e42be2c --- /dev/null +++ b/docs/IDENTITIES.md @@ -0,0 +1,28 @@ +# Identities + +An identity is a cryptographically signed identifier or "id" and can be used to sign and verify various data. Within OrbitDB, the main objective of an identity is verify write access to a database's log and, if allowed, to sign each entry as it is added to the log. + +Identities provides methods to manage one or more identities and includes functionality for creating, retrieving, signing and verifying an identity as well as signing and verifying messages using an existing identity. + +## Creating an identity + +``` +const id = 'userA' +const identities = await Identities() +const identity = identities.createIdentity({ id }) +``` + +Once created, the identity can be passed to OrbitDB: + +``` +const orbitdb = await OrbitDB({ identity }) +``` + +## Specifying a keystore + +``` +const keystore = await KeyStore() +const id = 'userA' +const identities = await Identities({ keystore }) +const identity = identities.createIdentity({ id }) +``` \ No newline at end of file diff --git a/docs/OPLOG.md b/docs/OPLOG.md new file mode 100644 index 0000000..cb8869f --- /dev/null +++ b/docs/OPLOG.md @@ -0,0 +1,122 @@ +# Operations Log + +The operations log or oplog, contains an immutable list of operations which have been carried out on the database. + +Each operation is known as an entry and each entry includes the id of the log the entry is stored in, some metadata describing the entry, references to other entries which come before it and payload which includes the data being stored. + +## Conflict-free Replicated Data Types (CRDT) + +in a distributed system such as orbitdb, a log is replicated across multiple systems. these replicas are updated independently of one another so that the state of one replica may differ greatly from the state of another. Differing replicas leaves the state of the log inconsistent. Concurrent updates to multiple versions of the same log requires a mechanism for resolving inconsistencies and returning the log to a consistent state across all replicas. + +a crdt is a data structure that is replicated across multiple systems, is able to be updated concurrently and without any coordination with other replicas and can resolve inconsistencies between replicas when replicas are merged. + +To learn more about CRDTs, check out this research: + +- ["A comprehensive study of Convergent and Commutative Replicated Data Types"](http://hal.upmc.fr/inria-00555588/document) paper +- [CRDTs on Wikipedia](https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type#Known_CRDTs) +- [IPFS's CRDT research group](https://github.com/ipfs/research-CRDT) + + +## Entry retrieval + +Entries can be retrieved by iterating over the oplog. + +``` +for await (const entry from log.iterator()) { + console.log(entry) +} +``` + +A subset of entries can also be retrieved by passing filters to the iterator. Available options include: + +**amount:** only return a certain number of entries. By default, all entries are returned. + +**gt:** return all entries after entry with specified hash (exclusive) + +**gte:** return all entries after entry with specified hash (inclusive), + +**lt:** return all entries before entry with specified hash (exclusive), + +**lte:** return all entries before entry with specified hash (inclusive). + +For example, the last 5 entries can be retrieved by passing the amount parameter: + +``` +log.iterator({ amount: 5 }) +``` + +If the log contains less than 5 entries, all entries will be returned. + +Additionally, multiple parameters can be used. To retrieve 2 entries prior to a an entry with hash '123', use amount and lt: + +``` +log.iterator({ amount: 2, lt: '123' }) +``` + +"Before" and "after" are determined by the order in which the entries are sorted. By default, entries are sorted newest to oldest. + +## Entry sorting and conflict resolution + +OpLog relies on a sort function to determine the order in which entries are returned. By default, Oplog uses the sort function Last Write Wins, which uses a logical clock to determine whether one entry is "newer" than another one. + +Sorting can be customized by passing an alternative function: + +```javaScript +const CustomSortFn = () => { + // alternative sorting mechanism. +} + +const identity = Identities.createIdentity('userA') +const db = await Log(identity, { sortFn: CustomSortFn }) +``` + +See Conflict Resolution for more information about creating a custom sort function. + +When a log contains a single history, the order of entries can be easily determined by a simple incremental counter. However, when two or more logs are joined, conflicts can occur between entries. When conflicts arise, a clock is used to resolve them. + +### Ordering decentralized logs + +In a centralized database, entries are stored in a single table. This allows for the order of entries to be easily determined, either by assigning a sequentially incremental number (E.g. 1, 2, 3, etc) or timestamp (E.g. 1681199558, 2023-01-01 23:11:56, etc). + +In a decentalized database, multiple versions of a database may be running across various locations, and may not always be connected all of the time. While some kind of sequential identifier can be used for entries within a standalone database, problems arise if entries from distributed copies of the database are joined together. + +Because of the adhoc nature of a connection between databases, a number of issues can arise: + +- Databases are not always connected, and may be offline for long periods of time, +- The various systems running the database may not share the exact same time, +- It is possible for two entries in to different databases to be written at exactly the same time. + +This means the traditional sequential or temporal id cannot guarantee a single order of entries across multiple copies of the database. Hence, this problem is solved through the use of a logical clock. + +### Logical Clock + +A logical clock provides a method to timestamp entries without needing to know the current state of a clock on another system. + +A logical clock contains two properties; a unique hash to distinguish the clock from other clocks and a logical "time". As each new entry is added, the time is incremented. Both properties allow entries to be sorted, because, if there is a "time" clash (I.e. both items have the same "time"), the ordering can fall back to the hash as a final attempt at collision resolution. + +### Last Write Wins + +Imagine there are two logs, A and B which share entries (I.e. they represent the operations of the same database). + +Log A has a logical clock initialized with the hash "1". Log B has a logical clock initialized with the hash "2". Both clocks are initialized with time equal to "0". + +Entries are added to A: + +``` +A.append('A1') // Time: 1 +A.append('A2') // Time: 2 +A.append('A3') // Time: 3 +``` + +At the same time, entries are added to B: + +``` +B.append('B1') // Time: 1 +B.append('B2') // Time: 2 +``` + +Log A is joined to log B. + +Iterating over the entries in log B will yield A1, B1, A2, B2, A3. The order is determined by the sort function, which, by default, is Last Write Wins (LWW). The LWW function will determine that an entry with a great time will come after the entry with a lesser time. Therefore, B2 follows B1 and A3 follows both A2 and A1. And, the clock's hash will determine the order for entries with the same logical time. Therefore B1 follows A1 and B2 follows A2. + +Joining Log B to log A should yield the same results because the sort function is the same for both logs. This ensures the ordering of the log entries is deterministic, and, thus, the same across databases. \ No newline at end of file diff --git a/docs/REPLICATION.md b/docs/REPLICATION.md new file mode 100644 index 0000000..2f4f06a --- /dev/null +++ b/docs/REPLICATION.md @@ -0,0 +1,38 @@ +# Replication + +Below is a simple replication example. Both peers run within the same Node daemon. + +``` +const waitFor = async (valueA, toBeValueB, pollInterval = 100) => { + return new Promise((resolve) => { + const interval = setInterval(async () => { + if (await valueA() === await toBeValueB()) { + clearInterval(interval) + resolve() + } + }, pollInterval) + }) +} + +let connected1 = false +let connected2 = false + +const onConnected1 = async (peerId, heads) => { + connected1 = true +} + +const onConnected2 = async (peerId, heads) => { + connected2 = true +} + +db1.events.on('join', onConnected1) +db2.events.on('join', onConnected2) + +await db1.put({ _id: 1, msg: 'record 1 on db 1' }) +await db2.put({ _id: 2, msg: 'record 2 on db 2' }) +await db1.put({ _id: 3, msg: 'record 3 on db 1' }) +await db2.put({ _id: 4, msg: 'record 4 on db 2' }) + +await waitFor(() => connected1, () => true) +await waitFor(() => connected2, () => true) +``` diff --git a/docs/STORAGE.md b/docs/STORAGE.md new file mode 100644 index 0000000..4d6c494 --- /dev/null +++ b/docs/STORAGE.md @@ -0,0 +1,57 @@ +# Storage + +OrbitDB is all about storage, and storage can be configured to best meet the needs of the implementation. Storage is also designed to be hierarchical, allowing for a variety of storage mechanisms to be used together. + +## Storage types + +OrbitDB is bundled with the following storage: + +- IPFSBlockStorage: IPFS block level storage, +- LevelStorage: LevelDB-based storage, +- LRUStorage: A Least Recently Used cache, +- MemoryStorage: A memory only array, +- ComposedStorage: A storage mechanism combining two other storage objects. + +All storage objects expose two common functions, `put` for adding a record and `get` for retrieving a record. This allows for storage to be easily swapped in and out based on the needs of the database solution. + +### Composed storage + +ComposedStorage combines two of the above storage objects. This reduces the need for performance and capability trade-offs because a combination of storage mechanisms can be used for a balance of speed vs memory usage. For example, MemoryStorage plus LevelStorage can be used for fast retrieval plus semi-permanent data storage, or LRU for efficient caching of frequently accessed items plus IPFSBlockStorage for replication. + +To use composed storage, create two storage objects and then pass them to an instance of `ComposedStorage`: + +``` +const memoryStorage = await MemoryStorage() +const levelStorage = await LevelStorage() + +const composedStorage = await ComposedStorage(memoryStorage, levelStorage) +``` + +The order in which primary storage is passed to ComposedStorage is important. When accessed, ComposedStorage will attempt to retrieve the data from the first storage mechanism, so this should be the performance-based storage. If not found, ComposedStorage will attempt to retrieve the data from the second storage; this will likely be some kind of permanent storage mechanism. + +## Customizing Storage + +By default, OrbitDB uses `ComposedStorage`, but storage can be customized across most functionality. For example, to permanently store database operations in OpLog, the default `MemoryStorage` can be replaced with `LevelStorage`: + +``` +const identities = await Identities() +const identity = identities.createIdentity({ id: 'userA' }) +const entryStorage = await MemoryStorage() +const headsStorage = await MemoryStorage() +const indexStorage = await MemoryStorage() +const log = await Log(identity, { entryStorage, headsStorage, indexStorage }) +await log.append('An operation') +``` + +## Implementing a third party storage solution + +Any storage mechanism can be used with OrbitDB provided it implements the OrbitDB storage interface. Once created, simply pass the storage instance to OrbitDB: + +``` +const identities = await Identities() +const identity = identities.createIdentity({ id: 'userA' }) +const entryStorage = await CustomStorage() +const headsStorage = await CustomStorage() +const indexStorage = await CustomStorage() +const log = await Log(identity, { entryStorage, headsStorage, indexStorage }) +``` diff --git a/test/orbitdb-open.test.js b/test/orbitdb-open.test.js index 55cca06..fbc5dd8 100644 --- a/test/orbitdb-open.test.js +++ b/test/orbitdb-open.test.js @@ -209,6 +209,14 @@ describe('Open databases', function () { deepStrictEqual(all.map(e => e.value), expected) }) + + it('re-opens a database by address', async () => { + const dbReopened = await orbitdb1.open(db.address) + + strictEqual(dbReopened.address, db.address) + + dbReopened.close() + }) }) describe('opening a database as a different user', () => {