docs: Add architecture diagrams and documentation

This commit is contained in:
Joachim Van Herwegen 2022-08-03 10:34:58 +02:00
parent dd9781b5f2
commit 528823725a
13 changed files with 641 additions and 99 deletions

View File

@ -38,7 +38,7 @@ the [changelog](https://github.com/CommunitySolidServer/CommunitySolidServer/blo
## What the internals look like
* [How the server uses dependency injection](architecture/dependency-injection.md)
* [What the architecture looks like](architecture/architecture.md)
* [What the architecture looks like](architecture/overview.md)
## Making changes

View File

@ -1,15 +1,7 @@
# Architecture overview
# Core building blocks
The initial architecture document the project was started from can be found [here](https://rubenverborgh.github.io/solid-server-architecture/solid-architecture-v1-3-0.pdf).
Many things have been added since the original inception of the project,
but the core ideas within that document are still valid.
As can be seen from the architecture, an important idea is the modularity of all components.
No actual implementations are defined there, only their interfaces.
Making all the components independent of each other in such a way provides us with an enormous flexibility:
they can all be replaced by a different implementation, without impacting anything else.
This is how we can provide many different configurations for the server,
and why it is impossible to provide ready solutions for all possible combinations.
There are several core building blocks used in many places of the server.
These are described here.
## Handlers
A very important building block that gets reused in many places is the `AsyncHandler`.
@ -48,26 +40,3 @@ Internally this means we are mostly handling data as `Readable` objects.
We actually use `Guarded<Readable>` which is an internal format we created to help us with error handling.
Such streams can be created using utility functions such as `guardStream` and `guardedStreamFrom`.
Similarly, we have a `pipeSafely` to pipe streams in such a way that also helps with errors.
## Example request
In this section we will give a high level overview of all the components
a request passes through when it enters the server.
This is specifically an LDP request, e.g. a POST request to create a new resource.
1. The correct `HttpHandler` gets found, responsible for LDP requests.
2. The HTTP request gets parsed into a manageable format, both body and metadata such as headers.
3. The identification credentials of the request, if any, are extracted and parsed to authenticate the calling agent.
4. The request gets authorized or rejected, based on the credentials from step 3
and the authorization rules of the target resource.
5. Based on the HTTP method, the corresponding method from the `ResourceStore` gets called,
which in the case of a POST request will return the location of the newly created error.
6. The returned data and metadata get converted to an HTTP response and sent back in the `ResponseWriter`.
In case any of the steps above error, an error will be thrown.
The `ErrorHandler` will convert the error to an HTTP response to be returned.
Below are sections that go deeper into the specific steps.
Not all steps are covered yet and will be added in the future.
* [How authentication and authorization work](features/authorization.md)
* [What the `ResourceStore` looks like](features/resource-store.md)

View File

@ -1,58 +0,0 @@
# Authorization
Authorization is usually handled by the `AuthorizingHttpHandler`,
and goes in the following steps:
1. Identify the credentials of the agent making the call.
2. Extract which access modes are needed for which resources.
3. Reading the permissions the agent has.
4. Compare the above results to see if the request is allowed.
## Authentication
There are multiple `CredentialsExtractor`s that each determine identity in a different way.
Potentially multiple extractors can apply,
making a requesting agent have multiple credentials.
The `DPoPWebIdExtractor` is most relevant for the [Solid-OIDC specification](https://solid.github.io/solid-oidc/),
as it parses the access token generated by a Solid Identity Provider.
Besides that there are always the public credentials, which everyone has.
There are also some debug extractors that can be used to simulate credentials,
which can be enabled as different options through the `config/ldp/authentication` imports.
If successful, a `CredentialsExtractor` will return a key/value map
linking the type of credentials to their specific values.
## Modes extraction
Access modes are a predefined list of `read`, `write`, `append`, `create` and `delete`.
The `ModesExtractor`s determine which modes will be necessary and for which resources,
based on the request contents.
The `MethodModesExtractor` determines modes based on the HTTP method.
A GET request will always need the `read` mode for example.
Specifically for PATCH requests there are extractors for each supported PATCH type,
such as the `N3PatchModesExtractor`,
which parses the N3 Patch body to know if it will add new data or only delete data.
## Permission reading
`PermissionReaders` take the input of the above to determine which permissions are available for which credentials.
The modes from the previous step are not yet needed,
but can be used as optimization as we only need to know if we have permission on those modes.
Each reader returns all the information it can find based on the resources and modes it receives.
Those results then get combined in the `UnionPermissionReader`.
In the default configuration the following readers are combined.
* `PathBasedReader` rejects all permissions for certain paths, to prevent access to internal data.
* `OwnerPermissionReader` grants control permissions to agents that are trying to access data in a pod that they own.
* `AuxiliaryReader` handles all permissions for auxiliary resources by requesting those of the subject resource if necessary.
* `ParentContainerReader` checks the necessary permissions on a parent container when creating or deleting a resource.
* `WebAclAuxiliaryReader` determines permissions on ACL resources by requesting if the subject resource has control permissions.
* `WebAclReader` reads out the relevant ACL resource to read out the defined permissions.
All of the above is if you have WebACL enabled.
It is also possible to always grant all permissions for debugging reasons
by changing the authorization import to `config/ldp/authorization/allow-all.json`.
## Authorization
All the results of the previous steps then get combined to either allow or reject a request.
If no permissions are found for a requested mode,
or they are explicitly forbidden,
a 401/403 will be returned,
depending on if the agent was logged in or not.

View File

@ -0,0 +1,52 @@
# Parsing Command line arguments
When starting the server, the application actually uses Components.js twice to instantiate components.
The first instantiation is used to parse the command line arguments.
These then get converted into Components.js variables and are used to instantiate the actual server.
## Architecture
```mermaid
flowchart TD
CliResolver("<strong>CliResolver</strong><br>CliResolver")
CliResolver --> CliResolverArgs
subgraph CliResolverArgs[" "]
CliExtractor("<strong>CliExtractor</strong><br>YargsCliExtractor")
ShorthandResolver("<strong>ShorthandResolver</strong><br>CombinedShorthandResolver")
end
ShorthandResolver --> ShorthandResolverArgs
subgraph ShorthandResolverArgs[" "]
BaseUrlExtractor("<br>BaseUrlExtractor")
KeyExtractor("<br>KeyExtractor")
AssetPathExtractor("<br>AssetPathExtractor")
end
```
The `CliResolver` (`urn:solid-server-app-setup:default:CliResolver`) is simply a way
to combine both the `CliExtractor` (`urn:solid-server-app-setup:default:CliExtractor`)
and `ShorthandResolver` (`urn:solid-server-app-setup:default:ShorthandResolver`)
into a single object and has no other function.
Which arguments are supported and which Components.js variables are generated
can depend on the configuration that is being used.
For example, for an HTTPS server additional arguments will be needed to specify the necessary key/cert files.
## CliResolver
The `CliResolver` converts the incoming string of arguments into a key/value object.
By default, a `YargsCliExtractor` is used, which makes use of the `yargs` library and is configured similarly.
## ShorthandResolver
The `ShorthandResolver` uses the key/value object that was generated above to generate Components.js variable bindings.
A `CombinedShorthandResolver` combines the results of multiple `ShorthandExtractor`
by mapping their values to specific variables.
For example, a `BaseUrlExtractor` will be used to extract the value for `baseUrl`,
or `port` if no `baseUrl` value is provided,
and use it to generate the value for the variable `urn:solid-server:default:variable:baseUrl`.
These extractors are also where the default values for the server are defined.
For example, BaseUrlExtractor will be instantiated with a default port of `3000`
which will be used if no port is provided.
The variables generated here will be used to [initialize the server](initialization.md).

View File

@ -0,0 +1,86 @@
# Handling HTTP requests
The direction of the arrows was changed slightly here to make the graph readable.
```mermaid
flowchart LR
HttpHandler("<strong>HttpHandler</strong><br>SequenceHandler")
HttpHandler --> HttpHandlerArgs
subgraph HttpHandlerArgs[" "]
direction LR
Middleware("<strong>Middleware</strong><br><i>HttpHandler</i>")
WaterfallHandler("<br>WaterfallHandler")
end
Middleware --> WaterfallHandler
WaterfallHandler --> WaterfallHandlerArgs
subgraph WaterfallHandlerArgs[" "]
direction TB
StaticAssetHandler("<strong>StaticAssetHandler</strong><br>StaticAssetHandler")
SetupHandler("<strong>SetupHandler</strong><br><i>HttpHandler</i>")
OidcHandler("<strong>OidcHandler</strong><br><i>HttpHandler</i>")
AuthResourceHttpHandler("<strong>AuthResourceHttpHandler</strong><br><i>HttpHandler</i>")
IdentityProviderHttpHandler("<strong>IdentityProviderHttpHandler</strong><br><i>HttpHandler</i>")
LdpHandler("<strong>LdpHandler</strong><br><i>HttpHandler</i>")
end
StaticAssetHandler --> SetupHandler
SetupHandler --> OidcHandler
OidcHandler --> AuthResourceHttpHandler
AuthResourceHttpHandler --> IdentityProviderHttpHandler
IdentityProviderHttpHandler --> LdpHandler
```
The `HttpHandler` is responsible for handling an incoming HTTP request.
The request will always first go through the `Middleware`,
where certain required headers will be added such as CORS headers.
After that it will go through the list in the `WaterfallHandler`
to find the first handler that understands the request,
with the `LdpHandler` at the bottom being the catch-all default.
## StaticAssetHandler
The `urn:solid-server:default:StaticAssetHandler` matches exact URLs to static assets which require no further logic.
An example of this is the favicon, where the `/favicon.ico` URL
is directed to the favicon file at `/templates/images/favicon.ico`.
It can also map entire folders to a specific path, such as `/.well-known/css/styles/` which contains all stylesheets.
## SetupHandler
The `urn:solid-server:default:SetupHandler` is responsible
for redirecting all requests to `/setup` until setup is finished,
thereby ensuring that setup needs to be finished before anything else can be done on the server,
and handling the actual setup request that is sent to `/setup`.
Once setup is finished, this handler will reject all requests and thus no longer be relevant.
If the server is configured to not have setup enabled,
the corresponding identifier will point to a handler that always rejects all requests.
## OidcHandler
The `urn:solid-server:default:OidcHandler` handles all requests related
to the Solid-OIDC [specification](https://solid.github.io/solid-oidc/).
The OIDC component is configured to work on the `/.oidc/` subpath,
so this handler catches all those requests and sends them to the internal OIDC library that is used.
## AuthResourceHttpHandler
The `urn:solid-server:default:AuthResourceHttpHandler` is identical
to the `urn:solid-server:default:LdpHandler` which will be discussed below,
but only handles resources relevant for authorization.
In practice this means that is your server is configured
to use [Web Access Control](https://solidproject.org/TR/wac) for authorization,
this handler will catch all requests targeting `.acl` resources.
The reason these already need to be handled here is so these can also be used
to allow authorization on the following handler(s).
More on this can be found in the [identity provider](../../../usage/identity-provider/#access) documentation
## IdentityProviderHttpHandler
The `urn:solid-server:default:IdentityProviderHttpHandler` handles everything
related to our custom identity provider API, such as registering, logging in, returning the relevant HTML pages, etc.
All these requests are identified by being on the `/idp/` subpath.
More information on the API can be found in the [identity provider](../../../usage/identity-provider) documentation
## LdpHandler
Once a request reaches the `urn:solid-server:default:LdpHandler`,
the server assumes this is a standard Solid request according to the Solid protocol.
A detailed description of what happens then can be found [here](protocol/overview.md)

View File

@ -0,0 +1,124 @@
# Server initialization
When starting the server, multiple Initializers trigger to set up everything correctly,
the last one of which starts listening to the specified port.
Similarly, when stopping the server several Finalizers trigger to clean up where necessary,
although the latter only happens when starting the server through code.
## App
```mermaid
flowchart TD
App("<strong>App</strong><br>App")
App --> AppArgs
subgraph AppArgs[" "]
Initializer("<strong>Initializer</strong><br><i>Initializer</i>")
AppFinalizer("<strong>Finalizer</strong><br><i>Finalizer</i>")
end
```
`App` (`urn:solid-server:default:App`) is the main component that gets instantiated by Components.js.
Every other component should be able to trace an instantiation path back to it if it also wants to be instantiated.
It's only function is to contain an `Initializer` and `Finalizer`
which get called by calling `start`/`stop` respectively.
## Initializer
```mermaid
flowchart TD
Initializer("<strong>Initializer</strong><br>SequenceHandler")
Initializer --> InitializerArgs
subgraph InitializerArgs[" "]
direction LR
LoggerInitializer("<strong>LoggerInitializer</strong><br>LoggerInitializer")
PrimaryInitializer("<strong>PrimaryInitializer</strong><br>ProcessHandler")
WorkerInitializer("<strong>WorkerInitializer</strong><br>ProcessHandler")
end
LoggerInitializer --> PrimaryInitializer
PrimaryInitializer --> WorkerInitializer
```
The very first thing that needs to happen is initializing the logger.
Before this other classes will be unable to use logging.
The `PrimaryInitializer` will only trigger once, in the primary worker thread,
while the `WorkerInitializer` will trigger for every worker thread.
Although if your server setup is single-threaded, which is the default,
there is no relevant difference between those two.
### PrimaryInitializer
```mermaid
flowchart TD
PrimaryInitializer("<strong>PrimaryInitializer</strong><br>ProcessHandler")
PrimaryInitializer --> PrimarySequenceInitializer("<strong>PrimarySequenceInitializer</strong><br>SequenceHandler")
PrimarySequenceInitializer --> PrimarySequenceInitializerArgs
subgraph PrimarySequenceInitializerArgs[" "]
direction LR
CleanupInitializer("<strong>CleanupInitializer</strong><br>SequenceHandler")
PrimaryParallelInitializer("<strong>PrimaryParallelInitializer</strong><br>ParallelHandler")
WorkerManager("<strong>WorkerManager</strong><br>WorkerManager")
end
CleanupInitializer --> PrimaryParallelInitializer
PrimaryParallelInitializer --> WorkerManager
```
The above is a simplification of all the initializers that are present in the `PrimaryInitializer`
as there are several smaller initializers that also trigger but are less relevant here.
The `CleanupInitializer` is an initializer that cleans up anything
that might have remained from a previous server start
and could impact behaviour.
Relevant components in other parts of the configuration are responsible for adding themselves to this array if needed.
An example of this is file-based locking components which might need to remove any dangling locking files.
The `PrimaryParallelInitializer` can be used to add any initializers to that have to happen in the primary process.
This makes it easier for users to add initializers by being able to append to its handlers.
The `WorkerManager` is responsible for setting up the worker threads, if any.
### WorkerInitializer
```mermaid
flowchart TD
WorkerInitializer("<strong>WorkerInitializer</strong><br>ProcessHandler")
WorkerInitializer --> WorkerSequenceInitializer("<strong>WorkerSequenceInitializer</strong><br>SequenceHandler")
WorkerSequenceInitializer --> WorkerSequenceInitializerArgs
subgraph WorkerSequenceInitializerArgs[" "]
direction LR
WorkerParallelInitializer("<strong>WorkerParallelInitializer</strong><br>ParallelHandler")
ServerInitializer("<strong>ServerInitializer</strong><br>ServerInitializer")
end
WorkerParallelInitializer --> ServerInitializer
```
The `WorkerInitializer` is quite similar to the `PrimaryInitializer` but triggers once per worker thread.
Like the `PrimaryParallelInitializer`, the `WorkerParallelInitializer` can be used
to add any custom initializers that need to run.
### ServerInitializer
The `ServerInitializer` is the initializer that finally starts up the server by listening to the relevant port,
once all the initialization described above is finished.
This is an example of a component that differs based on some of the choices made during configuration.
```mermaid
flowchart TD
ServerInitializer("<strong>ServerInitializer</strong><br>ServerInitializer")
ServerInitializer --> WebSocketServerFactory("<strong>ServerFactory</strong><br>WebSocketServerFactory")
WebSocketServerFactory --> BaseHttpServerFactory("<br>BaseHttpServerFactory")
BaseHttpServerFactory --> HttpHandler("<strong>HttpHandler</strong><br><i>HttpHandler</i>")
ServerInitializer2("<strong>ServerInitializer</strong><br>ServerInitializer")
ServerInitializer2 ---> BaseHttpServerFactory2("<strong>ServerFactory</strong><br>BaseHttpServerFactory")
BaseHttpServerFactory2 --> HttpHandler2("<strong>HttpHandler</strong><br><i>HttpHandler</i>")
```
Depending on if the configurations necessary for websockets are imported or not,
the `urn:solid-server:default:ServerFactory` identifier will point to a different resource.
There will always be a `BaseHttpServerFactory` that starts the HTTP(S) server,
but there might also be a `WebSocketServerFactory` wrapped around it to handle websocket support.
Although not indicated here, the parameters for initializing the `BaseHttpServerFactory`
might also differ in case an HTTPS configuration is imported.
The `HttpHandler` it takes as input is responsible for how [HTTP requests get resolved](http-handler.md).

View File

@ -0,0 +1,163 @@
# Authorization
```mermaid
flowchart TD
AuthorizingHttpHandler("<br>AuthorizingHttpHandler")
AuthorizingHttpHandler --> AuthorizingHttpHandlerArgs
subgraph AuthorizingHttpHandlerArgs[" "]
CredentialsExtractor("<strong>CredentialsExtractor</strong><br><i>CredentialsExtractor</i>")
ModesExtractor("<strong>ModesExtractor</strong><br><i>ModesExtractor</i>")
PermissionReader("<strong>PermissionReader</strong><br><i>PermissionReader</i>")
Authorizer("<strong>Authorizer</strong><br>PermissionBasedAuthorizer")
OperationHttpHandler("<br><i>OperationHttpHandler</i>")
end
```
Authorization is usually handled by the `AuthorizingHttpHandler`,
which receives a parsed HTTP request in the form of an `Operation`.
It goes through the following steps:
1. A `CredentialsExtractor` identifies the credentials of the agent making the call.
2. A `ModesExtractor` finds which access modes are needed for which resources.
3. A `PermissionReader` determines the permissions the agent has on the targeted resources.
4. The above results are compared in an `Authorizer`.
5. If the request is allowed, call the `OperationHttpHandler`, otherwise throw an error.
## Authentication
There are multiple `CredentialsExtractor`s that each determine identity in a different way.
Potentially multiple extractors can apply,
making a requesting agent have multiple credentials.
The diagram below shows the default configuration if authentication is enabled.
```mermaid
flowchart TD
CredentialsExtractor("<strong>CredentialsExtractor</strong><br>UnionCredentialsExtractor")
CredentialsExtractor --> CredentialsExtractorArgs
subgraph CredentialsExtractorArgs[" "]
WaterfallHandler("<br>WaterfallHandler")
PublicCredentialsExtractor("<br>PublicCredentialsExtractor")
end
WaterfallHandler --> WaterfallHandlerArgs
subgraph WaterfallHandlerArgs[" "]
direction LR
DPoPWebIdExtractor("<br>DPoPWebIdExtractor") --> BearerWebIdExtractor("<br>BearerWebIdExtractor")
end
```
Both of the WebID extractors make use of
the (`access-token-verifier`)[https://github.com/CommunitySolidServer/access-token-verifier] library
to parse incoming tokens based on the [Solid-OIDC specification](https://solid.github.io/solid-oidc/).
Besides those there are always the public credentials, which everyone has.
All these credentials then get combined into a single union object.
If successful, a `CredentialsExtractor` will return a key/value map
linking the type of credentials to their specific values.
There are also debug configuration options available that can be used to simulate credentials.
These can be enabled as different options through the `config/ldp/authentication` imports.
## Modes extraction
Access modes are a predefined list of `read`, `write`, `append`, `create` and `delete`.
The `ModesExtractor` determine which modes will be necessary and for which resources,
based on the request contents.
```mermaid
flowchart TD
ModesExtractor("<strong>ModesExtractor</strong><br>IntermediateCreateExtractor")
ModesExtractor --> HttpModesExtractor("<strong>HttpModesExtractor</strong><br>WaterfallHandler")
HttpModesExtractor --> HttpModesExtractorArgs
subgraph HttpModesExtractorArgs[" "]
direction LR
PatchModesExtractor("<strong>PatchModesExtractor</strong><br><i>ModesExtractor</i>") --> MethodModesExtractor("<br>MethodModesExtractor")
end
```
The `IntermediateCreateExtractor` is responsible if requests try to create intermediate containers with a single request.
E.g., a PUT request to `/foo/bar/baz` should create both the `/foo/` and `/foo/bar/` containers in case they do not exist yet.
This extractor makes sure that `create` permissions are also checked on those containers.
Modes can usually be determined based on just the HTTP methods,
which is what the `MethodModesExtractor` does.
A GET request will always need the `read` mode for example.
The only exception are PATCH requests,
where the necessary modes depend on the body and the PATCH type.
```mermaid
flowchart TD
PatchModesExtractor("<strong>PatchModesExtractor</strong><br>WaterfallHandler") --> PatchModesExtractorArgs
subgraph PatchModesExtractorArgs[" "]
N3PatchModesExtractor("<br>N3PatchModesExtractor")
SparqlUpdateModesExtractor("<br>SparqlUpdateModesExtractor")
end
```
The server supports both N3 Patch and SPARQL Update PATCH requests.
In both cases it will parse the bodies to determine what the impact would be of the request and what modes it requires.
## Permission reading
`PermissionReaders` take the input of the above to determine which permissions are available for which credentials.
The modes from the previous step are not yet needed,
but can be used as optimization as we only need to know if we have permission on those modes.
Each reader returns all the information it can find based on the resources and modes it receives.
In the default configuration the following readers are combined when WebACL is enabled as authorization method.
In case authorization is disabled by changing the authorization import to `config/ldp/authorization/allow-all.json`,
this diagram is just a class that always returns all permissions.
```mermaid
flowchart TD
PermissionReader("<strong>PermissionReader</strong><br>AuxiliaryReader")
PermissionReader --> UnionPermissionReader("<br>UnionPermissionReader")
UnionPermissionReader --> UnionPermissionReaderArgs
subgraph UnionPermissionReaderArgs[" "]
PathBasedReader("<strong>PathBasedReader</strong><br>PathBasedReader")
OwnerPermissionReader("<strong>OwnerPermissionReader</strong><br>OwnerPermissionReader")
WrappedWebAclReader("<strong>WrappedWebAclReader</strong><br>ParentContainerReader")
end
WrappedWebAclReader --> WebAclAuxiliaryReader("<strong>WebAclAuxiliaryReader</strong><br>WebAclAuxiliaryReader")
WebAclAuxiliaryReader --> WebAclReader("<strong>WebAclReader</strong><br>WebAclReader")
```
The first thing that happens is that if the target is an auxiliary resource that uses the authorization of its subject resource,
the `AuxiliaryReader` inserts that identifier instead.
An example of this is if the requests targets the metadata of a resource.
The `UnionPermissionReader` then combines the results of its readers into a single permission object.
If one reader rejects a specific mode and another allows it, the rejection takes priority.
The `PathBasedReader` rejects all permissions for certain paths.
This is used to prevent access to the internal data of the server.
The `OwnerPermissionReader` makes sure owners always have control access
to the [pods they created on the server](../../../../usage/identity-provider/#pod).
Users will always be able to modify the ACL resources in their pod,
even if they accidentally removed their own access.
The final readers are specifically relevant for the WebACL algorithm.
The `ParentContainerReader` checks the permissions on a parent resource if required:
creating a resource requires `append` permissions on the parent container,
while deleting a resource requires `write` permissions there.
In case the target is an ACL resource, `control` permissions need to be checked,
no matter what mode was generated by the `ModesExtractor`.
The `WebAclAuxiliaryReader` makes sure this conversion happens.
Finally, the `WebAclReader` implements
the [efffective ACL resource algorithm](https://solidproject.org/TR/2021/wac-20210711#effective-acl-resource)
and returns the permissions it finds in that resource.
In case no ACL resource is found this indicates a configuration error and no permissions will be granted.
## Authorization
All the results of the previous steps then get combined in the `PermissionBasedAuthorizer` to either allow or reject a request.
If no permissions are found for a requested mode,
or they are explicitly forbidden,
a 401/403 will be returned,
depending on if the agent was logged in or not.

View File

@ -0,0 +1,30 @@
# Solid protocol
The `LdpHandler`, named as a reference to the Linked Data Platform specification,
chains several handlers together, each with their own specific purpose, to fully resolve the HTTP request.
It specifically handles Solid requests as described
in the protocol [specification](https://solidproject.org/TR/protocol),
e.g. a POST request to create a new resource.
Below is a simplified view of how these handlers are linked.
```mermaid
flowchart LR
LdpHandler("<strong>LdpHandler</strong><br>ParsingHttphandler")
LdpHandler --> AuthorizingHttpHandler("<br>AuthorizingHttpHandler")
AuthorizingHttpHandler --> OperationHandler("<strong>OperationHandler</strong><br><i>OperationHandler</i>")
OperationHandler --> ResourceStore("<strong>ResourceStore</strong><br><i>ResourceStore</i>")
```
A standard request would go through the following steps:
1. The `ParsingHttphandler` parses the HTTP request into a manageable format, both body and metadata such as headers.
2. The `AuthorizingHttpHandler` verifies if the request is authorized to access the targeted resource.
3. The `OperationHandler` determines which action is required based on the HTTP method.
4. The `ResourceStore` does all the relevant data work.
5. The `ParsingHttphandler` eventually receives the response data, or an error, and handles the output.
Below are sections that go deeper into the specific steps.
* [How input gets parsed and output gets returned](parsing.md)
* [How authentication and authorization work](authorization.md)
* [What the `ResourceStore` looks like](resource-store.md)

View File

@ -0,0 +1,102 @@
# Parsing and responding to HTTP requests
```mermaid
flowchart TD
ParsingHttphandler("<br>ParsingHttphandler")
ParsingHttphandler --> ParsingHttphandlerArgs
subgraph ParsingHttphandlerArgs[" "]
RequestParser("<strong>RequestParser</strong><br>BasicRequestParser")
AuthorizingHttpHandler("<strong></strong><br>AuthorizingHttpHandler")
ErrorHandler("<strong>ErrorHandler</strong><br><i>ErrorHandler</i>")
ResponseWriter("<strong>ResponseWriter</strong><br>BasicResponseWriter")
end
```
A `ParsingHttpHandler` handles both the parsing of the input data, and the serializing of the output data.
It follows these 3 steps:
1. Use the `RequestParser` to convert the incoming data into an `Operation`.
2. Send the `Operation` to the `AuthorizingHttpHandler` to receive either a `Representation` if the operation was a success,
or an `Error` in case something went wrong.
* In case of an error the `ErrorHandler` will convert the `Error` into a `ResponseDescription`.
3. Use the `ResponseWriter` to output the `ResponseDescription` as an HTTP response.
## Parsing the request
```mermaid
flowchart TD
RequestParser("<strong>RequestParser</strong><br>BasicRequestParser") --> RequestParserArgs
subgraph RequestParserArgs[" "]
TargetExtractor("<strong>TargetExtractor</strong><br>OriginalUrlExtractor")
PreferenceParser("<strong>PreferenceParser</strong><br>AcceptPreferenceParser")
MetadataParser("<strong>MetadataParser</strong><br><i>MetadataParser</i>")
BodyParser("<br><i>Bodyparser</i>")
Conditions("<br>BasicConditionsParser")
end
OriginalUrlExtractor --> IdentifierStrategy("<strong>IdentifierStrategy</strong><br><i>IdentifierStrategy</i>")
```
The `BasicRequestParser` is mostly an aggregator of multiple smaller parsers that each handle a very specific part.
### URL
This is a single class, the `OriginalUrlExtractor`, but fulfills the very important role
of making sure input URLs are handled consistently.
The query parameters will always be completely removed from the URL.
There is also an algorithm to make sure all URLs have a "canonical" version as for example both `&` and `%26`
can be interpreted in the same way.
Specifically all special characters will be encoded into their percent encoding.
The `IdentifierStrategy` it gets as input is used to determine if the resulting URL is within the scope of the server.
This can differ depending on if the server uses subdomains or not.
The resulting identifier will be stored in the `target` field of an `Operation` object.
### Preferences
The `AcceptPreferenceParser` parses the `Accept` header and all the relevant `Accept-*` headers.
These will all be put into the `preferences` field of an `Operation` object.
These will later be used to handle the content negotiation.
For example, when sending an `Accept: text/turtle; q=0.9` header,
this wil result in the preferences object `{ type: { 'text/turtle': 0.9 } }`.
### Headers
Several other headers can have relevant metadata,
such as the `Content-Type` header,
or the `Link: <http://www.w3.org/ns/ldp#Container>; rel="type"` header
which is used to indicate to the server that a request intends to create a container.
Such headers are converted to RDF triples and stored in the `RepresentationMetadata` object,
which will be part of the `body` field in the `Operation`.
The default `MetadataParser` is a `ParallelHandler` that contains several smaller parsers,
each looking at a specific header.
### Body
In case of most requests, the input data stream is used directly in the `body` field of the `Operation`,
with a few minor checks to make sure the HTTP specification is being followed.
In the case of PATCH requests though,
there are several specific body parsers that will convert the request
into a JavaScript object containing all the necessary information to execute such a PATCH.
Several validation checks will already take place there as well.
### Conditions
The `BasicConditionsParser` parses everything related to conditions headers,
such as `if-none-match` or `if-modified-since`,
and stores the relevant information in the `conditions` field of the `Operation`.
These will later be used to make sure the request should be aborted or not.
## Sending the response
In case a request is successful, the `AuthorizingHttpHandler` will return a `ResponseDescription`,
and if not it will throw an error.
In case an error gets thrown, this will be caught by the `ErrorHandler` and converted into a `ResponseDescription`.
The request preferences will be used to make sure the serialization is one that is preferred.
Either way we will have a `ResponseDescription`,
which will be sent to the `BasicResponseWriter` to convert into output headers, data and a status code.
To convert the metadata into headers, it uses a `MetadataWriter`,
which functions as the reverse of the `MetadataParser` mentioned above:
it has multiple writers which each convert certain metadata into a specific header.

View File

@ -1,6 +1,4 @@
# Resource store
Once an LDP request passes authorization, it will be passed to the `ResourceStore`.
The interface of a `ResourceStore` is mostly a 1-to-1 mapping of the HTTP methods:
* GET: `getRepresentation`

View File

@ -0,0 +1,62 @@
# Architecture overview
The initial architecture document the project was started from can be found
[here](https://rubenverborgh.github.io/solid-server-architecture/solid-architecture-v1-3-0.pdf).
Many things have been added since the original inception of the project,
but the core ideas within that document are still valid.
As can be seen from the architecture, an important idea is the modularity of all components.
No actual implementations are defined there, only their interfaces.
Making all the components independent of each other in such a way provides us with an enormous flexibility:
they can all be replaced by a different implementation, without impacting anything else.
This is how we can provide many different configurations for the server,
and why it is impossible to provide ready solutions for all possible combinations.
## Architecture diagrams
Having a modular architecture makes it more difficult to give a complete architecture overview.
We will limit ourselves to the more commonly used default configurations we provide,
and in certain cases we might give examples of what differences there are
based on what configurations are being imported.
To do this we will make use of architecture diagrams.
We will use an example below to explain the formatting used throughout the architecture documentation:
```mermaid
flowchart TD
LdpHandler("<strong>LdpHandler</strong><br>ParsingHttphandler")
LdpHandler --> LdpHandlerArgs
subgraph LdpHandlerArgs[" "]
RequestParser("<strong>RequestParser</strong><br>BasicRequestParser")
Auth("<br>AuthorizingHttpHandler")
ErrorHandler("<strong>ErrorHandler</strong><br><i>ErrorHandler</>")
ResponseWriter("<strong>ResponseWriter</strong><br>BasicResponseWriter")
end
```
Below is a summary of how to interpret such diagrams:
* Rounded red box: component instantiated in the Components.js [configuration](dependency-injection.md).
* First line:
* **Bold text**: shorthand of the instance identifier. In case the full URI is not specified,
it can usually be found by prepending `urn:solid-server:default:` to the shorthand identifier.
* (empty): this instance has no identifier and is defined in the same place as its parent.
* Second line:
* Regular text: The class of this instance.
* _Italic text_: The interface of this instance.
Will be used if the actual class is not relevant for the explanation or can differ.
* Square grey box: the parameters of the linked instance.
* Arrow: links an instance to its parameters. Can also be used to indicate the order of parameters if relevant.
For example, in the above, **LdpHandler** is a shorthand for the actual identifier `urn:solid-server:default:LdpHandler`
and is an instance of `ParsingHttpHandler`. It has 4 parameters,
one of which has no identifier but is an instance of `AuthorizingHttpHandler`.
# Features
Below are the sections that go deeper into the features of the server and how those work.
* [How Command Line Arguments are parsed and used](features/cli.md)
* [How the server is initialized and started](features/initialization.md)
* [How HTTP requests are handled](features/http-handler.md)
* [How the server handles a standard Solid request](features/protocol/overview.md)

View File

@ -14,6 +14,7 @@ The links here assume the server is hosted at `http://localhost:3000/`.
To register an account, you can go to `http://localhost:3000/idp/register/` if this feature is enabled,
which it is on all configurations we provide.
Currently our registration page ties 3 features together on the same page:
* Creating an account on the server.
* Creating or linking a WebID to your account.
* Creating a pod on the server.

View File

@ -54,7 +54,12 @@ markdown_extensions:
- pymdownx.highlight
- pymdownx.superfences
- pymdownx.smartsymbols
- pymdownx.superfences:
custom_fences:
# need to fork the theme to make changes https://github.com/squidfunk/mkdocs-material/issues/3665#issuecomment-1060019924
- name: mermaid
class: mermaid
format: !!python/name:pymdownx.superfences.fence_code_format
extra:
version:
@ -79,11 +84,18 @@ nav:
- Client credentials: usage/client-credentials.md
- Seeding pods: usage/seeding-pods.md
- Architecture:
- Architecture: architecture/architecture.md
- Overview: architecture/overview.md
- Dependency injection: architecture/dependency-injection.md
- Core: architecture/core.md
- Features:
- Authorization: architecture/features/authorization.md
- Resource Store: architecture/features/resource-store.md
- Command line arguments: architecture/features/cli.md
- Server initialization: architecture/features/initialization.md
- HTTP requests: architecture/features/http-handler.md
- Solid protocol:
- Overview: architecture/features/protocol/overview.md
- Parsing: architecture/features/protocol/parsing.md
- Authorization: architecture/features/protocol/authorization.md
- Resource Store: architecture/features/protocol/resource-store.md
- Contributing:
- Pull requests: contributing/making-changes.md
- Releases: contributing/release.md
@ -91,3 +103,4 @@ nav:
# To write documentation locally, execute the next line and browse to http://localhost:8000
# docker run --rm -it -p 8000:8000 -v ${PWD}/documentation:/docs squidfunk/mkdocs-material
# Alternatively, install `mkdocs` and `mkdocs-material` using `pip`, browse to the documentation folder and run `mkdocs serve`