Skip to content
Snippets Groups Projects
Commit 2295fa76 authored by Christian Boettcher's avatar Christian Boettcher
Browse files

Merge branch 'fastAPI' into 'master'

apiserver based on fastAPI

Closes #1

See merge request rybicki1/datacatalog!1
parents 8eb4d9bd 267a5e01
Branches
Tags
1 merge request!1apiserver based on fastAPI
**__pycache__/
# contains data for local tests
app/
\ No newline at end of file
......@@ -4,3 +4,88 @@ This is Data Catalog for eFlows4HPC project
Find architecture in [arch](arch/arch.adoc) folder.
## API-Server for the Data Catalog
[This](apiserver/) part is the the API-server for the Data Catalog, which will provide the backend functionality.
It is implemented via [fastAPI](https://fastapi.tiangolo.com/) and provides an api documentation via openAPI.
For deployment via [docker](https://www.docker.com/), a docker image is included.
### API Documentation
If the api-server is running, you can see the documentation at `<server-url>/docs` or `<server-url>/redoc`.
### Running without docker
First ensure that your python version is 3.6 or newer.
Then, if they are not yet installed on your machine, install the requirements via pip:
```bash
pip install -r requirements.txt
```
To start the server, run
```bash
uvicorn apiserver:app --reload --reload-dir apiserver
```
while in the project root directory.
Without any other options, this starts your server on `<localhost:8000>`.
The `--reload --reload-dir apiserver` options ensure, that any changes to files in the `apiserver`-directory will cause an immediate reload of the server, which is especially useful during development. If this is not required, just don't include the options.
More information about uvicorn settings (including information about how to bind to other network interfaces or ports) can be found [here](https://www.uvicorn.org/settings/).
### Testing
First ensure that the `pytest` package is installed (It is included in the `requirements.txt`).
Tests are located in the `apiserver_tests` directory. They can be executed by simply running `pytest` while in the project folder.
If more test-files should be added, they should be named with a `test_` prefix and put into a similarily named folder, so that they can be auto-detected.
The `context.py` file helps with importing the apiserver-packages, so that the tests function independent of the local python path setup.
### Using the docker image
#### Building the docker image
To build the docker image of the current version, simply run
```bash
docker build -t datacatalog-apiserver ./apiserver
```
while in the project root directory.
`datacatalog-apiserver` is a local tag to identify the built docker image. You can change it if you want.
#### Running the docker image
To run the docker image in a local container, run
```bash
docker run -d --name <container name> -p 127.0.0.1:<local_port>:80 datacalog-apiserver
```
`<container name>` is the name of your container, that can be used to refer to it with other docker commands.
`<local_port>` is the port of your local machine, which will be forwarded to the docker container. For example, if it is set to `8080`, you will be able to reach the api-server at http://localhost:8080.
#### Stopping the docker image
To stop the docker image, run
```bash
docker stop <container name>
```
Note, that this will only stop the container, and not delete it fully. To do that, run
```bash
docker rm <container name>
```
For more information about docker, please see the [docker docs](https://docs.docker.com)
\ No newline at end of file
**/__pycache__
**/.pytest_cache
README.md
Dockerfile
.dockerignore
\ No newline at end of file
FROM tiangolo/uvicorn-gunicorn:python3.8
LABEL maintainer="Christian Böttcher <c.boettcher@fz-juelich.de>"
RUN mkdir -p /app/data
RUN mkdir -p /app/apiserver
RUN apt update && apt upgrade -y
RUN pip install --no-cache-dir fastapi
ADD . /app/apiserver
COPY ./config.env /app/apiserver/config.env
# This RUN creates a python file that contains the fastapi app, in a location that is expected by the base docker image
# This avoids naming conflicts with the packages in the project folder
RUN echo "from apiserver.main import app" > /app/main.py
# set data directory properly for the docker container
RUN sed -i 's_./app/data_/app/data_g' /app/apiserver/config.env
\ No newline at end of file
# TODO list
- add testing stuff to readme
- merge and add CI Testing
- expand tests
- add authentication stuff
- find out how authentication is to be handled
\ No newline at end of file
from .main import app
\ No newline at end of file
# contains Settings that can be loaded via pydantic settings .env support
# DATACATALOG_APISERVER_HOST="0.0.0.0",
# DATACATALOG_APISERVER_PORT=80
DATACATALOG_APISERVER_JSON_STORAGE_PATH="./app/data"
from .settings import ApiserverSettings
\ No newline at end of file
from pydantic import BaseSettings
DEFAULT_JSON_FILEPATH: str = "./app/data"
## Additional Settings can be made available by adding them as properties to this class
# At launch they will be read from environment variables (case-INsensitive)
class ApiserverSettings(BaseSettings):
json_storage_path: str = DEFAULT_JSON_FILEPATH
class Config:
env_prefix: str = "datacatalog_apiserver_"
env_file: str = "apiserver/config.env"
\ No newline at end of file
from typing import Optional
from typing import Dict
from fastapi import FastAPI
from fastapi import HTTPException
from pydantic import BaseModel
from .storage import JsonFileStorageAdapter
from .storage import AbstractLocationDataStorageAdapter
from .storage import LocationData
from .storage import LocationDataType
from .config import ApiserverSettings
from enum import Enum
settings = ApiserverSettings()
app = FastAPI(
title="API-Server for the Data Catalog"
)
adapter: AbstractLocationDataStorageAdapter = JsonFileStorageAdapter(settings)
#### A NOTE ON IDS
# the id of a dataset is not yet defined, it could be simply generated, it could be based on some hash of the metadata or simple be the name, which would then need to be enforced to be unique
# this might change some outputs of the GET functions that list reistered elements, but will very likely not change any part of the actual API
# list types of data locations, currently datasets (will be provided by the pillars) and targets (possible storage locations for worklfow results or similar)
@app.get("/")
def get_types():
return [{element.value : "/" + element.value} for element in LocationDataType]
# list id and name of every registered dataset for the specified type
@app.get("/{location_data_type}")
def list_datasets(location_data_type : LocationDataType):
return adapter.getList(location_data_type)
# register a new dataset, the response will contain the new dataset and its id
@app.put("/{location_data_type}")
def add_dataset(location_data_type : LocationDataType, dataset : LocationData):
usr: str = "testuser"
return adapter.addNew(location_data_type, dataset, usr)
# returns all information about a specific dataset, identified by id
@app.get("/{location_data_type}/{dataset_id}")
def get_specific_dataset(location_data_type : LocationDataType, dataset_id: str):
try:
return adapter.getDetails(location_data_type, dataset_id)
except FileNotFoundError:
raise HTTPException(status_code=404, detail='The provided id does not exist for this datatype.')
# update the information about a specific dataset, identified by id
@app.put("/{location_data_type}/{dataset_id}")
def update_specific_dataset(location_data_type : LocationDataType, dataset_id: str, dataset : LocationData):
usr: str = "testuser"
try:
return adapter.updateDetails(location_data_type, dataset_id, dataset, usr)
except FileNotFoundError:
raise HTTPException(status_code=404, detail='The provided id does not exist for this datatype.')
# delete a specific dataset
@app.delete("/{location_data_type}/{dataset_id}")
def delete_specific_dataset(location_data_type : LocationDataType, dataset_id: str):
usr: str = "testuser"
try:
return adapter.delete(location_data_type, dataset_id, usr)
except FileNotFoundError:
raise HTTPException(status_code=404, detail='The provided id does not exist for this datatype.')
\ No newline at end of file
import os
import json
import uuid
from .LocationStorage import AbstractLocationDataStorageAdapter, LocationData, LocationDataType
from apiserver.config import ApiserverSettings
from typing import List
class StoredData:
actualData: LocationData
users: List[str]
def toDict(self):
return {'actualData' : self.actualData.__dict__, 'users' : self.users}
# This stores LocationData via the StoredData Object as json files
# These Jsonfiles then contain the actualData, as well as the users with permissions for this LocationData
# all users have full permission to to anything with this dataobject, uncluding removing their own access (this might trigger a confirmation via the frontend, but this is not enforced via the api)
# IMPORTANT: The adapter does not check for authentication or authorization, it should only be invoked if the permissions have been checked
class JsonFileStorageAdapter(AbstractLocationDataStorageAdapter):
data_dir: str
def __init__(self, settings: ApiserverSettings):
AbstractLocationDataStorageAdapter.__init__(self)
self.data_dir = settings.json_storage_path
if not (os.path.exists(self.data_dir) and os.path.isdir(self.data_dir)):
raise Exception('Data Directory \"' + self.data_dir + '\" does not exist.')
def getList(self, type: LocationDataType):
localpath = os.path.join(self.data_dir, type.value)
if not (os.path.isdir(localpath)):
# This type has apparently not yet been used at all, create its directory and return an empty json file
os.mkdir(localpath)
return {}
else:
allFiles = [f for f in os.listdir(localpath) if os.path.isfile(os.path.join(localpath, f))]
# now each file has to be checked for its filename (= id) and the LocationData name (which is inside the json)
retList = []
for f in allFiles:
with open(os.path.join(localpath, f)) as file:
data = json.load(file)
retList.append({data['actualData']['name'] : f})
return retList
def addNew(self, type: LocationDataType, data: LocationData, usr: str):
localpath = os.path.join(self.data_dir, type.value)
if not (os.path.isdir(localpath)):
# This type has apparently not yet been used at all, therefore we need to create its directory
os.mkdir(localpath)
# create a unique id, by randomly generating one, and re-choosing if it is already taken
id = str(uuid.uuid4())
while (os.path.exists(os.path.join(localpath, id))):
id = str(uuid.uuid4())
toStore = StoredData()
toStore.users = [usr]
toStore.actualData = data
with open(os.path.join(localpath, id), 'w') as json_file:
json.dump(toStore.toDict(), json_file)
return {id : data}
def getDetails(self, type: LocationDataType, id: str):
localpath = os.path.join(self.data_dir, type.value)
fullpath = os.path.join(localpath, id)
if not os.path.isfile(fullpath):
raise FileNotFoundError('The requested Object does not exist.')
with open(fullpath) as file:
data = json.load(file)
return data['actualData']
def updateDetails(self, type:LocationDataType, id:str, data: LocationData, usr: str):
localpath = os.path.join(self.data_dir, type.value)
fullpath = os.path.join(localpath, id)
if not os.path.isfile(fullpath):
raise FileNotFoundError('The requested Object does not exist.')
toStore = StoredData()
toStore.actualData = data
# get permissions from old file
with open(fullpath) as file:
old_data = json.load(file)
toStore.users = old_data['users']
with open(fullpath, 'w') as file:
json.dump(toStore.toDict(), file)
return {id : data}
def delete(self, type:LocationDataType, id:str, usr: str):
localpath = os.path.join(self.data_dir, type.value)
fullpath = os.path.join(localpath, id)
if not os.path.isfile(fullpath):
raise FileNotFoundError('The requested Object does not exist.')
os.remove(fullpath)
def getOwner(self, type: LocationDataType, id: str):
raise NotImplementedError()
def checkPerm(self, type: LocationDataType, id: str, usr: str):
raise NotImplementedError()
def addPerm(self, type: LocationDataType, id: str, authUsr: str, newUser: str):
raise NotImplementedError()
def rmPerm(self, type: LocationDataType, id: str, usr: str, rmUser: str):
raise NotImplementedError()
\ No newline at end of file
from pydantic import BaseModel
from typing import Optional
from typing import Dict
from enum import Enum
class LocationDataType(Enum):
DATASET: str = 'dataset'
STORAGETARGET: str = 'storage_target'
class LocationData(BaseModel):
name: str
url: str
metadata: Optional[Dict[str, str]]
#
'''
This is an abstract storage adapter for storing information about datasets, storage targets and similar things.
It can easily be expanded to also store other data (that has roughly similar metadata), just by expanding the LocationDataType Enum.
In general, all data is public. This means, that the adapter does not do any permission checking, except when explicitly asked via the checkPerm function.
The caller therefore has to manually decide when to check for permissions, and not call any function unless it is already authorized (or does not need any authorization).
The usr: str (the user id) that is required for several functions, is a unique and immutable string, that identifies the user. This can be a verified email or a user name.
The management of authentication etc. is done by the caller, this adapter assumes that the user id fulfills the criteria.
Permissions are stored as a list of user ids, and every id is authorized for full access.
'''
class AbstractLocationDataStorageAdapter:
# get a list of all LocationData Elements with the provided type, as pairs of {name : id}
def getList(self, type: LocationDataType):
raise NotImplementedError()
# add a new element of the provided type, assign and return the id and the new data as {id : LocationData}
def addNew(self, type: LocationDataType, data: LocationData, usr: str):
raise NotImplementedError()
# return the LocationData of the requested object (identified by id and type)
def getDetails(self, type: LocationDataType, id: str):
raise NotImplementedError()
# change the details of the requested object, return {id : newData}
def updateDetails(self, type:LocationDataType, id:str, data: LocationData, usr: str):
raise NotImplementedError()
def delete(self, type:LocationDataType, id:str, usr: str):
raise NotImplementedError()
# return the owner of the requested object; if multiple owners are set, return them is a list
def getOwner(self, type: LocationDataType, id: str):
raise NotImplementedError()
# check if the given user has permission to change the given object
def checkPerm(self, type: LocationDataType, id: str, usr: str):
raise NotImplementedError()
# add user to file perm
def addPerm(self, type: LocationDataType, id: str, usr: str):
raise NotImplementedError()
# remove user from file perm
def rmPerm(self, type: LocationDataType, id: str, usr: str):
raise NotImplementedError()
from .JsonFileStorageAdapter import JsonFileStorageAdapter
from .LocationStorage import LocationDataType, LocationData, AbstractLocationDataStorageAdapter
\ No newline at end of file
import os
import sys
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
import apiserver as apiserver
import apiserver.storage as storage
\ No newline at end of file
# These Tests only check if every api path that should work is responding to requests, the functionality is not yet checked
# Therefore this only detects grievous errors in the request handling.
from fastapi.testclient import TestClient
from context import apiserver
from context import storage
client = TestClient(apiserver.app)
# get root
def test_root():
rsp = client.get('/')
assert rsp.status_code >= 200 and rsp.status_code < 300 # any 200 response is fine, as a get to the root should not return any error
# get every type in type enum
def test_types():
for location_type in storage.LocationDataType:
rsp = client.get('/' + location_type.value)
assert rsp.status_code >= 200 and rsp.status_code < 300 # any 200 response is fine, as a get to the datatypes should not return any error
\ No newline at end of file
# This file only exists as a placeholder, so that local testing works immediatly after cloning.
\ No newline at end of file
fastapi==0.63.0
pytest==6.2.4
requests==2.25.1
uvicorn==0.13.4
python-dotenv==0.17.1
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment