Skip to content

MediaGraphOrg/MeGraS

Repository files navigation

MeGraS

MeGraS, short for MediaGraph Store, is the data storage and query processing engine of the MediaGraph project which aims to elevate multimodal data to first-class citizens in Knowledge-Graphs. MegraS stores and processes multimodal knowledge graphs including its media components as an RDF graph.

Installation

MeGraS is written in Kotlin and requires a Java runtime environment of version 8 or higher. For some operations on audio and video data, ffmpeg needs to be installed and added to the system path.

Building MeGraS from Source

MeGraS uses Gradle as a build system. To build the application, simply run ./gradlew distZip and unpack the generated archive in build/distributions.

Using a Docker Image

MeGraS is also available as a Docker image. You can pull and run the latest version from Docker Hub using the following command:

docker run --name megras -p 8080:8080 -v ./assets:/assets -it floruosch/megras:latest

TODO: move to a more specific Docker image repository.

This will also mount the assets directory from your local machine into the container, allowing you to access media files stored there.

Configuring MeGraS

MeGraS uses an optional configuration file in JSON format. The file to be used can be passed as a parameter when starting the application. If no such parameter is provided, MeGraS will look for a config.json file in its root directory. If no such file is found, the default options are used. The configuration options look as follows:

{
  "objectStoreBase": "store", //directory to be used as base for the media object store
  "httpPort": 8080,           //port to listen to for HTTP connections
  "backend": "FILE",          //persistent backend to use for graph information
  "fileStore": {              //options to be used for 'FILE' backend
    "filename": "quads.tsv",  //filename to be used to store graph information in
    "compression": false      //store graph information in compressed form
  },
  "cottontailConnection": {   //options to be used for 'COTTONTAIL' backend
    "host": "localhost",
    "port": 1865
  },
  "postgresConnection": {     //options to be used for the 'POSTGRES' backend
    "host": "localhost",
    "port": 5432,
    "database": "megras",
    "user": "megras",
    "password": "megras"
  }
}

MeGraS supports several different backend implementations for storing the graph data. The backend to be used can be selected using the backend field in the configuration. The following graph storage backends are supported:

  • FILE: keeps all graph triples in memory and periodically dumps everything to a single file. Suitable for smaller graphs and for testing purposes.
  • COTTONTAIL: uses the Cottontail DB vector database. Supports all graph data types, including vector types. Suitable for medium-sized graphs of several 100k triples up to a few million triples.
  • POSTGRES: Uses PostgreSQL to store the graph. It also supports vector types and related operations. Suitable for larger graphs up to several tens of millions of triples.

FILE Backend

The FILE backend is the simplest backend available in MeGraS and requires no additional setup.

COTTONTAIL Backend

The COTTONTAIL backend requires a running instance of Cottontail DB. TODO: add more details about the Cottontail DB setup and configuration.

POSTGRES Backend

The POSTGRES backend requires a running instance of PostgreSQL. To set up the database, we recommend that you use the following docker image which contains a preconfigured PostgreSQL instance with the required extensions:

docker run -d --name timescaledb -p 5432:5432 -e POSTGRES_PASSWORD=password timescale/timescaledb-ha:pg17

Then, you can connect to PostgreSQL using a client of your choice (e.g., psql) and create the database and user. If you do not have a PostgreSQL client installed, you can use the following command to connect to the database within the Docker container:

docker exec -it timescaledb psql -U postgres

Now, you can create the database and user with the following commands, setting the username and password to your desired values, according to your configuration:

CREATE USER megras WITH PASSWORD megras;
CREATE DATABASE megras WITH OWNER megras;
GRANT ALL PRIVILEGES ON DATABASE megras TO megras;

To connect the MeGraS docker to the PostgreSQL database, you can use the following commands, to create a dedicated Docker network for the two containers and to connect them:

docker network create megras
docker network connect megras timescaledb
docker network connect megras megras
docker exec timescaledb hostname

Add the result of the hostname query of the database container to the config.json and copy it to the MeGraS container:

docker cp config.json megras:\

Getting Started

Once MeGraS is up and running, it can be accessed via HTTP on the configured port. Further documentation is also available here.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages