diff --git a/pages/memgraph-zero/memgql/changelog.mdx b/pages/memgraph-zero/memgql/changelog.mdx index 1ed34640d..13bf212ac 100644 --- a/pages/memgraph-zero/memgql/changelog.mdx +++ b/pages/memgraph-zero/memgql/changelog.mdx @@ -5,6 +5,31 @@ description: MemGQL release notes # MemGQL Changelog +## MemGQL v0.6.3 - June 21st, 2026 + +### ✨ New features & Improvements + +- **Schema-based routing (USE-free queries).** In `multi` mode, queries no + longer need a `USE ` clause — the engine infers the backend from the + query's schema signals (labels, rel-types, properties) against a unified schema + index, built from mappings for SQL backends and live introspection for Cypher + backends. Routing is strict: exactly one candidate routes, zero or multiple + hard-error with the fix; explicit `USE` always wins. USE-free federated + `JOIN` and `UNION` work too, and writes route on a unique candidate. Two new + statements expose the index: **`SHOW SCHEMA [FOR ]`** and **`REFRESH + SCHEMA`**. Identifier matching is now **exact** engine-wide. **Still gated + to explicit `USE`:** non-default remote databases (Memgraph multi-tenancy), + a single `MATCH` pattern spanning two backends, and a label-less `MATCH (n)` + against a SQL backend. +- **Native Apache Iceberg connector (`iceberg-direct`).** A Trino-free Iceberg + connector that reads tables directly via the REST catalog and Apache Arrow + scans from object storage (S3/MinIO) — no SQL engine in the query path, with + projection and predicate pushdown into the scan. It reuses the **same mapping + format** as the Trino-backed `iceberg` connector and joins other backends in + multi-connector federation. Register with `ADD CONNECTOR TYPE + iceberg-direct URI '' MAPPING `. **Read-only** — + writes return an "unsupported" error. + ## MemGQL v0.6.2 - June 7th, 2026 ### ✨ New features & Improvements diff --git a/pages/memgraph-zero/memgql/complete.mdx b/pages/memgraph-zero/memgql/complete.mdx index 02047ef0d..6440af4c6 100644 --- a/pages/memgraph-zero/memgql/complete.mdx +++ b/pages/memgraph-zero/memgql/complete.mdx @@ -102,7 +102,7 @@ Save the following as `docker-compose.yml`: ```yaml services: memgql: - image: ${MEMGQL_IMAGE:-memgraph/memgql:0.6.2} + image: ${MEMGQL_IMAGE:-memgraph/memgql:0.6.3} container_name: memgql ports: - "7688:7688" @@ -129,7 +129,7 @@ services: - memgql-net memgql-init: - image: memgraph/mgconsole:1.5.1 + image: memgraph/mgconsole:1.6.0 container_name: memgql-init entrypoint: - sh @@ -151,7 +151,7 @@ services: restart: "no" memgraph: - image: memgraph/memgraph-mage:3.10.1 + image: memgraph/memgraph-mage:3.11.0 container_name: memgraph ports: - "7687:7687" @@ -169,7 +169,7 @@ services: - memgql-net lab: - image: memgraph/lab:3.10.1 + image: memgraph/lab:3.11.0 container_name: lab ports: - "3000:3000" diff --git a/pages/memgraph-zero/memgql/connect/iceberg.mdx b/pages/memgraph-zero/memgql/connect/iceberg.mdx index 70b837924..e57cbb1e7 100644 --- a/pages/memgraph-zero/memgql/connect/iceberg.mdx +++ b/pages/memgraph-zero/memgql/connect/iceberg.mdx @@ -1,18 +1,196 @@ --- title: Iceberg -description: Connect MemGQL to Apache Iceberg via Trino. +description: Connect MemGQL to Apache Iceberg either natively (direct) or via Trino. --- # Iceberg -The Iceberg connector (`CONNECTOR_TYPE=iceberg`) translates GQL queries into -Trino-dialect SQL with fully-qualified table references -(`catalog.schema.table`) and executes them against Iceberg tables via Trino's -REST API. It requires a +MemGQL can read [Apache Iceberg](https://iceberg.apache.org/) tables in two +different ways. Both reuse the same [mapping file](../quick-start.mdx#mapping-file) that maps graph patterns to -Iceberg tables. +Iceberg tables, and both fully qualify table references as +`catalog.schema.table`. + +| Connector | `CONNECTOR_TYPE` | Execution | Writes | +|------------------|------------------|---------------------------------------------------------------------------|--------| +| Direct (native) | `iceberg-direct` | Native, in-process — reads Iceberg as Arrow directly from object storage | ❌ read-only | +| Trino | `iceberg` | GQL → Trino-dialect SQL, executed by a Trino engine | ✓ | + +**Which one should you use?** + +- Use the **direct** connector when your workload is **scan-and-filter heavy**. + It reads Iceberg natively over the REST catalog and object storage (S3/MinIO), + pushing column projection and predicate filters into the scan + (manifest/partition/row-group pruning). There is no SQL engine in the path, so + base latency is lower. Joins, sort, limit, distinct, and aggregation run + **in-process** over Arrow batches. It is **read-only**. +- Use the **Trino** connector when your workload is **join/aggregation heavy and + fully contained in one Iceberg catalog**, or when you need writes. Trino + executes joins, pruning, and aggregation server-side. + +Both connectors accept the same mapping file unchanged — a mapping written for +one works against the other. + +--- + +## Direct execution (`iceberg-direct`) + +The direct connector (`CONNECTOR_TYPE=iceberg-direct`) performs **native, +in-process** graph execution on Iceberg. It uses the +[`iceberg`](https://crates.io/crates/iceberg) and +[`iceberg-catalog-rest`](https://crates.io/crates/iceberg-catalog-rest) Rust +crates to talk to the Iceberg REST Catalog and read table data as Apache Arrow +record batches directly from object storage. There is **no Trino** and no SQL +string — MemGQL owns the entire execution loop. + +It is **read-only**: `INSERT`, `DELETE`, `SET`, and `REMOVE` return an explicit +"unsupported" error. + +### 1. Start the Iceberg stack (MinIO + REST Catalog, no Trino) + +```bash +docker network create memgql-net + +# S3-compatible object storage for Iceberg data/metadata files. +docker run -d --rm \ + --name minio-dev \ + --network memgql-net \ + -p 9000:9000 \ + -p 9001:9001 \ + --env MINIO_ROOT_USER=admin \ + --env MINIO_ROOT_PASSWORD=password \ + minio/minio server /data --console-address ":9001" + +# Create the warehouse bucket. +docker exec minio-dev sh -c \ + "mc alias set local http://localhost:9000 admin password && \ + mc mb --ignore-existing local/warehouse" + +# Iceberg REST Catalog backed by MinIO. +docker run -d --rm \ + --name iceberg-rest-dev \ + --network memgql-net \ + -p 8181:8181 \ + --env CATALOG_WAREHOUSE=s3://warehouse/ \ + --env CATALOG_IO__IMPL=org.apache.iceberg.aws.s3.S3FileIO \ + --env CATALOG_S3_ENDPOINT=http://minio-dev:9000 \ + --env CATALOG_S3_PATH__STYLE__ACCESS=true \ + --env AWS_ACCESS_KEY_ID=admin \ + --env AWS_SECRET_ACCESS_KEY=password \ + --env AWS_REGION=us-east-1 \ + tabulario/iceberg-rest +``` + +### 2. Seed data + +The direct connector is read-only, and the REST catalog only manages metadata, +so seeding is done with an external Iceberg writer. The example below uses +[PyIceberg](https://py.iceberg.apache.org/), which talks to the same REST +catalog + MinIO that the connector reads from. + +```bash +pip install "pyiceberg[s3fs]" pyarrow +``` + +```python +import pyarrow as pa +from pyiceberg.catalog.rest import RestCatalog + +catalog = RestCatalog( + "default", + **{ + "uri": "http://localhost:8181", + "s3.endpoint": "http://localhost:9000", + "s3.access-key-id": "admin", + "s3.secret-access-key": "password", + "s3.region": "us-east-1", + "s3.path-style-access": "true", + }, +) + +catalog.create_namespace_if_not_exists("default") + +tables = { + "persons": ( + pa.schema([("id", pa.int32()), ("name", pa.string()), ("age", pa.int32())]), + pa.table({"id": [1, 2], "name": ["Alice", "Bob"], "age": [30, 25]}), + ), + "companies": ( + pa.schema([("id", pa.int32()), ("name", pa.string())]), + pa.table({"id": [1], "name": ["Acme Corp"]}), + ), + "knows": ( + pa.schema([("from_id", pa.int32()), ("to_id", pa.int32())]), + pa.table({"from_id": [1], "to_id": [2]}), + ), + "works_at": ( + pa.schema([("person_id", pa.int32()), ("company_id", pa.int32())]), + pa.table({"person_id": [1], "company_id": [1]}), + ), +} + +for name, (schema, data) in tables.items(): + table = catalog.create_table_if_not_exists(f"default.{name}", schema=schema) + table.append(data) +``` + +### 3. Start MemGQL + +The direct connector resolves the REST catalog and object storage from the +`ICEBERG_*` environment variables. + +```bash +docker run --rm \ + --name memgql \ + --network memgql-net \ + --stop-timeout 2 \ + -p 7688:7688 \ + --env CONNECTOR_TYPE=iceberg-direct \ + --env ICEBERG_REST_URI=http://iceberg-rest-dev:8181 \ + --env ICEBERG_WAREHOUSE=iceberg \ + --env ICEBERG_SCHEMA=default \ + --env ICEBERG_DIRECT_S3_ENDPOINT=http://minio-dev:9000 \ + --env ICEBERG_DIRECT_S3_REGION=us-east-1 \ + --env ICEBERG_DIRECT_S3_ACCESS_KEY_ID=admin \ + --env ICEBERG_DIRECT_S3_SECRET_ACCESS_KEY=password \ + --env MAPPING_FILE=/data/mapping.json \ + --env BOLT_LISTEN_ADDR=0.0.0.0:7688 \ + -v ./mapping.json:/data/mapping.json \ + memgraph/memgql:latest +``` -## 1. Start Trino with Iceberg +### 4. Connect + +```bash +mgconsole --port 7688 +``` + +### 5. Query + +```gql +MATCH (p:Person) RETURN p.name, p.age; +``` + +```gql +MATCH (p:Person)-[:WORKS_AT]->(c:Company) RETURN p.name, c.name; +``` + +```gql +MATCH (a:Person)-[:KNOWS]->(b:Person) RETURN a.name, b.name; +``` + +For environment variables, see [Reference](../reference.mdx#iceberg-direct-iceberg-direct). + +--- + +## Trino (`iceberg`) + +The Trino-backed connector (`CONNECTOR_TYPE=iceberg`) translates GQL queries +into Trino-dialect SQL with fully-qualified table references +(`catalog.schema.table`) and executes them against Iceberg tables via Trino's +REST API. Trino performs joins, pruning, aggregation, and writes server-side. + +### 1. Start Trino with Iceberg ```bash docker network create memgql-net @@ -24,7 +202,7 @@ docker run -d --rm \ trinodb/trino:latest ``` -## 2. Seed data +### 2. Seed data ```bash docker exec -i trino-dev trino << 'SQL' @@ -55,7 +233,7 @@ INSERT INTO iceberg.default.works_at VALUES (1, 1); SQL ``` -## 3. Start MemGQL +### 3. Start MemGQL ```bash docker run --rm \ @@ -73,13 +251,13 @@ docker run --rm \ memgraph/memgql:latest ``` -## 4. Connect +### 4. Connect ```bash mgconsole --port 7688 ``` -## 5. Query +### 5. Query ```gql MATCH (p:Person) RETURN p.name, p.age; @@ -95,15 +273,24 @@ MATCH (a:Person)-[:KNOWS]->(b:Person) RETURN a.name, b.name; For environment variables, see [Reference](../reference.mdx#iceberg-iceberg). +--- + ## Supported GQL features -| Feature | Iceberg / Trino | -|-----------------------------------------------|-----------------| -| `MATCH (n:Label) RETURN n.prop` | ✓ | -| Whole-node `RETURN n` / whole-rel `RETURN r` | ✓ | -| Pattern-level `WHERE` (`MATCH (n WHERE …)`) | ✓ | -| Typed edge `(a)-[r:R]->(b)` | ✓ | -| `ORDER BY` / `LIMIT` | ✓ | -| `DISTINCT` | ✓ | -| `INSERT (a {…})` | ✓ | -| `DELETE` (no alias — Trino requirement) | ✓ | +For the `iceberg-direct` connector, ✓ means the feature is supported; filters +and scans run natively against the Iceberg scan (pushed into +manifest/row-group pruning), while joins, sort, limit, distinct, and +aggregation run **in-process** over Arrow batches. + +| Feature | Iceberg (Direct) | Iceberg (Trino) | +|-----------------------------------------------|------------------|-----------------| +| `MATCH (n:Label) RETURN n.prop` | ✓ | ✓ | +| Whole-node `RETURN n` / whole-rel `RETURN r` | ✓ | ✓ | +| Pattern-level `WHERE` (`MATCH (n WHERE …)`) | ✓ | ✓ | +| Typed edge `(a)-[r:R]->(b)` | ✓ (local join) | ✓ | +| `ORDER BY` / `LIMIT` | ✓ (local) | ✓ | +| `DISTINCT` | ✓ (local) | ✓ | +| Aggregation | ✓ (local) | ✓ | +| Time-travel (snapshot) | ✓ | ✓ | +| `INSERT (a {…})` | ❌ (read-only) | ✓ | +| `DELETE` (no alias — Trino requirement) | ❌ (read-only) | ✓ | diff --git a/pages/memgraph-zero/memgql/connect/memgraph.mdx b/pages/memgraph-zero/memgql/connect/memgraph.mdx index a5fbb5acf..413d1d6ad 100644 --- a/pages/memgraph-zero/memgql/connect/memgraph.mdx +++ b/pages/memgraph-zero/memgql/connect/memgraph.mdx @@ -19,7 +19,7 @@ docker run -d --rm \ --name memgraph-dev \ --network memgql-net \ -p 7687:7687 \ - memgraph/memgraph-mage:3.10.1 \ + memgraph/memgraph-mage:3.11.0 \ --log-level=TRACE --also-log-to-stderr ``` diff --git a/pages/memgraph-zero/memgql/features.mdx b/pages/memgraph-zero/memgql/features.mdx index b2455a47e..357ebb5f5 100644 --- a/pages/memgraph-zero/memgql/features.mdx +++ b/pages/memgraph-zero/memgql/features.mdx @@ -5,24 +5,27 @@ description: MemGQL Community and Enterprise feature comparison. # Features -| Feature | Community | [Enterprise](https://memgraph.com/contact-us) | -|--------------------------------------------------------|-----------|-----------------------------------------------| -| GQL to Cypher translation | Yes | Yes | -| GQL to SQL translation | Yes | Yes | -| Bolt protocol | Yes | Yes | -| **Connectors** | | | -| [Memgraph](/memgraph-zero/memgql/connect/memgraph) | Yes | Yes | -| [Neo4j](/memgraph-zero/memgql/connect/neo4j) | Yes | Yes | -| [PostgreSQL](/memgraph-zero/memgql/connect/postgres) | Yes | Yes | -| [DuckDB](/memgraph-zero/memgql/connect/duckdb) | Yes | Yes | -| [Iceberg](/memgraph-zero/memgql/connect/iceberg) | Yes | Yes | -| [ClickHouse](/memgraph-zero/memgql/connect/clickhouse) | Yes | Yes | -| [MySQL](/memgraph-zero/memgql/connect/mysql) | Yes | Yes | -| [Pinot](/memgraph-zero/memgql/connect/pinot) | Yes | Yes | -| [Oracle](/memgraph-zero/memgql/connect/oracle) | Yes | Yes | -| **Multi-Connection Mode** | Yes | Yes | -| Max connectors | 2 | Unlimited | -| Max simultaneous connections | 2 | Unlimited | -| **Agentic Capabilities** | | | -| MCP Server | Yes | Yes | -| Structured2Graph Agent to create mappings | Yes | Yes | +| Feature | Community | [Enterprise](https://memgraph.com/contact-us) | +|---------------------------------------------------------------------------------------------------|-----------|-----------------------------------------------| +| GQL to Cypher translation | Yes | Yes | +| GQL to SQL translation | Yes | Yes | +| Bolt protocol | Yes | Yes | +| **Connectors** | | | +| [Memgraph](/memgraph-zero/memgql/connect/memgraph) | Yes | Yes | +| [Neo4j](/memgraph-zero/memgql/connect/neo4j) | Yes | Yes | +| [PostgreSQL](/memgraph-zero/memgql/connect/postgres) | Yes | Yes | +| [DuckDB](/memgraph-zero/memgql/connect/duckdb) | Yes | Yes | +| [Iceberg (Trino)](/memgraph-zero/memgql/connect/iceberg#trino-iceberg) | Yes | Yes | +| [Iceberg (direct/native)](/memgraph-zero/memgql/connect/iceberg#direct-execution-iceberg-direct) | Yes | Yes | +| [ClickHouse](/memgraph-zero/memgql/connect/clickhouse) | Yes | Yes | +| [MySQL](/memgraph-zero/memgql/connect/mysql) | Yes | Yes | +| [Pinot](/memgraph-zero/memgql/connect/pinot) | Yes | Yes | +| [Oracle](/memgraph-zero/memgql/connect/oracle) | Yes | Yes | +| **Multi-Connection Mode** | Yes | Yes | +| Max connectors | 2 | Unlimited | +| Max simultaneous connections | 2 | Unlimited | +| [Cross-backend joins & composite queries](/memgraph-zero/memgql/multiple-graphs) | Yes | Yes | +| [Schema-based routing (USE-free queries)](/memgraph-zero/memgql/multiple-graphs#use-free-routing) | Yes | Yes | +| **Agentic Capabilities** | | | +| MCP Server | Yes | Yes | +| Structured2Graph Agent to create mappings | Yes | Yes | diff --git a/pages/memgraph-zero/memgql/multiple-graphs.mdx b/pages/memgraph-zero/memgql/multiple-graphs.mdx index af8114c7d..9ecbb559d 100644 --- a/pages/memgraph-zero/memgql/multiple-graphs.mdx +++ b/pages/memgraph-zero/memgql/multiple-graphs.mdx @@ -7,6 +7,8 @@ description: Query across multiple graphs and backends using the graph catalog a MemGQL introduces a **graph catalog** that makes graphs first-class entities. Rather than specifying connectors and connections in every query, you register graphs once and reference them by name. This enables seamless multi-graph queries across heterogeneous backends using standard ISO GQL composite clauses. +If you want a running stack to try these queries against, the [Docker Compose - Complete Setup](/memgraph-zero/memgql/complete) guide spins up MemGQL with Memgraph and PostgreSQL backends already wired up, so you can follow along here against real data. + ## Where the catalog DSL works The catalog statements (`ADD CONNECTOR`, `ADD GRAPH`, `SHOW GRAPHS`, `SHOW CONNECTORS`, `DROP GRAPH`, `USE `, …) are available in **`CONNECTOR_TYPE=multi`** mode. Single-backend modes (`memgraph-gql`, `neo4j-gql`, `postgres`, `mysql`, `oracle`, `duckdb`, `clickhouse`, `iceberg`, `pinot`) connect to one backend configured via env vars and don't expose the catalog. @@ -14,6 +16,7 @@ The catalog statements (`ADD CONNECTOR`, `ADD GRAPH`, `SHOW GRAPHS`, `SHOW CONNE | Statement | `multi` | Cypher single-backend | SQL single-backend | |---|---|---|---| | `SHOW GRAPHS` / `SHOW CONNECTORS` / `SHOW MAPPINGS` | ✓ | ✗ | ✗ | +| `SHOW SCHEMA` / `REFRESH SCHEMA` | ✓ | ✗ | ✗ | | `ADD CONNECTOR` / `ADD GRAPH` / `CONNECT` | ✓ | ✗ | ✗ | | `CREATE GRAPH ` | ✓ (catalog entry) | ✓ (forwarded as `CREATE DATABASE `) | ✗ | | `DROP GRAPH ` | ✓ | ✗ | ✗ | @@ -58,6 +61,147 @@ USE warehouse MATCH (c:Company) WHERE c.revenue > 1000000 RETURN c; The engine resolves the graph name to its bound connector and executes the query on the appropriate backend. +## USE-free routing + +You don't always have to name the graph. MemGQL maintains a **unified schema +index** — the labels, relationship types, and properties every registered +source defines — and uses it to infer which backend a query belongs to from the +query itself. A query with no `USE` clause routes automatically when its schema +signals point to exactly one source. + +```gql +-- `name` is defined only on the warehouse's Person table → routes to warehouse +MATCH (p:Person) RETURN p.name; + +-- FRIEND_OF exists only in the social graph → routes to social +MATCH (me:Person {id: 1})-[:FRIEND_OF]->(f:Person) RETURN f.id; +``` + +The schema index is built from two sources: + +- **SQL-family connectors** (PostgreSQL, MySQL, Oracle, DuckDB, ClickHouse, + Iceberg, Pinot) — taken from the registered **mapping** (labels, rel-types, + and properties are known exactly). +- **Cypher-family connectors** (Memgraph, Neo4j) — **introspected at `CONNECT` + time** and cached. Memgraph uses `SHOW SCHEMA INFO` (the server must run with + `--schema-info-enabled`); Neo4j uses `db.schema.*`, falling back to a + labels-and-rel-types-only view. When property names can't be introspected, a + source can still be *selected* by its labels/rel-types but is never *excluded* + by a property it might have. + +### Routing policy + +Routing is strict and deterministic: + +- **Exactly one candidate** → the query routes there. +- **Zero candidates** → a hard error naming the sources that were checked and + pointing you at `SHOW SCHEMA`. +- **Two or more candidates** → an *ambiguous* hard error listing the candidates. + Disambiguate by referencing a property or relationship type that exists in + only one of them, or add an explicit `USE`. +- **Explicit `USE` always wins** and bypasses inference entirely. +- **Session defaults never tie-break.** A sticky `default_connection` (from + `SET DEFAULT CONNECTION`) does not resolve an otherwise-ambiguous query. +- **No-signal queries** like `MATCH (n) RETURN n` (no label, rel-type, or + property to route by) fall back to today's default-connection behavior + unchanged. + +Identifier matching is **exact**: `:person` does not match a mapping (or +introspected schema) that declares `Person`. Routing, translation, and the +backends all agree on the same casing. + +### USE-free federated queries + +Routing happens per query part, so multi-backend joins and composites work +without any `USE` clauses — each part (or branch) routes independently and the +existing [cross-backend hash join](#focused-multi-graph-queries) takes over at +the boundary: + +```gql +-- Federated JOIN with zero USE clauses: the FRIEND_OF part routes to social, +-- the ORDERED part routes to the transactional backend, joined on f.id = fp.id +MATCH (me:Person {id: 1})-[:FRIEND_OF]->(f:Person) +MATCH (fp:Person)-[:ORDERED]->(prod:Product) WHERE fp.id = f.id +RETURN fp.name AS friend, prod.name AS item; + +-- Federated UNION: each branch routes on its own signals +MATCH (p:Person)-[:ORDERED]->(prod:Product) WHERE p.id = 1 RETURN prod.id +UNION +MATCH (:Product {id: 1})-[:SIMILAR_TO]->(s:Product) RETURN s.id; +``` + +Writes (`INSERT`) route the same way and require a **unique** candidate. A +successful `INSERT` also teaches the schema cache the labels it wrote, so a +follow-up read routes without needing a `REFRESH SCHEMA`. + +### What still needs an explicit `USE` + +- A graph bound to a **non-default remote database** (Memgraph multi-tenancy) is + fenced to explicit `USE` — per-tenant introspection isn't wired up yet. +- A **single `MATCH` pattern that spans two backends** errors with guidance to + split it into one `MATCH` clause per graph; auto-splitting one pattern is out + of scope. +- A label-less `MATCH (n)` against a SQL backend still surfaces a raw backend + error (there's nothing to route or translate by). + +## Schema discovery + +`SHOW SCHEMA` is the discoverability surface for USE-free routing — it's the +unified index the router consults, and what routing errors point you at. It's +especially useful for agents that need to learn what's queryable without +hardcoded knowledge. + +```gql +SHOW SCHEMA; +``` + +``` ++----------------+---------+-----------+--------------------------+--------+---------+ +| source | element | name | properties | from | to | ++----------------+---------+-----------+--------------------------+--------+---------+ +| transactional | node | Person | id, name | — | — | +| transactional | node | Product | id, name, price | — | — | +| transactional | edge | ORDERED | quantity | Person | Product | +| social | node | Person | (not introspected) | — | — | +| social | edge | FRIEND_OF | (not introspected) | Person | Person | ++----------------+---------+-----------+--------------------------+--------+---------+ +``` + +Each row is one label (`element = node`) or relationship type +(`element = edge`) defined by a source. `properties` lists the known property +names, or `(not introspected)` when a Cypher backend exposed only labels and +rel-types. `from` / `to` are the endpoint labels for relationships. Sources with +no schema information at all are listed with a `(no schema info: …)` reason so +you know to query them with an explicit `USE`. + +Filter to a single source: + +```gql +SHOW SCHEMA FOR social; +``` + +`REFRESH SCHEMA` re-introspects every live Cypher connection (Memgraph / Neo4j) +and updates the cache. Introspection already runs eagerly at `CONNECT` — use +this after the underlying schema changes: + +```gql +REFRESH SCHEMA; +``` + +``` ++-------------+-------------+--------------------------------------------+ +| connection | connector | status | ++-------------+-------------+--------------------------------------------+ +| social_conn | mg_social | refreshed (3 labels, 2 relationship types) | +| tx_conn | pg_tx | mapping-backed (nothing to introspect) | ++-------------+-------------+--------------------------------------------+ +``` + +SQL-family connections report `mapping-backed (nothing to introspect)` — their +schema comes from the mapping, not from a live query. If introspection fails +(for example, a Memgraph started without `--schema-info-enabled`), the status +explains why; the connection stays reachable via explicit `USE`. + ## Focused Multi-Graph Queries A single linear query can chain multiple `USE` clauses. Variables bind across parts, enabling joins across backends: @@ -197,6 +341,10 @@ View details for a specific graph: SHOW GRAPH social; ``` +`SHOW GRAPHS` lists how graphs are *registered* (connector, mapping, access +mode). To see what each one actually *defines* — the labels, relationship types, +and properties used for routing — use [`SHOW SCHEMA`](#schema-discovery). + ## Graph Lifecycle Management ### Creating Graphs @@ -254,6 +402,8 @@ DROP GRAPH IF EXISTS analytics; 4. **Compatible column types**: All branches of a composite query must produce result sets with the same number of columns, in the same order, with compatible types. +5. **Routing is per part**: With [USE-free routing](#use-free-routing) each query part (and each composite branch) routes on its own schema signals. A part that matches zero or multiple sources is a hard error — add an explicit `USE` or reference a label/property unique to one source. + ## Complete Example Set up multiple backends and run cross-graph queries: diff --git a/pages/memgraph-zero/memgql/quick-start.mdx b/pages/memgraph-zero/memgql/quick-start.mdx index f816167c9..7e03bb834 100644 --- a/pages/memgraph-zero/memgql/quick-start.mdx +++ b/pages/memgraph-zero/memgql/quick-start.mdx @@ -29,7 +29,7 @@ docker run -d --rm \ --name memgraph-dev \ --network memgql-net \ -p 7687:7687 \ - memgraph/memgraph-mage:3.10.1 \ + memgraph/memgraph-mage:3.11.0 \ --log-level=TRACE --also-log-to-stderr ``` diff --git a/pages/memgraph-zero/memgql/reference.mdx b/pages/memgraph-zero/memgql/reference.mdx index 8dbd553a4..3d7aa1468 100644 --- a/pages/memgraph-zero/memgql/reference.mdx +++ b/pages/memgraph-zero/memgql/reference.mdx @@ -87,6 +87,11 @@ ALTER GRAPH SET MAPPING ; ALTER GRAPH REMOVE MAPPING; ``` +``` +SHOW SCHEMA [FOR ]; -- unified routing index: labels, rel-types, properties +REFRESH SCHEMA; -- re-introspect live Cypher connections (Memgraph/Neo4j) +``` + ``` -- Single graph USE ; @@ -95,8 +100,19 @@ USE ; USE UNION | UNION ALL | INTERSECT | INTERSECT ALL | EXCEPT | EXCEPT ALL USE ; + +-- USE-free: routes automatically when the query's labels / rel-types / +-- properties match exactly one registered source (see Multiple Graphs). +; ``` +In `multi` mode, a query with no `USE` clause routes automatically when its +schema signals (labels, relationship types, properties) match exactly one +source; zero or multiple matches hard-error. Identifier matching is **exact** +(`:person` ≠ `Person`). Memgraph sources must run with `--schema-info-enabled` +for property-level introspection. See +[Multiple Graphs → USE-free routing](/memgraph-zero/memgql/multiple-graphs#use-free-routing). + ## Configuration Reference ### General @@ -160,7 +176,8 @@ connections. | `oracle` | GQL -> SQL | Oracle 19c+ (incl. Free 23ai) | | `duckdb` | GQL -> SQL | DuckDB (embedded) | | `clickhouse` | GQL -> SQL | ClickHouse | -| `iceberg` | GQL -> SQL | Iceberg via Trino | +| `iceberg` | GQL -> SQL | Iceberg via Trino | +| `iceberg-direct` | None (native in-process) | Iceberg (REST catalog + Arrow) | | `pinot` | GQL -> SQL | Apache Pinot | | `multi` | Per-connector | Multiple backends simultaneously | @@ -246,6 +263,24 @@ macOS, Linux, and Windows without any extra system packages. | `TRINO_SCHEMA` | `default` | Trino schema | | `MAPPING_FILE` | _(none, uses built-in default)_ | Path to JSON mapping file | +#### Iceberg Direct (`iceberg-direct`) + +Native, in-process execution over Iceberg — reads the REST catalog and object +storage (S3/MinIO) directly, no Trino. Read-only. + +| Variable | Default | Description | +|---------------------------------------|-------------------------|--------------------------------------| +| `ICEBERG_REST_URI` | `http://localhost:8181` | Iceberg REST Catalog URI | +| `ICEBERG_WAREHOUSE` | `iceberg` | Warehouse / catalog name | +| `ICEBERG_SCHEMA` | `default` | Default namespace (schema) | +| `ICEBERG_DIRECT_S3_ENDPOINT` | `http://localhost:9000` | S3/MinIO endpoint | +| `ICEBERG_DIRECT_S3_REGION` | `us-east-1` | S3 region | +| `ICEBERG_DIRECT_S3_ACCESS_KEY_ID` | `admin` | S3/MinIO access key | +| `ICEBERG_DIRECT_S3_SECRET_ACCESS_KEY` | `password` | S3/MinIO secret key | +| `MAPPING_FILE` | _(none, uses built-in default)_ | Path to JSON mapping file | + +S3 path-style access is always enabled (`s3.path-style-access=true`). + ## Mapping Schema The graph mapping file is a JSON document with two top-level arrays: diff --git a/pages/memgraph-zero/memgql/use-cases/agentic.mdx b/pages/memgraph-zero/memgql/use-cases/agentic.mdx index 4fcaf9eab..9b7cbde7b 100644 --- a/pages/memgraph-zero/memgql/use-cases/agentic.mdx +++ b/pages/memgraph-zero/memgql/use-cases/agentic.mdx @@ -36,3 +36,25 @@ questions to ask and where to look. MemGQL provides a federated graph layer that abstracts all backend systems behind a single GQL endpoint. Agents connect to one URL and query everything. + +### Autonomous discovery + +Agents don't need hardcoded schemas or hand-written `USE` clauses. A single +[`SHOW SCHEMA`](/memgraph-zero/memgql/multiple-graphs#schema-discovery) returns +the unified index of every label, relationship type, and property across all +connected backends — the agent's map of what's queryable and where the +entities connect. + +From there, [USE-free routing](/memgraph-zero/memgql/multiple-graphs#use-free-routing) +lets the agent write a plain GQL query and let the engine pick the backend from +the labels and properties it referenced: + +```gql +-- The agent doesn't need to know `name` lives in PostgreSQL or that +-- FRIEND_OF lives in Memgraph — both route automatically. +MATCH (p:Person {id: 1})-[:FRIEND_OF]->(f:Person) RETURN f.name; +``` + +Routing is strict: if a query is ambiguous or matches nothing, MemGQL returns +an actionable error that names the candidate sources and points back at +`SHOW SCHEMA` — so an agent can self-correct instead of failing silently. diff --git a/pages/memgraph-zero/memgql/use-cases/public-private.mdx b/pages/memgraph-zero/memgql/use-cases/public-private.mdx index 2e3c34976..9b37b188e 100644 --- a/pages/memgraph-zero/memgql/use-cases/public-private.mdx +++ b/pages/memgraph-zero/memgql/use-cases/public-private.mdx @@ -62,7 +62,7 @@ MemGQL, and a one-shot init container: cat > docker-compose.yml << 'EOF' services: memgql: - image: ${MEMGQL_IMAGE:-memgraph/memgql:0.6.2} + image: ${MEMGQL_IMAGE:-memgraph/memgql:0.6.3} ports: - "7688:7688" environment: @@ -82,7 +82,7 @@ services: start_period: 3s memgql-init: - image: memgraph/mgconsole:1.5.1 + image: memgraph/mgconsole:1.6.0 entrypoint: - sh - -c @@ -101,7 +101,7 @@ services: restart: "no" memgraph: - image: memgraph/memgraph-mage:3.10.1 + image: memgraph/memgraph-mage:3.11.0 ports: - "7687:7687" command: --log-level=TRACE --also-log-to-stderr