CPython4J

An experimental project exploring how to call Python from Java, in-process, with full native extension support. Inspired by Java Project Detroit and the growing need for Java/Python interop in AI workloads.

CPython4J runs real CPython via the Foreign Function & Memory API (Panama), with a focus on developer experience: type-safe contract interfaces, annotation-processed proxies, automatic Python environment management via uv, and bidirectional Java/Python calls.

The north-star use case is Docling: running an AI document processing pipeline from Java, with Java classification callbacks firing per-element inside Python's loop. PyTorch, NumPy, OpenCV all just work because it's real CPython.

This is not a Wasm-based approach. Projects like trino-wasm-python and boomslang compile CPython to WebAssembly for sandboxed execution. The tradeoff is that every native Python library (NumPy, PyTorch, OpenCV, etc.) must also be compiled to Wasm and statically linked into the build. This makes it impractical to support the broader Python ecosystem, especially AI/ML libraries with complex native dependencies. CPython4J uses the system's native CPython instead, so any pip-installable package works out of the box.

Prerequisites

Java 25+
uv (for Python environment management)
A shared libpython (uv installs one automatically with --python-preference managed)

Quickstart

Option A: Let CPython4J manage everything

Create a pyproject.toml with your Python dependencies:

[project]
name = "my-project"
version = "0.1.0"
requires-python = ">=3.13"
dependencies = ["cowsay"]

Write a Python module:

# greeting.py
import cowsay

def greet(name):
    return cowsay.get_output_string("cow", "Hello " + name)

def wordCount(text):
    return len(text.split())

Use it from Java - CPython4J runs uv sync, installs Python and dependencies, and configures everything:

var env = PythonEnv.uvProject(Path.of("./my-project")).sync(true).build();
try (var engine = PythonEngine.create(env)) {
    String msg = engine.invokeFunction("greeting", "greet", List.of("World"), String.class);
    int count = engine.invokeFunction("greeting", "wordCount", List.of("one two three"), int.class);
}

Option B: Pre-existing Python environment

If you already have a venv (from CI, a container, or manual uv sync):

cd my-project
uv sync  # one command - creates .venv, installs Python + deps

Then in Java, skip the sync:

var env = PythonEnv.venv(Path.of("./my-project/.venv")).build();
try (var engine = PythonEngine.create(env)) {
    // ready to use - sys.path configured automatically
}

Option C: Production container

Prepare the Python environment at Docker build time - zero first-run latency:

FROM eclipse-temurin:25-jre
WORKDIR /app

# Python dependency spec - changing these invalidates the layer cache
COPY pyproject.toml uv.lock ./
COPY target/lib/cpython4j.jar ./

# Install Python + dependencies into .venv (cached across rebuilds)
RUN java -jar cpython4j.jar

# Application code - changes here don't re-install Python deps
COPY target/my-app.jar ./
CMD ["java", "-cp", "my-app.jar:lib/*", "com.example.Main"]

var env = PythonEnv.venv(Path.of(".venv")).build();
try (var engine = PythonEngine.create(env)) {
    // ready - venv was pre-built in Docker image
}

For uber JARs, add one line to your main() instead:

public static void main(String[] args) {
    PythonEnv.prepareIfRequested(args);  // enables: java -jar app.jar --cpython4j-prepare
    // ... rest of your app
}

For full manual control (air-gapped, no uv), use PythonEnv.explicit():

var env = PythonEnv.explicit()
    .library(Path.of("/usr/lib/libpython3.13.so"))
    .sitePackages(Path.of("/app/.venv/lib/python3.13/site-packages"))
    .build();

Contract-first API with annotation processor

Define a Java interface, annotate it, and the annotation processor generates a concrete implementation at compile time (no reflection):

@ScriptInterface(module = "analyzer")
public interface Analyzer {
    Analysis analyze(Document document);
    List<String> keywords(String text);
    int wordCount(String text);
}

public record Document(String text, String source) {}
public record Analysis(String language, int wordCount) {}

Python implements the contract:

# analyzer.py
def analyze(document):
    return {"language": "en", "wordCount": len(document["text"].split())}

def keywords(text):
    return text.lower().split()[:5]

def wordCount(text):
    return len(text.split())

Java usage with the generated proxy:

var env = PythonEnv.uvProject(Path.of(".")).sync(true).build();
try (var engine = PythonEngine.create(env)) {
    Analyzer analyzer = new Analyzer_Proxy(engine);
    Analysis result = analyzer.analyze(new Document("Hello from CPython4J", "demo"));
}

Host callbacks (Python calls Java)

Use context on @ScriptInterface to define both directions in one place. Mark Java methods with @HostFunction and they become importable from Python:

class HostApi {
    @HostFunction
    public void log(String message) { System.out.println(message); }

    @HostFunction
    public String reverse(String s) { return new StringBuilder(s).reverse().toString(); }
}

@ScriptInterface(module = "my_module", context = HostApi.class)
public interface MyModule {
    String process(String text);
}

The annotation processor generates both MyModule_Proxy (Java calls Python) and MyModule_Builtins (Python calls Java):

var env = PythonEnv.uvProject(Path.of(".")).sync(true).build();
try (var engine = PythonEngine.builder()
        .withEnv(env)
        .expose(MyModule_Builtins.toHostModule(new HostApi()))
        .build()) {

    // Python can now: from hostapi import log, reverse
    MyModule mod = new MyModule_Proxy(engine);
    mod.process("hello");
}

Docling end-to-end demo

The Docling integration test shows the north-star use case: converting a PDF with Docling while Java classification callbacks fire per-element inside Python's processing loop. See the Python bridge and Java handler for the full example.

Run it with mvn -B install -Pdocling (installs PyTorch + Docling, ~5GB on first run).

See also the Langchain4j agent demo showing a Java AI agent using Python spaCy for named entity extraction with bidirectional callbacks.

Building

mvn -B install

CI

The CI workflow has two jobs:

build - runs core + integration tests on Linux, macOS, and Windows with Java 25 + Python 3.13 + uv
docling - runs the Docling end-to-end test (Linux only, heavier)

Design documentation

Constitution - 12 design principles
Product thesis - positioning and north-star use case (Docling)
Architecture - runtime shape, distribution modes, module layout
Contract-first API design
Conversion model - primitives + JSON structural conversion
Callbacks - pure FFM, no native helper
Python environments - uvProject, venv, explicit, bundled
Implementation plan - phases with estimates
Risks and Open questions
ADRs: 0001 | 0002 | 0003 | 0004 | 0005 | 0006

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
annotations		annotations
core		core
docs		docs
examples/langchain4j-agent		examples/langchain4j-agent
it		it
processor		processor
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CPython4J

Prerequisites

Quickstart

Option A: Let CPython4J manage everything

Option B: Pre-existing Python environment

Option C: Production container

Contract-first API with annotation processor

Host callbacks (Python calls Java)

Docling end-to-end demo

Building

CI

Design documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CPython4J

Prerequisites

Quickstart

Option A: Let CPython4J manage everything

Option B: Pre-existing Python environment

Option C: Production container

Contract-first API with annotation processor

Host callbacks (Python calls Java)

Docling end-to-end demo

Building

CI

Design documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages