Skip to content

Implement SLF4Jm JUL, and Log4J2 providers for Java SDK#68725

Open
uranusjr wants to merge 13 commits into
apache:mainfrom
astronomer:java-sdk-logging-providers
Open

Implement SLF4Jm JUL, and Log4J2 providers for Java SDK#68725
uranusjr wants to merge 13 commits into
apache:mainfrom
astronomer:java-sdk-logging-providers

Conversation

@uranusjr

Copy link
Copy Markdown
Member

Additional mechanism is added Logger in sdk.execution to support sending arbitrary log messages to LogSender. Once that's in place, we can implement various adapters to support common Java logging providers with a bit boilerplate code for each.

Those providers are implemented in separate artifacts since nobody needs all of them at once. The user can choose what they want. Documentation is added to describe how to do it.

I changed examples to prefer the Java Platform Logging interface instead of SLF4J too. This is built-in since Java 9. All the logging tools are more or less the same, so users should adapt pretty easily.

@uranusjr uranusjr mentioned this pull request Jun 18, 2026
1 task
@uranusjr uranusjr added the AIP-108: java-sdk Change this to an 'area:' label after AIP acceptance. label Jun 18, 2026
@uranusjr uranusjr added this to the Java SDK 1.0 milestone Jun 18, 2026
@uranusjr uranusjr force-pushed the java-sdk-logging-providers branch from d73ec53 to 7c0f104 Compare June 18, 2026 18:01
uranusjr added 11 commits June 19, 2026 04:58
A new airflow-sdk-slf4j artifact is added to allow SLF4J to be
seamlessly forwarded to Airflow's task log infrastructure.

Use 'org.apache.airflow:airflow-sdk-slf4j' to enable this.
This allows log4j users to write logs into Airflow directly. Note that
this uses Java since the appender is registered with an annotation, and
the annotation processor can only handle Java. (There's a Kotlin bridge,
but the class is small enough it doesn't make much sense to pull it in.)
This is too fancy for the spell checker. Use Dumb English instead.
@uranusjr uranusjr force-pushed the java-sdk-logging-providers branch from f35217b to 3ac5294 Compare June 18, 2026 21:00
uranusjr added 2 commits June 19, 2026 05:05
Systen.log annoyingly pass a null as an array to a vararg method
implemented in Kotlin. This breaks Kotlin since it expects an empty
array instead.

Fortunately, there seems to be a way to work around this, according to
Claude? Let's hope this works.
@uranusjr uranusjr force-pushed the java-sdk-logging-providers branch from 3ac5294 to d48fe00 Compare June 18, 2026 21:05
@uranusjr

Copy link
Copy Markdown
Member Author

Don’t know if it makes sense… since Airflow uses structlog, and the log forwarder actually works best with structured arguments, trying out intentionally NOT rendering the message but put the arguments in the dict instead

image

Is this too weird?

Comment thread airflow-core/docs/authoring-and-scheduling/language-sdks/java.rst
Comment on lines +57 to +60
s.split(Regex("""[\s,]+""")).forEach {
val parts = it.split(Regex("""\s*=\s*"""), 2)
val level = parse(parts[1])
if (level != null) put(parts[0], level)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
s.split(Regex("""[\s,]+""")).forEach {
val parts = it.split(Regex("""\s*=\s*"""), 2)
val level = parse(parts[1])
if (level != null) put(parts[0], level)
s.split(Regex("""[\s,]+""")).forEach {
if (it.isEmpty()) return@forEach
val parts = it.split(Regex("""\s*=\s*"""), 2)
if (parts.size != 2) return@forEach // <- guards parts[1]
val level = parse(parts[1])
if (level != null) put(parts[0], level)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The happy path works fine, however,

Say a deployment sets:

AIRFLOW__LOGGING__NAMESPACE_LEVELS="botocore=debug,"

A trailing comma — extremely common when people build up a comma list. Now trace it:

Step 1 — split "botocore=debug," on [\s,]+:

["botocore=debug", ""]

The trailing comma produces an empty-string token at the end.

Step 2 — the loop reaches the "" token and splits it on =:

"".split(Regex("""\s*=\s*"""), 2) -> [""] // size 1, no "=" to split on

Step 3 — parse(parts[1]) reads index 1 of a 1-element list:

java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1

The same thing happens for "botocore" (a bare name, no =), " botocore=debug" (leading space), ",botocore=debug" (leading comma), or the env var set to "".

The example value in config.yml for this exact option is:

example: "sqlalchemy=INFO sqlalchemy.engine=DEBUG, botocor"

That last token, botocor, has no =level — so a user who copies the documented example verbatim hits the crash

The Python parser (structlog.py) has the same parsing gap, but it fails loudly with a ValueError at config time naming the bad value

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That example is broken and got truncated in my PR that I added. It should fail with an error, but a nice one.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll add some nicer errors.

The Python implementation currently also just use regex without error handling. Should we improve it too?

if isinstance(namespace_log_levels, str):
log_from_level = partial(re.compile(r"\s*=\s*").split, maxsplit=2)
namespace_log_levels = {
log: level for log, level in map(log_from_level, re.split(r"[\s,]+", namespace_log_levels))
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we should, but may be in a separate PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AIP-108: java-sdk Change this to an 'area:' label after AIP acceptance. kind:documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants