Skip to content

Fix Java-SDK logging level#68696

Closed
jason810496 wants to merge 2 commits into
apache:mainfrom
jason810496:fix/java-sdk-log-level
Closed

Fix Java-SDK logging level#68696
jason810496 wants to merge 2 commits into
apache:mainfrom
jason810496:fix/java-sdk-log-level

Conversation

@jason810496

Copy link
Copy Markdown
Member

Why

A Java task's SLF4J logs (e.g. logger.info(...)) fell through to stderr, which the supervisor tags as ERROR, so every application log showed up as ERROR in the UI regardless of its real level.

Java-Task code:

@Task(
    id = "extract"
)
public long extractValue(Client client) throws InterruptedException {
    logger.info("Hello from task");
    Object pythonXcom = client.getXCom("python_task_1");
    logger.info("Got XCom from Python Task 'python_task_1' {}", pythonXcom);
    Connection connection = client.getConnection("test_http");
    logger.info("Got con {}", connection);

    for(int i = 0; i < 3; ++i) {
        logger.info("Beep {}, next time will be {}", i, new Date());
        Thread.sleep(2000L);
    }

    logger.info("Goodbye from task");
    return (new Date()).getTime();
}

What Airflow UI got:
Screenshot 2026-06-18 at 1 31 12 PM

What

  • Ship an SLF4J binding inside the SDK (AirflowSlf4jServiceProvider) registered via META-INF/services, so LoggerFactory.getLogger(...) routes through the Airflow logs socket carrying each message's real level instead of writing to stderr.
  • Map SLF4J levels to the strings the supervisor (structlog NAME_TO_LEVEL) understands; TRACE maps to debug since there is no TRACE level on the Python side.

Was generative AI tooling used to co-author this PR?

Comment on lines +64 to +65
const val LEVEL_PROPERTY = "airflow.logging.level"
const val LEVEL_ENV = "AIRFLOW__LOGGING__LOGGING_LEVEL"

@jason810496 jason810496 Jun 18, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I felt we should provide the log_level in StartupDetails.

I prefer to respect the Airflow side logging level, which means user might setup custom secret backend for the conf. So we need to send the [logging/log_level] info from supervisor.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, let’s do this. The Python implementation should probably also be changed to respect this log_level value (and fall back to various configurations).

Since this requires Python side changes, I would say for 3.3 let’s just default to INFO in SDKs and add support for Airflow configuration later.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, then I will go with this direction. This should solve Phani's review comment as well.

@jason810496 jason810496 requested a review from uranusjr June 18, 2026 04:47
The logs socket was wrapped in use{}, which closed it as soon as
LogSender.configure() returned. Logs produced later while the task ran
then hit a closed channel, were buffered, and were lost at process
exit, so the UI showed no application logs for Java tasks. Keep the
socket open until the comm job completes so they are flushed.

enum class Level { ERROR, DEBUG, }
// wireName is the level string the Airflow supervisor understands (structlog's
// NAME_TO_LEVEL). It has no TRACE, so TRACE maps to "debug"; a level the

@uranusjr uranusjr Jun 18, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think TRACE should be mapped to NOTSET (0).

@uranusjr uranusjr Jun 18, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, why do we need to reinvent level? This enum can simply map 1:1 to Python levels, and we just resolve SLF4J levels to Python levels directly.

const val LEVEL_PROPERTY = "airflow.logging.level"
const val LEVEL_ENV = "AIRFLOW__LOGGING__LOGGING_LEVEL"

fun threshold(): Level = parse(configuredLevel()) ?: Level.INFO

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test this scenario with this PR, I think it breaks.

In airflow.cfg
[logging]
logging_level = WARNING

Since it wont be in os.environ: no AIRFLOW__LOGGING__LOGGING_LEVEL set . Hence Java resolves to INFO.

Overall Result: Java WARNING logs will be silently dropped. A Python task in the same deployment with the same cfg would show WARNING logs.

I think it is identical config, divergent behavior by task language .

@uranusjr

Copy link
Copy Markdown
Member

Moved to #68725

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AIP-108: java-sdk Change this to an 'area:' label after AIP acceptance.

Development

Successfully merging this pull request may close these issues.

3 participants