Skip to content

fix(pkg/pull): Route git/GitHub URLs to manifest provider#2681

Open
robaa12 wants to merge 13 commits intounikraft:stagingfrom
robaa12:staging
Open

fix(pkg/pull): Route git/GitHub URLs to manifest provider#2681
robaa12 wants to merge 13 commits intounikraft:stagingfrom
robaa12:staging

Conversation

@robaa12
Copy link
Copy Markdown
Contributor

@robaa12 robaa12 commented Mar 5, 2026

Prerequisite checklist

Description of changes

Closes #1114 kraft pkg pull failed to handle git/GitHub URLs such as:

  • github.com/unikraft/lib-nginx.git
  • https://github.com/unikraft/lib-nginx
  • git@github.com:unikraft/lib-nginx.git

Instead of routing them to the manifest provider, they were either claimed by the OCI manager or crashed with a nil pointer dereference.

# Root Cause Fix
1 name.ParseReference treats github.com as a valid registry host, so OCIManager.IsCompatible claimed git/GitHub URLs before the manifest manager could Added isGitSource() to reject these URLs early in IsCompatible and Catalog()
2 git/GitHub args in pull.go were passed as WithName, but the manifest manager's Catalog() only resolves a provider when WithSource is set Changed to use WithSource for all git/GitHub args
3 ManifestManager.IsCompatible had no recognition of git/GitHub URL patterns, so it never short-circuited to claim them Added an early-return guard matching .git, git@, github.com/, and ://github.com/
4 When both NewGitProvider and NewGitHubProvider failed, a nil provider was returned without error, causing a nil pointer dereference in Manifests() Added a nil guard before the fallback return in NewProvider
5 NewGitProvider converted git@HOST:path to ssh://git@HOST:path (colon kept), which URL parsers reject Fixed to produce ssh://git@HOST/path (colon replaced with slash)
6 ghrepo.NewFromURL used url.Parse directly, which cannot handle SCP-style SSH or bare github.com/ URLs, resulting in an empty host Normalize both forms to https:// before parsing

@github-project-automation github-project-automation Bot moved this to 🧊 Icebox in KraftKit Roadmap Mar 5, 2026
@robaa12 robaa12 changed the title fix(pkg/pull): route git/GitHub URLs to manifest provider fix(pkg/pull): Route git/GitHub URLs to manifest provider Mar 9, 2026
@craciunoiuc
Copy link
Copy Markdown
Member

Can you create commit for each fix in the table please? thanks 🙏

@robaa12
Copy link
Copy Markdown
Contributor Author

robaa12 commented Mar 10, 2026

Hi @craciunoiuc , I implemented all the requested changes. Please take another look when you get a chance. Happy to make any further changes if needed.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes kraft pkg pull handling for git/GitHub-style inputs by ensuring they are routed to the manifest provider (instead of being treated as OCI registry/image refs), and prevents a nil-provider dereference during provider selection.

Changes:

  • Add git-source heuristics to prevent the OCI manager from claiming git/GitHub inputs and to allow the manifest manager to short-circuit compatibility checks.
  • Update kraft pkg pull arg handling to pass git/GitHub inputs via WithSource (so the manifest provider resolution path is used).
  • Improve normalization of SCP-style SSH and bare github.com/... URLs in git/GitHub parsing and add unit tests for the new behavior.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
oci/manager.go Adds isGitSource and uses it to avoid OCI lookups for git/GitHub sources in Catalog/IsCompatible.
oci/manager_test.go Adds unit tests for isGitSource.
manifest/provider.go Adds nil-guard so provider selection never returns a nil provider without error.
manifest/provider_git.go Normalizes bare github.com/... and fixes SCP-style SSH conversion to a parsable ssh:// form.
manifest/manager.go Adds early-compatibility claim for git/GitHub patterns so manifest manager can handle them.
manifest/manager_test.go Adds unit tests for git-source compatibility behavior.
internal/ghrepo/repo.go Normalizes SCP-style SSH and bare github.com/... before url.Parse to correctly extract owner/repo.
internal/ghrepo/repo_test.go Adds unit tests for NewFromURL normalization and parsing.
internal/cli/kraft/pkg/pull/pull.go Routes detected git/GitHub args via WithSource instead of WithName.
Comments suppressed due to low confidence (1)

oci/manager.go:1001

  • In OCIManager.IsCompatible, the new isGitSource guard only exists inside isRegistry/isRemoteImage (which are only evaluated when query.Remote() is true). When Remote is false (a common/default), isFullyQualifiedNameReference can still return true for GitHub-like git inputs (since name.ParseReference accepts them), so OCI may still claim git sources. Consider applying the git-source guard before running any of the checks (or inside isFullyQualifiedNameReference) so compatibility is consistently false for git sources regardless of Remote.
	// Check if the provided source an OCI Distrubtion Spec capable registry
	isRegistry := func(source string) bool {
		log.G(ctx).
			WithField("source", source).
			Tracef("checking if source is registry")

		if isGitSource(source) {
			return false
		}

		regName, err := name.NewRegistry(source)
		if err != nil {
			return false
		}

		if _, err := transport.Ping(ctx, regName, http.DefaultTransport.(*http.Transport).Clone()); err == nil {
			return true
		}

		return false
	}

	// Check if the provided source is OCI registry
	isRemoteImage := func(source string) bool {
		log.G(ctx).
			WithField("source", source).
			Tracef("checking if source is remote image")

		if isGitSource(source) {
			return false
		}

		ref, err := name.ParseReference(source,
			name.WithDefaultRegistry(DefaultRegistry),
			name.WithDefaultTag(DefaultTag),
		)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread oci/manager.go
Comment on lines +916 to +921
func isGitSource(source string) bool {
return strings.HasSuffix(source, ".git") ||
strings.HasPrefix(source, "git@") ||
strings.HasPrefix(source, "github.com/") ||
strings.Contains(source, "://github.com/")
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isGitSource doesn’t recognize SSH URL forms like ssh://git@github.com/owner/repo (no .git suffix) because it only checks git@ at the start and ://github.com/ (which won’t match ://git@github.com/). This means some valid git sources will still be treated as OCI refs and won’t be routed to the manifest provider. Consider parsing as a URL and matching hostname/user, or extending the checks to cover ssh:// (and ssh+git://, git+ssh://) variants explicitly.

Copilot uses AI. Check for mistakes.
Comment thread oci/manager.go
Comment on lines +916 to +921
func isGitSource(source string) bool {
return strings.HasSuffix(source, ".git") ||
strings.HasPrefix(source, "git@") ||
strings.HasPrefix(source, "github.com/") ||
strings.Contains(source, "://github.com/")
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .git suffix check in isGitSource will also return true for valid OCI image references whose repository name ends with .git (e.g. ghcr.io/org/foo.git without an explicit tag). That would prevent the OCI manager from handling those images and may cause a regression. Consider narrowing the heuristic (e.g. require a URL scheme/SSH form, or specifically target known git host patterns) so non-git OCI refs aren’t excluded.

Copilot uses AI. Check for mistakes.
Comment thread internal/cli/kraft/pkg/pull/pull.go Outdated
Comment on lines +346 to +353
if strings.HasSuffix(arg, ".git") ||
strings.HasPrefix(arg, "git@") ||
strings.HasPrefix(arg, "github.com/") ||
strings.Contains(arg, "://github.com/") {
qopts = append(qopts, packmanager.WithSource(arg))
} else {
qopts = append(qopts, packmanager.WithName(arg))
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The git/GitHub detection here doesn’t match ssh://git@github.com/owner/repo (no .git suffix), so those sources will still be treated as package names and won’t set WithSource. Consider expanding detection to include SSH URL forms (and/or reusing a shared helper used by the manifest/git provider) so all supported git URL formats are routed consistently.

Copilot uses AI. Check for mistakes.
Comment thread manifest/manager.go Outdated
Comment on lines +521 to +526
if strings.HasSuffix(source, ".git") ||
strings.HasPrefix(source, "git@") ||
strings.HasPrefix(source, "github.com/") ||
strings.Contains(source, "://github.com/") {
return m, true, nil
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This early-claim heuristic misses GitHub SSH URL forms like ssh://git@github.com/owner/repo (no .git suffix) for the same reason as isGitSource in the OCI manager. If users pass those forms, the manifest manager may not be selected as compatible. Consider parsing as a URL and matching hostname, or adding explicit handling for ssh://git@github.com/ (and similar SSH variants).

Copilot uses AI. Check for mistakes.
@robaa12
Copy link
Copy Markdown
Contributor Author

robaa12 commented Apr 3, 2026

I am working on the copilot changes now. Will push an update soon.

@robaa12 robaa12 force-pushed the staging branch 2 times, most recently from 6192ae3 to 3b7c460 Compare April 5, 2026 07:23
@robaa12
Copy link
Copy Markdown
Contributor Author

robaa12 commented Apr 5, 2026

Hey @craciunoiuc!

I addressed the Copilot feedback by creating a shared gitutil package instead of patching the inline code. Here's what I did:

What changed

  1. New internal/gitutil package : Created a centralized IsGitSource() function that properly detects git URLs vs OCI images. Now handles all SSH URL variants (ssh://, git@, git+ssh://, etc.) and correctly distinguishes OCI images like ghcr.io/org/repo.git:latest from actual git sources.

  2. Updated the 3 files : oci/manager.go, manifest/manager.go, and pull.go now all use this shared function instead of having the same detection logic copy-pasted in 3 places.

Why did I choose this approach?

  • Copilot pointed out that the old code didn't handle SSH URLs without .git suffix and could have false positives with OCI images. Having one centralized function makes it easier to fix and test.
  • Removed duplicate test code — the core logic is tested in one place now.
  • All tests cases pass locally (make fmt also passes).

Copy link
Copy Markdown
Member

@craciunoiuc craciunoiuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, can you squash this commit? afterwards it will be good to merge

Image

robaa12 added 10 commits April 6, 2026 11:47
NewFromURL used url.Parse directly, which cannot handle SCP-style SSH
URLs (git@github.com:owner/repo.git) or bare github.com/ paths without
a scheme. Both cases result in an empty host, causing the "host is not
github.com" error.

Normalize both forms to https:// before parsing so the function works
for all common GitHub URL formats.

Signed-off-by: robaa12 <arobaa23@gmail.com>
NewGitProvider converted SCP-style git@HOST:path URLs to
ssh://git@HOST:path (colon kept), which URL parsers (giturl, go-git)
reject. The colon must become a slash: ssh://git@HOST/path.

Also add explicit https:// normalization for bare github.com/ paths
so they resolve without requiring a scheme prefix.

Signed-off-by: robaa12 <arobaa23@gmail.com>
When both NewGitProvider and NewGitHubProvider fail, NewProvider
returned a nil provider without error, causing a nil pointer
dereference in Manifests(). Add a nil guard before the fallback
return so the error path is taken instead.

Signed-off-by: robaa12 <arobaa23@gmail.com>
ManifestManager.IsCompatible had no recognition of git/GitHub URL
patterns, so it never short-circuited to claim them. Sources with a
.git suffix, git@ prefix, github.com/ prefix, or ://github.com/
substring were not identified as compatible, causing the manifest
manager to be skipped entirely.

Add an early-return guard so these sources are claimed before the
authentication/registry lookup path runs.

Signed-off-by: robaa12 <arobaa23@gmail.com>
google/go-containerregistry's name.ParseReference accepts github.com/
paths as valid OCI references (treating github.com as a registry
host), so OCIManager.IsCompatible incorrectly claimed git/GitHub URLs
before the manifest manager could handle them.

Add isGitSource() to detect .git suffixes, git@ prefixes,
github.com/ prefixes, and ://github.com/ substrings, and return early
in both the isRegistry and isRemoteImage closures. Also return early
from Catalog() for git sources to prevent the entire local OCI image
list from leaking into results.

Signed-off-by: robaa12 <arobaa23@gmail.com>
git/GitHub args in pull.go were passed as WithName, but the manifest
manager's Catalog() only resolves a provider when WithSource is set.
As a result, Catalog() tried to glob-match the full URL against
manifest names instead of calling NewProvider(ctx, source) to
resolve the correct provider.

Detect git/GitHub URL shapes (.git suffix, git@ prefix, github.com/
prefix, ://github.com/ substring) and pass them as WithSource so
the manifest manager can route them to the correct provider.

Signed-off-by: robaa12 <arobaa23@gmail.com>
Add centralized utility to detect git repository URLs vs OCI image
references. Handles SSH URLs (ssh://, git@, git+ssh://), HTTPS URLs
and bare host/path patterns. Also distinguishes OCI images with
.git in the name from actual git sources.

Fixes unikraft#1114

Signed-off-by: robaa12 <arobaa23@gmail.com>
Replace inline string matching with centralized IsGitSource to
prevent OCI manager from claiming git/GitHub URLs. Also reduce test
duplication as core logic is tested in gitutil package.

Signed-off-by: robaa12 <arobaa23@gmail.com>
Replace inline git URL detection with centralized IsCompatible
check. Add test cases for SSH URL variants.

Signed-off-by: robaa12 <arobaa23@gmail.com>
Replace inline detection to properly route git/GitHub arguments via
WithSource instead of WithName so manifest provider is used.

Signed-off-by: robaa12 <arobaa23@gmail.com>
@robaa12
Copy link
Copy Markdown
Contributor Author

robaa12 commented Apr 6, 2026

Done, please have another look when u get a chance.

@craciunoiuc
Copy link
Copy Markdown
Member

Hey, this did not work for me with these cases, am I doing something wrong?

➜  test-pr ~/kraftkit/main pkg pull "git@github.com:unikraft/app-nginx.git"
 i  finding *
 E  could not find local reference for *
 i  finding *
 E  could not find *
 E  could not find *
➜  test-pr ~/kraftkit/main pkg pull "ssh+git@github.com:unikraft/app-nginx.git"
 i  finding *
 E  could not find local reference for *
 i  finding *
 E  could not find *
 E  could not find *

also note that it's showing a wildcard everywhere instead of the URL

@robaa12
Copy link
Copy Markdown
Contributor Author

robaa12 commented Apr 6, 2026

reproduced it on my end too.

I'm working on a follow-up fix for those exact git@... / ssh+git@... cases (and a couple of related SSH URL edge cases). I've tracked down where it breaks in the URL normalization/provider path and I'm testing the fix now.

Also, the * you saw in finding * is a query rendering thing , source-only lookups were printing the name field, and when name is empty it just falls back to *. I am working on it.

I'll push an update soon and drop another comment with exactly what changed.

robaa12 added 3 commits April 7, 2026 06:06
When pkg pull uses WithSource without a package name, Query.String
fell back to '*', which made progress and error output misleading.

Prefer the source field when name is empty so logs show the exact
URL being queried.

Signed-off-by: robaa12 <arobaa23@gmail.com>
Handle ssh+git@ and ssh://git@...:owner/repo forms consistently in
source detection and URL parsing helpers.

This prevents valid GitHub SSH-like inputs from being rejected and
adds tests covering the additional normalization paths.

Signed-off-by: robaa12 <arobaa23@gmail.com>
When probing GitHub sources, using the raw SSH-like input could depend
on SSH agent availability and fail for public repositories.

Probe via normalized HTTPS and preserve manifest origin as the original
input so source-based filtering still matches the requested URL.

Signed-off-by: robaa12 <arobaa23@gmail.com>
@robaa12
Copy link
Copy Markdown
Contributor Author

robaa12 commented Apr 7, 2026

What was going wrong

  1. git@... / ssh+git@... still had gaps in normalization/parsing across the provider flow.
  2. finding * happened because source-only queries were rendered from name; when name was empty it fell back to *.
  3. GitHub probing could still depend on raw SSH-like input in some paths, which could fail even for public repos.

What I changed

  • packmanager/query.go
    • show source in Query.String() when name is empty (so no more wildcard * in logs/errors for URL pulls)
  • SSH URL handling improvements
    • normalize additional SSH-style variants (ssh+git@..., ssh://git@...:owner/repo...) in:
      • internal/gitutil/detector.go
      • internal/ghrepo/repo.go
      • manifest/provider_git.go
    • added test coverage for these URL forms in:
      • internal/gitutil/detector_test.go
      • internal/ghrepo/repo_test.go
  • manifest/provider_github.go
    • probe via normalized HTTPS for GitHub sources
    • preserve manifest.Origin as the original input source so source filtering still matches correctly

Verification

I rebuilt and tested ./bin/kraft manually with multiple formats, including the two you reported.
Also validated additional forms (https://, bare github.com/..., ssh://git@... slash/colon, git+ssh://, ssh+git://) and all passed in my local.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🧊 Icebox

Development

Successfully merging this pull request may close these issues.

kraft pkg pull no longer works with GitHub sources

3 participants