Change Llama2 from the Turbine implementation to the Sharktank one by gpetters-amd · Pull Request #2170 · nod-ai/AMD-SHARK-Studio

gpetters-amd · 2024-09-19T16:05:47Z

There are still two outstanding issues I'd like some comments on, but otherwise this should be basically done.

gpetters-amd · 2024-09-19T16:07:39Z

apps/shark_studio/api/llm.py

+                huggingface_hub.snapshot_download(
+                    repo_id=self.hf_model_name, cache_dir=cache_dir
                )
+                # TODO: Convert to gguf, delete cache


The way that sharktank recommends for generating the .gguf file is to use a CLI tool from llama.cpp. Is that still the best way to extract that, or do we have a way to do it using sharktank?

gpetters-amd · 2024-09-19T16:08:31Z

apps/shark_studio/api/llm.py

+            model = PagedLlamaModelV1(dataset.root_theta, llama_config)
+
+            fxb = FxProgramsBuilder(model)
+            self.torch_ir = export(fxb)


Not sure why, but this is producing an empty module. Any idea what I'm missing?

Change Llama2 from the Turbine implementation to the Sharktank one

8359038

gpetters-amd commented Sep 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change Llama2 from the Turbine implementation to the Sharktank one#2170

Change Llama2 from the Turbine implementation to the Sharktank one#2170
gpetters-amd wants to merge 1 commit intonod-ai:mainfrom
gpetters-amd:sharktank

gpetters-amd commented Sep 19, 2024

Uh oh!

gpetters-amd Sep 19, 2024

Uh oh!

gpetters-amd Sep 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gpetters-amd commented Sep 19, 2024

Uh oh!

gpetters-amd Sep 19, 2024

Choose a reason for hiding this comment

Uh oh!

gpetters-amd Sep 19, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant