Skip to content

Commit 97bd210

Browse files
authored
[EZ] Replace pytorch-labs with meta-pytorch (#2914)
1 parent dc0441a commit 97bd210

File tree

6 files changed

+11
-11
lines changed

6 files changed

+11
-11
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -324,7 +324,7 @@ torchtune focuses on integrating with popular tools and libraries from the ecosy
324324
- [EleutherAI's LM Eval Harness](https://github.com/EleutherAI/lm-evaluation-harness) for [evaluating](recipes/eleuther_eval.py) trained models
325325
- [Hugging Face Datasets](https://huggingface.co/docs/datasets/en/index) for [access](torchtune/datasets/_instruct.py) to training and evaluation datasets
326326
- [PyTorch FSDP2](https://github.com/pytorch/torchtitan/blob/main/docs/fsdp.md) for distributed training
327-
- [torchao](https://github.com/pytorch-labs/ao) for lower precision dtypes and [post-training quantization](recipes/quantize.py) techniques
327+
- [torchao](https://github.com/pytorch/ao) for lower precision dtypes and [post-training quantization](recipes/quantize.py) techniques
328328
- [Weights & Biases](https://wandb.ai/site) for [logging](https://pytorch.org/torchtune/main/deep_dives/wandb_logging.html) metrics and checkpoints, and tracking training progress
329329
- [Comet](https://www.comet.com/site/) as another option for [logging](https://pytorch.org/torchtune/main/deep_dives/comet_logging.html)
330330
- [ExecuTorch](https://pytorch.org/executorch-overview) for [on-device inference](https://github.com/pytorch/executorch/tree/main/examples/models/llama2#optional-finetuning) using finetuned models
@@ -351,7 +351,7 @@ We really value our community and the contributions made by our wonderful users.
351351
The transformer code in this repository is inspired by the original [Llama2 code](https://github.com/meta-llama/llama/blob/main/llama/model.py). We also want to give a huge shout-out to EleutherAI, Hugging Face and
352352
Weights & Biases for being wonderful collaborators and for working with us on some of these integrations within torchtune. In addition, we want to acknowledge some other awesome libraries and tools from the ecosystem:
353353

354-
- [gpt-fast](https://github.com/pytorch-labs/gpt-fast) for performant LLM inference techniques which we've adopted out-of-the-box
354+
- [gpt-fast](https://github.com/meta-pytorch/gpt-fast) for performant LLM inference techniques which we've adopted out-of-the-box
355355
- [llama recipes](https://github.com/meta-llama/llama-recipes) for spring-boarding the llama2 community
356356
- [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) for bringing several memory and performance based techniques to the PyTorch ecosystem
357357
- [@winglian](https://github.com/winglian/) and [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) for early feedback and brainstorming on torchtune's design and feature set.

docs/source/tutorials/e2e_flow.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -341,7 +341,7 @@ these parameters.
341341
Introduce some quantization
342342
~~~~~~~~~~~~~~~~~~~~~~~~~~~
343343

344-
We rely on `torchao <https://github.com/pytorch-labs/ao>`_ for `post-training quantization <https://github.com/pytorch/ao/tree/main/torchao/quantization#quantization>`_.
344+
We rely on `torchao <https://github.com/pytorch/ao>`_ for `post-training quantization <https://github.com/pytorch/ao/tree/main/torchao/quantization#quantization>`_.
345345
To quantize the fine-tuned model after installing torchao we can run the following command::
346346

347347
# we also support `int8_weight_only()` and `int8_dynamic_activation_int8_weight()`, see

docs/source/tutorials/llama3.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ Running generation with our LoRA-finetuned model, we see the following output:
241241
Faster generation via quantization
242242
----------------------------------
243243

244-
We rely on `torchao <https://github.com/pytorch-labs/ao>`_ for `post-training quantization <https://github.com/pytorch/ao/tree/main/torchao/quantization#quantization>`_.
244+
We rely on `torchao <https://github.com/pytorch/ao>`_ for `post-training quantization <https://github.com/pytorch/ao/tree/main/torchao/quantization#quantization>`_.
245245
To quantize the fine-tuned model after installing torchao we can run the following command::
246246

247247
# we also support `int8_weight_only()` and `int8_dynamic_activation_int8_weight()`, see

docs/source/tutorials/qlora_finetune.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ accuracy.
4242

4343
The QLoRA authors introduce two key abstractions to decrease memory usage and avoid accuracy degradation: the bespoke 4-bit NormatFloat
4444
type, and a double quantization method that quantizes the quantization parameters themselves to save even more memory. torchtune uses
45-
the `NF4Tensor <https://github.com/pytorch-labs/ao/blob/b9beaf351e27133d189b57d6fa725b1a7824a457/torchao/dtypes/nf4tensor.py#L153>`_ abstraction from the `torchao library <https://github.com/pytorch-labs/ao>`_ to build QLoRA components as specified in the paper.
45+
the `NF4Tensor <https://github.com/pytorch/ao/blob/b9beaf351e27133d189b57d6fa725b1a7824a457/torchao/dtypes/nf4tensor.py#L153>`_ abstraction from the `torchao library <https://github.com/pytorch/ao>`_ to build QLoRA components as specified in the paper.
4646
torchao is a PyTorch-native library that allows you to quantize and prune your models.
4747

4848

@@ -275,7 +275,7 @@ As mentioned above, torchtune takes a dependency on torchao for some of the core
275275
276276
The key changes on top of the LoRA layer are the usage of the ``to_nf4`` and ``linear_nf4`` APIs.
277277
278-
``to_nf4`` accepts an unquantized (bf16 or fp32) tensor and produces an ``NF4`` representation of the weight. See the `implementation <https://github.com/pytorch-labs/ao/blob/c40358072f99b50cd7e58ec11e0e8d90440e3e25/torchao/dtypes/nf4tensor.py#L587>`_ of ``to_nf4`` for more details.
278+
``to_nf4`` accepts an unquantized (bf16 or fp32) tensor and produces an ``NF4`` representation of the weight. See the `implementation <https://github.com/pytorch/ao/blob/c40358072f99b50cd7e58ec11e0e8d90440e3e25/torchao/dtypes/nf4tensor.py#L587>`_ of ``to_nf4`` for more details.
279279
``linear_nf4`` handles the forward pass and autograd when running with quantized base model weights. It computes the forward pass as a regular
280280
``F.linear`` with the incoming activation and unquantized weight. The quantized weight is saved for backward, as opposed to the unquantized version of the weight, to avoid extra
281-
memory usage due to storing higher precision variables to compute gradients in the backward pass. See `linear_nf4 <https://github.com/pytorch-labs/ao/blob/main/torchao/dtypes/nf4tensor.py#L577>`_ for more details.
281+
memory usage due to storing higher precision variables to compute gradients in the backward pass. See `linear_nf4 <https://github.com/pytorch/ao/blob/main/torchao/dtypes/nf4tensor.py#L577>`_ for more details.

recipes/eleuther_eval.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@
4242

4343
class _VLMEvalWrapper(HFMultimodalLM):
4444
"""An EvalWrapper for EleutherAI's eval harness based on gpt-fast's
45-
EvalWrapper: https://github.com/pytorch-labs/gpt-fast/blob/main/eval.py.
45+
EvalWrapper: https://github.com/meta-pytorch/gpt-fast/blob/main/eval.py.
4646
4747
Note:
4848
This is ONLY for vision-language models.
@@ -283,7 +283,7 @@ def _model_multimodal_generate(
283283

284284
class _LLMEvalWrapper(HFLM):
285285
"""An EvalWrapper for EleutherAI's eval harness based on gpt-fast's
286-
EvalWrapper: https://github.com/pytorch-labs/gpt-fast/blob/main/eval.py.
286+
EvalWrapper: https://github.com/meta-pytorch/gpt-fast/blob/main/eval.py.
287287
288288
Note:
289289
This is for text-only decoder models.
@@ -355,7 +355,7 @@ def tok_encode(self, text: str, **kwargs) -> list[int]:
355355
# +1% on truthfulqa_mc2 with a LoRA finetune. lit-gpt also sets this to False,
356356
# see https://github.com/Lightning-AI/lit-gpt/blob/main/eval/lm_eval_harness.py#L66,
357357
# though notably fast-gpt does the opposite
358-
# https://github.com/pytorch-labs/gpt-fast/blob/main/eval.py#L123.
358+
# https://github.com/meta-pytorch/gpt-fast/blob/main/eval.py#L123.
359359
if isinstance(self._tokenizer, HuggingFaceModelTokenizer):
360360
return self._tokenizer.base_tokenizer.encode(
361361
text=text, add_bos=False, add_eos=False

torchtune/generation/_generation.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ def generate_next_token(
8989
x (torch.Tensor): tensor with the token IDs associated with the given prompt,
9090
with shape [bsz x seq_length].
9191
q (Optional[torch.Tensor]): randomly sampled tensor for softmax sampling trick.
92-
See https://github.com/pytorch-labs/gpt-fast/blob/32971d3129541c5bfb4f715abc33d1c5f408d204/generate.py#L40
92+
See https://github.com/meta-pytorch/gpt-fast/blob/32971d3129541c5bfb4f715abc33d1c5f408d204/generate.py#L40
9393
mask (Optional[torch.Tensor]): attention mask with shape [bsz x seq_length x seq_length],
9494
default None.
9595
temperature (float): value to scale the predicted logits by, default 1.0.

0 commit comments

Comments
 (0)