Releases: TransformerLensOrg/TransformerLens
v3.0.0a4
Big update that brings us a lot closer to beta! This update adds a compatibility layer for a lot of legacy properties of the old hooked root modules.
What's Changed
- Unified aliases by @bryce13950 in #991
- fixed hook alias positions by @bryce13950 in #992
- Create bridge for every module in Mixtral by @degenfabian in #984
- removed numpy ceiling by @bryce13950 in #994
- Ensure hook and property backwards compatibility with HookedTransformer by @degenfabian in #990
- Create bridge for every module in neox by @degenfabian in #995
- Create bridges for every module in neo by @degenfabian in #987
Full Changelog: v3.0.0a3...v3.0.0a4
v3.0.0a3
New Alpha release! A whole bunch of changes have been added. Some more HookedTransformer functionality has been imported, and a whole bunch of architectures have been improved to give more options in our new module. These changes have resulted in a very noticeable improvement with compatibility of old HookedTransformer based code.
What's Changed
- Setup deprecated hook aliases and got the majority of the main demo running properly by @bryce13950 in #976
- Linear test coverage by @bryce13950 in #977
- Create Bridge for every Gemma 3 module by @degenfabian in #966
- Add Bridges for every module in GPT2 by @degenfabian in #967
- Cache hook aliases & stop at layer by @bryce13950 in #978
- Create Bridges for every module in Bloom models by @degenfabian in #970
- Create Bridges for every module in Gemma 2 by @degenfabian in #971
- Create bridges for every module in Gemma 1 by @degenfabian in #972
- Create bridges for every module in Mistral by @degenfabian in #979
- Remove that output_attention flag defaults to true in boot function by @degenfabian in #982
- Create bridge for every module in GPT-J by @degenfabian in #974
- Create bridge for every module in Llama by @degenfabian in #975
Full Changelog: v3.0.0a2...v3.0.0a3
v3.0.0a2
This release is inconsequential. The first alpha release showed that the CI was not capable of publishing to pip with pep style alpha tags. This release makes that possible. Please consult the release notes for v3.0.0a1 for full information on 3.x alpha.
What's Changed
- Pre release version publishing by @bryce13950 in #973
Full Changelog: v3.0.0a1...v3.0.0a2
v3.0.0a1
Big release coming up! This release will add a new module named TransformerBridge, which will greatly increase flexibility and expandability of TransformerLens. This is a very experimental module right now, but we are looking for people who are ready to test it. This version already supports more models than any of the existing HookedModules, and we are in the middle of working through a number of scripts to assure full compatibility with any existing code utilized any of those HookedModules.
If you are interested in helping as test some of this, let us know on the slack channel! If you want to be able to use any models not currently supported in HookedModules, then please feel free to submit any scripts currently running with an existing HookedModule to https://github.com/TransformerLensOrg/BridgeComaptibilityScripts. All scripts in this repo will be confirmed to be running, and matching the current HookedTransformer output before the final 3.0.0 release is published.
What's Changed
- Refactor the utilities file into utilities folder by @starship006 in #628
- Raise exception when BERT is loaded with HookedTransformer instead of… by @degenfabian in #795
- Circular dependency resolution by @bryce13950 in #803
- fixed corner param by @bryce13950 in #817
- bumped python min version by @bryce13950 in #802
- Updates torch to use the most recent version by @bryce13950 in #822
- updated python requirements by @bryce13950 in #821
- Recent releases by @bryce13950 in #841
- updated mypy limit by @bryce13950 in #880
- Activation utils cleanup by @bryce13950 in #879
- Restore consistency of hook_normalized between LayerNorm and RMSNorm by @degenfabian in #770
- Fix that padding_side always defaults to "right" when no value is explicitly passed by @degenfabian in #814
- Unified conversions by @bryce13950 in #881
- Flatten state dictionary for proper weight loading by @degenfabian in #860
- enabled actions on action pr by @bryce13950 in #882
- Add weight conversion for Phi model by @degenfabian in #863
- Add weight conversion for T5 models by @degenfabian in #859
- Visualize weight conversions by @degenfabian in #852
- Fixed test for ensuring weight conversions are provided by @bryce13950 in #883
- Drop python 3.9 by @bryce13950 in #885
- Conversion improved test coverage by @bryce13950 in #886
- Component test coverage by @bryce13950 in #890
- Bug new loading by @bryce13950 in #891
- Weight conversion llama by @bryce13950 in #892
- Refactor supported models module by @bryce13950 in #893
- Bug neox by @bryce13950 in #895
- added conditional check for hugging face by @bryce13950 in #919
- created a seperate list of models to test for public PRs by @bryce13950 in #920
- added alternative when hf token is not included by @bryce13950 in #921
- shrunk loss test by @bryce13950 in #922
- Fix broken test, per issue #913 by @JasonBenn in #914
- Fix loading on specific device by @mntss in #906
- Feature model adapter by @bryce13950 in #928
- added test for making sure formatting works well by @bryce13950 in #932
- Refactor final issues by @bryce13950 in #933
- restored tokenizer content by @bryce13950 in #935
- Refactor weight conversion by @bryce13950 in #931
- Add qwen3 by @mntss in #937
- Improve ActivationCache docs by @BorisTheBrave in #901
- Feature: Get the value for rotary base from the hugging face config, only for Qwen for now. by @Gusanidas in #887
- added python 3.13 to CI by @bryce13950 in #843
- updated mypy by @bryce13950 in #940
- updated numpy dependency by @bryce13950 in #943
- upated torch by @bryce13950 in #942
- updated transformers by @bryce13950 in #939
- Fixed Qwen 3 docs issues by @bryce13950 in #946
- upstream fixes from dev by @bryce13950 in #941
- Flexible component mapping by @bryce13950 in #938
- updated sphinx by @bryce13950 in #948
- removed dependency by @bryce13950 in #951
- Move flatten dictionary to architecture_conversion by @degenfabian in #936
- made new transformer bridge extend nn module properly by @bryce13950 in #955
- brought in remaining hooked transformer functions by @bryce13950 in #954
- Setup tokenizer in boot function by @degenfabian in #959
- Bridged Robust Model Structure by @bryce13950 in #960
- Remove transformers dependency from bridge tokenization by @degenfabian in #963
- Dynamically add boot function to bridge by @degenfabian in #964
New Contributors
- @JasonBenn made their first contribution in #914
- @BorisTheBrave made their first contribution in #901
- @Gusanidas made their first contribution in #887
Full Changelog: v2.15.4...v3.0.0a1
v2.16.1
Minor release with security dependency bumps
What's Changed
- updated sphinx by @bryce13950 in #948
- removed dependency by @bryce13950 in #951
Full Changelog: v2.16.0...v2.16.1
v2.16.0
What's Changed
- added conditional check for hugging face by @bryce13950 in #919
- created a seperate list of models to test for public PRs by @bryce13950 in #920
- added alternative when hf token is not included by @bryce13950 in #921
- shrunk loss test by @bryce13950 in #922
- Fix broken test, per issue #913 by @JasonBenn in #914
- Fix loading on specific device by @mntss in #906
- Add qwen3 by @mntss in #937
- Improve ActivationCache docs by @BorisTheBrave in #901
- Feature: Get the value for rotary base from the hugging face config, only for Qwen for now. by @Gusanidas in #887
- updated mypy by @bryce13950 in #940
- updated numpy dependency by @bryce13950 in #943
- upated torch by @bryce13950 in #942
- updated transformers by @bryce13950 in #939
- Fixed Qwen 3 docs issues by @bryce13950 in #946
New Contributors
- @JasonBenn made their first contribution in #914
- @BorisTheBrave made their first contribution in #901
- @Gusanidas made their first contribution in #887
Full Changelog: v2.15.4...v2.16.0
v2.15.4
v2.15.3
What's Changed
- updated triggers to enforce secrets by @bryce13950 in #926
Full Changelog: v2.15.2...v2.15.3
v2.15.2
v2.15.1
Minor fix to LLaMA rope compatibility and a fix for the CI
What's Changed
- added full hf token authentication by @bryce13950 in #916
- Fix LLama RoPE by @mntss in #910
Full Changelog: v2.15.0...v2.15.1