-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge with internal master - 2024-08-05 #1026
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fix docker url security: use microsoft container registry instead of public dockerhub
…able specific jobs
This PR adds a first working version of ALIBI with algorithmic shifts for encoder-decoder models. Also adds trainable ALIBI slopes and biases and ALIBI in general to the **new** layer framework. This is still experimental.
This adds nucleus and epsilon sampling to the output-sampling options. * This required the implementation of a sorting algorithm, tested thrust and CUB. * Implementation of cumsum and logcumsumexp (no gradient for now) operators. * Various minor improvements.
This adjusts the logmask computation to match the implementation in COMET-QE model after the ALIBI refactoring.
This adds a sparsemax function and support for COMET-22 ref-based metric. Worth adding a regression test for Unbabel/wmt22-comet-da model later. Scores seem to be pretty much identical to PyTorch implementation when running as float32.
This is a rewrite of the graph loading and memory-mapping functionality. We now mmap and share oportunistically, i.e. whenever it is possible: * with cpu-decoding and *.bin files everything will be automatically mmapped * with *.npz files the model will be read only once. * on the GPU *.bin will be mmapped but still copied to GPU, ideally omitting CPU memory. This quite drastically reduces unnecessary CPU memory overhead and loading time for things like COMET scoring.
This PR implements * Comet-Kiwi - fully functional * xComet-XL and xComet-XXL - scores for regressor part fully matching, MQM partial scores not implemented yet.
Fixes small bug for mt-detect models
…marian-evaluate This PR add minor fixes to pybindings and pymarian-evaluate: * comet2marian.py script correctly handles the wmt23-cometkiwi-da-xl/xxl models. * pymarian-evaluate now correctly computes scores * evaluator now exposes an interface function to read the model config
…or Ampere and Turing Ubuntu CI: ON to Maxwell, Pascal and Volta; OFF to Ampere and Turing * to fix space issue on CI vms
…dels This PR implements a bunch of missing functionality in the new layer framework. Among others: * Autoregressive self-attention * Guided alignment training * Decode-time alignment Minor refactoring of previous code to accommodate above changes. When setting `export TRANSFORMER_FLAVOR=experimental` all legacy transformer models are internally mapped to the new layer framework. With that enabled: Production regression tests all pass. Passes all public regression tests with the exception of: - tests/factors/test_factors_concat.sh - tests/factors/test_factors_decoder_concat.sh - tests/models/wnmt18/test_student_small_aan.sh - tests/models/wnmt18/test_student_small_aan_intgemm16.sh - tests/models/wnmt18/test_student_small_aan_intgemm8.sh and - tests/interface/input-tsv/test_tsv_train_with_align_and_weights.sh - tests/interface/input-tsv/test_tsv_train_with_align_and_weights_inputtypes.sh I could get these to work, but it doesn't seem to be worth it. I plan to remove both code paths in the future. The last two are -- I think -- just divergences due to mild model differences and probably don't need fixing, rather future adaptation.
This PR adds `--input-reorder` which allows to swap the indices of batch subfields. Currently, this is used for comet-kiwi-style models to accomodate that the mt output comes first and not the source.
It seems there was a shape mismatch for force-decoding with beams larger than 1. This PR fixes the problem.
List of changes/updates/fixes to pymarian * Rename model IDs to match with hugging face (e.g., comet22-da -> wmt22-comet-da) * Rename CLI to make it short pymarian-evaluate -> pymarian-eval. * Rename pymarian.evaluate.py -> pymarian.eval.py to reflect CLI * The functional code from pymarian.eval.py is moved to Evaluator class (goal: allow reuse of Evaluator object for scoring many small files like WMT metric task) * Use mmap *.bins instead of *.npz * Downloads *.bin and *.spm individually instead of .tgz. Future plan to support quantized / gemm models. Downloading .tgz is okay but it will get too expensive since we dont need all variants of model (.npz, .bin, fp32, fp16, avx512 ...) * Uses file locking mechanism (based on `portalocker`) to avoid race condition between parallel download processes * Added optional `-v/--vocab` argument to pymarian-eval. * Added `--fields|-f` argument: supports `src mt ref` or a subsequence of this. Raises an error when missing fields are detected, ignores that extra fields * pymarian build improvements: strict on python version match between package and native extension. Also removes custom logic for extension detection, instead uses EXT_SUFFIX from sysconfig * add `--like` argument for local models * Ran black and isort to fix code formatting issues * pypdl -- parallel download * Regression tests to pymarian -- Other scripts * Added `convert-all-models.sh` : convert pytorch to marian .npz, convert .npz to .bin and creates directory structure compatible with pymarian-eval * Added `compare.sh` to compare metrics between original implementation and pymarian
This mostly adds @<Varun Mathur>'s changes from public master to internal. I did an automatic merge and need to go through those changes myself. I think there is an issue in translator.h which I will fix. @<Varun Mathur> can you check if things work for you here?
support force-decoding for pymarian Translator API
Cuda seems to have deprecated a whole bunch of its interface and it seems to interact weirdly with some gcc versions. Disabling warnings for this header via dummy include.
This PR adds a simple `--no-optimizer-reload` that allows to skip restoring optimizer state during continued training or divergence fallback.
This PR includes various fixes to the force decoding code to make the LSH and beam search work.
Abort or throw an exception if we try force-decoding with a factored Vocab.
…ed decoding * Fixes regressions in new layer framework for ALIBI-based decoding
* Do not mmap files for conversion
…e tcmalloc; huggingface backed for gated COMETs pymarian upgrades * Support for build for multiple python versions at once; borrowed a cmake script from AMD * use "build" instead of "pip wheel"; build is more stable and leaves less junk on file system * Disable tcmalloc for pymarian * Added support for [huggingface backend](https://huggingface.co/collections/Unbabel/marian-comet-metrics-and-qe-664e28c82743db6709d022fc). Currently enabled for gated comet models only. * Added `--cache` argument to pymarian-eval CLI; Useful for accessing cache from blobstorage mount path for gated models
snukky
approved these changes
Aug 7, 2024
The workflow fails for the current master as well, so this is unlikely related to the sync. I think we can merge. Let's make sure it's not a squash merge to preserve the commit history. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merges withe MS-internal master branch. Full sync