-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GGUF: ggml backend support for writing tensor data #1033
base: master
Are you sure you want to change the base?
GGUF: ggml backend support for writing tensor data #1033
Conversation
It should be ok to store the tensor in |
I did a refactor to store a |
/* if (info->n_dims > GGML_MAX_DIMS) { */ | ||
/* fprintf(stderr, "%s: invalid number of dimensions (%" PRIu32 ")\n", __func__, info->n_dims); */ | ||
/* return false; */ | ||
/* } */ | ||
|
||
/* if (info->type < 0 || info->type >= GGML_TYPE_COUNT) { */ | ||
/* fprintf(stderr, "%s: invalid type (%d)\n", __func__, info->type); */ | ||
/* return false; */ | ||
/* } */ | ||
|
||
/* if (strlen(info->name.data) >= GGML_MAX_NAME) { */ | ||
/* fprintf(stderr, "%s: tensor '%s' name is too long\n", __func__, info->name.data); */ | ||
/* return false; */ | ||
/* } */ | ||
|
||
/* for (uint32_t i = 0; i < info->n_dims; ++i) { */ | ||
/* if (info->ne[i] <= 0) { */ | ||
/* fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[i]); */ | ||
/* return false; */ | ||
/* } */ | ||
/* } */ | ||
|
||
/* // prevent overflow for total number of elements */ | ||
/* if (INT64_MAX/info->ne[1] <= info->ne[0]) { */ | ||
/* fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[1]); */ | ||
/* return false; */ | ||
/* } */ | ||
|
||
/* if (INT64_MAX/info->ne[2] <= info->ne[0]*info->ne[1]) { */ | ||
/* fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[2]); */ | ||
/* return false; */ | ||
/* } */ | ||
|
||
/* if (INT64_MAX/info->ne[3] <= info->ne[0]*info->ne[1]*info->ne[2]) { */ | ||
/* fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[3]); */ | ||
/* return false; */ | ||
/* } */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these checks commented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just something I did for a WIP version. I have a version with more changes and the checks re-eenabled on my local machine. I'll make a PR to llama.cpp either today or tomorrow.
This PR adds ggml backend support for writing tensor data to a GGUF file. Currently a workaround is needed where the data is first copied to new tensors with data in RAM, which the GGUF code can then access via
memcpy
. This PR makes it so that instead a fake tensor is reconstructed fromgguf_tensor_info
which can then be passed toggml_backend_tensor_get
. I'm not sure whether this is the best solution; a lot of the fileds ingguf_tensor_info
are the same as inggml_tensor
, is there a reason why you couldn't just directly store aggml_tensor
as one of the fields ingguf_tensor_info
?