GGUF: ggml backend support for writing tensor data #1033

JohannesGaessler · 2024-11-30T10:34:57Z

This PR adds ggml backend support for writing tensor data to a GGUF file. Currently a workaround is needed where the data is first copied to new tensors with data in RAM, which the GGUF code can then access via memcpy. This PR makes it so that instead a fake tensor is reconstructed from gguf_tensor_info which can then be passed to ggml_backend_tensor_get. I'm not sure whether this is the best solution; a lot of the fileds in gguf_tensor_info are the same as in ggml_tensor, is there a reason why you couldn't just directly store a ggml_tensor as one of the fields in gguf_tensor_info?

slaren · 2024-12-01T18:07:52Z

It should be ok to store the tensor in gguf_tensor_info, but I think it would require a refactor to avoid duplicating the data since the gguf loader also uses this struct to load the tensor info.

JohannesGaessler · 2024-12-02T17:36:32Z

I did a refactor to store a ggml_tensor instead of effectively mirrored fields. It seems to work correctly for MNIST but I think I'll open a PR in the llama.cpp repository to ensure that it works there as well (there are also some slight API changes that I would suggest). While I'm at it I'll also tackle #1038 as well.

ggerganov · 2024-12-03T20:12:01Z

src/ggml.c

+    /* if (info->n_dims > GGML_MAX_DIMS) { */
+    /*     fprintf(stderr, "%s: invalid number of dimensions (%" PRIu32 ")\n", __func__, info->n_dims); */
+    /*     return false; */
+    /* } */
+
+    /* if (info->type < 0 || info->type >= GGML_TYPE_COUNT) { */
+    /*     fprintf(stderr, "%s: invalid type (%d)\n", __func__, info->type); */
+    /*     return false; */
+    /* } */
+
+    /* if (strlen(info->name.data) >= GGML_MAX_NAME) { */
+    /*     fprintf(stderr, "%s: tensor '%s' name is too long\n", __func__, info->name.data); */
+    /*     return false; */
+    /* } */
+
+    /* for (uint32_t i = 0; i < info->n_dims; ++i) { */
+    /*     if (info->ne[i] <= 0) { */
+    /*         fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[i]); */
+    /*         return false; */
+    /*     } */
+    /* } */
+
+    /* // prevent overflow for total number of elements */
+    /* if (INT64_MAX/info->ne[1] <= info->ne[0]) { */
+    /*     fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[1]); */
+    /*     return false; */
+    /* } */
+
+    /* if (INT64_MAX/info->ne[2] <= info->ne[0]*info->ne[1]) { */
+    /*     fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[2]); */
+    /*     return false; */
+    /* } */
+
+    /* if (INT64_MAX/info->ne[3] <= info->ne[0]*info->ne[1]*info->ne[2]) { */
+    /*     fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[3]); */
+    /*     return false; */
+    /* } */


Why are these checks commented?

This was just something I did for a WIP version. I have a version with more changes and the checks re-eenabled on my local machine. I'll make a PR to llama.cpp either today or tomorrow.

GGUF: ggml backend support for writing tensor data

e5fc9b2

JohannesGaessler mentioned this pull request Dec 1, 2024

llama/ggml: add LLM training support ggerganov/llama.cpp#10544

Open

use ggml_tensor for storing GGUF data

5441c04

ggerganov reviewed Dec 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GGUF: ggml backend support for writing tensor data #1033

GGUF: ggml backend support for writing tensor data #1033

JohannesGaessler commented Nov 30, 2024

slaren commented Dec 1, 2024

JohannesGaessler commented Dec 2, 2024

ggerganov Dec 3, 2024

JohannesGaessler Dec 3, 2024

GGUF: ggml backend support for writing tensor data #1033

Are you sure you want to change the base?

GGUF: ggml backend support for writing tensor data #1033

Conversation

JohannesGaessler commented Nov 30, 2024

slaren commented Dec 1, 2024

JohannesGaessler commented Dec 2, 2024

ggerganov Dec 3, 2024

Choose a reason for hiding this comment

JohannesGaessler Dec 3, 2024

Choose a reason for hiding this comment