Add blog post demonstrating the usage of `GT.fmt_image()` and `vals.fmt_image()` #558

jrycw · 2024-12-12T11:12:12Z

Hello team,

I’d like to propose a blog post to help users better understand how to render images using the recent improvements to GT.fmt_image() and vals.fmt_image(). Feedback is always welcome!

In addition, I’ve noticed that the table alignment behaves differently depending on whether the file_pattern= argument is used. Below is a minimal example to demonstrate this behavior:

import pandas as pd
from great_tables import GT

data = {
    "name": [
        "Argentine",
        "Bastille",
        "Bérault",
        "Champs-Élysées—Clemenceau",
        "Charles de Gaulle—Étoile",
    ],
    "lines": ["1", "1, 5, 8", "1", "1, 13", "1, 2, 6"],
}

img_url_paths = "https://raw.githubusercontent.com/posit-dev/great-tables/refs/heads/main/great_tables/data/metro_images"

metro_mini = pd.DataFrame(
    {
        **data,
        "filenames": [
            ", ".join(f"metro_{item}.svg" for item in row.split(", ")) for row in data["lines"]
        ],
    }
)

print(metro_mini)

(
    GT(metro_mini)
    .fmt_image("filenames", path=img_url_paths)
    .fmt_image("lines", path=img_url_paths, file_pattern="metro_{}.svg")
    .cols_label(
        {
            "lines": "right-aligned(w file_pattern)",
            "filenames": "left-aligned(w/o file_pattern)",
        }
    )
    .show()
)

                        name    lines                              filenames
0                  Argentine        1                            metro_1.svg
1                   Bastille  1, 5, 8  metro_1.svg, metro_5.svg, metro_8.svg
2                    Bérault        1                            metro_1.svg
3  Champs-Élysées—Clemenceau    1, 13              metro_1.svg, metro_13.svg
4   Charles de Gaulle—Étoile  1, 2, 6  metro_1.svg, metro_2.svg, metro_6.svg

I’ve attempted to investigate the code but couldn’t pinpoint the root cause. Could the team clarify if this is an intended feature or a potential bug?

…mt_image()`

codecov · 2024-12-12T11:13:39Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.67%. Comparing base (ceef5c5) to head (08acafa).
Report is 4 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #558   +/-   ##
=======================================
  Coverage   90.66%   90.67%           
=======================================
  Files          46       46           
  Lines        5367     5381   +14     
=======================================
+ Hits         4866     4879   +13     
- Misses        501      502    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

rich-iannone · 2024-12-12T14:55:38Z

Thanks for submitting this post! We'll read it over shortly. To address the question:

I’ve attempted to investigate the code but couldn’t pinpoint the root cause. Could the team clarify if this is an intended feature or a potential bug?

It's a feature (the auto_align= argument in GT()) that has some unintended consequences. Auto-align will align based on dtypes and also whether strings look like numbers/datetimes (contain the character set [0-9 -/:\\.]). This is all in the align_from_data() function:

great-tables/great_tables/_gt_data.py

Line 357 in b32b55a

number_like_matches.append(re.match("^[0-9 -/:\\.]*$", val))

Most of the time, this works well. In the case of your example, those metro line numbers in lines column just happen to have all cells matching the character set. If you just execute GT(metro_mini) you'll see the table aligns like this:

And using a letter in place of any of the numbers will result in left alignment of the lines column. That alignment is kept when replacing the text with images via fmt_image().

I think that there should probably be an align= argument for fmt_image() (defaulting to left) as a corrective and useful option. I can see inline images being aligned as left, center, and right as common use cases and having to use cols_align() separately seems a little inconvenient here.

jrycw · 2024-12-12T15:07:57Z

@rich-iannone , thank you for the prompt response and for pointing me to the relevant source code.
This is an interesting behavior that I wasn’t aware of—definitely learned something new about Great Tables today!

machow

Thanks for this PR! This post is incredible, and reads really nicely! The 4 body case studies seem great in terms of prep and example data.

There are two pieces in those case studies that I think could be tweaked a bit to motivate the "why" of the cases. This is important IMO because it tells them what they'll get out of each case study (especially 3 and 4, which are convenience arguments).

Here are the two pieces:

The summary bullets at the beginning of the section (e.g. add text after to explain "why"; see comment)
The explanations just before the DataFrame code in each case study.
- (I put an example in my case 1 comment)
- Since the code preparing tables is folded (a nice move!), you could remove the paragraphs above any folded code explaining what it does.
- Instead of the code explanations, a sentence prep'ing people on the table output might help orient them. What should they notice in each case study table?

I glanced through the other sections and they look so great!

machow · 2024-12-12T15:11:23Z

docs/blog/rendering-images/index.qmd

+* **Case 1**: Local file paths.
+* **Case 2**: Full HTTP/HTTPS URLs.
+* **Case 3**: File names with the `path=` argument.
+* **Case 4**: File names using both the `path=` and `file_pattern=` arguments.


This summary is really helpful up top! It says the "what" of each case (e.g. case 3 is using the path argument). The piece that took me a bit of scanning to find is the "why" of each case. I wonder if we could give a very short statement either in the bullets or below to help readers understand why they might choose a given case.

It seems like case 1 and 2 are fairly obvious. But afaik for others it looks like the whys might be...

case 3: specify image names relative to a base directory or url (e.g. "/path/to/images")

case 4: specify image names inside a path (e.g. "metro_{}.svg")

I think you could get away with adding a sentence or two beneath the list explaining the whys of case 3 and 4 (or there might be some other nice way of including?)

Perhaps a picture is worth a thousand words? Do you think the mermaid flowchart is clear enough for readers?

I believe we still need to add a few sentences to explain the "why," with the flowchart serving as an additional aid. I'll continue working on this tomorrow, as it's quite late here.

How about using a callout to provide a quick tip for readers?

This little call out box seems great!

machow · 2024-12-12T15:15:05Z

docs/blog/rendering-images/index.qmd

+Our goal is to construct a Python dictionary to serve as input for creating a `Pandas` DataFrame.
+This dictionary will include the previously defined `data` and a new key, `"case1"`. The value for
+`"case1"` is generated by processing the `data["lines"]` column, which contains a list of strings.
+
+For each string in `data["lines"]`, we split it by commas to extract individual numbers. These
+numbers are then processed by adding the necessary prefix and suffix to generate valid file paths.
+Once processed, the paths are joined back into a single string using `", ".join()`. The resulting
+strings from all iterations are collected into a list, which becomes the value for `"case1"`:


These paragraphs seem to explain the code below them. Since that code is folded, I think you could get away with removing these, and instead provide a sentence preparing people for the table output.

(If it seems useful, you could put some of this explanation in the folded code block)

An example for explaining the table might be

Below is a DataFrame, whose case1 column has full file paths that we want to turn into images.

machow · 2024-12-12T15:26:40Z

docs/blog/rendering-images/index.qmd

+
+
+### Case 3: File Names with the `path=` Argument
+**Case 3** demonstrates how to simulate a column containing file names. The preparation for creating


A sentence on why might help prime readers for what they'll get out of this case. Maybe something like...

Case 3 demonstrates using the path= argument to specify images relative to a base directory. This lets you cut out a lot of repetition in file names. <remove DataFrame code prep explanation text>

machow · 2024-12-12T15:32:11Z

docs/blog/rendering-images/index.qmd

+this works in **Case 4** below.
+
+### Case 4: File Names Using Both the `path=` and `file_pattern=` Arguments
+**Case 4** demonstrates how to simulate a column containing only the variable part of a string pattern.


Similar to case 3, the why of case 4 up top might help people see what they'll get out of this section. something like..?:

Case 4 demonstrates how to use path= and file_pattern= to specify images whose names are inside a common name. For example, you might use file_pattern="metro_{}.svg", to specify "metro_1.svg", "metro_2.svg", etc..

jrycw · 2024-12-12T17:09:01Z

@machow , thank you so much for the excellent suggestions! I’ll work on incorporating them into the post.

@machow and @rich-iannone , I’d appreciate your thoughts on the following questions:

Is it appropriate to set pd.set_option('display.max_colwidth', 150)? I aim to display at least one full path in the column, and this setting works well for shorter paths. However, I’m concerned the paths might be too long. Do you have any suggestions for a better solution?
Should we ensure consistent alignment across all tables by explicitly using GT.cols_align()?

machow · 2024-12-12T17:46:38Z

Is it appropriate to set pd.set_option('display.max_colwidth', 150)?

It seems okay to use, especially if it lets people see the file paths. It's tough the file names are so long, but also--since that's addressed by case study 3--maybe it's a chance to lean into the problem of the long file names. You could flag in case study 1, "these are filepaths to images inside the great_tables.data module, and notice they're super long. (In case study 3 we'll show you how avoid repeating the directory name every time" (or similar?)

rich-iannone · 2024-12-12T18:13:27Z

Should we ensure consistent alignment across all tables by explicitly using GT.cols_align()?

I think we ought to for these examples.

jrycw · 2024-12-13T10:37:11Z

@machow , thank you again for your excellent suggestions!

I decided to place the "why" section in a callout to provide a quick tip for readers. I also refined the wording for each case, thanks to the helpful "something like" examples you provided.

Additionally, I made the following updates:

Added a section to remind readers that file_pattern can be used independently.
Highlighted that GT.fmt_image() supports rendering multiple images within a single row.
Aligned the image column to the right, which I think looks more aesthetically pleasing. While we technically don’t need to apply GT.cols_align() to every table, relying on the auto-align feature could lead to inconsistencies if we modify the behavior of GT.fmt_image() or the GT object in the future.
Reworded and polished some sentences.

@machow and @rich-iannone , feel free to make any direct updates to this PR if needed.

machow

This blog post is incredible, @rich-iannone do you want to review? (or feel free to merge!)

rich-iannone

LGTM!!

Minor touchups for #558

jrycw added 2 commits December 12, 2024 18:44

Add blog post demonstrating the usage of GT.fmt_image() and `vals.f…

06c9250

…mt_image()`

Update docstring for path parameter in GT.fmt_image()

7892a2c

github-actions bot temporarily deployed to docs-preview-blog December 12, 2024 14:42 Destroyed

machow requested changes Dec 12, 2024

View reviewed changes

Update the blog based on team feedback

08acafa

github-actions bot deployed to docs-preview-blog December 13, 2024 20:18 View deployment

machow self-requested a review December 13, 2024 20:58

machow approved these changes Dec 13, 2024

View reviewed changes

rich-iannone approved these changes Dec 13, 2024

View reviewed changes

rich-iannone merged commit 4043f14 into posit-dev:main Dec 13, 2024
14 of 15 checks passed

jrycw deleted the blog-render-image branch December 14, 2024 01:08

jrycw mentioned this pull request Dec 15, 2024

Minor touchups for #558 #565

Merged

machow added a commit that referenced this pull request Dec 16, 2024

Merge pull request #565 from jrycw/blog-image-followup

eac1fb4

Minor touchups for #558

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add blog post demonstrating the usage of `GT.fmt_image()` and `vals.fmt_image()` #558

Add blog post demonstrating the usage of `GT.fmt_image()` and `vals.fmt_image()` #558

jrycw commented Dec 12, 2024

codecov bot commented Dec 12, 2024 •

edited

Loading

rich-iannone commented Dec 12, 2024

jrycw commented Dec 12, 2024

machow left a comment •

edited

Loading

machow Dec 12, 2024

jrycw Dec 12, 2024

jrycw Dec 12, 2024

jrycw Dec 13, 2024

machow Dec 13, 2024

machow Dec 12, 2024

machow Dec 12, 2024

machow Dec 12, 2024

jrycw commented Dec 12, 2024

machow commented Dec 12, 2024 •

edited

Loading

rich-iannone commented Dec 12, 2024

jrycw commented Dec 13, 2024

machow left a comment

rich-iannone left a comment



		### Case 3: File Names with the `path=` Argument
		Case 3 demonstrates how to simulate a column containing file names. The preparation for creating

Add blog post demonstrating the usage of GT.fmt_image() and vals.fmt_image() #558

Add blog post demonstrating the usage of GT.fmt_image() and vals.fmt_image() #558

Conversation

jrycw commented Dec 12, 2024

codecov bot commented Dec 12, 2024 • edited Loading

Codecov Report

rich-iannone commented Dec 12, 2024

jrycw commented Dec 12, 2024

machow left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jrycw commented Dec 12, 2024

machow commented Dec 12, 2024 • edited Loading

rich-iannone commented Dec 12, 2024

jrycw commented Dec 13, 2024

machow left a comment

Choose a reason for hiding this comment

rich-iannone left a comment

Choose a reason for hiding this comment

Add blog post demonstrating the usage of `GT.fmt_image()` and `vals.fmt_image()` #558

Add blog post demonstrating the usage of `GT.fmt_image()` and `vals.fmt_image()` #558

codecov bot commented Dec 12, 2024 •

edited

Loading

machow left a comment •

edited

Loading

machow commented Dec 12, 2024 •

edited

Loading