Add ?_extra= mechanism for requesting extra properties in JSON #262

simonw · 2018-05-16T14:55:42Z

Datasette views currently work by creating a set of data that should be returned as JSON, then defining an additional, optional template_data() function which is called if the view is being rendered as HTML.

This template_data() function calculates extra template context variables which are necessary for the HTML view but should not be included in the JSON.

Example of how that is used today:

datasette/datasette/views/table.py

Lines 672 to 704 in 2b79f2b

    
           async def template_data(): 
        
               display_columns, display_rows = await self.display_columns_and_rows( 
        
                   name, 
        
                   table, 
        
                   description, 
        
                   rows, 
        
                   link_column=False, 
        
                   expand_foreign_keys=True, 
        
               ) 
        
               for column in display_columns: 
        
                   column["sortable"] = False 
        
               return { 
        
                   "database_hash": hash, 
        
                   "foreign_key_tables": await self.foreign_key_tables( 
        
                       name, table, pk_values 
        
                   ), 
        
                   "display_columns": display_columns, 
        
                   "display_rows": display_rows, 
        
                   "custom_rows_and_columns_templates": [ 
        
                       "_rows_and_columns-{}-{}.html".format( 
        
                           to_css_class(name), to_css_class(table) 
        
                       ), 
        
                       "_rows_and_columns-row-{}-{}.html".format( 
        
                           to_css_class(name), to_css_class(table) 
        
                       ), 
        
                       "_rows_and_columns.html", 
        
                   ], 
        
                   "metadata": self.ds.metadata.get("databases", {}).get(name, {}).get( 
        
                       "tables", {} 
        
                   ).get( 
        
                       table, {} 
        
                   ), 
        
               }

With features like Facets in #255 I'm beginning to want to move more items into the template_data() - in the case of facets it's the suggested_facets array. This saves that feature from being calculated (involving several SQL queries) for the JSON case where it is unlikely to be used.

But... as an API user, I want to still optionally be able to access that information.

Solution: Add a ?_extra=suggested_facets&_extra=table_metadata argument which can be used to optionally request additional blocks to be added to the JSON API.

Then redefine as many of the current template_data() features as extra arguments instead, and teach Datasette to return certain extras by default when rendering templates.

This could allow the JSON representation to be slimmed down further (removing e.g. the table_definition and view_definition keys) while still making that information available to API users who need it.

The text was updated successfully, but these errors were encountered:

simonw · 2018-05-17T00:00:39Z

Idea: ?_extra=sqllog could output a lot of every individual SQL statement that was executed in order to generate the page - useful for seeing how foreign key expansion and faceting actually works.

simonw · 2020-09-12T18:19:50Z

Idea: ?_extra=sqllog could output a lot of every individual SQL statement that was executed in order to generate the page - useful for seeing how foreign key expansion and faceting actually works.

I built a version of that a while ago as the ?_trace=1 argument.

simonw · 2020-09-12T18:22:44Z

Are there any interesting use-cases for a plugin hook that allows plugins to define their own ?_extra= blocks?

simonw · 2020-10-20T16:32:02Z

Just realized I added an undocumented ?_extras= option to the row view years ago and forgot about it - it's not even documented. Added in a30c5b2 - https://latest.datasette.io/fixtures/attraction_characteristic/2.json?_extras=foreign_key_tables

That will need to be made consistent with the new mechanism. I think ?_extra=a&_extra=b is more consistent with other Datasette features (like ?_facet=col1&_facet=col2) but potentially quite verbose.

So I could support ?_extra=a,b,c as an alternative allowed syntax, or I could allow ?_extra=single and ?_extras=comma,separated.

I think I prefer allowing commas in ?_extra=.

simonw · 2020-10-20T22:13:01Z

In the documentation for ?_extra= I think I'll emphasize the comma-separated version of it. Also: there will be ?_extra= values which act as aliases for collection combinations - e.g. ?_extra=full will toggle everything.

simonw · 2020-10-20T22:14:37Z

I think it's worth having a plugin hook for this - it can be same hook that is used internally. Maybe register_extra - it lets you return one or more extra implementations, each with a name and an async function that gets called.

Things like suggested facets will become register_extra hooks. Maybe actual facets too?

simonw · 2020-10-20T23:41:30Z

This is now blocking simonw/datasette-graphql#61 because that issue needs a way to turn off suggested facets when retrieving the results of a table query.

simonw · 2020-10-21T00:03:18Z

I think I should prioritize the facets component of this, since that could have significant performance wins while also supporting datasette-graphql.

simonw · 2021-12-15T18:03:46Z

This is relevant to the big refactor in:

Complete refactor of TableView and table.html template #1518

simonw · 2023-01-20T05:33:00Z

I'm going to write code which parses ?_extra= in the comma separated or multiple parameter format and then looks up functions in a dictionary.

It will return an error if you ask for an extra that does not exist.

simonw · 2023-01-21T01:56:52Z

Got first prototype working using asyncinject and it's pretty nice:

diff --git a/datasette/views/table.py b/datasette/views/table.py
index ad45ecd3..c8690b22 100644
--- a/datasette/views/table.py
+++ b/datasette/views/table.py
@@ -2,6 +2,7 @@ import asyncio
 import itertools
 import json
 
+from asyncinject import Registry
 import markupsafe
 
 from datasette.plugins import pm
@@ -538,57 +539,60 @@ class TableView(DataView):
         # Execute the main query!
         results = await db.execute(sql, params, truncate=True, **extra_args)
 
-        # Calculate the total count for this query
-        count = None
-        if (
-            not db.is_mutable
-            and self.ds.inspect_data
-            and count_sql == f"select count(*) from {table_name} "
-        ):
-            # We can use a previously cached table row count
-            try:
-                count = self.ds.inspect_data[database_name]["tables"][table_name][
-                    "count"
-                ]
-            except KeyError:
-                pass
-
-        # Otherwise run a select count(*) ...
-        if count_sql and count is None and not nocount:
-            try:
-                count_rows = list(await db.execute(count_sql, from_sql_params))
-                count = count_rows[0][0]
-            except QueryInterrupted:
-                pass
-
-        # Faceting
-        if not self.ds.setting("allow_facet") and any(
-            arg.startswith("_facet") for arg in request.args
-        ):
-            raise BadRequest("_facet= is not allowed")
+        # Resolve extras
+        extras = _get_extras(request)
+        if request.args.getlist("_facet"):
+            extras.add("facet_results")
 
-        # pylint: disable=no-member
-        facet_classes = list(
-            itertools.chain.from_iterable(pm.hook.register_facet_classes())
-        )
-        facet_results = {}
-        facets_timed_out = []
-        facet_instances = []
-        for klass in facet_classes:
-            facet_instances.append(
-                klass(
-                    self.ds,
-                    request,
-                    database_name,
-                    sql=sql_no_order_no_limit,
-                    params=params,
-                    table=table_name,
-                    metadata=table_metadata,
-                    row_count=count,
-                )
+        async def extra_count():
+            # Calculate the total count for this query
+            count = None
+            if (
+                not db.is_mutable
+                and self.ds.inspect_data
+                and count_sql == f"select count(*) from {table_name} "
+            ):
+                # We can use a previously cached table row count
+                try:
+                    count = self.ds.inspect_data[database_name]["tables"][table_name][
+                        "count"
+                    ]
+                except KeyError:
+                    pass
+
+            # Otherwise run a select count(*) ...
+            if count_sql and count is None and not nocount:
+                try:
+                    count_rows = list(await db.execute(count_sql, from_sql_params))
+                    count = count_rows[0][0]
+                except QueryInterrupted:
+                    pass
+            return count
+
+        async def facet_instances(extra_count):
+            facet_instances = []
+            facet_classes = list(
+                itertools.chain.from_iterable(pm.hook.register_facet_classes())
             )
+            for facet_class in facet_classes:
+                facet_instances.append(
+                    facet_class(
+                        self.ds,
+                        request,
+                        database_name,
+                        sql=sql_no_order_no_limit,
+                        params=params,
+                        table=table_name,
+                        metadata=table_metadata,
+                        row_count=extra_count,
+                    )
+                )
+            return facet_instances
+
+        async def extra_facet_results(facet_instances):
+            facet_results = {}
+            facets_timed_out = []
 
-        async def execute_facets():
             if not nofacet:
                 # Run them in parallel
                 facet_awaitables = [facet.facet_results() for facet in facet_instances]
@@ -607,9 +611,13 @@ class TableView(DataView):
                         facet_results[key] = facet_info
                     facets_timed_out.extend(instance_facets_timed_out)
 
-        suggested_facets = []
+            return {
+                "results": facet_results,
+                "timed_out": facets_timed_out,
+            }
 
-        async def execute_suggested_facets():
+        async def extra_suggested_facets(facet_instances):
+            suggested_facets = []
             # Calculate suggested facets
             if (
                 self.ds.setting("suggest_facets")
@@ -624,8 +632,15 @@ class TableView(DataView):
                 ]
                 for suggest_result in await gather(*facet_suggest_awaitables):
                     suggested_facets.extend(suggest_result)
+            return suggested_facets
+
+        # Faceting
+        if not self.ds.setting("allow_facet") and any(
+            arg.startswith("_facet") for arg in request.args
+        ):
+            raise BadRequest("_facet= is not allowed")
 
-        await gather(execute_facets(), execute_suggested_facets())
+        # pylint: disable=no-member
 
         # Figure out columns and rows for the query
         columns = [r[0] for r in results.description]
@@ -732,17 +747,56 @@ class TableView(DataView):
             rows = rows[:page_size]
 
         # human_description_en combines filters AND search, if provided
-        human_description_en = filters.human_description_en(
-            extra=extra_human_descriptions
-        )
+        async def extra_human_description_en():
+            human_description_en = filters.human_description_en(
+                extra=extra_human_descriptions
+            )
+            if sort or sort_desc:
+                human_description_en = " ".join(
+                    [b for b in [human_description_en, sorted_by] if b]
+                )
+            return human_description_en
 
         if sort or sort_desc:
             sorted_by = "sorted by {}{}".format(
                 (sort or sort_desc), " descending" if sort_desc else ""
             )
-            human_description_en = " ".join(
-                [b for b in [human_description_en, sorted_by] if b]
-            )
+
+        async def extra_next_url():
+            return next_url
+
+        async def extra_columns():
+            return columns
+
+        async def extra_primary_keys():
+            return pks
+
+        registry = Registry(
+            extra_count,
+            extra_facet_results,
+            extra_suggested_facets,
+            facet_instances,
+            extra_human_description_en,
+            extra_next_url,
+            extra_columns,
+            extra_primary_keys,
+        )
+
+        results = await registry.resolve_multi(
+            ["extra_{}".format(extra) for extra in extras]
+        )
+        data = {
+            "ok": True,
+            "rows": rows[:page_size],
+            "next": next_value and str(next_value) or None,
+        }
+        data.update({
+            key.replace("extra_", ""): value
+            for key, value in results.items()
+            if key.startswith("extra_")
+            and key.replace("extra_", "") in extras
+        })
+        return Response.json(data, default=repr)
 
         async def extra_template():
             nonlocal sort
@@ -1334,3 +1388,11 @@ class TableDropView(BaseView):
 
         await db.execute_write_fn(drop_table)
         return Response.json({"ok": True}, status=200)
+
+
+def _get_extras(request):
+    extra_bits = request.args.getlist("_extra")
+    extras = set()
+    for bit in extra_bits:
+        extras.update(bit.split(","))
+    return extras

With that in place, http://127.0.0.1:8001/content/releases?author=25778&_size=1&_extra=count,primary_keys,columns&_facet=author returns:

{
  "ok": true,
  "rows": [
    {
      "html_url": "https://github.com/eyeseast/geocode-sqlite/releases/tag/0.1.2",
      "id": 30926270,
      "author": {
        "value": 25778,
        "label": "eyeseast"
      },
      "node_id": "MDc6UmVsZWFzZTMwOTI2Mjcw",
      "tag_name": "0.1.2",
      "target_commitish": "master",
      "name": "v0.1.2",
      "draft": 0,
      "prerelease": 1,
      "created_at": "2020-09-08T17:48:24Z",
      "published_at": "2020-09-08T17:50:15Z",
      "body": "Basic API is in place, with CLI support for Google, Bing, MapQuest and Nominatum (OSM) geocoders.",
      "repo": {
        "value": 293361514,
        "label": "geocode-sqlite"
      },
      "reactions": null,
      "mentions_count": null
    }
  ],
  "next": "30926270",
  "primary_keys": [
    "id"
  ],
  "columns": [
    "html_url",
    "id",
    "author",
    "node_id",
    "tag_name",
    "target_commitish",
    "name",
    "draft",
    "prerelease",
    "created_at",
    "published_at",
    "body",
    "repo",
    "reactions",
    "mentions_count"
  ],
  "count": 25,
  "facet_results": {
    "results": {
      "author": {
        "name": "author",
        "type": "column",
        "hideable": true,
        "toggle_url": "/content/releases?author=25778&_size=1&_extra=count%2Cprimary_keys%2Ccolumns",
        "results": [
          {
            "value": 25778,
            "label": "eyeseast",
            "count": 25,
            "toggle_url": "http://127.0.0.1:8001/content/releases?_size=1&_extra=count%2Cprimary_keys%2Ccolumns&_facet=author",
            "selected": true
          }
        ],
        "truncated": false
      }
    },
    "timed_out": []
  }
}

simonw · 2023-01-21T04:53:15Z

Implementing this to work with the .json extension is going to be a lot harder.

The challenge here is that we're working with the whole BaseView() v.s. TableView() abstraction, which I've been wanting to get rid of for a long time.

BaseView() calls .data() and expects to get back a (data, extra_template_data, templates) tuple - then if a format is in play (.json or .geojson or similar from a plugin) it hands off data to that. If .csv is involved it does something special, in order to support streaming responses. And if it's regular HTML it calls await extra_template_data() and combines that with data and passes it to the template.

I want this to work completely differently: I want the formats (including HTML) to have the option of adding some extra ?_extra= extras, then I want HTML to be able to render the page entirely from the JSON if necessary.

simonw · 2023-01-21T04:54:49Z

I pushed my prototype so far, going to start a draft PR for it.

simonw · 2023-01-21T05:35:32Z

It's annoying that the https://docs.datasette.io/en/0.64.1/plugin_hooks.html#register-output-renderer-datasette plugin hook passes rows as "list of sqlite3.Row objects" - I'd prefer it if that plugin hook worked with JSON data, not sqlite3.Row.

https://docs.datasette.io/en/0.64.1/plugin_hooks.html#render-cell-row-value-column-table-database-datasette is documented as accepting Row but actually gets CustomRow, see:

render_cell plugin hook's row object is not a sqlite.Row #1973

simonw · 2023-01-21T05:36:22Z

Maybe "rows" should be a default ?_extra=... but it should be possible to request "arrays" instead which would be a list of arrays, more suitable perhaps for custom renderers such as the CSV one.

This could be quite neat, in that EVERY key in the JSON representation would be defined as an extra - just some would be on by default. There could even be a mechanism for turning them back off again, maybe using ?_extra=-rows.

In which case maybe ?_extra= isn't actually the right name for this feature. It could be ?_key= perhaps, or ?_field=.

Being able to pass ?_field=count,-rows to get back just the count (and skip executing the count entirely) would be pretty neat.

Although ?_only=count would be tidier. So maybe the pair of ?_only= and ?_extra= would make sense.

Would ?_only=rows still return the "ok" field so you can always look at that to confirm an error didn't occur?

simonw · 2023-01-25T21:35:32Z

This issue here would benefit from some kid of mechanism for returning just the HTML of the table itself, without any of the surrounding material. I'm not sure if that would make sense as an extra or not:

Reuse existing table cell rendering logic, so render_cell() hook works datasette-search-all#17

simonw · 2023-02-05T22:57:58Z

I think that does make sense: ?_extra=table perhaps, which would add {"table": "..."}.

simonw · 2023-02-08T18:33:32Z

Just realized that it's useful to be able to tell what parameters were used to generate a page... but reflecting things like _next back in the JSON is confusing in the presence of next.

So I'm going to add an extra for that information too.

Not sure what to call it though:

params - confusing because in the code that's usually used for params passed to SQL queries
query_string - wouldn't that be a string, not params as a dictionary?

I'm going to experiment with a request extra that returns some bits of information about the request.

* Implemented ?_extra= option for JSON views, refs #262 * New dependency: asyncinject * Remove now-obsolete TableView class

simonw · 2023-03-22T22:50:30Z

I just landed this PR so this feature is now in main:

?_extra= support (draft) #1999

Still needs documentation and maybe some extra tests too.

simonw · 2023-03-29T06:22:21Z

I need to get the arbitrary query page to return the same format. It likely won't have nearly as many extras.

Closes #2135 Refs #262, #782, #1153, #1970, #2007, #2079, #2106, #2127, #2130

Also refs #262 - started a test suite for extras.

simonw mentioned this issue May 16, 2018

Facets should not execute for ?shape=array|object #263

Closed

simonw added the feature label Jun 21, 2018

simonw added this to the Next release milestone Jul 24, 2018

simonw removed this from the Next release milestone May 13, 2019

simonw changed the title ~~Add ?_extras= mechanism for requesting extra properties in JSON~~ Add ?_extra= mechanism for requesting extra properties in JSON Sep 12, 2020

simonw mentioned this issue Sep 12, 2020

Redesign default .json format #782

Closed

simonw added this to the Datasette 1.0 milestone Oct 20, 2020

simonw added plugins large refactor labels Oct 20, 2020

simonw mentioned this issue Oct 20, 2020

Switch to using datasette.client simonw/datasette-graphql#61

Closed

simonw modified the milestones: Datasette 1.0, Datasette 0.52 Nov 1, 2020

simonw mentioned this issue Nov 12, 2020

Idea: ?_extra=urls for getting back URLs to useful things #1087

Open

simonw modified the milestones: Datasette 0.52, Datasette Next Nov 28, 2020

simonw added the json-extras label Dec 15, 2021

simonw removed this from the Datasette Next milestone Jan 13, 2022

simonw added this to the Datasette 1.0 milestone Mar 15, 2022

simonw added the json-api label Mar 15, 2022

simonw mentioned this issue Apr 11, 2022

Redesigned JSON API with ?_extra= parameters #1709

Open

simonw pinned this issue Jan 19, 2023

simonw mentioned this issue Jan 20, 2023

Handle non-async functions too simonw/asyncinject#15

Closed

simonw added a commit that referenced this issue Jan 21, 2023

Initial ?_extra= prototype, refs #262

853edc8

simonw mentioned this issue Jan 21, 2023

?_extra= support (draft) #1999

Merged

simonw mentioned this issue Jan 25, 2023

Reuse existing table cell rendering logic, so render_cell() hook works simonw/datasette-search-all#17

Open

simonw added a commit that referenced this issue Feb 8, 2023

?_extra=request and ?_extra=extras, refs #262

732285c

simonw added a commit that referenced this issue Feb 8, 2023

Improved ?_extra=extras to show selected and available, refs #262

21568f9

simonw added a commit that referenced this issue Mar 22, 2023

?_extra= support and TableView refactor to table_view

d97e82d

* Implemented ?_extra= option for JSON views, refs #262 * New dependency: asyncinject * Remove now-obsolete TableView class

simonw mentioned this issue Mar 22, 2023

Gather feedback on new ?_extra= design #2042

Open

simonw added a commit that referenced this issue Mar 29, 2023

Redesign ?_extra=extras a bit, refs #262

651b78d

simonw added a commit that referenced this issue Mar 29, 2023

Updated JSON API shape documentation, refs #262

4c1e277

simonw mentioned this issue Mar 30, 2023

Custom SQL queries should use new JSON ?_extra= format #2049

Open

simonw mentioned this issue Apr 6, 2023

Make detailed notes on how table, query and row views work right now #2054

Open

simonw added a commit that referenced this issue Aug 9, 2023

Release 1.0a3

19ab455

Closes #2135 Refs #262, #782, #1153, #1970, #2007, #2079, #2106, #2127, #2130

simonw added a commit that referenced this issue Jan 8, 2024

Fix for ?_extra=columns bug, closes #2230

0b2c6a7

Also refs #262 - started a test suite for extras.

simonw added a commit that referenced this issue Jan 8, 2024

Test for ?_extra=count, refs #262

2ff4d4a

simonw mentioned this issue Apr 25, 2024

Datasette 1.0 rc tracking issue #2333

Open

8 tasks

simonw modified the milestones: Datasette 1.0, Datasette 1.0rc Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ?_extra= mechanism for requesting extra properties in JSON #262

Add ?_extra= mechanism for requesting extra properties in JSON #262

simonw commented May 16, 2018 •

edited

Loading

simonw commented May 17, 2018 •

edited

Loading

simonw commented Sep 12, 2020

simonw commented Sep 12, 2020

simonw commented Oct 20, 2020 •

edited

Loading

simonw commented Oct 20, 2020

simonw commented Oct 20, 2020

simonw commented Oct 20, 2020

simonw commented Oct 21, 2020

simonw commented Dec 15, 2021 •

edited

Loading

simonw commented Jan 20, 2023

simonw commented Jan 21, 2023

simonw commented Jan 21, 2023

simonw commented Jan 21, 2023

simonw commented Jan 21, 2023

simonw commented Jan 21, 2023 •

edited

Loading

simonw commented Jan 25, 2023

simonw commented Feb 5, 2023 •

edited

Loading

simonw commented Feb 8, 2023 •

edited

Loading

simonw commented Mar 22, 2023

simonw commented Mar 29, 2023

Add ?_extra= mechanism for requesting extra properties in JSON #262

Add ?_extra= mechanism for requesting extra properties in JSON #262

Comments

simonw commented May 16, 2018 • edited Loading

simonw commented May 17, 2018 • edited Loading

simonw commented Sep 12, 2020

simonw commented Sep 12, 2020

simonw commented Oct 20, 2020 • edited Loading

simonw commented Oct 20, 2020

simonw commented Oct 20, 2020

simonw commented Oct 20, 2020

simonw commented Oct 21, 2020

simonw commented Dec 15, 2021 • edited Loading

simonw commented Jan 20, 2023

simonw commented Jan 21, 2023

simonw commented Jan 21, 2023

simonw commented Jan 21, 2023

simonw commented Jan 21, 2023

simonw commented Jan 21, 2023 • edited Loading

simonw commented Jan 25, 2023

simonw commented Feb 5, 2023 • edited Loading

simonw commented Feb 8, 2023 • edited Loading

simonw commented Mar 22, 2023

simonw commented Mar 29, 2023

simonw commented May 16, 2018 •

edited

Loading

simonw commented May 17, 2018 •

edited

Loading

simonw commented Oct 20, 2020 •

edited

Loading

simonw commented Dec 15, 2021 •

edited

Loading

simonw commented Jan 21, 2023 •

edited

Loading

simonw commented Feb 5, 2023 •

edited

Loading

simonw commented Feb 8, 2023 •

edited

Loading