-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Historical) Discussion of theme refactoring #4
Comments
I have created the project Theme Refactor to track issues related to this. |
I do not have the bandwidth to dive deep into the logistics of fixing this, but I agree that you're pointing out one of the reasons the site build is both slow and brittle. Fixing this is worth doing; I support this effort. |
I've done a little bit of work that I want to capture; I suspect that using Bundles might not work out the way we are hoping (at least from the perspective of Page Resources, just because of how they work). What does this mean? In my head, I was thinking that what would happen is that, for example, images related to speakers, organizers, etc, would then live in the content directory for the event, rather than static (they can be accessed programmatically which is powerful). But due to how Bundles work, this will end up making the file structure of the content files really complicated, and to be honest, I suspect that at our scale, having to parse those images as Resources during the build would suck anyway. People are used to the
I'm working on this in devopsdays/devopsdays-web#9794; I'll focus more on the page content/frontmatter approach for that spike, rather than trying to get the images as part of it. |
Putting this here so I don't forget - I think it will actually work to put almost everything that is currently in Something like this, in pseudocode:
This presumes we are not using Page Bundles, or at least not Branch Bundles...if we do Leaf Bundles I think this will work. |
Thinking about
I think that we can add an optional frontmatter to Pages ( For sort order, it's harder. We either have to add something like For the latter, it would look something like this:
While I like it happening automatically, this actually solves both issues in one place. If there's a way to make the array more nested, so to speak, that would handle the "navigation needs to be off-site links" use case. For this to work, the frontmatter would be a little more complex, but not terribly hard:
In psuedocode, the navigation would be built like this:
|
Having thought about this some more, I am really wanting to focus on the idea of getting EVERYTHING (event-wise; not sure about sponsor files yet but maybe?) out of data files and into frontmatter. This will make automation a LOT easier ( I think that sponsors, even, should be able to be moved to Pages (possibly a headless bundle?) |
to update the nav conversation (in case anyone is reading this)... Navbar code:
How it looks in frontmatter:
|
Some interesting stuff in here that might be relevant: https://forestry.io/blog/data-relationships-in-hugo/
|
Most of the event-level templates are all working. There's one kind of weird thing to figure out... We have a lot of pages that are spun off to One thing we can do, possibly, is keep data files around for the "archived" pages. There are two places where we need to query information for the older stuff:
For number 1, I managed to handle this in the new event level page; it basically first loads any past events that exist in the data directory, and then it lists pages it finds in the Content. This is working not too terribly bad. For number 2, we could do something similar. For each year, it could first load any cities it finds from the data directory in that year, and then follow it with the events in Content. The trick would be to make sure that we remove data files when we migrate events. This won't be too bad, as the migration tool can take care of it (it has to import stuff from the data file, so the last part of the import could be to delete the associated data file). The only events that would migrate would be 2019 and after (previous events have been archived). The thing is, this change will fundamentally remove our ability to archive to static; I don't know if I mind this terribly, as I kind of thing that the efficiencies of the new code will make having lots of stuff in Content not as impactful as the past. We could also handle this even a little more elegantly; we could add a new parameter to the frontmatter that is If we did this, it could be part of the migration as well - the only thing we need from the Page is the year/city. I just did a test - whatever is in SOOOO what we do is move all the stuff in
|
For experimenting purposes, I went back through old commits and got copies of all the |
Note to self - before migrating the "static" events, the speaker pages need to be converted (2016 and some 2017 events used the speaker data files instead of Pages for speakers, so speakers/program need some changes). I will probably have to do this manually, but it might not be too onerous (a migrator might be tough, but maybe not terrible? I'll see if I can write something quickly...it really is just going to be reading some YAML and converting it to markdown via a template) |
I just thought about the migration for data file speakers to Pages...it will be pretty simple, but there will still need to be a manual step to go through each of the I could write something that is a later migration that goes through those files and adds the frontmatter based on the program file name, although if I do this, it has to be done on an event-by-event basis (although thinking some more; we only will run the "convert speaker data files if there is a data file", and that same function could then also call the "update the talk page associated" thing; it would be slightly tricky but doable? I think I'll try it) |
In case anyone is following this (I would be surprised if they were), I did add the function to I've very curious to see what builds look like with the new theme when we have things moved "back" out of There is still one small manual thing that has to happen...in the "old" style, the |
Here are the things I still need to update in the theme:
|
For the Program page template, I think that the easiest thing to do (and the least impactful) is to move the YAML for program out of the data file and into the frontmatter on the If we want to make enhancements to how the program template functions (later!!), the move will be to create a new template for type I did go back to the survey I ran years ago to see what people prefer, and data fields in a YAML file (for this purpose, TOML in frontmatter is just as fine) was the overwhelming preference. if we migrate the YAML "as-is" to TOML, I think the frontmatter for a program page would look something like this: Title = "Program"
Type = "program"
Description = "Program for devopsdays Chicago 2019"
icons = "TRUE"
program_elements = [
{ title = "Registration, Breakfast, Sponsors", type = "custom", start = 2019-02-15T08:00:00-06:00, end = 2019-02-15T09:00:00-06:00 },
{ title = "Opening Welcome", type = "custom", start = 2019-02-15T09:00:00-06:00, end = 2019-02-15T09:15:00-06:00 },
{ title = "jeff-smith", type = "talk", start = 2019-02-15T09:15:00-06:00, end = 2019-02-15T09:45:00-06:00 },
{ title = "Ignites", type = "ignite", start = 2019-02-15T09:45:00-06:00, end = 2019-02-15T10:15:00-06:00 },
{ title = "Registration, Breakfast, Sponsors", type = "custom", start = 2019-02-16T08:00:00-06:00, end = 2019-02-16T09:00:00-06:00 },
]
and so on...the upshot of this is that we get timezone info. But we don't necessarily have that. We might have to do something where "if we don't know the timezone from another field in the main data file, we just set it to US central" because this is only for historical programs and if the time isn't right, it's probably Not the End of the World. We could also have the following be supported: program_elements = [
{ title = "Registration, Breakfast, Sponsors", type = "custom", date = "2019-02-15",start_time = "08:00", end_time = "09:00" },
] so the template code for the program page would have to be pretty weird; it would have to check for an element that has We also need to handle devopsdays/devopsdays-web#6543 as long as we are here; I think it works the same way. If it detects any elements with |
Program page template discussion moved to devopsdays/devopsdays-web#9839 |
I've migrated almost everything properly (there are some really old events that I didn't properly get the old markdown content files for, but it's only a handful). Here are the build time analysis! New site:
old site
|
Some thoughts... the new site with (most of) the old events takes So it takes about 10 seconds longer to build now, but that is with EVERYTHING vs just 2019-2021 events. There is a HUGE improvement in the program page. Old code takes avg of 818 ms vs 8 ms for the new one. The speaker pages take about twice as long in the new code (71 ms vs 31 ms) and I'll have to look at why. That's the new heaviest page, I think, but that's in terms of there being so many of them. One thing that happened when I ran this in watch mode is that even setting max files to 65K was not enough, but this also still has all the archived static HTML files that will get deleted later, so I'm not terribly worried. I am slightly concerned about total build time; remember that the durations listed (except for the total build time) is in terms of CPU time; I am running on 8 cores. I'm tempted to push this to netlify to see how long it takes to build as a PR, but I'm fairly sure it will totally time out on upload (there are way too many files) but it should get far enough to build hugo so I can see what it does. |
Testing with a push to netlify... for comparison, the current site in Netlify comes up as taking this amount of time for the hugo build (from hugo): the newly pushed code takes this long: That's not insubstantial; for comparison: old site: 1m 36s I am considering that Netlify is probably a somewhat reasonable estimate for the lowest-end computer that an organizer would be using; 4 minutes seems waaaaaay too long. I need to consider this some more. |
I suspect that if the speaker page could be improved that might make it feasible. If we abandon the overall refactor, i do want to see if i can replicate what I did with the program page in the “old” code :) |
This is the part of the speaker page that might be the heavy part:
So it’s spinning through all the talks and then ranging over the element; I could take another swing and writing the query so that the check for speaker slug is in the first |
Another thing that might help is working with caching of partials We do this a little bit in the old site (and I am pretty sure I didn't carry this over to the new code). It also looks like you can somehow be specific about where the caching happens, i.e., should it cache across the whole site, or just in sections? For example, it might be possible to cache the |
A bit more detail here - https://regisphilibert.com/blog/2019/12/hugo-partial-series-part-1-caching-with-partialcached/ What might be helpful is if it's possible to cache based on the path/fildir; that would let us do a lot of caching per-event.
|
I don't know if this will help. I just set the sponsors partial to be cached (globally, which we wouldn't do, but it's the most aggressive) on the speaker page and this is the difference: before:
after:
That did cut off about 10ms per, which times 3000 executions isn't minimal. But again, it wouldn't be that good in reality (as the sponsor partial wouldn't cache across all of them) I do wonder if optimizing across ALL pages, even small amounts, would end up with a cumulative improvement. |
Moving the sponsors partial to
The key difference is that |
If the netlify build looks better (not perfect) I think going through all the partials (head/footer ones and their component partials) and adding caching in the same way will make a non-trivial improvement as well. |
(I also discovered a fun bug in the current site - the |
Caching the sponsors partial per-event resulted in a netlify hugo build time of: compared to before it was 4m 23s So that just cut over a minute off of the netlify build. This is promising. |
If we want to start doing more aggressive caching of the
However, it would end up with all pages of type "Talk" with the same header, which while the stuff that The downside of that is that |
This definitely is making this heavier; when I remove this part, it goes from 57 ms to 1 ms, which across all the speakers, is a big deal. Hmm. |
OK, this generally just sucks to have to go pull all the talks (even just the ones for the speaker). I modified the range statement so it only pulls talks from that event if the filename is
removing the "list the talks for this speaker" section results in this:
Hmm. |
the previous speaker page took old code:
new code:
I don't see how the new code is really any different? What am I missing? |
the only thing I can imagine is that the first I could test it by taking an event and making the "type" something other than NOTE: it does not help. I narrowed it down and it comes back this way:
|
I did some caching on the here's what it looks like now:
vs previous:
it definitely cut the time that the particular partial was executed, which is good. it cuts it in half, but that partial doesn't look like it was terribly heavy before? Although having it execute much less frequently probably helps. I'll try a netlify deploy. |
with the new caching, the netlify build is now:
so...uh...it got ... worse? |
Closing the issue, but keeping it around for historical reasons. discussion moved to #6 |
NOTE: This issue was originally opened on devopsdays/devopsdays-web and it shared here for historical/reference purposes. None of these are decisions or even what we should be doing. Just reference!
I might change the title of this issue at some point. What I am looking to do here is get some work in place for a major refactor of the
devopsdays-theme
theme. The reason that I think this is potentially a reasonable time to start considering this is that 2020 will have a drastically smaller amount of events than usual, so it's relatively "safe" to make breaking changes at this time.A few things that come to mind:
Treat everything like a Page
Much of the logic/structure in this theme is heavily dependent upon data files, and a lot of inefficient queries. Modern Hugo treats every element as a Page object, which drives more of the information into the frontmatter elements instead of where we do it with a lot of looping queries based upon file locations. I suspect we could get more efficiency by pushing things down to the Page levels
Page Bundles
If we refactor how content is organized, we can leverage Page Bundles, which would allow us to group resources associated with an event in one place (and which gives us the ability to do some clever things with processing the files in the Hugo part itself, rather than trying to do things in build pipelines, etc). Here is a helpful post with some pros/cons to bundles vs static folder.
File structure
I have made the argument before that the filesystem for a lot of this is unwieldy. Of course changes to the paths for events would need to be handled with redirects or alias so that old URLs don't break. But for example, I see something like this:
So what you have there is a generated path of
devopsdays.org/events/2020/ponyville
which would want to make sure to have redirects created for backward-compatibility so thatdevopsdays.org/events/2020-ponyville
still works.I would suggest a similar refactor of the
data
directory as well, which makes it easier to navigate. I am not sure what we would have to do to refactor this.Migrate from data files to Page frontmatter
This is a big one, and it might not work out super well. Much of what is in the
data/events/YYYY-CITY.yml
files could be moved to frontmatter for the Pages; the challenge I see is there is data that is used on multiple pages (although TBH there isn't that much). Data fields that are used by more than one page are:city
- this is the "friendly" name of the city. It's used in a lot of places, I think.year
andname
- these are used in a lot of places, but to construct queries. They probably aren't needed anymore, and if they are needed, can be obtained via the file path.ga_tracking_id
- the Google Analytics UA ID. This is used (when set) on every page for the event.cancel
- while not really used by pages themselves, used for queries. TBH this probably is handled in frontmatter of theindex.md
page for the event and queries will hit that, the same way they use the dates, etc.startdate
, etc - there are multiple date fields which are used on event pages, but also in "top level" page queries (open CFP, etc). The start/end date fields could be inindex.md
and the cfp related ones inpropose.md
, etc. I think we could probably figure out how to query against that (pseudocode - "select all pages of type propose where cfp end date is greater than now", etc)Navigation is currently handled in the data file; instead it could simply be built by iterating through the pages in the event's directory. If someone wanted a link that wasn't to a Page, we could provide docs on how to do that (you would actually create a page for it, with no content, and maybe of the type "redirect" with frontmatter of the actual URL, and it would do two things: one, if someone hit that path somehow, it would redirect there, but the menu building logic could be like "range through all the pages; if it's type 'redirect' then make a different URL, but otherwise pull the name or something to make the menu")
Sponsor stuff is currently all in the data file; this one would be trickier. I don't have any ideas for this yet.
The other data elements are all very page-specific and would be able to be moved to frontmatter.
Change Program template to use Pages
I believe that the data file concept for the Program makes is a lot more unwieldy. The program could be dynamically generated based on frontmatter elements (which the current script actually creates, but we don't use), i.e., the program element pages all have a date, start, and end time, and they get collected into one place, sorted, displayed, etc. This makes it a lot easier to create some custom program elements (you just create a Page for them).
Downside of that is things that have to go on the program that you repeat a lot, but don't want to create a whole Page element just for them (like breaks).
Maybe the Program gets actually done via shortcodes in the markdown; you have a list of the things for the day, and the shortcode takes arguments to turn it into what needs to be done. So, the
program.md
would look like this:Upon reflection, I can't see how that actually works, because of HTML. But I'm keeping the reference so that we don't think it's a good idea in the future.
Another possiblity with the "get all the program elements from frontmatter of the pages in the
program
directory" is that the pages could support more than one start/end time (for the custom stuff like breaks).It's a not trivial problem, that's for sure.
A simpler mechanism might be to create a script that will generate the program for you, based upon prompts, and it outputs it as HTML to the
program.md
page. The good part of this idea is that it's VERY easy to customize it later by hacking the HTML (if you wanted to do fancy things/stuff not supported by the default template, like having multiple blocks of ignites, etc).Downside is that the script has to be where we make changes if we want to improve the styling, etc, of the program.
The other thing we could do is keep all the program layout in the markdown file, and just have the generated program be a "sample" and people have to do it manually. That's not great either. I think that the happy medium is to have a templated markdown file that people can use as inspiration, but also provide a script to make one based upon prompts.
The text was updated successfully, but these errors were encountered: