[Investigation] See if there are optimizations we can make in our use of Core Data to improve app performance #1443

setch-l · 2024-08-30T17:00:31Z

We've identified several critical areas of the app that we cannot continue to break. The goal of this ticket is to spend some time investigating our use of Core Data to see if there are optimizations we can make to improve performance in critical areas or in the app overall.

Some options for Deliverables:

List of improvements in how we use core data (feel free to create tickets for the work)
Your recommendation for how the improvements should be prioritized
List of specific questions we can ask Apple Developer support as this is a service we pay for

bryanmontz · 2024-09-08T13:04:17Z

I have noticed and fixed the following issues as a result of this investigation (#1471):

This code: Event().createIfNecessary(jsonEvent... is creating Events and throwing them away immediately and causing the console error message "Failed to call designated initializer on NSManagedObject". The resolution is to make the createIfNecessary... function static.
Adding the launch argument -com.apple.CoreData.ConcurrencyDebug 1 reveals that Event.loadFirstQuotedNote() is creating an event on the wrong thread. The resolution is to move the Event.findOrCreateStubBy... call to inside the context.perform call below. Reference https://useyourloaf.com/blog/debugging-core-data/.
In RelayService.retryFailedPublishes(), a JSONEvent was being creating inside a nested loop. This caused needless extra work because it could be created only in the outer loop.

bryanmontz · 2024-09-08T13:08:48Z

Suggestions from this investigation that I wanted to get feedback on from other devs before creating tickets or implementing:

All occurrences of Event.findOrCreateStubBy must be called on the thread associated with the context. For viewContext this can be by making sure we are on the main thread with @MainActor, but better would be to always make sure we are inside a context.perform when creating managed objects. The same applies to every Author.findOrCreate(by:context:) and Relay.findOrCreate(by:context:) call. The app needs an audit to confirm the correctness of these calls.
DatabaseCleaner could be run at the end of app sessions rather than at the beginning of sessions. This would need some testing, but it could make app launch feel faster to the user.
I recommend that we leave -com.apple.CoreData.ConcurrencyDebug 1 on all the time so that we get assertions during development. With this applied, you can see this message in the console on every run, and the app will stop if database changes are made on the wrong thread:
"CoreData: annotation: Core Data multi-threading assertions enabled."
Shouldn't Event.markSeen(on:) save after making changes?
The documentation comment for Event.stub() indicates that it should reset all properties other than the identifier. However, there are several missing ones. Shouldn't they be reset as well?
expirationDate, kind, replaceableIdentifier, referencingEvents, isRead
Move all the functions in Event+CoreDataClass that create predicates and fetch requests to an extension in a separate file. Move all the functions that have to do with hydrating events to another file.
Rename Event.stub() to Event.resetToStub() because the word "stub" is both a noun and a verb and "reset to stub" more clearly indicates what the function does.
There are a few cases of SwiftUI views publishing events to relays. In my opinion, it would improve separation of concerns, separation of view and business logic, testability, and reusability to encapsulate publishing specific types of events within some namespace, such as a protocol extension (EventPublishing) or a caseless enum (EventPublisher).

mplorentz · 2024-09-09T19:06:17Z

Thanks Bryan! It's great to have fresh eyes on this stuff. Responding to each of these individually:

1. All occurrences of `Event.findOrCreateStubBy` must be called on the thread associated with the context. For `viewContext` this can be by making sure we are on the main thread with `@MainActor`, but better would be to always make sure we are inside a `context.perform` when creating managed objects. The same applies to every `Author.findOrCreate(by:context:)` and `Relay.findOrCreate(by:context:)` call. The app needs an audit to confirm the correctness of these calls.

This is one of my least favorite parts of Core Data, as we have no way for the compiler to enforce the thread isolation requirements of NSManagedObjectContexts. Except now we do have actor isolation in Swift 5. @joshuatbrown and I were noodling on whether it would be possible to isolate each of our background contexts to its own @globalActor actor somehow. I'm also curious if SwiftData makes any improvements here (although that's probably out of scope, we have #971 for that).

Can you do some research to see if there is prior art here? Surely someone has tried actor-isolating contexts or maybe people have done crazy things with property wrappers or macros that can help us here.

We have done audits for this in the past but a pre-launch one sounds like a good idea. Can you write up a ticket for that?

2. `DatabaseCleaner` could be run at the end of app sessions rather than at the beginning of sessions. This would need some testing, but it could make app launch feel faster to the user.

I am actually exploring this now as part of #1426. The main problem I am running into is that the app can terminate before the cleanup has finished, but it might be a good part of the overall solution.

3. I recommend that we leave `-com.apple.CoreData.ConcurrencyDebug 1` on all the time so that we get assertions during development. With this applied, you can see this message in the console on every run, and the app will stop if database changes are made on the wrong thread:
   "CoreData: annotation: Core Data multi-threading assertions enabled."

I actually did turn this on in the past, but it really hurt the scroll performance in the simulator, and sometimes caused long freezes. Aside from being annoying it also makes it harder to notice regressions in scroll performance during development. However, now that we are prioritizing #1459 we won't need to rely on subjective feelings of scroll performance so much. I'm game to try this again and if it's too annoying we'll just turn it off again.

We have a Nos Diagnostics scheme that builds the app with this flag as well as -com.apple.CoreData.SQLDebug 1. The SQL debug flag is an even worse performance killer though so I don't run that scheme very often.

4. Shouldn't `Event.markSeen(on:)` save after making changes?

No, I don't think so. In general I think it's a best practice not to save inside NSManagedObject functions. This gives callers the freedom to use NSManagedObjectContexts as a scratch pad and they get to decide when to commit the changes. Also this function is only called inside the parsing loop, which has been carefully calibrated to only save periodically, because saving is somewhat expensive for the main thread as it tries to figure out what objects changed and what views need to be updated.

5. The documentation comment for `Event.stub()` indicates that it should reset all properties other than the `identifier`. However, there are several missing ones. Shouldn't they be reset as well?
   `expirationDate`, `kind`, `replaceableIdentifier`, `referencingEvents`, `isRead`

Good question. I think I left some of those there on purpose, and others were added later. I could go either way on this. I guess I would rather just update the doc comment for now. Could you go ahead and do that? Some of those properties are nice to keep around, and as long as we delete enough stuff to make isStub return true then I think we are good.

Move all the functions in Event+CoreDataClass that create predicates and fetch requests to an extension in a separate file. Move all the functions that have to do with hydrating events to another file.

This sounds good to me. Feel free to make that change if it won't take long or file another ticket.

7. Rename `Event.stub()` to `Event.resetToStub()` because the word "stub" is both a noun and a verb and "reset to stub" more clearly indicates what the function does.

Great point. Go ahead!

8. There are a few cases of SwiftUI views publishing events to relays. In my opinion, it would improve separation of concerns, separation of view and business logic, testability, and reusability to encapsulate publishing specific types of events within some namespace, such as a protocol extension (`EventPublishing`) or a caseless enum (`EventPublisher`).

Agreed! I think @martindsq actually just created type like "EventPublisher" in a recent PR. Maybe you could file a separate ticket for this one as its some heavier refactoring and it would be a good opportunity to write some tests for these.

bryanmontz · 2024-09-10T12:49:49Z

Thanks, @mplorentz!

Ticket for Core Data context audit:
Core Data context audit #1481

I didn't find too much about actor-isolation used with Core Data. Here's a Swift forums post where someone was struggling with it a bit: https://forums.swift.org/t/using-nsmanagedobjectcontext-background-context-with-actors/72546
Probably still worth trying. SwiftData makes me nervous because it's so new, and I haven't personally tried it yet. It's probably the best choice long term though.

Ah! I hadn't noticed the Nos Diagnostics scheme.
Yeah, good point.

5/6/7. I'll put these in a follow-up PR to avoid adding to an already-approved PR.

Ticket for refactoring event publishing:
Refactor publishing events out of views #1482

fixes and improvements related to Core Data usage #1443

minor refactor of Event+CoreDataClass #1443

mplorentz · 2024-09-17T18:59:26Z

I filed #1515 to look into actor-isolating Core Data contexts in the future.

setch-l · 2024-12-17T18:59:10Z

@bryanmontz @mplorentz - Is this ticket done and a new one has been created for the remaining work?

bryanmontz · 2024-12-19T13:13:05Z

@setch-l Yes, I believe this one could be closed.

setch-l added this to Nos Product Board Aug 30, 2024

setch-l converted this from a draft issue Aug 30, 2024

setch-l assigned bryanmontz Aug 30, 2024

bryanmontz added a commit that referenced this issue Sep 8, 2024

fixes and improvements related to Core Data usage #1443

83ad516

bryanmontz mentioned this issue Sep 8, 2024

fixes and improvements related to Core Data usage #1443 #1471

Merged

github-merge-queue bot pushed a commit that referenced this issue Sep 11, 2024

Merge pull request #1471 from planetary-social/bdm/1443

045b186

fixes and improvements related to Core Data usage #1443

bryanmontz added a commit that referenced this issue Sep 11, 2024

minor refactor of Event+CoreDataClass #1443

336a4c3

bryanmontz mentioned this issue Sep 11, 2024

minor refactor of Event+CoreDataClass #1443 #1494

Merged

github-merge-queue bot pushed a commit that referenced this issue Sep 11, 2024

Merge pull request #1494 from planetary-social/bdm/1443-cd-follow-up

343d5ca

minor refactor of Event+CoreDataClass #1443

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Investigation] See if there are optimizations we can make in our use of Core Data to improve app performance #1443

[Investigation] See if there are optimizations we can make in our use of Core Data to improve app performance #1443

setch-l commented Aug 30, 2024

bryanmontz commented Sep 8, 2024

bryanmontz commented Sep 8, 2024

mplorentz commented Sep 9, 2024

bryanmontz commented Sep 10, 2024

mplorentz commented Sep 17, 2024

setch-l commented Dec 17, 2024

bryanmontz commented Dec 19, 2024

[Investigation] See if there are optimizations we can make in our use of Core Data to improve app performance #1443

[Investigation] See if there are optimizations we can make in our use of Core Data to improve app performance #1443

Comments

setch-l commented Aug 30, 2024

bryanmontz commented Sep 8, 2024

bryanmontz commented Sep 8, 2024

mplorentz commented Sep 9, 2024

bryanmontz commented Sep 10, 2024

mplorentz commented Sep 17, 2024

setch-l commented Dec 17, 2024

bryanmontz commented Dec 19, 2024