Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Investigation] See if there are optimizations we can make in our use of Core Data to improve app performance #1443

Open
setch-l opened this issue Aug 30, 2024 · 7 comments
Assignees

Comments

@setch-l
Copy link

setch-l commented Aug 30, 2024

We've identified several critical areas of the app that we cannot continue to break. The goal of this ticket is to spend some time investigating our use of Core Data to see if there are optimizations we can make to improve performance in critical areas or in the app overall.

Some options for Deliverables:

  • List of improvements in how we use core data (feel free to create tickets for the work)
  • Your recommendation for how the improvements should be prioritized
  • List of specific questions we can ask Apple Developer support as this is a service we pay for
@bryanmontz
Copy link
Contributor

I have noticed and fixed the following issues as a result of this investigation (#1471):

  1. This code: Event().createIfNecessary(jsonEvent... is creating Events and throwing them away immediately and causing the console error message "Failed to call designated initializer on NSManagedObject". The resolution is to make the createIfNecessary... function static.

  2. Adding the launch argument -com.apple.CoreData.ConcurrencyDebug 1 reveals that Event.loadFirstQuotedNote() is creating an event on the wrong thread. The resolution is to move the Event.findOrCreateStubBy... call to inside the context.perform call below. Reference https://useyourloaf.com/blog/debugging-core-data/.

  3. In RelayService.retryFailedPublishes(), a JSONEvent was being creating inside a nested loop. This caused needless extra work because it could be created only in the outer loop.

@bryanmontz
Copy link
Contributor

Suggestions from this investigation that I wanted to get feedback on from other devs before creating tickets or implementing:

  1. All occurrences of Event.findOrCreateStubBy must be called on the thread associated with the context. For viewContext this can be by making sure we are on the main thread with @MainActor, but better would be to always make sure we are inside a context.perform when creating managed objects. The same applies to every Author.findOrCreate(by:context:) and Relay.findOrCreate(by:context:) call. The app needs an audit to confirm the correctness of these calls.

  2. DatabaseCleaner could be run at the end of app sessions rather than at the beginning of sessions. This would need some testing, but it could make app launch feel faster to the user.

  3. I recommend that we leave -com.apple.CoreData.ConcurrencyDebug 1 on all the time so that we get assertions during development. With this applied, you can see this message in the console on every run, and the app will stop if database changes are made on the wrong thread:
    "CoreData: annotation: Core Data multi-threading assertions enabled."

  4. Shouldn't Event.markSeen(on:) save after making changes?

  5. The documentation comment for Event.stub() indicates that it should reset all properties other than the identifier. However, there are several missing ones. Shouldn't they be reset as well?
    expirationDate, kind, replaceableIdentifier, referencingEvents, isRead

  6. Move all the functions in Event+CoreDataClass that create predicates and fetch requests to an extension in a separate file. Move all the functions that have to do with hydrating events to another file.

  7. Rename Event.stub() to Event.resetToStub() because the word "stub" is both a noun and a verb and "reset to stub" more clearly indicates what the function does.

  8. There are a few cases of SwiftUI views publishing events to relays. In my opinion, it would improve separation of concerns, separation of view and business logic, testability, and reusability to encapsulate publishing specific types of events within some namespace, such as a protocol extension (EventPublishing) or a caseless enum (EventPublisher).

@mplorentz
Copy link
Member

Thanks Bryan! It's great to have fresh eyes on this stuff. Responding to each of these individually:

1. All occurrences of `Event.findOrCreateStubBy` must be called on the thread associated with the context. For `viewContext` this can be by making sure we are on the main thread with `@MainActor`, but better would be to always make sure we are inside a `context.perform` when creating managed objects. The same applies to every `Author.findOrCreate(by:context:)` and `Relay.findOrCreate(by:context:)` call. The app needs an audit to confirm the correctness of these calls.

This is one of my least favorite parts of Core Data, as we have no way for the compiler to enforce the thread isolation requirements of NSManagedObjectContexts. Except now we do have actor isolation in Swift 5. @joshuatbrown and I were noodling on whether it would be possible to isolate each of our background contexts to its own @globalActor actor somehow. I'm also curious if SwiftData makes any improvements here (although that's probably out of scope, we have #971 for that).

Can you do some research to see if there is prior art here? Surely someone has tried actor-isolating contexts or maybe people have done crazy things with property wrappers or macros that can help us here.

We have done audits for this in the past but a pre-launch one sounds like a good idea. Can you write up a ticket for that?

2. `DatabaseCleaner` could be run at the end of app sessions rather than at the beginning of sessions. This would need some testing, but it could make app launch feel faster to the user.

I am actually exploring this now as part of #1426. The main problem I am running into is that the app can terminate before the cleanup has finished, but it might be a good part of the overall solution.

3. I recommend that we leave `-com.apple.CoreData.ConcurrencyDebug 1` on all the time so that we get assertions during development. With this applied, you can see this message in the console on every run, and the app will stop if database changes are made on the wrong thread:
   "CoreData: annotation: Core Data multi-threading assertions enabled."

I actually did turn this on in the past, but it really hurt the scroll performance in the simulator, and sometimes caused long freezes. Aside from being annoying it also makes it harder to notice regressions in scroll performance during development. However, now that we are prioritizing #1459 we won't need to rely on subjective feelings of scroll performance so much. I'm game to try this again and if it's too annoying we'll just turn it off again.

We have a Nos Diagnostics scheme that builds the app with this flag as well as -com.apple.CoreData.SQLDebug 1. The SQL debug flag is an even worse performance killer though so I don't run that scheme very often.

4. Shouldn't `Event.markSeen(on:)` save after making changes?

No, I don't think so. In general I think it's a best practice not to save inside NSManagedObject functions. This gives callers the freedom to use NSManagedObjectContexts as a scratch pad and they get to decide when to commit the changes. Also this function is only called inside the parsing loop, which has been carefully calibrated to only save periodically, because saving is somewhat expensive for the main thread as it tries to figure out what objects changed and what views need to be updated.

5. The documentation comment for `Event.stub()` indicates that it should reset all properties other than the `identifier`. However, there are several missing ones. Shouldn't they be reset as well?
   `expirationDate`, `kind`, `replaceableIdentifier`, `referencingEvents`, `isRead`

Good question. I think I left some of those there on purpose, and others were added later. I could go either way on this. I guess I would rather just update the doc comment for now. Could you go ahead and do that? Some of those properties are nice to keep around, and as long as we delete enough stuff to make isStub return true then I think we are good.

  1. Move all the functions in Event+CoreDataClass that create predicates and fetch requests to an extension in a separate file. Move all the functions that have to do with hydrating events to another file.

This sounds good to me. Feel free to make that change if it won't take long or file another ticket.

7. Rename `Event.stub()` to `Event.resetToStub()` because the word "stub" is both a noun and a verb and "reset to stub" more clearly indicates what the function does.

Great point. Go ahead!

8. There are a few cases of SwiftUI views publishing events to relays. In my opinion, it would improve separation of concerns, separation of view and business logic, testability, and reusability to encapsulate publishing specific types of events within some namespace, such as a protocol extension (`EventPublishing`) or a caseless enum (`EventPublisher`).

Agreed! I think @martindsq actually just created type like "EventPublisher" in a recent PR. Maybe you could file a separate ticket for this one as its some heavier refactoring and it would be a good opportunity to write some tests for these.

@bryanmontz
Copy link
Contributor

Thanks, @mplorentz!

  1. Ticket for Core Data context audit:
    Core Data context audit #1481

I didn't find too much about actor-isolation used with Core Data. Here's a Swift forums post where someone was struggling with it a bit: https://forums.swift.org/t/using-nsmanagedobjectcontext-background-context-with-actors/72546
Probably still worth trying. SwiftData makes me nervous because it's so new, and I haven't personally tried it yet. It's probably the best choice long term though.

  1. Ah! I hadn't noticed the Nos Diagnostics scheme.

  2. Yeah, good point.

5/6/7. I'll put these in a follow-up PR to avoid adding to an already-approved PR.

  1. Ticket for refactoring event publishing:
    Refactor publishing events out of views #1482

github-merge-queue bot pushed a commit that referenced this issue Sep 11, 2024
fixes and improvements related to Core Data usage #1443
github-merge-queue bot pushed a commit that referenced this issue Sep 11, 2024
@mplorentz
Copy link
Member

I filed #1515 to look into actor-isolating Core Data contexts in the future.

@setch-l
Copy link
Author

setch-l commented Dec 17, 2024

@bryanmontz @mplorentz - Is this ticket done and a new one has been created for the remaining work?

@bryanmontz
Copy link
Contributor

@setch-l Yes, I believe this one could be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Sprint
Development

No branches or pull requests

3 participants