Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use MemoryOnlyProvider as fallback to allow the app to work without storage #439

Closed

Conversation

chrispader
Copy link
Contributor

@chrispader chrispader commented Dec 21, 2023

@marcaaron @tgolen

Details

This PR adds a MemoryOnlyProvider fallback solution and changes the interface of Storage (lib/storage/index.js) to allow multiple simultaneous storage providers.

This PR also improves the structure of everything in lib/storage and establishes a consistent structure for providers and order for all the functions within them.

Related Issues

Expensify/App#29403

Automated Tests

This PR changes the way we use the storage but the functionality of the library to the outer world is the same. Therefore no new tests were added.

Manual Tests

Verify that no flows and functionality were broken by the changes. Check for console errors regarding Onyx.

Author Checklist

  • I linked the correct issue in the ### Related Issues section above
  • I wrote clear testing steps that cover the changes made in this PR
    • I added steps for local testing in the Tests section
    • I tested this PR with a High Traffic account against the staging or production API to ensure there are no regressions (e.g. long loading states that impact usability).
  • I included screenshots or videos for tests on all platforms
  • I ran the tests on all platforms & verified they passed on:
    • Android / native
    • Android / Chrome
    • iOS / native
    • iOS / Safari
    • MacOS / Chrome / Safari
    • MacOS / Desktop
  • I verified there are no console errors (if there's a console error not related to the PR, report it or open an issue for it to be fixed)
  • I followed proper code patterns (see Reviewing the code)
    • I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e. toggleReport and not onIconClick)
    • I verified that the left part of a conditional rendering a React component is a boolean and NOT a string, e.g. myBool && <MyComponent />.
    • I verified that comments were added to code that is not self explanatory
    • I verified that any new or modified comments were clear, correct English, and explained "why" the code was doing something instead of only explaining "what" the code was doing.
    • I verified proper file naming conventions were followed for any new files or renamed files. All non-platform specific files are named after what they export and are not named "index.js". All platform-specific files are named for the platform the code supports as outlined in the README.
    • I verified the JSDocs style guidelines (in STYLE.md) were followed
  • If a new code pattern is added I verified it was agreed to be used by multiple Expensify engineers
  • I followed the guidelines as stated in the Review Guidelines
  • I tested other components that can be impacted by my changes (i.e. if the PR modifies a shared library or component like Avatar, I verified the components using Avatar are working as expected)
  • I verified all code is DRY (the PR doesn't include any logic written more than once, with the exception of tests)
  • I verified any variables that can be defined as constants (ie. in CONST.js or at the top of the file that uses the constant) are defined as such
  • I verified that if a function's arguments changed that all usages have also been updated correctly
  • If a new component is created I verified that:
    • A similar component doesn't exist in the codebase
    • All props are defined accurately and each prop has a /** comment above it */
    • The file is named correctly
    • The component has a clear name that is non-ambiguous and the purpose of the component can be inferred from the name alone
    • The only data being stored in the state is data necessary for rendering and nothing else
    • If we are not using the full Onyx data that we loaded, I've added the proper selector in order to ensure the component only re-renders when the data it is using changes
    • For Class Components, any internal methods passed to components event handlers are bound to this properly so there are no scoping issues (i.e. for onClick={this.submit} the method this.submit should be bound to this in the constructor)
    • Any internal methods bound to this are necessary to be bound (i.e. avoid this.submit = this.submit.bind(this); if this.submit is never passed to a component event handler like onClick)
    • All JSX used for rendering exists in the render method
    • The component has the minimum amount of code necessary for its purpose, and it is broken down into smaller components in order to separate concerns and functions
  • If any new file was added I verified that:
    • The file has a description of what it does and/or why is needed at the top of the file if the code is not self explanatory
  • If the PR modifies a generic component, I tested and verified that those changes do not break usages of that component in the rest of the App (i.e. if a shared library or component like Avatar is modified, I verified that Avatar is working as expected in all cases)
  • If the main branch was merged into this PR after a review, I tested again and verified the outcome was still expected according to the Test steps.
  • I have checked off every checkbox in the PR author checklist, including those that don't apply to this PR.

Screenshots/Videos

Android: Native
Android: mWeb Chrome
iOS: Native
iOS: mWeb Safari
MacOS: Chrome / Safari
Screen.Recording.2023-12-27.at.14.12.34.mov
MacOS: Desktop

@chrispader chrispader requested review from marcaaron and tgolen January 8, 2024 15:56
@tgolen
Copy link
Collaborator

tgolen commented Jan 16, 2024

I'm going to simplify this PR and remove and unnecessary overhead for now and just use the NoopProvider instead of actually using a working fallback provider.

@chrispader Is this ready to review again? I wasn't sure and I don't want to be the hold-up here.

@chrispader
Copy link
Contributor Author

I'm going to simplify this PR and remove and unnecessary overhead for now and just use the NoopProvider instead of actually using a working fallback provider.

@chrispader Is this ready to review again? I wasn't sure and I don't want to be the hold-up here.

ah no, not yet. Going to finish this this week!

@chrispader
Copy link
Contributor Author

@tgolen @marcaaron i just removed the additional "multi-storage" logic and just fallback to the NoopProvider in case the main platform storage provider couldn't be initialised.

This whole re-structuring and cleanup as well as this extra "storage" layer comes in handy though, since we can now handle the initialisation phase first, and only then start handling operations.

I'm going to test this in the app now, but make sure to review this PR, as it should be ready!

cache.set(key, value);
keyChanged(key, value);
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: Why was this moved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to keep the code that basically initiates the Storage layer together. Technically, it makes no difference

* @return {Promise<*>}
*/
getItem(key) {
return runAfterInit(() => provider.getItem(key));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confusion: Why do we need runAfterInit()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since i moved the initialization of the provider to provider.init(), we need this, to prevent any operations from running before the storage provider actually got intialized/created

init() {
const newIdbKeyValStore = createStore('OnyxDB', 'keyvaluepairs');

if (newIdbKeyValStore == null) throw Error('IDBKeyVal store could not be created');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What problem specifically does this solve? Do we have any documented cases where createStore() returned null? What was the problem we saw? What about when an irrecoverable error happens when trying to save after initializing the store successfully?

(that second situation is why this issue was created in the first place - and I'm not convinced we have solved it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Currently i assumed that this or any other error would only occur during initialization.

I updated the PR to handle errors during runtime as well and gracefully degrade the performance of the app. Lmk what you think of the general approach, before actually reviewing it :)

It's still very useful to have this extra storage layer and a cleaned up provider structure.

@chrispader
Copy link
Contributor Author

One more time, i'm asking for you're opinions on this new approach. @marcaaron @tgolen

(no need to completely review yet, i'm just curious what you think :))

The current approach handles errors during initialization and runtime, while also utilizing the extra storage layer and the cleaned up provider structure.

Still can't really test this in a real-life scenario, since we don't know how to trigger this Internal error opening backing store for indexedDB.open error, but it basically works with any error thrown for any operation.

As In the tryOrDegradePerformance function we could additionally check for specific errors. As long as we don't know which errors exactly to expect, we can start adding all the problematic errors to a list of error types and check for them in the functions catch block...

@chrispader chrispader requested a review from marcaaron January 23, 2024 17:44
@tgolen
Copy link
Collaborator

tgolen commented Jan 23, 2024

it basically works with any error thrown for any operation.

My only concern with this is the chance of false positives. It seems like we are struggling between the extremes of not knowing which errors get thrown or only knowing one error that gets thrown with no way to reproduce it.

Maybe this hints that we need better error reporting first so that we can make a better assumption about which errors are actually occurring.

@chrispader
Copy link
Contributor Author

it basically works with any error thrown for any operation.

My only concern with this is the chance of false positives. It seems like we are struggling between the extremes of not knowing which errors get thrown or only knowing one error that gets thrown with no way to reproduce it.

Yes you're right, it's either this or that extreme case. I don't see any other way right now to check for these errors, unless we're able to confidently reproduce it.

After doing some internet research, i'm like 80% sure the Internal error opening backing store for indexedDB.open error described in Expensify/App#29403 comes from the storage being full.

Maybe this hints that we need better error reporting first so that we can make a better assumption about which errors are actually occurring.

My suggestion here would be to basically have this error check for initialization and every operation (tryOrDegradePerformance) and only degrade performance if we detect a known error that will prevent the storage from keep working. This way, we can easily add more errors on the go.

There really is no official way from idb-keyval or IndexedDB to detect these kind of errors and handle them. I think the current approach is the most generic and will fix this issue at least for the known error.

@marcaaron
Copy link
Contributor

On the subject of error reporting... I worked on improving it a bit in the past and an internal log search will show us when storage is failing and why it failed (some explanations more cryptic than others).

This will give us the ability to catch something like a "critical" vs. "benign" loss of client data (if there is such a thing).

With reliable updates, it seems extremely important to ensure storage failures don't lead to data loss at all e.g. if you are able to store one recent update, but for some reason could not store some previous update - the previous update is now lost forever. Maybe I am wrong, but the current system we have depends on a near perfect reliability. Anytime we fail to store something the update chain will essentially be broken.

So, I might actually argue that we should always stop storing things immediately when any error happens and address these errors with urgency.

@chrispader
Copy link
Contributor Author

On the subject of error reporting... I worked on improving it a bit in the past and an internal log search will show us when storage is failing and why it failed (some explanations more cryptic than others).

This will give us the ability to catch something like a "critical" vs. "benign" loss of client data (if there is such a thing).

unfortunately i don't have access to that link.. are you referring to some sort of analytics where we collect data about (the type of) errors occurring (in production)?

With reliable updates, it seems extremely important to ensure storage failures don't lead to data loss at all e.g. if you are able to store one recent update, but for some reason could not store some previous update - the previous update is now lost forever. Maybe I am wrong, but the current system we have depends on a near perfect reliability. Anytime we fail to store something the update chain will essentially be broken.

So, I might actually argue that we should always stop storing things immediately when any error happens and address these errors with urgency.

If i understand you correctly, this means we would need to implement some sort of "transaction" system, to basically rollback changes if there an error occurs later in a chain of operations?

We can definitely check for initialization errors - probably like the original one described in the issue - and degrade performance before any operations are even triggered.

But if a fatal error occurs at some other point during Onyx's lifetime, we can only degrade performance from there... meaning after the failing operation. The data will still be stored in cache (in-memory), though that data will obviously be lost after a refresh. Without some other persisted fallback storage solution, i'm not sure if we can prevent this sort of data loss.

Maybe i got something wrong, because i'm not totally sure what you are referring to here and if it isn't extending the scope of this PR/issue massively.

@chrispader
Copy link
Contributor Author

JFYI @marcaaron @tgolen

I'm going to be OOO from 31/01/2024 - 17/03/2024, which means i can either complete a (limited) version of this issue fix/PR by tomorrow/tuesday, or i'll have to hand it over to someone else to takeover.

From my perspective, i don't think the current state of this PR affects the behaviour of Onyx negatively at all. The current implementation of tryOrDegradePerformance checks for only this specific error (Internal error opening backing store for indexedDB.open) and only then degrades performance.

This PR also includes lots of re-structuring and code quality improvements and paves the way for future TypeScript typings and clean-up.

Do you guys think we can merge this (as-is) and work on advanced error reporting and preventing in a follow-up PR?

@tgolen
Copy link
Collaborator

tgolen commented Jan 29, 2024

I think with such large changes in this PR, and you being gone for so long, I'd actually prefer holding this PR until you are back. Unless someone else is willing and able to take over maintaining it. We've had a lot of issues where Onyx PRs need to be reverted and then it causes as cascade effect on other PRs. I think the risk of causing a regression here is pretty high. The best way to combat that is to split this PR up into multiple smaller pieces.

@chrispader
Copy link
Contributor Author

I think with such large changes in this PR, and you being gone for so long, I'd actually prefer holding this PR until you are back. Unless someone else is willing and able to take over maintaining it. We've had a lot of issues where Onyx PRs need to be reverted and then it causes as cascade effect on other PRs. I think the risk of causing a regression here is pretty high. The best way to combat that is to split this PR up into multiple smaller pieces.

I understand! I can hand this PR over to another Margelo engineer and ask them to split it up into more easily reviewable PRs. Otherwise (if this not a priority) i can also continue working on it in March.

@marcaaron
Copy link
Contributor

If i understand you correctly, this means we would need to implement some sort of "transaction" system, to basically rollback changes if there an error occurs later in a chain of operations

Hmm no I wasn't trying to imply any kind of transaction system (unsure what that would entail). I am just stating a fact - which is that if you are unable to store an update, but allow storing a later update then you will have irrecoverable "gaps" in the client side data.

@hannojg
Copy link
Contributor

hannojg commented Feb 1, 2024

Hey, as Chris is OOO pretty long we are internally assigning someone new to carry on this PR.
The idea is as Tim pointed out to split up the PR in smaller PRs to reduce the risk of hard to fix regressions.

@tgolen
Copy link
Collaborator

tgolen commented Feb 2, 2024

Thanks @hannojg. I'm going to close this PR for now then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants