Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows] EXCEPTION_IN_PAGE_ERROR Crash in sqlite3 WAL since 2.5.0 #6881

Closed
guruz opened this issue Nov 14, 2018 · 13 comments
Closed

[Windows] EXCEPTION_IN_PAGE_ERROR Crash in sqlite3 WAL since 2.5.0 #6881

guruz opened this issue Nov 14, 2018 · 13 comments
Assignees
Labels
ReadyToTest QA, please validate the fix/enhancement type:bug Windows
Milestone

Comments

@guruz
Copy link
Contributor

guruz commented Nov 14, 2018

See https://sentry.io/owncloud/desktop-win-and-mac/issues/705087323/events/

The backtrace shown in sentry is a bit confusing but from raw one you can see:

Thread 0 (crashed)
 0  owncloud_csync.dll!static int walIndexTryHdr(struct Wal *, int *) [sqlite3.c : 59283 + 0x1]
    eip = 0x6d68d122   esp = 0x00afc27c   ebp = 0x00afc378   ebx = 0x08090178
    esi = 0x007b0000   edi = 0x00000000   eax = 0x007b0000   ecx = 0x007b0000
    edx = 0x00000000   efl = 0x00010286
    Found by: given as instruction pointer in context
 1  owncloud_csync.dll!static int walIndexReadHdr(struct Wal *, int *) [sqlite3.c : 59367 + 0xf]
    eip = 0x6d68cb3c   esp = 0x00afc300   ebp = 0x00afc378
    Found by: call frame info
 2  owncloud_csync.dll!static int walTryBeginRead(struct Wal *, int *, int, int) [sqlite3.c : 59685 + 0x7]
    eip = 0x6d68dd88   esp = 0x00afc320   ebp = 0x00afc378
    Found by: call frame info
 3  owncloud_csync.dll!static int sqlite3WalBeginReadTransaction(struct Wal *, int *) [sqlite3.c : 59956 + 0xb]
    eip = 0x6d680abb   esp = 0x00afc344   ebp = 0x00afc378
    Found by: call frame info
 4  owncloud_csync.dll!static int pagerBeginReadTransaction(struct Pager *) [sqlite3.c : 52804 + 0x10]
    eip = 0x6d644f8a   esp = 0x00afc364   ebp = 0x00afc378
    Found by: call frame info
 5  owncloud_csync.dll!static int sqlite3PagerSharedLock(struct Pager *) [sqlite3.c : 54898 + 0x6]
    eip = 0x6d667de1   esp = 0x00afc380   ebp = 0x00afc378
    Found by: call frame info
 6  owncloud_csync.dll!static int lockBtree(struct BtShared *) [sqlite3.c : 65032 + 0x13]
    eip = 0x6d641587   esp = 0x00afc3ac   ebp = 0x00afc378
    Found by: call frame info
 7  owncloud_csync.dll!static int sqlite3BtreeBeginTrans(struct Btree *, int) [sqlite3.c : 65400 + 0xc]
    eip = 0x6d65158c   esp = 0x00afc3c8   ebp = 0x00afc378
    Found by: call frame info
 8  owncloud_csync.dll!static int sqlite3InitOne(struct sqlite3 *, int, char * *) [sqlite3.c : 120578 + 0x7]
    eip = 0x6d6624a4   esp = 0x00afc3e0   ebp = 0x00afc378
    Found by: call frame info
 9  owncloud_csync.dll!static int sqlite3Init(struct sqlite3 *, char * *) [sqlite3.c : 120763 + 0x9]
    eip = 0x6d662144   esp = 0x00afc440   ebp = 0x00afc378
    Found by: call frame info
10  owncloud_csync.dll!static int sqlite3ReadSchema(struct Parse *) [sqlite3.c : 120789 + 0xa]
    eip = 0x6d66be9b   esp = 0x00afc458   ebp = 0x00afc378
    Found by: call frame info
11  owncloud_csync.dll!static void sqlite3Pragma(struct Parse *, struct Token *, struct Token *, struct Token *, int) [sqlite3.c : 118319 + 0x9]
    eip = 0x6d6691e6   esp = 0x00afc46c   ebp = 0x00afc378
    Found by: call frame info
12  owncloud_csync.dll!static unsigned short yy_reduce(struct yyParser *, unsigned int, int, struct Token, struct Parse *) [sqlite3.c : 144915 + 0x15]
    eip = 0x6d696d61   esp = 0x00afc4fc   ebp = 0x00afc378
    Found by: call frame info
13  owncloud_csync.dll!static void sqlite3Parser(void *, int, struct Token) [sqlite3.c : 145333 + 0x13]
    eip = 0x6d66878a   esp = 0x00afc544   ebp = 0x00afc378
    Found by: call frame info
14  owncloud_csync.dll!static int sqlite3RunParser(struct Parse *, const char *, char * *) [sqlite3.c : 146324 + 0x19]
    eip = 0x6d66d410   esp = 0x00afc578   ebp = 0x00afc378
    Found by: call frame info
15  owncloud_csync.dll!static int sqlite3Prepare(struct sqlite3 *, const char *, int, unsigned int, struct Vdbe *, struct sqlite3_stmt * *, const char * *) [sqlite3.c : 120986 + 0x10]
    eip = 0x6d66b9cd   esp = 0x00afca70   ebp = 0x00afc378
    Found by: call frame info
16  owncloud_csync.dll!static int sqlite3LockAndPrepare(struct sqlite3 *, const char *, int, unsigned int, struct Vdbe *, struct sqlite3_stmt * *, const char * *) [sqlite3.c : 121079 + 0x1b]
    eip = 0x6d664beb   esp = 0x00afcc80   ebp = 0x00afc378
    Found by: call frame info
17  owncloud_csync.dll!sqlite3_prepare_v2 [sqlite3.c : 121162 + 0x20]
    eip = 0x6d6247a0   esp = 0x00afccb0   ebp = 0x00afc378
    Found by: call frame info
18  owncloud_csync.dll!OCC::SqlQuery::prepare(QByteArray const &,bool) [ownsql.cpp : 258 + 0x14]
    eip = 0x6d5e2954   esp = 0x00afccd0   ebp = 0x00afc378
    Found by: call frame info

Annoying that the backtrace is cut at 18, need to fix this

@guruz guruz added type:bug p2-high Escalation, on top of current planning, release blocker labels Nov 14, 2018
@guruz guruz added this to the 2.5.x milestone Nov 14, 2018
@ogoffart
Copy link
Contributor

As far as i can remember, we've always had these crashes.
For example
https://sentry.io/owncloud/desktop-win-and-mac/issues/130205177/?query=is:unresolved [since 2.2] is amongst one of the most frequent crashes.
This also happens on OSX: https://sentry.io/owncloud/desktop-win-and-mac/issues/751145518/

My suspicion is that this is one of these crashes which happens on corrupted database.

@guruz
Copy link
Contributor Author

guruz commented Nov 14, 2018

Hmm not sure I can agree..

For example
https://sentry.io/owncloud/desktop-win-and-mac/issues/130205177/?query=is:unresolved [since 2.2] is amongst one of the most frequent crashes.

That one has a different (much shorter) backtrace and only occured between 2.0.1 and 2.2.4!

This also happens on OSX: https://sentry.io/owncloud/desktop-win-and-mac/issues/751145518/

But that one is also first seen in 2.5.0 (according to sentry)

@ogoffart ogoffart removed the p2-high Escalation, on top of current planning, release blocker label Nov 29, 2018
@ckamm
Copy link
Contributor

ckamm commented Dec 18, 2018

We don't have enough information to know what the cause of these crashes is. What are our options here?

We could switch away from WAL and hope that that does something? @ogoffart @guruz Do you know why exactly we use WAL mode? (as far as I remember we already switch to DELETE mode on some incompatible filesystems)

@ckamm ckamm added the Windows label Jan 9, 2019
@ckamm
Copy link
Contributor

ckamm commented Jan 9, 2019

How about we switch to EXCLUSIVE locking mode as suggested in https://bugzilla.mozilla.org/show_bug.cgi?id=993556 ?

@ckamm
Copy link
Contributor

ckamm commented Jan 9, 2019

I've done a short test with EXCLUSIVE lock mode on windows and it seems to work fine. The -wal and -shm files are no longer created when it's enabled. Basic sync worked fine.

@guruz
Copy link
Contributor Author

guruz commented Jan 10, 2019

The -wal and -shm files are no longer created when it's enabled.

So the write ahead log is in normal heap memory then? And this is properly guarded against multi-threaded access that we might be doing?

See also my comments on your PR #6960

@guruz
Copy link
Contributor Author

guruz commented Jan 10, 2019

Hmm the bugzilla commenter that you linked says

Is that database ever used by more than a single process. (Use by multiple threads using separate connections does not count - I mean really used by multiple processes with their own address space.)

ckamm added a commit that referenced this issue Jan 11, 2019
Can be overridden with OWNCLOUD_SQLITE_LOCKING_MODE
guruz pushed a commit that referenced this issue Jan 11, 2019
Can be overridden with OWNCLOUD_SQLITE_LOCKING_MODE
@ckamm ckamm added the ReadyToTest QA, please validate the fix/enhancement label Jan 11, 2019
@jnweiger
Copy link
Contributor

jnweiger commented Jan 21, 2019

How can we test this properly?

tested 252rc2 on linux mint tara

My sync folder still contains ._sync_ffe9784ffc22.db-shm and ._sync_ffe9784ffc22.db-wal files.

  • Is this an indicator that the patch is not present in 252rc2? -> BAD
  • exclusive locking is nevertheless active- sqlite3 client reports database is locked. -> GOOD
  • env OWNCLOUD_SQLITE_LOCKING_MODE="NORMAL; create table little(bobby int);" owncloud –logwindow reports that locking mode is normal and sqlite3 client works. → GOOD
    (extra bonus for not creating an additional db table)

@ckamm
Copy link
Contributor

ckamm commented Jan 22, 2019

@jnweiger The patch is in the rc and it worked in my tests. Maybe these files linger if they were present before. Can you try whether they're recreated on client start if you delete them while the client isn't running?

@jnweiger
Copy link
Contributor

wal and shm files get created even in a fresh sync folder. Seen on macos and linux.

@jnweiger jnweiger removed the ReadyToTest QA, please validate the fix/enhancement label Jan 22, 2019
@ckamm
Copy link
Contributor

ckamm commented Jan 23, 2019

@jnweiger Thanks for retesting, I'll take a second look. Possibly it's that index deletion that happens before the pragmas are set for some reason.

@ckamm
Copy link
Contributor

ckamm commented Jan 23, 2019

Oddly enough I couldn't reproduce the issue on linux. Could you test whether my #6999 improves behavior for you?

@guruz guruz modified the milestones: 2.5.2, 2.5.x-next Jan 25, 2019
@guruz guruz added the ReadyToTest QA, please validate the fix/enhancement label Feb 1, 2019
@guruz
Copy link
Contributor Author

guruz commented Feb 1, 2019

Ready to test now

@guruz guruz closed this as completed Mar 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ReadyToTest QA, please validate the fix/enhancement type:bug Windows
Projects
None yet
Development

No branches or pull requests

4 participants