Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AuthPool (C1) on first block: 2 new items in the auth pool for each core, 1 expected #90

Closed
sourabhniyogi opened this issue Feb 22, 2025 · 13 comments
Assignees

Comments

@sourabhniyogi
Copy link
Contributor

From @davxy in #74:

@sourabhniyogi @jaymansfield

Alright, from the disassembled blob I can see the HF calls, but we never reach them during execution because of 2 independent problems:

  1. The WR reports 101 as gas. But this is exhausted way before reaching the first host call
  2. Instruction executed at PC 636 triggers a trap!
633: s1 = s1 + s0                               
      // s1 = 0x0
636: a1 = u32 [s1 + 4294967292]  
    // Load of 4 bytes from 0xfffffffc failed!  
    // TRAP TRIGGERED => bail out
639: a2 = 0x64
642: a3 = 0x64
645: a0 = s0
647: ecalli 9        // NEW Host Call is never reached

From what I can can tell instruction at 636 tries to load in a1 4 bytes from address 0xfffffffc .
This is most certainly due to this code here:

let code_length_address: u64 = work_result_address + work_result_length - 4;
let code_length: u64 = unsafe { ( *(code_length_address as *const u32)).into() };

Where apparently the work_report_address is 0? So maybe is not correctly assigned in the loop above?

@jaymansfield
Copy link

jaymansfield commented Feb 22, 2025

Moving my comment this to this issue.

The WR reports 101 as gas. But this is exhausted way before reaching the first host call

I think this is why I hardcoded the gas limit previously when testing this. With my implementation I need the gas to be >= 155 to not end up with OOG.

From what I can can tell instruction at 636 tries to load in a1 4 bytes from address 0xfffffffc .

My instruction 636 loads from memory location 4278124587 which exists. Reg 6=4278124591 and immediate=-4 and I end up with a matching code length same as the vector. I do not receive any traps.

@sourabhniyogi
Copy link
Contributor Author

We will take ~48 hours to resolve this trap/gas issue (also #89), storage key issue #86, and new a_t computation #91

@sourabhniyogi sourabhniyogi changed the title WR reports 101 as gas, Instruction executed at PC 636 triggers a trap assurances data: WR reports 101 as gas, Instruction executed at PC 636 triggers a trap Feb 22, 2025
@clw8998
Copy link
Collaborator

clw8998 commented Feb 23, 2025

@davxy, @sourabhniyogi
Regarding Instruction executed at PC 636 triggers a trap!

Program:

633: c8 56 06                 s1 = s1 + s0
636: 80 68 fc                 a1 = u32 [s1 + 0xfffffffffffffffc]
639: 33 09 64                 a2 = 0x64
642: 33 0a 64                 a3 = 0x64
645: 64 57                    a0 = s0
647: 0a 09                    ecalli 9 // 'new'

Actual Operation Results (All Values in Decimal):

633: s1 = s1 + s0                               
    // s1 = 36 + 4278124555 = 4278124591
636: a1 = u32 [s1 + 18446744073709551612]  
    // (4278124591 + 18446744073709551612) % 2^32 = 4278124587 (mod 2^32 since memory address belongs to N_32)
    // u32[4278124587..4278124591] = u32[125, 4, 0, 0] = 1149
    // a1 = 1149
639: a2 = 0x64
642: a3 = 0x64
645: a0 = s0
647: ecalli 9

The bootstrap service blob contains the following opcodes. You can check them to see if they contain any bugs.

[0, 1, 10, 40, 50, 51, 72, 73, 80, 81, 82, 83, 85, 87, 90, 100, 108, 123, 124, 125, 128, 130, 131, 132, 133, 136, 138, 147, 149, 151, 170, 172, 174, 200, 201, 207, 212]

Because the information you provided is limited, we can only provide the actual results of our operation. We hope this is useful.

@sourabhniyogi
Copy link
Contributor Author

Ok @davxy we arrive at

code_len = 1149

in our 64-bit interpreter which matches up with the desired preimage length so why do we get this from the disassembler

   636: 80 68 fc                 a1 = u32 [s1 + 0xfffffffffffffffc]

whereas your trap report from your "direct" @koute PolkaVM integration (?) has

636: a1 = u32 [s1 + 4294967292]  
    // Load of 4 bytes from 0xfffffffc failed!  
    // TRAP TRIGGERED => bail out

It is as if you are running 32-bit PVM and we are running our 64-bit interpreter?

@davxy
Copy link

davxy commented Feb 24, 2025

Looks like in our trace s0 and s1 are both 0 before s1 = s1 + s0.
@koute can you have a look please? If you need my support just ping me.
You can easily reproduce by executing my polkavm branch davxy/process-community-vectors

@koute
Copy link

koute commented Feb 25, 2025

From what I can see the trapping is most likely a symptom of the test program being buggy when it gets inputs it doesn't expect, and has nothing to do with PVM.

@sourabhniyogi Can you please paste the contents of the memory starting at 0xfeff0000 for this test vector?

From the JavaJAM traces pasted in the previous issues they have this blob over there:

[16, 0, 0, 0, 0, 0, 0, 0, 1, 0, 36, 65, 121, 179, 10, 95, 210, 176, 131, 49, 5, 92, 207, 22, 140, 173, 183, 105, 33, 170, 0, 150, 68, 26, 213, 52, 236, 20, 242, 72, 27, 36, 155, 125, 4, 0, 0, 104, 246, 198, 248, 169, 8, 114, 139, 86, 109, 90, 93, 170, 254, 188, 188, 116, 72, 228, 223, 200, 64, 111, 101, 71, 157, 144, 35, 234, 146, 144, 150, 75, 240, 155, 219, 222, 37, 102, 252, 164, 205, 13, 7, 231, 163, 202, 174, 75, 61, 225, 21, 105, 184, 15, 135, 173, 41, 205, 190, 139, 13, 224, 141, 0]

which if I try to decode with our code I get:

AccumulateParams {
    slot: 16,
    id: 0,
    results: [
        AccumulateItem {
            package: 0x00244179b30a5fd2..14f2481b,
            auth_output: 0x9b7d04000068f6c6..23ea9290,
            payload: 0x964bf09bdbde2566..be8b0de0,
            result: Err(
                Error141,
            ),
        },
    ],
}

...which looks like nonsense as the result error is 141 for some reason here? So either this blob is incorrect, or our code is incorrect.

While we are passing this blob in our failed run:

[16, 0, 0, 0, 0, 0, 0, 0, 1, 153, 247, 54, 213, 78, 105, 147, 88, 47, 187, 213, 2, 56, 68, 108, 76, 20, 58, 125, 5, 69, 230, 216, 221, 160, 242, 233, 27, 237, 71, 11, 209, 0, 104, 246, 198, 248, 169, 8, 114, 139, 86, 109, 90, 93, 170, 254, 188, 188, 116, 72, 228, 223, 200, 64, 111, 101, 71, 157, 144, 35, 234, 146, 144, 150, 0, 36, 65, 121, 179, 10, 95, 210, 176, 131, 49, 5, 92, 207, 22, 140, 173, 183, 105, 33, 170, 0, 150, 68, 26, 213, 52, 236, 20, 242, 72, 27, 36, 155, 125, 4, 0, 0]

...which decodes like this:

AccumulateParams {
    slot: 16,
    id: 0,
    results: [
        AccumulateItem {
            package: 0x99f736d54e699358..ed470bd1,
            auth_output: "",
            payload: 0x68f6c6f8a908728b..ea929096,
            result: Ok(
                0x4179b30a5fd2b083..249b7d04+2*0,
            ),
        },
    ],
}

and the first divergence between our run and JavaJAM run happens when the first byte of the package hash is compared at offset 339 (jump 363 if a1 == 0), where JavaJAM goes to 363 and we go to 342.

@clw8998
Copy link
Collaborator

clw8998 commented Feb 25, 2025

Contents of the memory starting at 0xfeff0000:

[16, 0, 0, 0, 0, 0, 0, 0, 1, 0, 36, 65, 121, 179, 10, 95, 210, 176, 131, 49, 5, 92, 207, 22, 140, 173, 183, 105, 33, 170, 0, 150, 68, 26, 213, 52, 236, 20, 242, 72, 27, 36, 155, 125, 4, 0, 0, 104, 246, 198, 248, 169, 8, 114, 139, 86, 109, 90, 93, 170, 254, 188, 188, 116, 72, 228, 223, 200, 64, 111, 101, 71, 157, 144, 35, 234, 146, 144, 150, 34, 179, 168, 152, 171, 39, 141, 11, 131, 35, 56, 231, 59, 68, 207, 92, 53, 250, 38, 138, 159, 9, 20, 92, 181, 153, 40, 100, 39, 167, 110, 219, 0]

The way we decode AccumulateParams:
(the package might be slightly different)

AccumulateParams {
    slot: 16,
    id: 0,
    results: [
        AccumulateItem {
            result: Ok(
                    0x4179b30a5fd2b08331055ccf168cadb76921aa0096441ad534ec14f2481b249b7d040000,
                ),
            payload: 0x68f6c6f8a908728b566d5a5daafebcbc7448e4dfc8406f65479d9023ea929096,
            package: 0x0b47d9a108fa3eeabf0a0c1ed4112f6061e79c66f862bbaae1f9cc1c4c8da9d7,
            auth_output: "",
        },
    ],
}

The reason why our results are different is due to the way we encode the Wrangled Operand Tuples (12.18), as shown below:

Image

Your encoding of (12.18) is:

$$ {\cal E}({x} \in \mathbb{O}) = {\cal E}(x_k) \mathbin{\smallfrown} {\cal E}(\updownarrow x_a) \mathbin{\smallfrown} {\cal E}(x_l) \mathbin{\smallfrown} {\cal E}(O(x_o)) $$

Our encoding of (12.18) is:

$$ {\cal E}({x} \in \mathbb{O}) = {\cal E}(O(x_o)) \mathbin{\smallfrown} {\cal E}(x_l, x_k) \mathbin{\smallfrown} {\cal E}(\updownarrow x_a) $$

and use same way to decode it as shown in bootstrap.pvm

Since GP doesn't define a codec for Wrangled Operand Tuples, as @sourabhniyogi mention before, I followed the original order to encode it.

Hope this makes sense!

@davxy
Copy link

davxy commented Feb 25, 2025

@clw8998 looks like using the same order as 12.18 we're able to proceed.

Unfortunately the last vectors revision introduced a regression for what concerns the AuthPool (C1).
After exec of the first block (1_000) we get two new items in the auth pool for each core, we expect only one:

  • Checking: 0100000000000000000000000000000000000000000000000000000000000000

What we Expect

[
    ["", "", "", "", "", "", "", 0x8c30f2c101674af1..1d5f3c9f],
    ["", "", "", "", "", "", "", 0x8c30f2c101674af1..1d5f3c9f]
]

What we get:

[
    ["", "", "", "", "", "", 0x8c30f2c101674af1..1d5f3c9f, 0x8c30f2c101674af1..1d5f3c9f]
    ["", "", "", "", "", "", 0x8c30f2c101674af1..1d5f3c9f, 0x8c30f2c101674af1..1d5f3c9f]
]

@sourabhniyogi sourabhniyogi changed the title assurances data: WR reports 101 as gas, Instruction executed at PC 636 triggers a trap regression for what concerns the AuthPool (C1) Feb 25, 2025
@sourabhniyogi sourabhniyogi changed the title regression for what concerns the AuthPool (C1) regression on AuthPool (C1) on first block Feb 25, 2025
@sourabhniyogi sourabhniyogi changed the title regression on AuthPool (C1) on first block AuthPool (C1) on first block: 2 new items in the auth pool for each core, on expected Feb 25, 2025
@sourabhniyogi sourabhniyogi changed the title AuthPool (C1) on first block: 2 new items in the auth pool for each core, on expected AuthPool (C1) on first block: 2 new items in the auth pool for each core, 1 expected Feb 25, 2025
sourabhniyogi added a commit that referenced this issue Feb 25, 2025
* genesis has 2 preimages, must have 2x81

* authorization c1 single apply #90
sourabhniyogi added a commit that referenced this issue Feb 25, 2025
* genesis has 2 preimages, must have 2x81

* authorization c1 single apply #90
@yoyo2325
Copy link
Collaborator

yoyo2325 commented Feb 25, 2025

@davxy
Hi, Thank you for letting us know about this issue. We have already fixed it. Could you please check again?

@davxy
Copy link

davxy commented Feb 25, 2025

Alright we're almost there :-)))
The code is executed and the host is called for NEW and WRITE. It seems that I have a couple of items in the posterior state that doesn't match. I'll inspect these soon and I'll report here

@sourabhniyogi
Copy link
Contributor Author

sourabhniyogi commented Feb 25, 2025

Cool @davxy -- it appears with #101 we have come full circle back to the Option<Hash> "yield" question, we'd appreciate your comment on #101 as well [should we have the bootstrap service have a yield call at the end with the new service id or something else? If we have no "yield" for this first bootstrap accumulate, what is the E_4(s) E(h) for this first accumulate if h is null?]

@sourabhniyogi
Copy link
Contributor Author

looks like using the same order as 12.18 we're able to proceed.

@davxy your review at this would be super appreciated https://github.com/gavofyork/graypaper/pull/251/files

@sourabhniyogi
Copy link
Contributor Author

Closing this as we appear to have enough people not having an auth pool/queue issue, but look forward to your next issue @davxy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants