Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PMP NA4-R test may stuck in an infinite loop on a compliant implementation #577

Open
marcfedorow opened this issue Dec 10, 2024 · 10 comments

Comments

@marcfedorow
Copy link
Contributor

marcfedorow commented Dec 10, 2024

Edit: changed granularity to simplify an example

Priv spec says the following:

All PMP CSR fields are WARL

Let's assume that DUT has PMP granularity of, e.g., 14 bits of address (region size of 0x4000).
Let's assume that if we write 0xYYYYYXXX (e.g. 0x20000212, correcponding to the address 0x80000828) to PMPADDRx, 0xYYYYY000 will be written (e.g. 0x20000000, corresponding to the address 0x80000000).
Let's assume that if we write 0x91 to the PMPCFGx (NA4 R L), actual value of 0x99 will be written (NAPOT R L) and 0x7ff will be OR'ed to the PMPADDRx (e.g. now PMP#x now protects the region [0x80000000..0x80003fff]).
Under the following assumptions I believe the above code will lead execution to the following:

hart   0: 0x0000000080000578 (0x3a022073) csrs    pmpcfg0, tp
hart   0: exception trap_instruction_access_fault, epc 0x000000008000057c
hart   0:           tval 0x000000008000057c
hart   0: exception trap_instruction_access_fault, epc 0x0000000080000940
hart   0:           tval 0x0000000080000940
hart   0: exception trap_instruction_access_fault, epc 0x0000000080000940
hart   0:           tval 0x0000000080000940
hart   0: exception trap_instruction_access_fault, epc 0x0000000080000940

In other words, the following will occure:

  1. csrs protects the whole region instead of just 4 bytes.
  2. While this is the same region that is being execuded right now, the very next instruction causes instruction access fault.
  3. The same is correct for the very first instruction of the trap handler, so go to step 2.

While I suppose the suggested architecture to be compliant, the test will never finish being stuck in the infinite loop.

@marcfedorow marcfedorow changed the title Test may stuck in an infinite loop on a compliant implementation PMP NA4-R test may stuck in an infinite loop on a compliant implementation Dec 10, 2024
@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 11, 2024 via email

@marcfedorow
Copy link
Contributor Author

Running an NA4 test on a model that doesn't support NA4 may be broken.

Unfortunately at least some tests that don't contain "NA4" in their name are broken too, so it is not possible to filter-out NA4 test knowing that they require NA4 support.

But it won't get into an infinite loop. The trap handler will exit the test

That is not true and I've shown that in my example: while the trap handler itself is in the protected area, not a single instruction of it shall be execuded. Executing the first instruction of the trap handler causes exception itself, leading the execution to be infinitely stuck at that point.
Trap handler will not be able to do anything with the signature so the test will never stop (at least untill being interrupted due to external cause such as ^C).
This leads to not comparing all the following signatures (even if they are present) but it's a design flaw of the framework, not a problem of this test.

So to be clear: this is fatal. The trap handler will not exit this test, because no instructions of trap handling are actually executed, because the trap handler itself is in the protected memory region.
SAIL results will not match DUT results because either DUT will stuck in the infinite loop preventing results for this and the following test on DUT and any results on SAIL to appear, or (if the timeouts are implemented) riscof_XXX.py will kill these tests and riscof itself will fail because of not having signature file(s) for this test, and that itself will abort the task so no following signatures are compared.

Because of framework being ugly, such an issue is fatal not just for the test, but for the whole testing. It is still fatal for the test itself, so this is still an issue of the test, not the framework (at least untill framework picks this test for execution).

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 11, 2024 via email

@marcfedorow
Copy link
Contributor Author

The PMP trap handler is in Mmode, and (except for locked entries) immune to PMP effects.

Unlucky enough, there is a locked region, so trap handler is not immune to PMP effect even being execuded in M-mode

#define PMPREGION3 ((((PMP_R|PMP_L|PMP_NA4)&0xFF) << PMP1_CFG_SHIFT))

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 11, 2024 via email

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 11, 2024 via email

@marcfedorow
Copy link
Contributor Author

Then we need to fix that one so it isn't executed if NA4 isn't supported.

I hope this will not be fixed via allowing trap handler to overflow the signature.

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 11, 2024 via email

@marcfedorow
Copy link
Contributor Author

I think this triggers a timeout (if I recall correctly), and no signature is saved, but that does need fixing.

Does it cause timeout or not purely relies on riscof_XXX.py implementation (i.e. is outside of ATG/riscof/etc control). Adequate implementation of riscof_XXX.py may or may not provide per-test time-outs. If it does not, the whole testing might be killed via external time-out with providing no results at all.

If it does trigger a time-out though, riscof will abort execution on the first test that doesn't have signature file(s) to compare. This will fail the whole task, not just one test's signatures' comparison. This behavior should be fixed too, but that's a major issue of the framework, not of the test.

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 11, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants