Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EverParse+Steel: Verified parsing and serialization with separation logic #54

Closed
wants to merge 383 commits into from

Conversation

tahina-pro
Copy link
Member

Motivation

Coming up with a domain-specific language for formally verified parsing and serialization at the right level of abstraction with runtime efficiency in mind is much larger a research project than one may think, and we at Project Everest have been working on that for a very long time. Several challenges:

  • minimize memory copies for speed ==> things must be serialized in the right order
  • ease of user proof: how much layout details do we want to expose to the user
  • inplace mutations and their impact on dependently parsed formats (e.g. tagged unions)
  • must be extracted to (ideally auditable) C code

Our USENIX Security 2019 paper, particularly Sections 4.4 and 4.5, presents our first attempt in Low*, as done in miTLS, based on LowParse.Low.* validators and accessors. As described there, EverParse already produces such validators, accessors and serializer primitives for the user to write a Low* program reading and writing valid packets, and we extensively use them in miTLS and EverQuic, but layout details (e.g. offsets to the input/output buffers) are still very much exposed to the user.

This PR: Steel+EverParse

In this PR, I propose to use Steel, a separation logic framework for F*, to model resources for byte arrays containing byte representations valid with respect to a given parser specification: LowParse.Steel.*

If a is a byte array and p is a parser specification, then LowParse.Steel.VParse.vparse a p says that a exactly contains bytes valid with respect to p: p consumes all bytes of a and succeeds. Such a model seems to work with most parsers currently supported by LowParse, which either consume all their input or have the strong prefix property. The selector (ghostly) returns the high-level value resulting from parsing.

Then, validating an input buffer x against a parser p should return either a null pointer (if parsing fails), or a pointer to the remaining unparsed part of the input buffer, truncating x to the part actually consumed by the parser/combinator, thus establishing vparse p x as a post-condition. See LowParse.Steel.Validate.

By using Steel, the main goal is to hide most, if not all, offset reasoning away from the user.

Utilities

Byte arrays

This PR is based on FStarLang/FStar#2319, which models C pointers and provides an abstraction for byte arrays.

Based on Steel.Array, which models mutable C arrays, I am introducing LowParse.Steel.ArrayPtr, adopting the interface I first proposed on FStarLang/FStar#2309, but with an implementation now based on a more faithful model of C pointers:

  • Spatial split and join: if x is an array viewed as a "pointer", then, instead of returning a pair of arrays, split x i will return one "pointer" y to represent the elements from index i and at the same time "cut" x to have it represent only the elements up to index i-1. Similarly, merge x y (if adjacent) will "restore" x by extending it to cover all elements in y. These are made possible by storing the actual array in a ghost reference while retaining its base pointer as the concrete value.

  • Similarly, the temporal fractional permission also becomes a ghost reference, in such a way that share x returns a new "pointer" with half permission, with the permission on x also being cut by half at the same time.

Right-to-left output buffer

Based on Tao et al.'s USENIX 2021 paper, which introduced right-to-left low-level serialization with Low* and LowParse for a subset of the X.509 certificate format for IoT, I introduce a Steel data structure, LowParse.Steel.R2LOutput, to represent right-to-left output buffer as a pair of a byte array and its length, with operations to "allocate" chunks of bytes starting from the right end of that array.

@tahina-pro tahina-pro added enhancement New feature or request lowparse The LowParse verified parsing and serialization combinator library labels Jun 16, 2021
@tahina-pro
Copy link
Member Author

This PR is superseded by #155, based on Pulse instead of Steel.

@tahina-pro tahina-pro closed this Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lowparse The LowParse verified parsing and serialization combinator library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant