Skip to content

Commit

Permalink
Merge branch 'master' into galop
Browse files Browse the repository at this point in the history
  • Loading branch information
athas committed Jan 21, 2024
2 parents 265e122 + f8756c5 commit e94176c
Show file tree
Hide file tree
Showing 153 changed files with 3,698 additions and 3,082 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,12 +78,12 @@ jobs:
with:
backend: opencl
system: A100
slurm-options: -p gpu --gres=gpu:a100:1 --job-name=fut-opencl-A100
slurm-options: -p gpu --gres=gpu:a100:1 --job-name=fut-opencl-A100 --exclude=hendrixgpu21fl
- uses: ./.github/actions/benchmark
with:
backend: cuda
system: A100
slurm-options: -p gpu --gres=gpu:a100:1 --job-name=fut-cuda-A100
slurm-options: -p gpu --gres=gpu:a100:1 --job-name=fut-cuda-A100 --exclude=hendrixgpu21fl

# benchmark-titanx-cuda:
# runs-on: hendrix
Expand Down
66 changes: 66 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,78 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

### Added

* Incremental flattening of `map`-`scan` compositions with nested
parallelism (similar to the logic for `map`-`reduce` compositions
that we have had for years).

### Removed

### Changed

### Fixed

* Flattening of `scatter` with multi-dimensional elements (#2089).

* Some instances of not-actually-irregular allocations were mistakenly
interpreted as irregular. Fixing this was a dividend of the memory
representation simplifications of 0.25.12.

* Obscure issue related to expansion of shared memory allocations (#2092).

## [0.25.12]

### Added

* `f16.copysign`, `f32.copysign`, `f64.copysign`.

* Trailing commas are now allowed for all syntactical elements that
involve comma-separation. (#2068)

* The C API now allows destruction and construction of sum types (with
some caveats). (#2074)

* An overall reduction in memory copies, through simplifying the
internal representation.

### Fixed

* C API would define distinct entry point types for Futhark types that
differed only in naming of sizes (#2080).

* `==` and `!=` on sum types with array payloads. Constructing them is
now a bit slower, though. (#2081)

* Somewhat obscure simplification error caused by neglecting to update
metadata when removing dead scatter outputs.

* Compiler crash due to the type checker forgetting to respect the
explicitly ascribed non-consuming diet of loop parameters (#2067).

* Size inference did incomplete level/scope checking, which could
result in circular sizes, which usually manifested as the type
checker going into an infinite loop (#2073).

* The OpenCL backend now more gracefully handles lack of platform.

## [0.25.11]

### Added

* New prelude function: `manifest`. For doing subtle things to memory.

* The GPU backends now handle up to 20 operators in a single fused
reduction.

* CUDA/HIP terminology for GPU concepts (e.g. "thread block") is now
used in all public interfaces. The OpenCL names are still supported
for backwards compatibility.

* More fusion across array slicing.

### Fixed

* Compatibility with CUDA versions prior than 12.

## [0.25.10]

### Added
Expand Down
4 changes: 1 addition & 3 deletions cabal.project
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
packages: futhark.cabal
index-state: 2023-11-22T20:04:53Z
index-state: 2024-01-14T18:48:23Z

package futhark
ghc-options: -j -fwrite-ide-info -hiedir=.hie

allow-newer: versions:base, versions:text, generic-lens:text, generic-lens-core:text
91 changes: 81 additions & 10 deletions docs/c-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -304,9 +304,10 @@ arrays contain brackets, which are not valid in identifiers. Defining
a type abbreviation is the best way around this.
The API for opaque values is similar to that of arrays, and the same
rules for memory management apply. You cannot construct them from
scratch, but must obtain them via entry points (or deserialisation,
see :c:func:`futhark_restore_opaque_foo`).
rules for memory management apply. You cannot construct them from
scratch (unless they correspond to records or tuples, see
:ref:`records`), but must obtain them via entry points (or
deserialisation, see :c:func:`futhark_restore_opaque_foo`).
.. c:struct:: futhark_opaque_foo
Expand Down Expand Up @@ -355,13 +356,16 @@ see :c:func:`futhark_restore_opaque_foo`).
Futhark program, and compiled with the same version of the Futhark
compiler.
.. _records:
Records
~~~~~~~
A record is an opaque type (see above) that supports additional
functions to *project* individual fields (read their values) and to
construct a value given values for the fields. An opaque type is a
record if its definition is a record at the Futhark level.
construct a value given values for the fields. An opaque type is a
record if its definition is a record at the Futhark level. Note that a
tuple is simply a record with numeric fields.
The projection and construction functions are equivalent in
functionality to writing entry points by hand, and so serve only to
Expand Down Expand Up @@ -404,6 +408,61 @@ types.
The resulting value *aliases* the record, but has its own lifetime,
and must eventually be freed.
.. _sums:
Sums
~~~~
A sum type is an opaque type (see above) that supports construction
and destruction functions. An opaque type is a sum type if its
definition is a sum type at the Futhark level.
Similarly to records (see :ref:`Records`), this functionality is
equivalent to writing entry points by hand, and have the same
properties regarding lifetimes.
A sum type consists of one or more variants. A value of this type is
always an instance of one of these variants. In the C API, these
variants are numbered from zero. The numbering is given by the order
in which they are represented in the manifest (see :ref:`manifest`),
which is also the order in which their associated functions are
defined in the header file.
For an opaque sum type ``t``, the following function is always
generated.
.. c:function:: int futhark_variant_opaque_t(struct futhark_context *ctx, const struct futhark_opaque_t *v);
Return the identifying number of the variant of which this sum type
is an instance (see above). Cannot fail.
For each variant ``foo``, construction and destruction functions are
defined. The following assume ``t`` is defined as ``type t = #foo
([]i32) bool``.
.. c:function:: int futhark_new_opaque_t_foo(struct futhark_context *ctx, struct futhark_opaque_contrived **out, const struct futhark_i32_1d *v0, const bool v1);
Construct a value of type ``t`` that is an instance of the variant
``foo``. Arguments are provided in the same order as in the
Futhark-level ``foo`` constructr.
**Beware:** if ``t`` has size parameters that are only used for
*other* variants than the one that is being instantiated, those
size parameters will be set to 0. If this is a problem for your
application, define your own entry point for constructing a value
with the proper sizes.
.. c:function:: int futhark_destruct_opaque_contrived_foo(struct futhark_context *ctx, struct futhark_i32_1d **v0, bool *v1, const struct futhark_opaque_contrived *obj);
Extract the payload of variant ``foo`` from the sum value. Despite
the name, "destruction" does not free the sum type value. The
extracted values alias the sum value, but has their own lifetime,
and must eventually be freed.
**Precondition:** ``t`` must be an instance of the ``foo`` variant,
which can be determined with :c:func:`futhark_variant_opaque_t`.
Entry points
------------
Expand Down Expand Up @@ -461,13 +520,25 @@ Exotic
The following functions are not interesting to most users.
.. c:function:: void futhark_context_config_set_default_thread_block_size(struct futhark_context_config *cfg, int size)
Set the default number of work-items in a thread block.
.. c:function:: void futhark_context_config_set_default_group_size(struct futhark_context_config *cfg, int size)
Set the default number of work-items in a work-group.
Identical to
:c:func:`futhark_context_config_set_default_thread_block_size`;
provided for backwards compatibility.
.. c:function:: void futhark_context_config_set_default_grid_size(struct futhark_context_config *cfg, int num)
Set the default number of thread blocks used for kernels.
.. c:function:: void futhark_context_config_set_default_num_groups(struct futhark_context_config *cfg, int num)
Set the default number of work-groups used for kernels.
Identical to
:c:func:`futhark_context_config_set_default_grid_size`;
provided for backwards compatibility.
.. c:function:: void futhark_context_config_set_default_tile_size(struct futhark_context_config *cfg, int num)
Expand Down Expand Up @@ -610,9 +681,9 @@ whose entry points do not have unique parameter types
Manifest
--------
The C backends generate a machine-readable *manifest* in JSON format
that describes the API of the compiled Futhark program. Specifically,
the manifest contains:
When compiling with ``--library``, the C backends generate a
machine-readable *manifest* in JSON format that describes the API of
the compiled Futhark program. Specifically, the manifest contains:
* A mapping from the name of each entry point to:
Expand Down
58 changes: 33 additions & 25 deletions docs/language-reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,12 @@ a *documentation comment* and has special meaning to documentation
tools. Documentation comments are only allowed immediately before
declarations.

Trailing commas
---------------

All syntactical elements that involve comma-separated sequencing
permit an optional trailing comma.

Identifiers and Keywords
------------------------

Expand Down Expand Up @@ -142,17 +148,17 @@ definition has been hidden via the module system (see
:ref:`module-system`).

.. productionlist::
tuple_type: "(" ")" | "(" `type` ("," `type`)+ ")"
tuple_type: "(" ")" | "(" `type` ("," `type`)+ [","] ")"

A tuple value or type is written as a sequence of comma-separated
values or types enclosed in parentheses. For example, ``(0, 1)`` is a
tuple value of type ``(i32,i32)``. The elements of a tuple need not
values or types enclosed in parentheses. For example, ``(0, 1)`` is a
tuple value of type ``(i32,i32)``. The elements of a tuple need not
have the same type -- the value ``(false, 1, 2.0)`` is of type
``(bool, i32, f64)``. A tuple element can also be another tuple, as
in ``((1,2),(3,4))``, which is of type ``((i32,i32),(i32,i32))``. A
tuple cannot have just one element, but empty tuples are permitted,
although they are not very useful. Empty tuples are written ``()``
and are of type ``()``.
``(bool, i32, f64)``. A tuple element can also be another tuple, as in
``((1,2),(3,4))``, which is of type ``((i32,i32),(i32,i32))``. A tuple
cannot have just one element, but empty tuples are permitted, although
they are not very useful. Empty tuples are written ``()`` and are of
type ``()``.

.. productionlist::
array_type: "[" [`exp`] "]" `type`
Expand Down Expand Up @@ -199,12 +205,13 @@ value will be resident in memory. Avoid using sum types where
multiple constructors have large payloads.

.. productionlist::
record_type: "{" "}" | "{" `fieldid` ":" `type` ("," `fieldid` ":" `type`)* "}"
record_type: "{" "}" | "{" `fieldid` ":" `type` ("," `fieldid` ":" `type`)* [","] "}"

Records are mappings from field names to values, with the field names
known statically. A tuple behaves in all respects like a record with
numeric field names starting from zero, and vice versa. It is an
error for a record type to name the same field twice.
known statically. A tuple behaves in all respects like a record with
numeric field names starting from zero, and vice versa. It is an error
for a record type to name the same field twice. A trailing comma is
permitted.

.. productionlist::
type_application: `type` `type_arg` | "*" `type`
Expand Down Expand Up @@ -458,18 +465,18 @@ literals and variables, but also more complicated forms.
: | `charlit`
: | "(" ")"
: | "(" `exp` ")" ("." `fieldid`)*
: | "(" `exp` ("," `exp`)* ")"
: | "(" `exp` ("," `exp`)+ [","] ")"
: | "{" "}"
: | "{" `field` ("," `field`)* "}"
: | `qualname` "[" `index` ("," `index`)* "]"
: | "(" `exp` ")" "[" `index` ("," `index`)* "]"
: | "{" `field` ("," `field`)* [","] "}"
: | `qualname` `slice`
: | "(" `exp` ")" `slice`
: | `quals` "." "(" `exp` ")"
: | "[" `exp` ("," `exp`)* "]"
: | "[" `exp` ("," `exp`)* [","] "]"
: | "(" `qualsymbol` ")"
: | "(" `exp` `qualsymbol` ")"
: | "(" `qualsymbol` `exp` ")"
: | "(" ( "." `field` )+ ")"
: | "(" "." "[" `index` ("," `index`)* "]" ")"
: | "(" "." `slice` ")"
: | "???"
exp: `atom`
: | `exp` `qualsymbol` `exp`
Expand All @@ -484,16 +491,17 @@ literals and variables, but also more complicated forms.
: | `exp` [ ".." `exp` ] "..>" `exp`
: | "if" `exp` "then" `exp` "else" `exp`
: | "let" `size`* `pat` "=" `exp` "in" `exp`
: | "let" `name` "[" `index` ("," `index`)* "]" "=" `exp` "in" `exp`
: | "let" `name` `slice` "=" `exp` "in" `exp`
: | "let" `name` `type_param`* `pat`+ [":" `type`] "=" `exp` "in" `exp`
: | "(" "\" `pat`+ [":" `type`] "->" `exp` ")"
: | "loop" `pat` ["=" `exp`] `loopform` "do" `exp`
: | "#[" `attr` "]" `exp`
: | "unsafe" `exp`
: | "assert" `atom` `atom`
: | `exp` "with" "[" `index` ("," `index`)* "]" "=" `exp`
: | `exp` "with" `slice` "=" `exp`
: | `exp` "with" `fieldid` ("." `fieldid`)* "=" `exp`
: | "match" `exp` ("case" `pat` "->" `exp`)+
slice: "[" `index` ("," `index`)* [","] "]"
field: `fieldid` "=" `exp`
: | `name`
size : "[" `name` "]"
Expand All @@ -502,9 +510,9 @@ literals and variables, but also more complicated forms.
: | "_"
: | "(" ")"
: | "(" `pat` ")"
: | "(" `pat` ("," `pat`)+ ")"
: | "(" `pat` ("," `pat`)+ [","] ")"
: | "{" "}"
: | "{" `fieldid` ["=" `pat`] ("," `fieldid` ["=" `pat`])* "}"
: | "{" `fieldid` ["=" `pat`] ("," `fieldid` ["=" `pat`])* [","] "}"
: | `constructor` `pat`*
: | `pat` ":" `type`
: | "#[" `attr` "]" `pat`
Expand Down Expand Up @@ -1686,7 +1694,7 @@ Attributes
.. productionlist::
attr: `name`
: | `decimal`
: | `name` "(" [`attr` ("," `attr`)*] ")"
: | `name` "(" [`attr` ("," `attr`)* [","]] ")"

An expression, declaration, pattern, or module type spec can be
prefixed with an attribute, written as ``#[attr]``. This may affect
Expand Down Expand Up @@ -1745,13 +1753,13 @@ parallelism" version for the attributed SOACs.
``incremental_flattening(no_intra)``
....................................

When using incremental flattening, do not generate the "intra-group
When using incremental flattening, do not generate the "intra-block
parallelism" version for the attributed SOACs.

``incremental_flattening(only_intra)``
......................................

When using incremental flattening, *only* generate the "intra-group
When using incremental flattening, *only* generate the "intra-block
parallelism" version of the attributed SOACs. **Beware**: the
resulting program will fail to run if the inner parallelism does not
fit on the device.
Expand Down
Loading

0 comments on commit e94176c

Please sign in to comment.