Skip to content

Commit b00f4e0

Browse files
Jefffreytustvoldalamb
authored
Update docs for datatypes (#5260)
* Update docs for datatypes * Update docs * Fix reinterpret_cast doc * Update arrow-schema/src/datatype.rs Co-authored-by: Andrew Lamb <[email protected]> --------- Co-authored-by: Raphael Taylor-Davies <[email protected]> Co-authored-by: Andrew Lamb <[email protected]>
1 parent c578570 commit b00f4e0

File tree

2 files changed

+22
-17
lines changed

2 files changed

+22
-17
lines changed

arrow-array/src/array/primitive_array.rs

+2-3
Original file line numberDiff line numberDiff line change
@@ -713,9 +713,8 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
713713
/// the semantic values of the array, e.g. 100 milliseconds in a [`TimestampNanosecondArray`]
714714
/// will become 100 seconds in a [`TimestampSecondArray`].
715715
///
716-
/// For casts that preserve the semantic value, check out the [compute kernels]
717-
///
718-
/// [compute kernels](https://docs.rs/arrow/latest/arrow/compute/kernels/cast/index.html)
716+
/// For casts that preserve the semantic value, check out the
717+
/// [compute kernels](https://docs.rs/arrow/latest/arrow/compute/kernels/cast/index.html).
719718
///
720719
/// ```
721720
/// # use arrow_array::{Int64Array, TimestampNanosecondArray};

arrow-schema/src/datatype.rs

+20-14
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,18 @@ use crate::{Field, FieldRef, Fields, UnionFields};
2323
/// The set of datatypes that are supported by this implementation of Apache Arrow.
2424
///
2525
/// The Arrow specification on data types includes some more types.
26-
/// See also [`Schema.fbs`](https://github.com/apache/arrow/blob/master/format/Schema.fbs)
26+
/// See also [`Schema.fbs`](https://github.com/apache/arrow/blob/main/format/Schema.fbs)
2727
/// for Arrow's specification.
2828
///
2929
/// The variants of this enum include primitive fixed size types as well as parametric or
3030
/// nested types.
31-
/// Currently the Rust implementation supports the following nested types:
31+
/// Currently the Rust implementation supports the following nested types:
3232
/// - `List<T>`
33+
/// - `LargeList<T>`
34+
/// - `FixedSizeList<T>`
3335
/// - `Struct<T, U, V, ...>`
36+
/// - `Union<T, U, V, ...>`
37+
/// - `Map<K, V>`
3438
///
3539
/// Nested types can themselves be nested within other arrays.
3640
/// For more information on these types please see
@@ -68,7 +72,7 @@ pub enum DataType {
6872
///
6973
/// Time is measured as a Unix epoch, counting the seconds from
7074
/// 00:00:00.000 on 1 January 1970, excluding leap seconds,
71-
/// as a 64-bit integer.
75+
/// as a signed 64-bit integer.
7276
///
7377
/// The time zone is a string indicating the name of a time zone, one of:
7478
///
@@ -140,15 +144,17 @@ pub enum DataType {
140144
/// DataType::Timestamp(TimeUnit::Second, Some("string".to_string().into()));
141145
/// ```
142146
Timestamp(TimeUnit, Option<Arc<str>>),
143-
/// A 32-bit date representing the elapsed time since UNIX epoch (1970-01-01)
147+
/// A signed 32-bit date representing the elapsed time since UNIX epoch (1970-01-01)
144148
/// in days (32 bits).
145149
Date32,
146-
/// A 64-bit date representing the elapsed time since UNIX epoch (1970-01-01)
150+
/// A signed 64-bit date representing the elapsed time since UNIX epoch (1970-01-01)
147151
/// in milliseconds (64 bits). Values are evenly divisible by 86400000.
148152
Date64,
149-
/// A 32-bit time representing the elapsed time since midnight in the unit of `TimeUnit`.
153+
/// A signed 32-bit time representing the elapsed time since midnight in the unit of `TimeUnit`.
154+
/// Must be either seconds or milliseconds.
150155
Time32(TimeUnit),
151-
/// A 64-bit time representing the elapsed time since midnight in the unit of `TimeUnit`.
156+
/// A signed 64-bit time representing the elapsed time since midnight in the unit of `TimeUnit`.
157+
/// Must be either microseconds or nanoseconds.
152158
Time64(TimeUnit),
153159
/// Measure of elapsed time in either seconds, milliseconds, microseconds or nanoseconds.
154160
Duration(TimeUnit),
@@ -159,35 +165,35 @@ pub enum DataType {
159165
/// Opaque binary data of variable length.
160166
///
161167
/// A single Binary array can store up to [`i32::MAX`] bytes
162-
/// of binary data in total
168+
/// of binary data in total.
163169
Binary,
164170
/// Opaque binary data of fixed size.
165171
/// Enum parameter specifies the number of bytes per value.
166172
FixedSizeBinary(i32),
167173
/// Opaque binary data of variable length and 64-bit offsets.
168174
///
169175
/// A single LargeBinary array can store up to [`i64::MAX`] bytes
170-
/// of binary data in total
176+
/// of binary data in total.
171177
LargeBinary,
172-
/// A variable-length string in Unicode with UTF-8 encoding
178+
/// A variable-length string in Unicode with UTF-8 encoding.
173179
///
174180
/// A single Utf8 array can store up to [`i32::MAX`] bytes
175-
/// of string data in total
181+
/// of string data in total.
176182
Utf8,
177183
/// A variable-length string in Unicode with UFT-8 encoding and 64-bit offsets.
178184
///
179185
/// A single LargeUtf8 array can store up to [`i64::MAX`] bytes
180-
/// of string data in total
186+
/// of string data in total.
181187
LargeUtf8,
182188
/// A list of some logical data type with variable length.
183189
///
184-
/// A single List array can store up to [`i32::MAX`] elements in total
190+
/// A single List array can store up to [`i32::MAX`] elements in total.
185191
List(FieldRef),
186192
/// A list of some logical data type with fixed length.
187193
FixedSizeList(FieldRef, i32),
188194
/// A list of some logical data type with variable length and 64-bit offsets.
189195
///
190-
/// A single LargeList array can store up to [`i64::MAX`] elements in total
196+
/// A single LargeList array can store up to [`i64::MAX`] elements in total.
191197
LargeList(FieldRef),
192198
/// A nested datatype that contains a number of sub-fields.
193199
Struct(Fields),

0 commit comments

Comments
 (0)