- Add --split-queries option to export-pg-from-queries. When invoked,
range()
steps will be injected in the beginning of queries, and they will be split according to the configured concurrency.
- Add support for
neptune_ml
profile toexport-pg-from-queries
.
- Use Graph Store Protocol for complete RDF graph exports, improving performance for large exports.
- Resolves issue where certain special characters would cause RDF export jobs to fail.
- Resolves issue in which RDF outputs may contain unexpected and potentially faulty prefixes.
-
Introduce
--structured-output
CLI option toexport-pg-from-queries
. This option, when used in conjunction with--format csv
, will produce CSV output matching the structure of the Neptune bulk loader's gremlin data format. This is the same format as produced byexport-pg --format csv
. The use of this option requires that queries produce elementMap()'s of nodes and edges. -
Add
--filter-edges-early
option to property graph exports. This option forcesgremlinFilters
to apply before the range() step which breaks up concurrent traversals. This may lead to improved performance in cases where the gremlinFilters are efficient and filter out the majority of edges.
- Cross Account Exports: New CLI options and parameters added to specify a role to assume when uploading to Amazon S3 buckets or Amazon Kinesis Data Streams
- New --credentials-profile CLI option to fetch AWS Credentials from non-default AWS CLI profiles
- Fixed bug which could lead to corrupted output in highly concurrent csv exports
- Resolves issue which caused the error
gremlin-groovy is not an available GremlinScriptEngine
to appear when using the--gremlin-filter
option in the uber jar.
- Upgraded to Gremlin Dependency Version to 3.6.2
- Fixed NullPointerException bug during RDF Exports
- Added aws-java-sdk-sts as a dependency to enable WebIdentityTokenFileCredentialsProvider.
- Fixed bug which was preventing getting the output id in FileToStreamOutputWriter
- Updated NOTICE file in shaded jar
-
Added new
--disable-stream-aggregation
option for property graph exports to Kinesis streams. More details can be found here. -
Improved error messages from server side errors (such as timeout exceptions) for RDF exports.
Neptune Export is a tool to perform bulk data exports from AWS Neptune. Neptune Export is migrated from the AWS Labs Amazon Neptune Tools repository, and the old module is now deprecated. In this release, the release artifact neptune-export.jar
has been renamed to neptune-export-1.0.0-all.jar
. Going forward, Neptune Export will be following this new versioned naming scheme.
Instructions for running export jobs can be found in the docs/ directory.
A few changes are included since the migration.
- Corrected r6g instance type prefix (used to be listed as r6d).
- Added a new optional parameter to use customer managed KMS key for S3 server-side encryption.
- Added integration tests for developers requiring manual setup (see docs/dev/IntegrationTests.md).