feat: csv to document row level conversion #8916

mdrazak2001 · 2025-02-24T18:37:41Z

Related Issues

Enhance functionality to split CSV files by rows and convert each row into a separate document.

Proposed Changes:

Enhance the CSVToDocument component to support row-level conversion.
- Adds a 'split_by_row' parameter to convert each row of a CSV file into a separate Haystack Document.
- Retains the header row (field names) as the first line of the 'content' in each row-level Document.

How did you test it?

added unit test to existing test_csv_todocument.py

Notes for the reviewer

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
I documented my code
I ran pre-commit hooks and fixed any issue

CLAassistant · 2025-02-24T18:37:54Z

All committers have signed the CLA.

haystack/components/converters/csv.py

Amnah199

@mdrazak2001 Thanks for the contribution. I have requested some small changes, otherwise the PR looks good.

coveralls · 2025-02-28T10:17:15Z

Pull Request Test Coverage Report for Build 13658995133

Details

0 of 0 changed or added relevant lines in 0 files are covered.
3 unchanged lines in 1 file lost coverage.
Overall coverage decreased (-0.01%) to 90.206%

Files with Coverage Reduction	New Missed Lines	%
components/converters/csv.py	3	94.74%

Totals
Change from base Build 13652854933:	-0.01%
Covered Lines:	9616
Relevant Lines:	10660

💛 - Coveralls

This reverts commit 682da59.

This reverts commit c9ff9a0.

Amnah199 · 2025-03-06T12:32:34Z

@mdrazak2001 We discussed this feature internally with the team and decided to move it to CSVDocumentSplitter, as it is a more suitable place for it.

For the linked issue, we plan to implement it in a way that provides a conversion feature:

Map one column to the document content.
Store all other fields as metadata, where the column name serves as the key and the column value as the corresponding value.

I'll update this PR to move the feature to CSVDocumentSplitter.

mdrazak2001 added 2 commits February 24, 2025 22:28

feat: Enhance CSVToDocument to support row-level conversion

0255b78

add test + release notes

edb931d

mdrazak2001 requested review from a team as code owners February 24, 2025 18:37

mdrazak2001 requested review from dfokina and anakin87 and removed request for a team February 24, 2025 18:37

github-actions bot added topic:tests type:documentation Improvements on the docs labels Feb 24, 2025

mdrazak2001 changed the title ~~Feat/csv to document row level conversion~~ Feat: csv to document row level conversion Feb 25, 2025

mdrazak2001 changed the title ~~Feat: csv to document row level conversion~~ feat: csv to document row level conversion Feb 25, 2025

julian-risch requested review from mpangrazzi and Amnah199 and removed request for anakin87 and mpangrazzi February 26, 2025 15:37

Amnah199 reviewed Feb 28, 2025

View reviewed changes

haystack/components/converters/csv.py Outdated Show resolved Hide resolved

Amnah199 reviewed Feb 28, 2025

View reviewed changes

haystack/components/converters/csv.py Outdated Show resolved Hide resolved

Amnah199 reviewed Feb 28, 2025

View reviewed changes

mdrazak2001 added 2 commits February 28, 2025 22:35

resolve review comments

88d3411

refactor

32116ba

mdrazak2001 requested a review from Amnah199 February 28, 2025 17:41

Amnah199 added 6 commits March 3, 2025 22:05

Merge branch 'main' into feat/csv-to-document-row-level-conversion

fbed234

Add fix for include_usage completion chunk

682da59

Revert "Add fix for include_usage completion chunk"

c9ff9a0

This reverts commit 682da59.

Reapply "Add fix for include_usage completion chunk"

85233bf

This reverts commit c9ff9a0.

reverting accidental commit

093efe6

Merge branch 'main' into feat/csv-to-document-row-level-conversion

47991bd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: csv to document row level conversion #8916

feat: csv to document row level conversion #8916

mdrazak2001 commented Feb 24, 2025 •

edited by Amnah199

Loading

CLAassistant commented Feb 24, 2025 •

edited

Loading

Amnah199 left a comment

coveralls commented Feb 28, 2025 •

edited

Loading

Amnah199 commented Mar 6, 2025 •

edited

Loading

feat: csv to document row level conversion #8916

Are you sure you want to change the base?

feat: csv to document row level conversion #8916

Conversation

mdrazak2001 commented Feb 24, 2025 • edited by Amnah199 Loading

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

CLAassistant commented Feb 24, 2025 • edited Loading

Amnah199 left a comment

Choose a reason for hiding this comment

coveralls commented Feb 28, 2025 • edited Loading

Pull Request Test Coverage Report for Build 13658995133

Details

💛 - Coveralls

Amnah199 commented Mar 6, 2025 • edited Loading

mdrazak2001 commented Feb 24, 2025 •

edited by Amnah199

Loading

CLAassistant commented Feb 24, 2025 •

edited

Loading

coveralls commented Feb 28, 2025 •

edited

Loading

Amnah199 commented Mar 6, 2025 •

edited

Loading