Skip to content

Add Vector Text Splitting Options to the UI #1106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

GSrini92
Copy link

Description

Changes Made

How to Test

  1. Steps to reproduce/test the behavior
  2. Expected outcomes

Notes

@GSrini92 GSrini92 requested a review from a team as a code owner May 12, 2025 14:16
Copy link

@CodiumAI-Agent /describe

@CodiumAI-Agent
Copy link

Title

Add Vector Text Splitting Options to the UI


User description

Description

Changes Made

How to Test

  1. Steps to reproduce/test the behavior
  2. Expected outcomes

Notes


PR Type

Enhancement


Description

  • Add split option dropdown to embedding UI

  • Pass selected split option to embedding API

  • Support split options in import form

  • Add split options config in constants


Changes walkthrough 📝

Relevant files
Enhancement
FileTable.tsx
Add split options dropdown and state                                         

packages/client/src/components/settings/FileTable.tsx

  • Introduce splittingOptions array and state
  • Add Select dropdown above FileDropzone
  • Include splitOptions in embedding query
  • +62/-18 
    ImportConnectionPage.tsx
    Handle split options in import form                                           

    packages/client/src/pages/import/ImportConnectionPage.tsx

  • Define VectorDatabaseFields with SPLITTING_OPTIONS
  • Extract and filter out split options for engine creation
  • Append splitOptions in secondary embedding query
  • +20/-3   
    Configuration changes
    import.constants.ts
    Add splitting options to connection constants                       

    packages/client/src/pages/import/import.constants.ts

  • Add SPLITTING_OPTIONS select config in CONNECTION_OPTIONS
  • Provide options: Tokens, Recursive, Semantic
  • +182/-0 

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • Copy link

    @CodiumAI-Agent /review

    Copy link

    @CodiumAI-Agent /improve

    @CodiumAI-Agent
    Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Missing Import

    The component uses useState for selectedSplitOptions but useState is not imported, which will cause a runtime error.

    const [selectedSplitOptions, setSelectedSplitOptions] = useState('Tokens');
    DRY Violation

    The SPLITTING_OPTIONS block is duplicated multiple times. Consider extracting it to a shared constant to avoid repetition and ease maintenance.

    {
        fieldName: 'SPLITTING_OPTIONS',
        label: 'Splitting Options',
        defaultValue: 'Tokens',
        options: {
            component: 'select',
            options: [
                {
                    display: 'Tokens',
                    value: 'Tokens',
                },
                {
                    display: 'Recursive',
                    value: 'Recursive',
                },
                {
                    display: 'Semantic',
                    value: 'Semantic',
                },
            ],
        },
        disabled: false,
        rules: { required: false },
        advanced: true,
        helperText: '',
    },
    UX Text

    The select label reads "Select a Split Options" which is grammatically incorrect. It should be e.g. "Select Splitting Option".

    label="Select a Split Options"
    value={selectedSplitOptions}

    @CodiumAI-Agent
    Copy link

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Add default split option

    Provide a default fallback for SPLITTING_OPTIONS to avoid passing undefined into the
    query. This ensures the backend always receives a valid split option.

    packages/client/src/pages/import/ImportConnectionPage.tsx [130-167]

     const { SPLITTING_OPTIONS, ...filteredFields } = values.fields;
    +const splitOption = SPLITTING_OPTIONS || 'Tokens';
     const secondaryPixel = `CreateEmbeddingsFromDocuments(
         engine="${output.database_id}", 
         filePaths=["${upload[0].fileLocation}"],
    -    splitOptions="${SPLITTING_OPTIONS}"
    +    splitOptions="${splitOption}"
     );`;
    Suggestion importance[1-10]: 7

    __

    Why: Providing a fallback for SPLITTING_OPTIONS prevents passing undefined into the query, avoiding potential runtime errors in document embedding.

    Medium
    General
    Add FormControl and InputLabel

    Wrap the in a with an and use labelId for accessibility and to render the label properly. packages/client/src/components/settings/FileTable.tsx [639-654] -<Select - sx={{ mb: 2, '& .MuiInputBase-input': { padding: '15px' } }} - size="small" - fullWidth - label="Select a Split Options" - value={selectedSplitOptions} - onChange={...} -> +<FormControl fullWidth size="small" sx={{ mb: 2 }}> + <InputLabel id="split-option-label">Split Option</InputLabel> + <Select + labelId="split-option-label" + id="split-option-select" + label="Split Option" + value={selectedSplitOptions} + onChange={(e) => setSelectedSplitOptions(e.target.value as string)} + sx={{ '& .MuiInputBase-input': { padding: '15px' } }} + > + {splittingOptions.map(option => ( + <MenuItem key={option.value} value={option.value}> + {option.label} + </MenuItem> + ))} + </Select> +</FormControl> Suggestion importance[1-10]: 6 __ Why: Wrapping the Select in FormControl with an InputLabel improves accessibility and proper label rendering without altering core functionality.

    Low
    Cast event value to string

    Cast the event value to string to satisfy the state setter signature and avoid
    implicit any. This ensures TypeScript knows the correct type.

    packages/client/src/components/settings/FileTable.tsx [650-654]

     onChange={(e) =>
         setSelectedSplitOptions(
    -        e.target.value,
    +        e.target.value as string,
         )
     }
    Suggestion importance[1-10]: 5

    __

    Why: The cast ensures TypeScript knows the type for setSelectedSplitOptions, preventing an implicit any while preserving existing behavior.

    Low
    DRY up splitting options

    Extract the repeated splitting options array into a shared constant to reduce
    duplication and ease future maintenance.

    packages/client/src/pages/import/import.constants.ts [6749-6765]

    +const splittingOptionsList = [
    +  { display: 'Tokens', value: 'Tokens' },
    +  { display: 'Recursive', value: 'Recursive' },
    +  { display: 'Semantic', value: 'Semantic' },
    +];
    +// ...
     options: {
       component: 'select',
    -  options: [
    -    { display: 'Tokens', value: 'Tokens' },
    -    { display: 'Recursive', value: 'Recursive' },
    -    { display: 'Semantic', value: 'Semantic' },
    -  ],
    +  options: splittingOptionsList,
     }
    Suggestion importance[1-10]: 5

    __

    Why: Extracting the repeated splitting options into a shared constant reduces duplication and simplifies future maintenance with minimal risk.

    Low

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    [FE] Add Vector Text Splitting Options to the UI
    3 participants