File Format
Upload files to Alviss AI in CSV or Excel format with strict rules for separators decimals structure headers encoding and data consistency.
Files uploaded to Alviss AI must adhere to a standardized format to ensure successful ingestion, validation, and processing. Supported formats include CSV and Excel (.xlsx), with specific rules for separators, decimals, and structure. This consistency prevents parsing errors and maintains data integrity across sources, enabling seamless dataset creation, modeling, and analysis.
Proper formatting is critical during Upload Data, as inconsistencies can lead to immediate notifications or failures. Always prepare files externally (e.g., via Excel, Python's pandas) to match these specifications.
Supported Formats
- CSV (Comma-Separated Values): Plain text files with comma-separated fields.
Excel may default to semicolons (;) or tabs in some regions—explicitly select comma when saving as CSV.
- Excel: .xlsx (standard Excel format).
In Excel, dates may auto-format based on your locale (e.g., MM/DD/YYYY vs. DD/MM/YYYY). Always export as text or force ISO 8601 (YYYY-MM-DD) to avoid corruption.
Data Requirements
- Separator (CSV Only): Must use a comma (
,) as the field delimiter. - Decimal Separator: Use a period (
.) for decimal numbers (e.g., 3.14, not 3,14). - Row Structure: Each row represents a single record or observation.
- Headers: The first row must contain column names (headers) exactly matching the required fields for the data type (e.g., "Country", "Date").
- Field Order: Data in each row must align with the header order—no mismatches or extra/missing columns.
- Missing Values: Represent as empty strings (``) or placeholders (e.g., "NaN", "null"); avoid inconsistent representations.
- Encoding: Use UTF-8 to handle special characters, accents, or international text correctly.
- No Formulas or Formatting: Excel files should contain raw data only—no embedded formulas, conditional formatting, or macros.
Best Practices
- Preparation Tools: Use spreadsheet software like Excel or Google Sheets for initial setup, then export to CSV or save as .xlsx. For automation, leverage libraries like pandas in Python to enforce formats.
- File Size: For large datasets, prefer .xlsx for better handling; split very large files if needed, but ensure consistent headers across parts.
- Testing: Upload a small sample file first to verify formatting via platform notifications.
- International Considerations: Set your software's locale to US/English for exports to ensure comma separators and period decimals.
- Text Handling: Enclose text fields in quotes if they contain commas (e.g., "Product, Name") to prevent splitting in CSV.
- Date and Numeric Consistency: Cross-reference with Date and Periodicity for temporal fields.
Most Common Issues
- Date Formatting: In Excel, dates may auto-format based on your locale (e.g., MM/DD/YYYY vs. DD/MM/YYYY). Always export as text or force ISO 8601 (YYYY-MM-DD) to avoid corruption.
- CSV Separator: Excel may default to semicolons (;) or tabs in some regions—explicitly select comma when saving as CSV.
- Thousand Separators: Avoid them entirely (e.g., use 100000, not 100,000 or 100.000), as they can cause column splits or truncation (e.g., 100,000 becoming two fields, or 100.000 parsed as 100 in some locales).
- Text Fields with Commas: Commas in values (e.g., "High-End, Product") can split columns in CSV—enclose in double quotes or replace commas with alternatives like semicolons.
- Encoding Problems: Non-UTF-8 files may garble special characters—save explicitly as UTF-8.
- Header Mismatches: Typos or order changes in headers will block uploads—double-check against data type specs (e.g., Sales).
For troubleshooting uploads, see Upload Data. If persistent issues occur, export to CSV for simplicity and contact support.