Supported File Types
Artifact supports the following file types for parsing documents and building catalogs (RAG):
.pdf
- Portable Document Format.doc
- Microsoft Word 97-2003.docx
- Microsoft Word 2007-2019.txt
- Plain text.md
- Markdown.html
- HTML document.ppt
- Microsoft PowerPoint 97-2003.pptx
- Microsoft PowerPoint 2007-2019.xls
- Microsoft Excel 97-2003.xlsx
- Microsoft Excel 2007-2019.csv
- Comma-separated values
Max file size: 512 MB per file
Parsing Behavior
- All supported formats are converted into clean, structured Markdown for use with LLMs — except
.txt
files, which are extracted as plain text. - Scanned and handwritten PDFs are supported with high-quality visual understanding.
Updated 9 days ago