Data Connectors
Data Connectors provide connections to databases, data warehouses, and data lakes for federated SQL queries and data replication.
Supported Data Connectors include:
| Name | Description | Status | Protocol/Format |
|---|---|---|---|
databricks (mode: delta_lake) | Databricks | Stable | S3/Delta Lake |
delta_lake | Delta Lake | Stable | Delta Lake |
dremio | Dremio | Stable | Arrow Flight |
duckdb | DuckDB | Stable | Embedded |
file | File | Stable | Parquet, CSV |
github | GitHub | Stable | GitHub API |
postgres | PostgreSQL | Stable | |
s3 | S3 | Stable | Parquet, CSV |
mysql | MySQL | Stable | |
delta_lake | Delta Lake | Stable | Delta Lake |
graphql | GraphQL | Release Candidate | JSON |
spice.ai | Spice.ai | Release Candidate | Arrow Flight |
databricks (mode: spark_connect) | Databricks | Beta | Spark Connect |
flightsql | FlightSQL | Beta | Arrow Flight SQL |
mssql | Microsoft SQL Server | Beta | Tabular Data Stream (TDS) |
odbc | ODBC | Beta | ODBC |
snowflake | Snowflake | Beta | Arrow |
spark | Spark | Beta | Spark Connect |
iceberg | Apache Iceberg | Beta | Parquet |
abfs | Azure BlobFS | Alpha | Parquet, CSV |
clickhouse | Clickhouse | Alpha | |
debezium | Debezium CDC | Alpha | Kafka + JSON |
dynamodb | DynamoDB | Alpha | |
ftp, sftp | FTP/SFTP | Alpha | Parquet, CSV |
http, https | HTTP(s) | Alpha | Parquet, CSV |
localpod | Local dataset replication | Alpha | |
sharepoint | Microsoft SharePoint | Alpha | Unstructured UTF-8 documents |
elasticsearch | ElasticSearch | Roadmap | |
mongodb | MongoDB | Roadmap |
Object Store File Formats
For data connectors that are object store compatible, if a folder is provided, the file format must be specified with params.file_format.
If a file is provided, the file format will be inferred, and params.file_format is unnecessary.
File formats currently supported are:
| Name | Parameter | Supported | Is Document Format |
|---|---|---|---|
| Apache Parquet | file_format: parquet | ✅ | ❌ |
| CSV | file_format: csv | ✅ |