python-duckddb
DuckDBPyConnection DuckDBPyRelation
duckdb.ExplainType duckdb.Expression duckdb.Statement duckdb.StatementType
_duckdb.typing.DuckDBPyType
duckdb
duckdb.read_csv()
duckdb.sql(sql_query).fetch_arrow_table()
duckdb.sql(sql_query).arrow() .df() .fetchall() .fetchnumpy()
duckdb.sql(sql_query).write_csv("out.csv") ) .write_parquet("out.parquet")
num_rows = duckdb.sql(f"select count(*) as num_rows from {self.sql_query_fragment()}").fetchall()
duckdb.create_function()
duckdb.ColumnExpression
duckdb.StarExpression
duckdb.ConstantExpression
conn:DuckDBPyConnection
conn: duckdb.DuckDBPyConnection = None,
conn.sql(sql_query).fetch_arrow_table()
con.close()
con.table("test").show()
con.create_function()
duckdb.DuckDBPyRelation
Transformation
Relational API
sql() query() sql_query() from_csv_auto() ()read_csv table() view()
Python DB API
con.execute() con.executemany()
####run DuckDB in parallel, each thread must have its own connection:
def good_use():
con = duckdb.connect()
# uses new connection
con.sql("SELECT 1").fetchall()
DuckDBPyConnection.cursor() method
DuckDBPyConnection.register() function.
rel.types
typing
DuckDBPyType
duckdb.sqltypes.DuckDBPyType
duckdb.typing.DuckDBPyType
duckdb.sqltypes import VARCHAR
from duckdb.sqltypes import BIGINT
Python object types to DuckDB Logical Types:
str → VARCHAR
bool → BOOLEAN
bytearray → BLOB
memoryview → BLOB
uuid.UUID → UUID
None → NULL
int → BIGINT INTEGER UBIGINT UINTEGER DOUBLE
float → DOUBLE FLOAT
bytes → BLOB BITSTRING
list → LIST VARCHAR[]
dict → STRUCT(...) or MAP(..., ...)
tuple → LIST STRUCT
→
→
→
datetime.timedelta → INTERVAL
decimal.Decimal → DECIMAL / DOUBLE
datetime.date datetime.time datetime.datetime →
NumPy : fetchnumpy()
Pandas : df() fetch_df() fetchdf() fetch_df_chunk(vector_multiple)
Apache Arrow : arrow() fetch_arrow_table() fetch_record_batch(chunk_size)
Polars : pl()
加载扩展
命令行式
con = duckdb.connect()
con.install_extension("h3", repository="community")
con.load_extension("h3")
###类的方式
class DuckDbExtensionContext(UDFContext):
def __init__(self, name: str, extension_path: str) -> None:
self.name = name
self.extension_path = extension_path
def __str__(self) -> str:
return f"{self.name}@{self.extension_path}"
__repr__ = __str__
def bind(self, conn: duckdb.DuckDBPyConnection):
conn.load_extension(self.extension_path)
参考
https://duckdb.org/docs/stable/clients/python/reference/
https://github.com/duckdb/duckdb/tree/main
https://github.com/duckdb/duckdb-python/tree/main/src/duckdb_py