To use DuckDB in Python, you can follow these steps to install the DuckDB library and perform basic operations such as creating a database, running queries, and manipulating data. Here’s a simple guide:
Step 1: Install DuckDB
You can install DuckDB using pip. Open your terminal or command prompt and run:
pip install duckdb
Example: Full Code
Here’s a complete example that incorporates all the steps:
import duckdb
# Step 1: Connect to an in-memory database
conn = duckdb.connect(database=':memory:')
# Step 2: Create a table
conn.execute("""
CREATE TABLE users (
id INTEGER,
name VARCHAR,
age INTEGER
)
""")
# Step 3: Insert data
conn.execute("""
INSERT INTO users VALUES
(1, 'Alice', 30),
(2, 'Bob', 25),
(3, 'Charlie', 35)
""")
# Step 4: Query data
result = conn.execute("SELECT * FROM users").fetchall()
# Print the results
for row in result:
print(row)
# Step 5: Close the connection
conn.close()
Additional Features
DuckDB also supports advanced features such as:
- Reading from CSV and Parquet files: You can load data directly from these formats using commands like
READ_CSV
orREAD_PARQUET
. - Integration with Pandas: You can easily convert DuckDB query results to Pandas DataFrames for further analysis.
Example of Reading from a CSV
# Load data from a CSV file into a DuckDB table
conn.execute("CREATE TABLE my_data AS SELECT * FROM read_csv_auto('path/to/your/file.csv')")