BulkCopy in mssql-python - API Spec & Discussion
#414
Replies: 3 comments 9 replies
-
|
Re
I think we dont have a usecase for |
Beta Was this translation helpful? Give feedback.
-
|
I'd like to understand the reasoning behind hiding so much behind the |
Beta Was this translation helpful? Give feedback.
-
|
My use case is for ELT for fast inserts from parquet file. I agree with @amachanic about hiding so much parameters behind **kwargs. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
BulkCopy API Specification
We’re adding a BulkCopy API (
cursor.bulkcopy()) tomssql-pythonfor fast, high-throughput inserts into SQL Server.Due to all the interest, this discussion thread is here for us to understand from you, about your usage scenarios - feel free to comment your suggestions and propose improvements!
If you’re using (or planning to use) bulk copy, feel free to comment with:
batch_size,table_locketc.)Overview
The
cursor.bulkcopy()method provides bulk data loading to SQL Server, similar to ADO.NET'sSqlBulkCopy. It follows the standard DB-API cursor pattern.API Revisions
Revision 1 (kwargs dict) — Deprecated
Limitation: Poor IDE discoverability - users couldn't see available options without reading docs.
Revision 2 (explicit parameters) ✅ Current
Based on community feedback, all options are now explicit parameters for better IDE autocomplete and discoverability.
Changes from Revision 1
**kwargscolumn_mappingsList[Tuple[Union[str, int], str]]Union[List[str], List[Tuple[int, str]]]Usage Example
Parameters Reference
Required Parameters
table_namestr'dbo.MyTable')data_sourceIterator[Tuple]Performance Settings
batch_sizeintNone(server optimal)timeoutint300= no timeout)batch_sizeguidelines:1000–10000depending on row sizeColumn Mapping
column_mappingsUnion[List[str], List[Tuple[int, str]]]NoneTwo formats supported:
Simple Format:
List[str]Just list destination column names in order — position in list = source index.
Advanced Format:
List[Tuple[int, str]]Explicit index mapping — allows skipping or reordering columns.
When omitted: Columns are mapped by ordinal position (first source column → first table column, etc.)
Bulk Copy Options
keep_identityboolFalseSET IDENTITY_INSERT ON)check_constraintsboolFalsetable_lockboolFalsekeep_nullsboolFalseNULLvalues instead of applying column defaultsfire_triggersboolFalseINSERTtriggers during bulk copyuse_internal_transactionboolFalseReturn Value
Returns a dictionary containing operation statistics:
{ 'rows_copied': 10000, # Number of rows successfully copied 'batch_count': 10, # Number of batches processed 'elapsed_time': 1.5, # Total elapsed time (seconds) 'rows_per_second': 6666.67 # Throughput }Feedback
We're actively developing this API based on community feedback!
cc: @dlevy-msft-sql @saurabh500 @sumitmsft
Beta Was this translation helpful? Give feedback.
All reactions