一个人如何将CursorResult-object转换为PANDAS DataFrame?
以下代码导致CursorResult-object:
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
engine = create_engine(f"mssql+pyodbc://{db_server}/{db_name}?trusted_connection=yes&driver={db_driver}")
q1 = "SELECT * FROM my_schema.my_table"
with Session(engine) as session:
results = session.execute(q1)
session.commit()
type(results)
>sqlalchemy.engine.cursor.CursorResult
由于我找不到从Cursorresult提取相关信息的方法,因此它尝试了以下内容:
# Extracting data as we go
with Session(engine) as session:
results = session.execute(q1)
description = results.cursor.description
rows = results.all()
session.commit()
# Extracting column names
colnames = [elem[0] for elem in description]
# Extracting types
types = [elem[1] for elem in description]
# Creating dataframe
import pandas as pd
pd.DataFrame(rows, columns=colnames)
但是dtypes呢?如果我只是把它们放进去,尽管看起来都是Python类型,但它不起作用。对于我的用例I MUST使用会话,因此我无法使用第一个建议进行经典:
# I cannot use
pandas.read_sql(q1, engine)
原因是我必须在同一上下文中进行multi-batch查询,这就是为什么我使用会话类的原因。
分析解答
IIUC,只需使用pd.DataFrame
构造函数即可。 dtypes
已正确设置。
# sqlalchemy==2.0.16
# pandas==2.0.2
from sqlalchemy.sql import text
with Session(engine) as session:
results = session.execute(text(q1))
df = pd.DataFrame(results)
# session.commit() # commit is irrelevant if you don't write data
在我的数据库上测试:
>>> df.head()
Scenario Attribute Process Period Region Vintage PV
0 WithHHP16HinsHE0CCS109LHP VAR_Cap EVTRANS_H-L 2014 FR None 296.071141
1 WithHHP16HinsHE0CCS109LHP VAR_Cap EVTRANS_H-M 2014 FR None 11.770909
2 WithHHP16HinsHE0CCS109LHP VAR_Cap IMPELCHIGA 2014 FR None 11851.674497
3 WithHHP16HinsHE0CCS109LHP VAR_Cap EVTRANS_H-L 2015 FR None 296.071141
4 WithHHP16HinsHE0CCS109LHP VAR_Cap EVTRANS_H-M 2015 FR None 11.770909
>>> df.dtypes
Scenario object
Attribute object
Process object
Period int64
Region object
Vintage object
PV float64
dtype: object
编辑:
rec = results.fetchone()
>>> rec
('WithHHP16HinsHE0CCS109LHP', 'VAR_Cap', 'EVTRANS_H-L', 2014, 'FR', None, 296.071141357762)
# python int --^ python float --^
>>> type(rec)
sqlalchemy.engine.row.Row