pd.concat
pd.concat(...,axis = 1)
其中对象是一个列。
import pandas as pd
pd.concat?
objs : a sequence or mapping of Series, DataFrame, or Panel objects
If a dict is passed, the sorted keys will be used as the `keys`
argument, unless it is passed, in which case the values will be
selected (see below). Any None objects will be dropped silently unless
they are all None in which case a ValueError will be raised
注意看这里,a sequence or mapping
,因此是一列,而非一个对象。
pickle
文件
使用pickle
文件,df = pd.read_pickle
和df.to_pickle
。
引用和复制
copy
z[~z.x.isin([2])].y = 1
indexing view
z.loc[~z.x.isin([2]),“y”] = 1
不推荐前一种,因为这两种分别对应
z.__getitem__(~z.x.isin([2])).__setitem__(y, value)
z.__setitem__([~z.x.isin([2]),"y"], value)
- 前面一种是复制数据,
z
变化,后面的结果不会变化。 - 后面一种是引用,因此
z
变化,后面的结果都会变化。 思路参考 pandas 0.23.0 documentation。
更多可以参考pandas的引用与复制 - CSDN博客,有7个例子加深理解。
#.tz_localize
报错
pd.to_datetime('now').tz_localize('Asia/Shanghai')
报错
TypeError: ufunc subtract cannot use operands with types dtype('O') and dtype('<M8[ns]')
解决办法
pd.to_datetime('now') + pd.Timedelta(hours=8)
#.isnotin
类功能 [@曹骥2017]
z[~z.x.isin([2])]
z[z.x.isin([2]) == False]
neat way pivot pandas avoid mutiple index
pivot(index='userid', columns='delta_day_group',values='lgcnt')
ods_tbloginlogby2018.columns = ['lgcnt_7_0','lgcnt_30_7','lgcnt_90_30']