问题:
各位老师好,我有一个dataframe
产品 数据1 数据2 A 1 2 B 4 5 C 6 3 我想找出比如这一行数据1>数据2 AND 数据1的上一行<数据2的上一行 例如上例子,6>3 AND 4<5 则输出 产品C 应该怎么写
回答:
df = pa.DataFrame({'产品': ['A','B','C'], '数据1': [1, 4, 6], '数据2': [2, 5, 3]}) df[(df['数据1'].shift(1) < df['数据2'].shift(1)) & (df['数据1'].shift(0) > df['数据2'].shift(0))]['产品']说明:
选择行的最快的方法不是遍历行。而是,创建一个mask(即,布尔数组),然后调用df[mask]选择。 这里有一个问题:如何动态表示dataframe中的当前行、前一行?答案是用shift。 shift(0):当前行 shift(1):前一行 shift(n):往前第n行
若要满足多个条件 逻辑与&: mask = ((...) & (...))
逻辑或|: mask = ((...) | (...))
逻辑非~: mask = ~(...)
例如:
In [75]: df = pd.DataFrame({'A':range(5), 'B':range(10,20,2)}) In [76]: df Out[76]: A B 0 0 10 1 1 12 2 2 14 3 3 16 4 4 18 In [77]: mask = (df['A'].shift(1) + df['B'].shift(2) > 12) In [78]: mask Out[78]: 0 False 1 False 2 False 3 True 4 True dtype: bool In [79]: df[mask] Out[79]: A B 3 3 16 4 4 18问题:
If I have the following dataframe:
date A B M S 20150101 8 7 7.5 0 20150101 10 9 9.5 -1 20150102 9 8 8.5 1 20150103 11 11 11 0 20150104 11 10 10.5 0 20150105 12 10 11 -1 ...
If I want to create another column 'cost' by the following rules:
if S < 0, cost = (M-B).shift(1)*S if S > 0, cost = (M-A).shift(1)*S if S == 0, cost=0currently, I am using the following function:
def cost(df): if df[3]<0: return np.roll((df[2]-df[1]),1)df[3] elif df[3]>0: return np.roll((df[2]-df[0]),1)df[3] else: return 0 df['cost']=df.apply(cost,axis=0)
Is there any other way to do it? can I somehow use pandas shift function in user defined functions? thanks.
答案:
import numpy as np import pandas as pd df = pd.DataFrame({'date': ['20150101','20150102','20150103','20150104','20150105','20150106'], 'A': [8,10,9,11,11,12], 'B': [7,9,8,11,10,10], 'M': [7.5,9.5,8.5,11,10.5,11], 'S': [0,-1,1,0,0,-1]}) df = df.reindex(columns=['date','A','B','M','S']) # 方法一 df['cost'] = np.where(df['S'] < 0, np.roll((df['M']-df['B']), 1)*df['S'], np.where(df['S'] > 0, np.roll((df['M']-df['A']), 1)*df['S'], 0) ) # 方法二 M, A, B, S = [df[col] for col in 'MABS'] conditions = [S < 0, S > 0] choices = [(M-B).shift(1)*S, (M-A).shift(1)*S] df['cost2'] = np.select(conditions, choices, default=0) print(df)转载于:https://www.cnblogs.com/hhh5460/p/5703129.html
相关资源:JAVA上百实例源码以及开源项目