1. loc——通过行标签索引行数据
1.1 loc[1]表示索引的是第1行(index 是整数)
import pandas as pd data = [[1,2,3],[4,5,6]] index = [0,1] columns=['a','b','c'] df = pd.DataFrame(data=data, index=index, columns=columns) df.loc[1] '''a 4 b 5 c 6 ''' df a b c0 1 2 31 4 5 6
1.2 loc[‘d’]表示索引的是第’d’行(index 是字符)
data = [[1,2,3],[4,5,6]] index = ['d','e'] columns=['a','b','c'] df = pd.DataFrame(data=data, index=index, columns=columns) df.loc['d'] '''a 1 b 2 c 3 ''' df a b cd 1 2 3e 4 5 6
1.3 loc可以获取多行数据
data = [[1,2,3],[4,5,6]] index = ['d','e'] columns=['a','b','c'] df = pd.DataFrame(data=data, index=index, columns=columns) df.loc['d':] ''''' a b c d 1 2 3 e 4 5 6 '''
1.4 loc扩展——索引某行某列
import pandas as pd data = [[1,2,3],[4,5,6]] index = ['d','e'] columns=['a','b','c'] df = pd.DataFrame(data=data, index=index, columns=columns) df.loc['d',['b','c']] ''''' b 2 c 3 '''
1.5 loc扩展——索引某列
import pandas as pd data = [[1,2,3],[4,5,6]] index = ['d','e'] columns=['a','b','c'] df = pd.DataFrame(data=data, index=index, columns=columns) df.loc[:,['c']] ''' c d 3 e 6 '''
当然获取某列数据最直接的方式是df.[列标签],但是当列标签未知时可以通过这种方式获取列数据。
需要注意的是,dataframe的索引[1:3]是包含1,2,3的。
2. iloc——通过行号获取行数据
.iloc
则是基于序号的索引(还是行优先),从0到length-1
。
2.1 获取单行
import pandas as pd data = [[1,2,3],[4,5,6]] index = ['d','e'] columns=['a','b','c'] df = pd.DataFrame(data=data, index=index, columns=columns) df.loc[1] '''a 4 b 5 c 6 '''
2.2 索引多行
import pandas as pddata = [[1,2,3],[4,5,6]]index = ['d','e']columns=['a','b','c']df = pd.DataFrame(data=data, index=index, columns=columns) df.iloc[0:]""" a b cd 1 2 3e 4 5 6"""
2.3 索引列数据
import pandas as pd data = [[1,2,3],[4,5,6]] index = ['d','e'] columns=['a','b','c'] df = pd.DataFrame(data=data, index=index, columns=columns) df.iloc[:,[1]] ''''' b d 2 e 5 '''
3. ix——结合前两种的混合索引
.ix
则相当于上述两个之和,两种index都能处理。
3.1 通过行号索引
import pandas as pd data = [[1,2,3],[4,5,6]] index = ['d','e'] columns=['a','b','c'] df = pd.DataFrame(data=data, index=index, columns=columns) df.ix[1] ''''' a 4 b 5 c 6 '''
3.2 通过行标签索引
import pandas as pd data = [[1,2,3],[4,5,6]] index = ['d','e'] columns=['a','b','c'] df = pd.DataFrame(data=data, index=index, columns=columns) df.ix['e']''''' a 4 b 5 c 6 '''