文章目录
DataFrame表示的是矩阵的数据表,它包含已排序的列集合,每一列可以是不同的值类型(数值、字符串、布尔值等)。DataFrame既有行索引也有列索引。在DataFrame中,数据被存储为一个以上的二维块,而不是列表、字典等其他一维数组。
构造函数
DataFrame([data, index, columns, dtype, copy])
属性描述
属性 | 描述 |
---|---|
DataFrame.index | 行标签 |
DataFrame.columns | 返回一个string类型的数组,返回值是所有列的名字。 |
DataFrame.dtypes | 返回一个string类型的二维数组,返回值是所有列的名字以及类型。 |
DataFrame.info([verbose, buf, max_cols, …]) | |
DataFrame.select_dtypes([include, exclude]) | 根据数据类型选取子数据框 |
DataFrame.values | 根据数据类型选取子数据框 |
DataFrame.axes | 返回横纵坐标的标签名 |
DataFrame.ndim | 返回该数据集的维度 |
DataFrame.size | 返回数据集元素的个数 |
DataFrame.shape | 返回数据框的形状 |
DataFrame.memory_usage([index, deep]) | 每一列的存储 |
DataFrame.empty | |
DataFrame.set_flags(*[, copy, …]) |
类型转换
方法 | 描述 |
---|---|
DataFrame.astype(dtype[, copy, errors]) | 转换数据类型 |
DataFrame.convert_dtypes([infer_objects, …]) | 列标签 |
DataFrame.infer_objects() | 返回数据的类型 |
DataFrame.copy([deep]) | deep深度复制数据 |
DataFrame.bool() |
索引和迭代
方法 | 描述 |
---|---|
DataFrame.head([n]) | 返回前n行数据 |
DataFrame.at | 快速标签常量访问器 |
DataFrame.iat | 快速整型常量访问器 |
DataFrame.loc | 标签定位 |
DataFrame.iloc | 整型定位 |
DataFrame.insert(loc, column, value[, …]) | 在特殊地点插入行 |
DataFrame.iter() | |
DataFrame.items() | |
DataFrame.iteritems() | 返回列名和序列的迭代器 |
DataFrame.keys() | |
DataFrame.iterrows() | 返回索引和序列的迭代器 |
DataFrame.itertuples([index, name]) | |
DataFrame.lookup(row_labels, col_labels) | |
DataFrame.pop(item) | 返回删除的项目 |
DataFrame.tail([n]) | 返回最后n行 |
DataFrame.xs(key[, axis, level, drop_level]) | |
DataFrame.get(key[, default]) | |
DataFrame.isin(values) | 计算表示每一个值是否在传值容器中的布尔数组。 |
DataFrame.where(cond[, other, inplace, …]) | 条件筛选 |
DataFrame.mask(cond[, other, inplace, axis, …]) | |
DataFrame.query(expr[, inplace]) |
二元运算
方法 | 描述 |
---|---|
DataFrame.add(other[, axis, level, fill_value]) | |
DataFrame.sub(other[, axis, level, fill_value]) | |
DataFrame.mul(other[, axis, level, fill_value]) | |
DataFrame.div(other[, axis, level, fill_value]) | |
DataFrame.truediv(other[, axis, level, …]) | |
DataFrame.floordiv(other[, axis, level, …]) | |
DataFrame.mod(other[, axis, level, fill_value]) | |
DataFrame.pow(other[, axis, level, fill_value]) | |
DataFrame.dot(other) | |
DataFrame.radd(other[, axis, level, fill_value]) | |
DataFrame.rsub(other[, axis, level, fill_value]) | |
DataFrame.rmul(other[, axis, level, fill_value]) | |
DataFrame.rdiv(other[, axis, level, fill_value]) | |
DataFrame.rtruediv(other[, axis, level, …]) | |
DataFrame.rfloordiv(other[, axis, level, …]) | |
DataFrame.rmod(other[, axis, level, fill_value]) | |
DataFrame.rpow(other[, axis, level, fill_value]) | |
DataFrame.lt(other[, axis, level]) | |
DataFrame.gt(other[, axis, level]) | |
DataFrame.le(other[, axis, level]) | |
DataFrame.ge(other[, axis, level]) | |
DataFrame.ne(other[, axis, level]) | |
DataFrame.eq(other[, axis, level]) | |
DataFrame.combine(other, func[, fill_value, …]) | |
DataFrame.combine_first(other) |
函数应用&分组&窗口
方法 | 描述 |
---|---|
DataFrame.apply(func[, axis, raw, …]) | |
DataFrame.applymap(func[, na_action]) | |
DataFrame.pipe(func, *args, **kwargs) | |
DataFrame.agg([func, axis]) | |
DataFrame.aggregate([func, axis]) | |
DataFrame.transform(func[, axis]) | |
DataFrame.groupby([by, axis, level, …]) | |
DataFrame.rolling(window[, min_periods, …]) | |
DataFrame.expanding([min_periods, center, …]) | |
DataFrame.ewm([com, span, halflife, alpha, …]) |
计算统计
方法 | 描述 |
---|---|
DataFrame.abs() | 返回前n行数据 |
DataFrame.all([axis, bool_only, skipna, level]) | |
DataFrame.any([axis, bool_only, skipna, level]) | |
DataFrame.clip([lower, upper, axis, inplace]) | |
DataFrame.corr([method, min_periods]) | |
DataFrame.corrwith(other[, axis, drop, method]) | |
DataFrame.count([axis, level, numeric_only]) | 非NA值的个数 |
DataFrame.cov([min_periods, ddof]) | |
DataFrame.cummax([axis, skipna]) | 累计值的最大值 |
DataFrame.cummin([axis, skipna]) | 累计值的最小值 |
DataFrame.cumprod([axis, skipna]) | 值得累计积 |
DataFrame.cumsum([axis, skipna]) | 累计值 |
DataFrame.describe([percentiles, include, …]) | 计算Series或DataFrame各列的汇总统计集合。 |
DataFrame.diff([periods, axis]) | 计算第一个算数差值(对时间序列有用) |
DataFrame.eval(expr[, inplace]) | |
DataFrame.kurt([axis, skipna, level, …]) | 样本峰度(第四时刻)得值 |
DataFrame.kurtosis([axis, skipna, level, …]) | |
DataFrame.mad([axis, skipna, level]) | 平均值的平均绝对差值 |
DataFrame.mean([axis, skipna, level, …]) | 均值 |
DataFrame.median([axis, skipna, level, …]) | 中位数(50%分位数) |
DataFrame.min([axis, skipna, level, …]) | 计算最小值 |
DataFrame.mode([axis, numeric_only, dropna]) | |
DataFrame.pct_change([periods, fill_method, …]) | 计算百分比 |
DataFrame.prod([axis, skipna, level, …]) | 所有值的积 |
DataFrame.product([axis, skipna, level, …]) | |
DataFrame.quantile([q, axis, numeric_only, …]) | 计算样本的0从1间分位数 |
DataFrame.rank([axis, method, numeric_only, …]) | |
DataFrame.round([decimals]) | |
DataFrame.sem([axis, skipna, level, ddof, …]) | |
DataFrame.skew([axis, skipna, level, …]) | 样本偏度(第三时刻)值 |
DataFrame.sum([axis, skipna, level, …]) | 加和 |
DataFrame.std([axis, skipna, level, ddof, …]) | 值的样本标准差 |
DataFrame.var([axis, skipna, level, ddof, …]) | 值的样本方差 |
DataFrame.nunique([axis, dropna]) | 计算索引的唯一值序列 |
DataFrame.value_counts([subset, normalize, …]) |
重新索引/选择/标签操作
方法 | 描述 |
---|---|
DataFrame.add_prefix(prefix) | |
DataFrame.add_suffix(suffix) | |
DataFrame.align(other[, join, axis, level, …]) | |
DataFrame.at_time(time[, asof, axis]) | |
DataFrame.between_time(start_time, end_time) | |
DataFrame.drop([labels, axis, index, …]) | 根据传参删除指定索引值,并产生新的索引值。 |
DataFrame.drop_duplicates([subset, keep, …]) | |
DataFrame.duplicated([subset, keep]) | |
DataFrame.equals(other) | |
DataFrame.filter([items, like, regex, axis]) | |
DataFrame.first(offset) | |
DataFrame.head([n]) | |
DataFrame.idxmax([axis, skipna]) | |
DataFrame.idxmin([axis, skipna]) | |
DataFrame.last(offset) | |
DataFrame.reindex([labels, index, columns, …]) | |
DataFrame.reindex_like(other[, method, …]) | |
DataFrame.rename([mapper, index, columns, …]) | |
DataFrame.rename_axis([mapper, index, …]) | |
DataFrame.reset_index([level, drop, …]) | |
DataFrame.sample([n, frac, replace, …]) | |
DataFrame.set_axis(labels[, axis, inplace]) | |
DataFrame.set_index(keys[, drop, append, …]) | |
DataFrame.tail([n]) | |
DataFrame.take(indices[, axis, is_copy]) | |
DataFrame.truncate([before, after, axis, copy]) |
处理缺失值
方法 | 描述 |
---|---|
DataFrame.backfill([axis, inplace, limit, …]) | |
DataFrame.bfill([axis, inplace, limit, downcast]) | |
DataFrame.dropna([axis, how, thresh, …]) | |
DataFrame.ffill([axis, inplace, limit, downcast]) | |
DataFrame.fillna([value, method, axis, …]) | |
DataFrame.interpolate([method, axis, limit, …]) | |
DataFrame.isna() | |
DataFrame.isnull() | |
DataFrame.notna() | |
DataFrame.notnull() | |
DataFrame.pad([axis, inplace, limit, downcast]) | |
DataFrame.replace([to_replace, value, …]) |
重新定型、排序、换位
方法 | 描述 |
---|---|
DataFrame.backfill([axis, inplace, limit, …]) | |
DataFrame.droplevel(level[, axis]) | |
DataFrame.pivot([index, columns, values]) | |
DataFrame.pivot_table([values, index, …]) | |
DataFrame.reorder_levels(order[, axis]) | |
DataFrame.sort_values(by[, axis, ascending, …]) | |
DataFrame.sort_index([axis, level, …]) | |
DataFrame.nlargest(n, columns[, keep]) | |
DataFrame.nsmallest(n, columns[, keep]) | |
DataFrame.swaplevel([i, j, axis]) | |
DataFrame.stack([level, dropna]) | |
DataFrame.unstack([level, fill_value]) | |
DataFrame.swapaxes(axis1, axis2[, copy]) | |
DataFrame.melt([id_vars, value_vars, …]) | |
DataFrame.explode(column[, ignore_index]) | |
DataFrame.squeeze([axis]) | |
DataFrame.to_xarray() | |
DataFrame.T | |
DataFrame.transpose(*args[, copy]) |
结合/比较/加入/合并
方法 | 描述 |
---|---|
DataFrame.append(other[, ignore_index, …]) | 将额外的索引对象粘贴到原索引后,产生一个新的索引。 |
DataFrame.assign(**kwargs) | |
DataFrame.compare(other[, align_axis, …]) | |
DataFrame.join(other[, on, how, lsuffix, …]) | |
DataFrame.merge(right[, how, on, left_on, …]) | |
DataFrame.update(other[, join, overwrite, …]) |
时间序列
方法 | 描述 |
---|---|
DataFrame.asfreq(freq[, method, how, …]) | |
DataFrame.asof(where[, subset]) | |
DataFrame.shift([periods, freq, axis, …]) | |
DataFrame.slice_shift([periods, axis]) | |
DataFrame.tshift([periods, freq, axis]) | |
DataFrame.first_valid_index() | |
DataFrame.last_valid_index() | |
DataFrame.resample(rule[, axis, closed, …]) | |
DataFrame.to_period([freq, axis, copy]) | |
DataFrame.to_timestamp([freq, how, axis, copy]) | |
DataFrame.tz_convert(tz[, axis, level, copy]) | |
DataFrame.tz_localize(tz[, axis, level, …]) |
标志Flags
方法 | 描述 |
---|---|
Flags(obj, *, allows_duplicate_labels) |
元数据Metadata
方法 | 描述 |
---|---|
DataFrame.attrs |
作图
方法 | 描述 |
---|---|
DataFrame.plot([x, y, kind, ax, …]) | |
DataFrame.plot.area([x, y]) | |
DataFrame.plot.bar([x, y]) | |
DataFrame.plot.barh([x, y]) | |
DataFrame.plot.box([by]) | |
DataFrame.plot.density([bw_method, ind]) | |
DataFrame.plot.hexbin(x, y[, C, …]) | |
DataFrame.plot.hist([by, bins]) | |
DataFrame.plot.kde([bw_method, ind]) | |
DataFrame.plot.line([x, y]) | |
DataFrame.plot.pie(**kwargs) | |
DataFrame.plot.scatter(x, y[, s, c]) | |
DataFrame.boxplot([column, by, ax, …]) | |
DataFrame.hist([column, by, grid, …]) |
Sparse accessor
DataFrame.sparse
方法 | 描述 |
---|---|
DataFrame.sparse.density | |
DataFrame.sparse.from_spmatrix(data[, …]) | |
DataFrame.sparse.to_coo() | |
DataFrame.sparse.to_dense() |
Serialization / IO / conversion
方法 | 描述 |
---|---|
DataFrame.from_dict(data[, orient, dtype, …]) | |
DataFrame.from_records(data[, index, …]) | |
DataFrame.to_parquet([path, engine, …]) | |
DataFrame.to_pickle(path[, compression, …]) | |
DataFrame.to_csv([path_or_buf, sep, na_rep, …]) | |
DataFrame.to_hdf(path_or_buf, key[, mode, …]) | |
DataFrame.to_sql(name, con[, schema, …]) | |
DataFrame.to_dict([orient, into]) | |
DataFrame.to_excel(excel_writer[, …]) | |
DataFrame.to_json([path_or_buf, orient, …]) | |
DataFrame.to_html([buf, columns, col_space, …]) | |
DataFrame.to_feather(path, **kwargs) | |
DataFrame.to_latex([buf, columns, …]) | |
DataFrame.to_stata(path[, convert_dates, …]) | |
DataFrame.to_gbq(destination_table[, …]) | |
DataFrame.to_records([index, column_dtypes, …]) | |
DataFrame.to_string([buf, columns, …]) | |
DataFrame.to_clipboard([excel, sep]) | |
DataFrame.to_markdown([buf, mode, index, …]) | |
DataFrame.style |
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。