Polars Cookbook

1 import polars

import polars as pl

print('polars ver:', pl.__version__)
polars ver: 1.16.0

2 read data

Create dummy df.

df = pl.DataFrame({
    'nums': [1, 2, 3, 4, 5],
    'letters': ['a', 'b', 'c', 'd', 'e']
})

df.head()
shape: (5, 2)
nums letters
i64 str
1 "a"
2 "b"
3 "c"
4 "d"
5 "e"

Read CSV file.

df_csv = pl.read_csv('./data/titanic.csv')

df_csv.head()
shape: (5, 8)
Survived Pclass Name Sex Age Siblings/Spouses Aboard Parents/Children Aboard Fare
i64 i64 str str f64 i64 i64 f64
0 3 "Mr. Owen Harris Braund" "male" 22.0 1 0 7.25
1 1 "Mrs. John Bradley (Florence Br… "female" 38.0 1 0 71.2833
1 3 "Miss. Laina Heikkinen" "female" 26.0 0 0 7.925
1 1 "Mrs. Jacques Heath (Lily May P… "female" 35.0 1 0 53.1
0 3 "Mr. William Henry Allen" "male" 35.0 0 0 8.05

Get df info

df_csv.schema
Schema([('Survived', Int64),
        ('Pclass', Int64),
        ('Name', String),
        ('Sex', String),
        ('Age', Float64),
        ('Siblings/Spouses Aboard', Int64),
        ('Parents/Children Aboard', Int64),
        ('Fare', Float64)])
df_csv.columns
['Survived',
 'Pclass',
 'Name',
 'Sex',
 'Age',
 'Siblings/Spouses Aboard',
 'Parents/Children Aboard',
 'Fare']
df_csv.dtypes
[Int64, Int64, String, String, Float64, Int64, Int64, Float64]
print('df shape:', df_csv.shape)
print('df width:', df_csv.width)
print('df height:', df_csv.height)
df shape: (887, 8)
df width: 8
df height: 887
df_csv.flags
{'Survived': {'SORTED_ASC': False, 'SORTED_DESC': False},
 'Pclass': {'SORTED_ASC': False, 'SORTED_DESC': False},
 'Name': {'SORTED_ASC': False, 'SORTED_DESC': False},
 'Sex': {'SORTED_ASC': False, 'SORTED_DESC': False},
 'Age': {'SORTED_ASC': False, 'SORTED_DESC': False},
 'Siblings/Spouses Aboard': {'SORTED_ASC': False, 'SORTED_DESC': False},
 'Parents/Children Aboard': {'SORTED_ASC': False, 'SORTED_DESC': False},
 'Fare': {'SORTED_ASC': False, 'SORTED_DESC': False}}

Create df with schema

import numpy as np

numpy_arr = np.array([[1,1,1], [2,2,2]])

df = pl.from_numpy(
    numpy_arr,
    schema={
        'ones': pl.Float32,
        'twos': pl.Int8,
    },
    orient='col'
)

df.head()
shape: (3, 2)
ones twos
f32 i8
1.0 2
1.0 2
1.0 2
Back to top