Getting Started with Pandas for Data Analysis

November 1, 2024 1 min read 128 words

Learn the fundamentals of pandas, Python's powerful data manipulation library.

Table of Contents

Pandas is the cornerstone of data analysis in Python. This guide will help you get started with the basics.

Installing Pandas

pip install pandas numpy

A one-dimensional labeled array:

import pandas as pd

s = pd.Series([1, 3, 5, np.nan, 6, 8])
print(s)

A two-dimensional labeled data structure:

df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['NYC', 'LA', 'Chicago']
})

# CSV
df = pd.read_csv('data.csv')

# Excel
df = pd.read_excel('data.xlsx')

# JSON
df = pd.read_json('data.json')

df.head()      # First 5 rows
df.tail()      # Last 5 rows
df.info()      # Column info
df.describe()  # Statistical summary

Practice with real datasets from Kaggle or your own data to solidify these concepts!