Guide

Advanced Data Transformation And CleaningDeep dive

Fill Missing Values in Excel with Pandas fillna

Fill missing values in an Excel file with pandas fillna: normalize blanks to NaN, then apply a scalar, a per-column dictionary, forward/back fill, or interpolation, and export.

To fill missing values in an Excel workbook with pandas, load the file with pd.read_excel(), turn Excel blanks into real NaN, apply DataFrame.fillna() (or ffill/bfill/interpolate), and write the result with df.to_excel(). The examples below run in order and share a namespace; the first block writes a sample workbook.

fillna replaces NaN cells with chosen values A column with NaN cells on the left becomes a complete column on the right, where the gaps are filled with 0, a column mean, or the previous value by forward fill. Before After fillna 12000 NaN 9800 NaN .fillna(...) 0 / mean / ffill 12000 10933 9800 9800 every gap replaced; column is complete

Create a sample workbook

Python
import pandas as pd

seed = pd.DataFrame({
    "Region": ["North", "South", "West", "East"],
    "Revenue": [12000, None, 9800, None],
    "Status": ["Closed", None, "Open", None],
    "Date": ["2024-01-01", None, "2024-01-03", "2024-01-04"],
})
seed.to_excel("monthly_report.xlsx", index=False)

Load and normalize blanks

Excel frequently exports an empty cell as an empty string, which pandas keeps as valid text — fillna() skips it. Convert whitespace-only cells to NaN first so every gap is fillable:

Python
df = pd.read_excel("monthly_report.xlsx", engine="openpyxl")
df = df.replace(r"^\s*$", pd.NA, regex=True)
print(df.isna().sum())

Fill with a per-column dictionary

A dictionary fills each column with a type-appropriate value in one call. This is the safest default because it never forces one column's fill onto another:

Python
fill_map = {
    "Revenue": df["Revenue"].median(),
    "Status": "Pending",
}
df = df.fillna(fill_map)
print(df)

Fill strategies at a glance

StrategySyntaxBest for
Global scalardf.fillna(0)A single uniform default
Per-column dictdf.fillna({"A": 0, "B": "n/a"})Mixed dtypes, categorical defaults
Forward / back filldf.ffill() / df.bfill()Ordered logs, time series
Interpolationdf.interpolate()Continuous numeric trends

Forward and back fill for ordered data

For time series, carry the last known value forward, then backfill any leading gap. In pandas 3.0 the method= argument to fillna() was removed — use the dedicated ffill() and bfill() methods:

Python
df["Date"] = pd.to_datetime(df["Date"], errors="coerce")
df = df.sort_values("Date")
df["Date"] = df["Date"].ffill().bfill()
print(df[["Region", "Date"]])

When a numeric column represents a smooth trend, interpolate() estimates gaps from neighboring values instead of repeating one number:

Python
series = pd.Series([10.0, None, None, 40.0])
print(series.interpolate(method="linear"))

Export the filled workbook

Write without the index so the output matches the input layout:

Python
df.to_excel("monthly_report_filled.xlsx", index=False, engine="openpyxl")
print("Wrote monthly_report_filled.xlsx")

Troubleshooting

A numeric fill turns the column into text. A column mixing numbers and text is object dtype; filling it with a number can coerce values to strings. Isolate numerics first:

Python
num_cols = df.select_dtypes(include="number").columns
df[num_cols] = df[num_cols].fillna(0)
print(df.dtypes)

fillna() leaves blanks untouched. They are empty strings, not NaN. Run df.replace(r"^\s*$", pd.NA, regex=True) after loading. Also avoid keep_default_na=False on import unless you map blanks yourself.

Formula cells lose their formulas on export. to_excel() writes computed values, not formulas. If downstream consumers need live formulas, write into a pre-built template with openpyxl instead of overwriting the sheet wholesale.

Validation checklist — confirm df.isna().sum() is zero where it must be, check df.dtypes to ensure numeric columns stayed numeric, and open the exported file to confirm no #VALUE! cells.

Frequently asked questions

Why does fillna() leave some blank cells untouched? Those cells are empty strings, not NaN, so fillna() skips them. Run df.replace(r"^\s*$", pd.NA, regex=True) after loading, and avoid keep_default_na=False on import unless you map blanks yourself.

How do I fill each column with a different value in one call? Pass a dictionary, like df.fillna({"Revenue": df["Revenue"].median(), "Status": "Pending"}). This is the safest default because it never forces one column's fill onto another dtype.

When should I use interpolate() instead of fillna()? Use interpolate() for a continuous numeric trend, where it estimates a gap from its neighbors rather than repeating a single number. A scalar fillna or forward fill is better for categorical or non-trending data.

Why did a numeric fill turn my column into text? A column mixing numbers and text is object dtype, so filling with a number can coerce values to strings. Select numerics first with df.select_dtypes(include="number").columns and fill only those.

Do my formulas survive to_excel()? No — to_excel() writes computed values, not formulas. If downstream consumers need live formulas, write into a pre-built template with openpyxl instead of overwriting the sheet.

Where to go next