[{"data":1,"prerenderedAt":1498},["ShallowReactive",2],{"doc:\u002Fadvanced-data-transformation-and-cleaning\u002Fcleaning-excel-data-with-pandas\u002Fremove-blank-rows-from-excel-with-pandas":3,"surround:\u002Fadvanced-data-transformation-and-cleaning\u002Fcleaning-excel-data-with-pandas\u002Fremove-blank-rows-from-excel-with-pandas":1490},{"id":4,"title":5,"body":6,"dateModified":1465,"datePublished":1465,"description":1466,"extension":1467,"faq":1468,"meta":1482,"navigation":228,"path":1483,"seo":1484,"slug":1486,"stem":1487,"type":1488,"__hash__":1489},"docs\u002Fadvanced-data-transformation-and-cleaning\u002Fcleaning-excel-data-with-pandas\u002Fremove-blank-rows-from-excel-with-pandas\u002Findex.md","Remove Blank Rows From Excel With Pandas",{"type":7,"value":8,"toc":1450},"minimark",[9,32,37,65,68,196,200,420,434,438,456,513,521,525,545,640,646,650,660,750,765,769,779,872,880,884,893,989,995,999,1160,1167,1171,1280,1291,1295,1315,1319,1343,1365,1375,1391,1402,1406,1419,1423,1446],[10,11,12,13,17,18,21,22,25,26,31],"p",{},"To remove blank rows from an Excel file with pandas, load the workbook with ",[14,15,16],"code",{},"pd.read_excel()",", drop the empty rows with ",[14,19,20],{},"df.dropna(how=\"all\")",", reset the index, and write the result back with ",[14,23,24],{},"to_excel()",". The tricky part is defining \"blank\": a fully empty row, a row missing only the columns you care about, and a row full of whitespace strings each need a different call. This page covers all three. It is part of the ",[27,28,30],"a",{"href":29},"\u002Fadvanced-data-transformation-and-cleaning\u002Fcleaning-excel-data-with-pandas\u002F","Cleaning Excel Data with Pandas"," cluster.",[33,34,36],"h2",{"id":35},"prerequisites","Prerequisites",[38,39,44],"pre",{"className":40,"code":41,"language":42,"meta":43,"style":43},"language-bash shiki shiki-themes github-light github-dark","pip install pandas openpyxl\n","bash","",[14,45,46],{"__ignoreMap":43},[47,48,51,55,59,62],"span",{"class":49,"line":50},"line",1,[47,52,54],{"class":53},"sScJk","pip",[47,56,58],{"class":57},"sZZnC"," install",[47,60,61],{"class":57}," pandas",[47,63,64],{"class":57}," openpyxl\n",[10,66,67],{},"Every block below runs in order against a sample workbook built in the first step.",[69,70,78,79,78,83,78,87,78,94,78,104,78,109,78,113,78,118,78,121,78,125,78,128,78,132,78,138,78,142,78,151,78,157,78,162,78,166,78,171,78,175,78,179,78,184,78,188,78,192],"svg",{"viewBox":71,"role":72,"ariaLabelledBy":73,"xmlns":76,"style":77},"0 0 760 280","img",[74,75],"blankrows-t","blankrows-d","http:\u002F\u002Fwww.w3.org\u002F2000\u002Fsvg","width:100%;max-width:760px;height:auto;display:block;margin:1.5rem auto;font-family:Inter,ui-sans-serif,system-ui,sans-serif","\n  ",[80,81,82],"title",{"id":74},"Three ways dropna removes blank rows",[84,85,86],"desc",{"id":75},"A four-row table with fully empty and whitespace-only rows; how all drops only fully empty rows, subset drops rows missing a key column, and thresh drops rows below a minimum count of real values.",[88,89,93],"text",{"x":90,"y":91,"style":92},"130","26","font-size:13px;font-weight:600;fill:var(--muted,#5b6780);text-anchor:middle","Source rows",[95,96],"rect",{"x":97,"y":98,"width":99,"height":100,"rx":101,"fill":102,"stroke":103},"30","40","200","32","6","var(--surface-muted,#eef2ff)","var(--brand,#5b5cf0)",[88,105,108],{"x":90,"y":106,"style":107},"61","font-size:12px;fill:var(--text,#172033);text-anchor:middle","Ana   30   NY",[95,110],{"x":97,"y":111,"width":99,"height":100,"rx":101,"fill":102,"stroke":112},"76","var(--line,#cdd5e6)",[88,114,117],{"x":90,"y":115,"style":116},"97","font-size:12px;fill:var(--muted,#5b6780);text-anchor:middle","(fully empty)",[95,119],{"x":97,"y":120,"width":99,"height":100,"rx":101,"fill":102,"stroke":112},"112",[88,122,124],{"x":90,"y":123,"style":116},"133","“ ”   41   LA",[95,126],{"x":97,"y":127,"width":99,"height":100,"rx":101,"fill":102,"stroke":103},"148",[88,129,131],{"x":90,"y":130,"style":107},"169","Cara       SF",[49,133],{"x1":134,"y1":135,"x2":136,"y2":135,"stroke":112,"style":137},"234","110","296","stroke-width:2px",[139,140],"polygon",{"points":141,"fill":112},"296,110 286,105 286,115",[95,143],{"x":144,"y":145,"width":146,"height":147,"rx":148,"fill":149,"stroke":103,"style":150},"300","44","430","62","10","none","stroke-width:1.5px",[88,152,156],{"x":153,"y":154,"style":155},"318","68","font-size:13px;font-weight:700;fill:var(--brand-strong,#4338ca)","how=\"all\"",[88,158,161],{"x":153,"y":159,"style":160},"90","font-size:11.5px;fill:var(--muted,#5b6780)","drops only the fully empty row",[95,163],{"x":144,"y":164,"width":146,"height":147,"rx":148,"fill":149,"stroke":165,"style":150},"114","var(--teal,#0f9488)",[88,167,170],{"x":153,"y":168,"style":169},"138","font-size:13px;font-weight:700;fill:var(--teal,#0f9488)","subset=[\"name\"]",[88,172,174],{"x":153,"y":173,"style":160},"160","drops rows missing the key column",[95,176],{"x":144,"y":177,"width":146,"height":147,"rx":148,"fill":149,"stroke":178,"style":150},"184","var(--gold,#b4740a)",[88,180,183],{"x":153,"y":181,"style":182},"208","font-size:13px;font-weight:700;fill:var(--gold,#b4740a)","thresh=2",[88,185,187],{"x":153,"y":186,"style":160},"230","drops rows with fewer than 2 real values",[88,189,191],{"x":90,"y":181,"style":190},"font-size:11.5px;fill:var(--muted,#5b6780);text-anchor:middle","“blank” means",[88,193,195],{"x":90,"y":194,"style":190},"226","three different things",[33,197,199],{"id":198},"create-a-messy-sample-workbook","Create a messy sample workbook",[38,201,205],{"className":202,"code":203,"language":204,"meta":43,"style":43},"language-python shiki shiki-themes github-light github-dark","import pandas as pd\n\ndf = pd.DataFrame({\n    \"OrderID\": [\"A-100\", None, \"B-200\", \"  \", \"C-300\", None],\n    \"Customer\": [\"Acme\", None, \"Globex\", None, \"Initech\", None],\n    \"Amount\": [120.0, None, 80.0, None, None, None],\n})\ndf.to_excel(\"orders_input.xlsx\", index=False, engine=\"openpyxl\")\nprint(f\"Wrote {len(df)} rows (some blank)\")\n","python",[14,206,207,223,230,242,283,318,352,358,391],{"__ignoreMap":43},[47,208,209,213,217,220],{"class":49,"line":50},[47,210,212],{"class":211},"szBVR","import",[47,214,216],{"class":215},"sVt8B"," pandas ",[47,218,219],{"class":211},"as",[47,221,222],{"class":215}," pd\n",[47,224,226],{"class":49,"line":225},2,[47,227,229],{"emptyLinePlaceholder":228},true,"\n",[47,231,233,236,239],{"class":49,"line":232},3,[47,234,235],{"class":215},"df ",[47,237,238],{"class":211},"=",[47,240,241],{"class":215}," pd.DataFrame({\n",[47,243,245,248,251,254,257,261,263,266,268,271,273,276,278,280],{"class":49,"line":244},4,[47,246,247],{"class":57},"    \"OrderID\"",[47,249,250],{"class":215},": [",[47,252,253],{"class":57},"\"A-100\"",[47,255,256],{"class":215},", ",[47,258,260],{"class":259},"sj4cs","None",[47,262,256],{"class":215},[47,264,265],{"class":57},"\"B-200\"",[47,267,256],{"class":215},[47,269,270],{"class":57},"\"  \"",[47,272,256],{"class":215},[47,274,275],{"class":57},"\"C-300\"",[47,277,256],{"class":215},[47,279,260],{"class":259},[47,281,282],{"class":215},"],\n",[47,284,286,289,291,294,296,298,300,303,305,307,309,312,314,316],{"class":49,"line":285},5,[47,287,288],{"class":57},"    \"Customer\"",[47,290,250],{"class":215},[47,292,293],{"class":57},"\"Acme\"",[47,295,256],{"class":215},[47,297,260],{"class":259},[47,299,256],{"class":215},[47,301,302],{"class":57},"\"Globex\"",[47,304,256],{"class":215},[47,306,260],{"class":259},[47,308,256],{"class":215},[47,310,311],{"class":57},"\"Initech\"",[47,313,256],{"class":215},[47,315,260],{"class":259},[47,317,282],{"class":215},[47,319,321,324,326,329,331,333,335,338,340,342,344,346,348,350],{"class":49,"line":320},6,[47,322,323],{"class":57},"    \"Amount\"",[47,325,250],{"class":215},[47,327,328],{"class":259},"120.0",[47,330,256],{"class":215},[47,332,260],{"class":259},[47,334,256],{"class":215},[47,336,337],{"class":259},"80.0",[47,339,256],{"class":215},[47,341,260],{"class":259},[47,343,256],{"class":215},[47,345,260],{"class":259},[47,347,256],{"class":215},[47,349,260],{"class":259},[47,351,282],{"class":215},[47,353,355],{"class":49,"line":354},7,[47,356,357],{"class":215},"})\n",[47,359,361,364,367,369,373,375,378,380,383,385,388],{"class":49,"line":360},8,[47,362,363],{"class":215},"df.to_excel(",[47,365,366],{"class":57},"\"orders_input.xlsx\"",[47,368,256],{"class":215},[47,370,372],{"class":371},"s4XuR","index",[47,374,238],{"class":211},[47,376,377],{"class":259},"False",[47,379,256],{"class":215},[47,381,382],{"class":371},"engine",[47,384,238],{"class":211},[47,386,387],{"class":57},"\"openpyxl\"",[47,389,390],{"class":215},")\n",[47,392,394,397,400,403,406,409,412,415,418],{"class":49,"line":393},9,[47,395,396],{"class":259},"print",[47,398,399],{"class":215},"(",[47,401,402],{"class":211},"f",[47,404,405],{"class":57},"\"Wrote ",[47,407,408],{"class":259},"{len",[47,410,411],{"class":215},"(df)",[47,413,414],{"class":259},"}",[47,416,417],{"class":57}," rows (some blank)\"",[47,419,390],{"class":215},[10,421,422,423,426,427,429,430,433],{},"Rows 1 and 5 (zero-based) are fully empty. Row 3 has whitespace in ",[14,424,425],{},"OrderID"," but is otherwise empty. Row 4 has an ",[14,428,425],{}," but no ",[14,431,432],{},"Amount",".",[33,435,437],{"id":436},"drop-fully-empty-rows-with-howall","Drop fully empty rows with how=\"all\"",[10,439,440,442,443,447,448,451,452,455],{},[14,441,156],{}," removes only rows where ",[444,445,446],"strong",{},"every"," cell is ",[14,449,450],{},"NaN",". This is almost always what you want for blank rows — the default ",[14,453,454],{},"how=\"any\""," would delete any row with a single missing cell, which is far too aggressive.",[38,457,459],{"className":202,"code":458,"language":204,"meta":43,"style":43},"df = pd.read_excel(\"orders_input.xlsx\", engine=\"openpyxl\")\n\ncleaned = df.dropna(how=\"all\")\nprint(cleaned)\n",[14,460,461,482,486,506],{"__ignoreMap":43},[47,462,463,465,467,470,472,474,476,478,480],{"class":49,"line":50},[47,464,235],{"class":215},[47,466,238],{"class":211},[47,468,469],{"class":215}," pd.read_excel(",[47,471,366],{"class":57},[47,473,256],{"class":215},[47,475,382],{"class":371},[47,477,238],{"class":211},[47,479,387],{"class":57},[47,481,390],{"class":215},[47,483,484],{"class":49,"line":225},[47,485,229],{"emptyLinePlaceholder":228},[47,487,488,491,493,496,499,501,504],{"class":49,"line":232},[47,489,490],{"class":215},"cleaned ",[47,492,238],{"class":211},[47,494,495],{"class":215}," df.dropna(",[47,497,498],{"class":371},"how",[47,500,238],{"class":211},[47,502,503],{"class":57},"\"all\"",[47,505,390],{"class":215},[47,507,508,510],{"class":49,"line":244},[47,509,396],{"class":259},[47,511,512],{"class":215},"(cleaned)\n",[10,514,515,516,518,519,433],{},"That drops the two fully empty rows but keeps the whitespace row and the row missing ",[14,517,432],{},", because neither is entirely ",[14,520,450],{},[33,522,524],{"id":523},"convert-whitespace-only-cells-to-nan-first","Convert whitespace-only cells to NaN first",[10,526,527,530,531,533,534,536,537,540,541,544],{},[14,528,529],{},"pd.read_excel"," reads ",[14,532,270],{}," as a literal string, not ",[14,535,450],{},", so a visually blank row survives ",[14,538,539],{},"dropna",". Replace whitespace-only strings with ",[14,542,543],{},"pd.NA"," before dropping:",[38,546,548],{"className":202,"code":547,"language":204,"meta":43,"style":43},"df = pd.read_excel(\"orders_input.xlsx\", engine=\"openpyxl\")\n\ndf = df.replace(r\"^\\s*$\", pd.NA, regex=True)\ncleaned = df.dropna(how=\"all\")\nprint(cleaned)\n",[14,549,550,570,574,618,634],{"__ignoreMap":43},[47,551,552,554,556,558,560,562,564,566,568],{"class":49,"line":50},[47,553,235],{"class":215},[47,555,238],{"class":211},[47,557,469],{"class":215},[47,559,366],{"class":57},[47,561,256],{"class":215},[47,563,382],{"class":371},[47,565,238],{"class":211},[47,567,387],{"class":57},[47,569,390],{"class":215},[47,571,572],{"class":49,"line":225},[47,573,229],{"emptyLinePlaceholder":228},[47,575,576,578,580,583,586,589,592,595,598,600,603,606,608,611,613,616],{"class":49,"line":232},[47,577,235],{"class":215},[47,579,238],{"class":211},[47,581,582],{"class":215}," df.replace(",[47,584,585],{"class":211},"r",[47,587,588],{"class":57},"\"",[47,590,591],{"class":259},"^\\s",[47,593,594],{"class":211},"*",[47,596,597],{"class":259},"$",[47,599,588],{"class":57},[47,601,602],{"class":215},", pd.",[47,604,605],{"class":259},"NA",[47,607,256],{"class":215},[47,609,610],{"class":371},"regex",[47,612,238],{"class":211},[47,614,615],{"class":259},"True",[47,617,390],{"class":215},[47,619,620,622,624,626,628,630,632],{"class":49,"line":244},[47,621,490],{"class":215},[47,623,238],{"class":211},[47,625,495],{"class":215},[47,627,498],{"class":371},[47,629,238],{"class":211},[47,631,503],{"class":57},[47,633,390],{"class":215},[47,635,636,638],{"class":49,"line":285},[47,637,396],{"class":259},[47,639,512],{"class":215},[10,641,642,643,645],{},"Now the whitespace row collapses to all-",[14,644,450],{}," and gets removed. Run this normalization step first whenever data comes from manual entry or a CSV-to-Excel round trip.",[33,647,649],{"id":648},"drop-rows-missing-a-key-field-with-subset","Drop rows missing a key field with subset",[10,651,652,653,655,656,659],{},"To delete rows that lack a specific required column — say every row without an ",[14,654,425],{}," — pass ",[14,657,658],{},"subset",":",[38,661,663],{"className":202,"code":662,"language":204,"meta":43,"style":43},"df = pd.read_excel(\"orders_input.xlsx\", engine=\"openpyxl\")\ndf = df.replace(r\"^\\s*$\", pd.NA, regex=True)\n\ncleaned = df.dropna(subset=[\"OrderID\"])\nprint(cleaned)\n",[14,664,665,685,719,723,744],{"__ignoreMap":43},[47,666,667,669,671,673,675,677,679,681,683],{"class":49,"line":50},[47,668,235],{"class":215},[47,670,238],{"class":211},[47,672,469],{"class":215},[47,674,366],{"class":57},[47,676,256],{"class":215},[47,678,382],{"class":371},[47,680,238],{"class":211},[47,682,387],{"class":57},[47,684,390],{"class":215},[47,686,687,689,691,693,695,697,699,701,703,705,707,709,711,713,715,717],{"class":49,"line":225},[47,688,235],{"class":215},[47,690,238],{"class":211},[47,692,582],{"class":215},[47,694,585],{"class":211},[47,696,588],{"class":57},[47,698,591],{"class":259},[47,700,594],{"class":211},[47,702,597],{"class":259},[47,704,588],{"class":57},[47,706,602],{"class":215},[47,708,605],{"class":259},[47,710,256],{"class":215},[47,712,610],{"class":371},[47,714,238],{"class":211},[47,716,615],{"class":259},[47,718,390],{"class":215},[47,720,721],{"class":49,"line":232},[47,722,229],{"emptyLinePlaceholder":228},[47,724,725,727,729,731,733,735,738,741],{"class":49,"line":244},[47,726,490],{"class":215},[47,728,238],{"class":211},[47,730,495],{"class":215},[47,732,658],{"class":371},[47,734,238],{"class":211},[47,736,737],{"class":215},"[",[47,739,740],{"class":57},"\"OrderID\"",[47,742,743],{"class":215},"])\n",[47,745,746,748],{"class":49,"line":285},[47,747,396],{"class":259},[47,749,512],{"class":215},[10,751,752,753,755,756,758,759,761,762,764],{},"This keeps the row missing only ",[14,754,432],{}," (it still has an ",[14,757,425],{},") while removing every row with no identifier. Combine ",[14,760,658],{}," with ",[14,763,156],{}," by chaining calls when you need both rules.",[33,766,768],{"id":767},"keep-rows-with-at-least-n-real-values-using-thresh","Keep rows with at least N real values using thresh",[10,770,771,774,775,778],{},[14,772,773],{},"thresh=N"," keeps rows that have ",[444,776,777],{},"at least"," N non-null values. Use it when a row is only useful if most of its fields are populated:",[38,780,782],{"className":202,"code":781,"language":204,"meta":43,"style":43},"df = pd.read_excel(\"orders_input.xlsx\", engine=\"openpyxl\")\ndf = df.replace(r\"^\\s*$\", pd.NA, regex=True)\n\n# Keep rows with 2 or more populated cells\ncleaned = df.dropna(thresh=2)\nprint(cleaned)\n",[14,783,784,804,838,842,848,866],{"__ignoreMap":43},[47,785,786,788,790,792,794,796,798,800,802],{"class":49,"line":50},[47,787,235],{"class":215},[47,789,238],{"class":211},[47,791,469],{"class":215},[47,793,366],{"class":57},[47,795,256],{"class":215},[47,797,382],{"class":371},[47,799,238],{"class":211},[47,801,387],{"class":57},[47,803,390],{"class":215},[47,805,806,808,810,812,814,816,818,820,822,824,826,828,830,832,834,836],{"class":49,"line":225},[47,807,235],{"class":215},[47,809,238],{"class":211},[47,811,582],{"class":215},[47,813,585],{"class":211},[47,815,588],{"class":57},[47,817,591],{"class":259},[47,819,594],{"class":211},[47,821,597],{"class":259},[47,823,588],{"class":57},[47,825,602],{"class":215},[47,827,605],{"class":259},[47,829,256],{"class":215},[47,831,610],{"class":371},[47,833,238],{"class":211},[47,835,615],{"class":259},[47,837,390],{"class":215},[47,839,840],{"class":49,"line":232},[47,841,229],{"emptyLinePlaceholder":228},[47,843,844],{"class":49,"line":244},[47,845,847],{"class":846},"sJ8bj","# Keep rows with 2 or more populated cells\n",[47,849,850,852,854,856,859,861,864],{"class":49,"line":285},[47,851,490],{"class":215},[47,853,238],{"class":211},[47,855,495],{"class":215},[47,857,858],{"class":371},"thresh",[47,860,238],{"class":211},[47,862,863],{"class":259},"2",[47,865,390],{"class":215},[47,867,868,870],{"class":49,"line":320},[47,869,396],{"class":259},[47,871,512],{"class":215},[10,873,874,876,877,879],{},[14,875,858],{}," counts non-null cells, so it overrides ",[14,878,498],{}," if both are passed — pick one.",[33,881,883],{"id":882},"reset-the-index-after-dropping","Reset the index after dropping",[10,885,886,888,889,892],{},[14,887,539],{}," preserves the original index, leaving gaps like ",[14,890,891],{},"0, 2, 4",". Those gaps break positional logic and export an odd-looking index. Reset before writing:",[38,894,896],{"className":202,"code":895,"language":204,"meta":43,"style":43},"df = pd.read_excel(\"orders_input.xlsx\", engine=\"openpyxl\")\ndf = df.replace(r\"^\\s*$\", pd.NA, regex=True)\n\ncleaned = df.dropna(how=\"all\").reset_index(drop=True)\nprint(cleaned.index.tolist())\n",[14,897,898,918,952,956,982],{"__ignoreMap":43},[47,899,900,902,904,906,908,910,912,914,916],{"class":49,"line":50},[47,901,235],{"class":215},[47,903,238],{"class":211},[47,905,469],{"class":215},[47,907,366],{"class":57},[47,909,256],{"class":215},[47,911,382],{"class":371},[47,913,238],{"class":211},[47,915,387],{"class":57},[47,917,390],{"class":215},[47,919,920,922,924,926,928,930,932,934,936,938,940,942,944,946,948,950],{"class":49,"line":225},[47,921,235],{"class":215},[47,923,238],{"class":211},[47,925,582],{"class":215},[47,927,585],{"class":211},[47,929,588],{"class":57},[47,931,591],{"class":259},[47,933,594],{"class":211},[47,935,597],{"class":259},[47,937,588],{"class":57},[47,939,602],{"class":215},[47,941,605],{"class":259},[47,943,256],{"class":215},[47,945,610],{"class":371},[47,947,238],{"class":211},[47,949,615],{"class":259},[47,951,390],{"class":215},[47,953,954],{"class":49,"line":232},[47,955,229],{"emptyLinePlaceholder":228},[47,957,958,960,962,964,966,968,970,973,976,978,980],{"class":49,"line":244},[47,959,490],{"class":215},[47,961,238],{"class":211},[47,963,495],{"class":215},[47,965,498],{"class":371},[47,967,238],{"class":211},[47,969,503],{"class":57},[47,971,972],{"class":215},").reset_index(",[47,974,975],{"class":371},"drop",[47,977,238],{"class":211},[47,979,615],{"class":259},[47,981,390],{"class":215},[47,983,984,986],{"class":49,"line":285},[47,985,396],{"class":259},[47,987,988],{"class":215},"(cleaned.index.tolist())\n",[10,990,991,994],{},[14,992,993],{},"drop=True"," discards the old index instead of pushing it into a new column.",[33,996,998],{"id":997},"write-the-cleaned-file-back","Write the cleaned file back",[38,1000,1002],{"className":202,"code":1001,"language":204,"meta":43,"style":43},"df = pd.read_excel(\"orders_input.xlsx\", engine=\"openpyxl\")\ndf = df.replace(r\"^\\s*$\", pd.NA, regex=True)\n\ncleaned = (df.dropna(how=\"all\")\n             .dropna(subset=[\"OrderID\"])\n             .reset_index(drop=True))\n\ncleaned.to_excel(\"orders_cleaned.xlsx\", index=False, engine=\"openpyxl\")\nprint(f\"Wrote {len(cleaned)} rows to orders_cleaned.xlsx\")\n",[14,1003,1004,1024,1058,1062,1079,1094,1108,1112,1138],{"__ignoreMap":43},[47,1005,1006,1008,1010,1012,1014,1016,1018,1020,1022],{"class":49,"line":50},[47,1007,235],{"class":215},[47,1009,238],{"class":211},[47,1011,469],{"class":215},[47,1013,366],{"class":57},[47,1015,256],{"class":215},[47,1017,382],{"class":371},[47,1019,238],{"class":211},[47,1021,387],{"class":57},[47,1023,390],{"class":215},[47,1025,1026,1028,1030,1032,1034,1036,1038,1040,1042,1044,1046,1048,1050,1052,1054,1056],{"class":49,"line":225},[47,1027,235],{"class":215},[47,1029,238],{"class":211},[47,1031,582],{"class":215},[47,1033,585],{"class":211},[47,1035,588],{"class":57},[47,1037,591],{"class":259},[47,1039,594],{"class":211},[47,1041,597],{"class":259},[47,1043,588],{"class":57},[47,1045,602],{"class":215},[47,1047,605],{"class":259},[47,1049,256],{"class":215},[47,1051,610],{"class":371},[47,1053,238],{"class":211},[47,1055,615],{"class":259},[47,1057,390],{"class":215},[47,1059,1060],{"class":49,"line":232},[47,1061,229],{"emptyLinePlaceholder":228},[47,1063,1064,1066,1068,1071,1073,1075,1077],{"class":49,"line":244},[47,1065,490],{"class":215},[47,1067,238],{"class":211},[47,1069,1070],{"class":215}," (df.dropna(",[47,1072,498],{"class":371},[47,1074,238],{"class":211},[47,1076,503],{"class":57},[47,1078,390],{"class":215},[47,1080,1081,1084,1086,1088,1090,1092],{"class":49,"line":285},[47,1082,1083],{"class":215},"             .dropna(",[47,1085,658],{"class":371},[47,1087,238],{"class":211},[47,1089,737],{"class":215},[47,1091,740],{"class":57},[47,1093,743],{"class":215},[47,1095,1096,1099,1101,1103,1105],{"class":49,"line":320},[47,1097,1098],{"class":215},"             .reset_index(",[47,1100,975],{"class":371},[47,1102,238],{"class":211},[47,1104,615],{"class":259},[47,1106,1107],{"class":215},"))\n",[47,1109,1110],{"class":49,"line":354},[47,1111,229],{"emptyLinePlaceholder":228},[47,1113,1114,1117,1120,1122,1124,1126,1128,1130,1132,1134,1136],{"class":49,"line":360},[47,1115,1116],{"class":215},"cleaned.to_excel(",[47,1118,1119],{"class":57},"\"orders_cleaned.xlsx\"",[47,1121,256],{"class":215},[47,1123,372],{"class":371},[47,1125,238],{"class":211},[47,1127,377],{"class":259},[47,1129,256],{"class":215},[47,1131,382],{"class":371},[47,1133,238],{"class":211},[47,1135,387],{"class":57},[47,1137,390],{"class":215},[47,1139,1140,1142,1144,1146,1148,1150,1153,1155,1158],{"class":49,"line":393},[47,1141,396],{"class":259},[47,1143,399],{"class":215},[47,1145,402],{"class":211},[47,1147,405],{"class":57},[47,1149,408],{"class":259},[47,1151,1152],{"class":215},"(cleaned)",[47,1154,414],{"class":259},[47,1156,1157],{"class":57}," rows to orders_cleaned.xlsx\"",[47,1159,390],{"class":215},[10,1161,1162,1163,1166],{},"Pass ",[14,1164,1165],{},"index=False"," so the reset index does not become a stray first column in the output.",[33,1168,1170],{"id":1169},"common-pitfalls","Common pitfalls",[1172,1173,1174,1190],"table",{},[1175,1176,1177],"thead",{},[1178,1179,1180,1184,1187],"tr",{},[1181,1182,1183],"th",{},"Symptom",[1181,1185,1186],{},"Cause",[1181,1188,1189],{},"Fix",[1191,1192,1193,1214,1232,1251,1264],"tbody",{},[1178,1194,1195,1199,1205],{},[1196,1197,1198],"td",{},"Real data rows disappear",[1196,1200,1201,1202,1204],{},"Default ",[14,1203,454],{}," drops any row with one missing cell",[1196,1206,1207,1208,1210,1211],{},"Use ",[14,1209,156],{}," or ",[14,1212,1213],{},"subset=[...]",[1178,1215,1216,1219,1226],{},[1196,1217,1218],{},"Visually blank rows survive",[1196,1220,1221,1223,1224],{},[14,1222,270],{}," is a string, not ",[14,1225,450],{},[1196,1227,1228,1231],{},[14,1229,1230],{},"df.replace(r\"^\\s*$\", pd.NA, regex=True)"," first",[1178,1233,1234,1241,1246],{},[1196,1235,1236,1237,1240],{},"Index reads ",[14,1238,1239],{},"0, 3, 7"," after drop",[1196,1242,1243,1245],{},[14,1244,539],{}," keeps the original index",[1196,1247,1248],{},[14,1249,1250],{},".reset_index(drop=True)",[1178,1252,1253,1256,1259],{},[1196,1254,1255],{},"Extra unnamed column in output",[1196,1257,1258],{},"Reset index written to file",[1196,1260,1261],{},[14,1262,1263],{},"to_excel(..., index=False)",[1178,1265,1266,1269,1272],{},[1196,1267,1268],{},"First data rows are blank\u002Fgarbled",[1196,1270,1271],{},"A multi-row header was read as data",[1196,1273,1274,1210,1277],{},[14,1275,1276],{},"pd.read_excel(..., header=[0, 1])",[14,1278,1279],{},"skiprows=N",[10,1281,1282,1283,1286,1287,1290],{},"The multi-row-header case is common with exported reports: a banner or merged title row above the real header makes pandas read junk rows. Use ",[14,1284,1285],{},"skiprows"," to skip the banner, or ",[14,1288,1289],{},"header=[0, 1]"," for a genuine two-level header, rather than dropping the rows afterward.",[33,1292,1294],{"id":1293},"performance-and-scale-note","Performance and scale note",[10,1296,1297,1299,1300,1303,1304,1306,1307,1310,1311,1314],{},[14,1298,539],{}," and ",[14,1301,1302],{},"replace"," run vectorized in C, so even files with hundreds of thousands of rows clean in well under a second. The regex ",[14,1305,1302],{}," is the slower of the two; if you only need to strip whitespace from one or two known string columns, target them directly with ",[14,1308,1309],{},"df[col] = df[col].str.strip().replace(\"\", pd.NA)"," instead of scanning the whole frame. For very large workbooks, read only the columns you need with ",[14,1312,1313],{},"usecols="," to cut memory before cleaning.",[33,1316,1318],{"id":1317},"frequently-asked-questions","Frequently asked questions",[10,1320,1321,1329,1330,1332,1333,1335,1336,1340,1341,433],{},[444,1322,1323,1324,1299,1326,1328],{},"What is the difference between ",[14,1325,156],{},[14,1327,454],{},"?"," ",[14,1331,156],{}," drops a row only when every cell is missing; ",[14,1334,454],{}," (the default) drops a row when ",[1337,1338,1339],"em",{},"any"," cell is missing. For removing blank rows you almost always want ",[14,1342,156],{},[10,1344,1345,1350,1351,1353,1354,1356,1357,1359,1360,1362,1363,433],{},[444,1346,1347,1348,1328],{},"Why do my whitespace-only rows survive ",[14,1349,539],{}," Because ",[14,1352,270],{}," is a non-null string. pandas only treats true ",[14,1355,450],{},"\u002F",[14,1358,260],{}," as missing. Run ",[14,1361,1230],{}," first to convert blank strings to ",[14,1364,450],{},[10,1366,1367,1370,1371,1374],{},[444,1368,1369],{},"How do I drop rows that are missing only certain columns?"," Use ",[14,1372,1373],{},"df.dropna(subset=[\"OrderID\", \"Customer\"])",". The row is dropped only if a value in one of the listed columns is missing.",[10,1376,1377,1383,1384,1387,1388,433],{},[444,1378,1379,1380,1382],{},"Does ",[14,1381,539],{}," modify the DataFrame in place?"," No, it returns a new DataFrame by default. Reassign the result (",[14,1385,1386],{},"df = df.dropna(...)",") or pass ",[14,1389,1390],{},"inplace=True",[10,1392,1393,1329,1396,1398,1399,1401],{},[444,1394,1395],{},"Why does my cleaned index have gaps?",[14,1397,539],{}," keeps original index labels. Call ",[14,1400,1250],{}," to renumber from zero.",[33,1403,1405],{"id":1404},"conclusion","Conclusion",[10,1407,1408,1409,1411,1412,1356,1414,1356,1416,1418],{},"Removing blank rows reliably is three steps: normalize whitespace to ",[14,1410,450],{},", drop with the right ",[14,1413,498],{},[14,1415,658],{},[14,1417,858],{}," rule, then reset the index before writing. Skipping the normalization step is the single most common reason \"empty\" rows survive.",[33,1420,1422],{"id":1421},"where-to-go-next","Where to go next",[1424,1425,1426,1432,1439],"ul",{},[1427,1428,1429,1431],"li",{},[27,1430,30],{"href":29}," — the parent cluster.",[1427,1433,1434,1438],{},[27,1435,1437],{"href":1436},"\u002Fadvanced-data-transformation-and-cleaning\u002Fcleaning-excel-data-with-pandas\u002Fpandas-drop-duplicates-from-excel-column\u002F","Pandas: Drop Duplicates From an Excel Column"," — remove repeated rows after dropping blanks.",[1427,1440,1441,1445],{},[27,1442,1444],{"href":1443},"\u002Fadvanced-data-transformation-and-cleaning\u002Fhandling-missing-data-in-excel-reports\u002F","Handling Missing Data in Excel Reports"," — fill, flag, or impute the gaps you keep.",[1447,1448,1449],"style",{},"html pre.shiki code .sScJk, html code.shiki .sScJk{--shiki-default:#6F42C1;--shiki-dark:#B392F0}html pre.shiki code .sZZnC, html code.shiki .sZZnC{--shiki-default:#032F62;--shiki-dark:#9ECBFF}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .szBVR, html code.shiki .szBVR{--shiki-default:#D73A49;--shiki-dark:#F97583}html pre.shiki code .sVt8B, html code.shiki .sVt8B{--shiki-default:#24292E;--shiki-dark:#E1E4E8}html pre.shiki code .sj4cs, html code.shiki .sj4cs{--shiki-default:#005CC5;--shiki-dark:#79B8FF}html pre.shiki code .s4XuR, html code.shiki .s4XuR{--shiki-default:#E36209;--shiki-dark:#FFAB70}html pre.shiki code .sJ8bj, html code.shiki .sJ8bj{--shiki-default:#6A737D;--shiki-dark:#6A737D}",{"title":43,"searchDepth":225,"depth":225,"links":1451},[1452,1453,1454,1455,1456,1457,1458,1459,1460,1461,1462,1463,1464],{"id":35,"depth":225,"text":36},{"id":198,"depth":225,"text":199},{"id":436,"depth":225,"text":437},{"id":523,"depth":225,"text":524},{"id":648,"depth":225,"text":649},{"id":767,"depth":225,"text":768},{"id":882,"depth":225,"text":883},{"id":997,"depth":225,"text":998},{"id":1169,"depth":225,"text":1170},{"id":1293,"depth":225,"text":1294},{"id":1317,"depth":225,"text":1318},{"id":1404,"depth":225,"text":1405},{"id":1421,"depth":225,"text":1422},"2026-06-18","Strip blank rows from an Excel file with pandas: dropna(how='all'), subset and thresh, whitespace-only cells, index resets, and writing the cleaned file back.","md",[1469,1472,1475,1477,1480],{"q":1470,"a":1471},"What is the difference between how=\"all\" and how=\"any\"?","how=\"all\" drops a row only when every cell is missing; how=\"any\" (the default) drops a row when *any* cell is missing. For removing blank rows you almost always want how=\"all\".",{"q":1473,"a":1474},"Why do my whitespace-only rows survive dropna?","Because \" \" is a non-null string. pandas only treats true NaN\u002FNone as missing. Run df.replace(r\"^\\s*$\", pd.NA, regex=True) first to convert blank strings to NaN.",{"q":1369,"a":1476},"Use df.dropna(subset=[\"OrderID\", \"Customer\"]). The row is dropped only if a value in one of the listed columns is missing.",{"q":1478,"a":1479},"Does dropna modify the DataFrame in place?","No, it returns a new DataFrame by default. Reassign the result (df = df.dropna(...)) or pass inplace=True.",{"q":1395,"a":1481},"dropna keeps original index labels. Call .reset_index(drop=True) to renumber from zero.",{},"\u002Fadvanced-data-transformation-and-cleaning\u002Fcleaning-excel-data-with-pandas\u002Fremove-blank-rows-from-excel-with-pandas",{"title":5,"description":1485},"Delete empty Excel rows with pandas dropna: how='all', subset, thresh, whitespace-only cells, reset_index, and a clean write-back. Runnable examples included.","remove-blank-rows-from-excel-with-pandas","advanced-data-transformation-and-cleaning\u002Fcleaning-excel-data-with-pandas\u002Fremove-blank-rows-from-excel-with-pandas\u002Findex","long_tail","lgYuk34XPSlCpkwdf1U7P7ZdTcViGpF9QTEY2IvrxcI",[1491,1494],{"title":1437,"path":1492,"stem":1493,"children":-1},"\u002Fadvanced-data-transformation-and-cleaning\u002Fcleaning-excel-data-with-pandas\u002Fpandas-drop-duplicates-from-excel-column","advanced-data-transformation-and-cleaning\u002Fcleaning-excel-data-with-pandas\u002Fpandas-drop-duplicates-from-excel-column\u002Findex",{"title":1495,"path":1496,"stem":1497,"children":-1},"Creating Pivot Tables from Excel Data with Pandas","\u002Fadvanced-data-transformation-and-cleaning\u002Fcreating-pivot-tables-from-excel-data","advanced-data-transformation-and-cleaning\u002Fcreating-pivot-tables-from-excel-data\u002Findex",1781795518764]