You import a CSV. Names are in ALL CAPS. Emails have trailing spaces. Dates are in three different formats. Phone numbers have dashes, parentheses, and spaces mixed randomly. Half the rows are duplicates.

Before you can VLOOKUP, QUERY, or dashboard anything, you need clean data. This guide covers every data cleanup technique in Google Sheets.

Remove Duplicates

Built-in tool

Select your data range, then: Data > Data cleanup > Remove duplicates. Choose which columns to check for duplicates.

Formula: Find duplicates first

Flag duplicates in column A =IF(COUNTIF(A$2:A2, A2) > 1, "DUPLICATE", "")

This marks the second (and subsequent) occurrences as "DUPLICATE". Put it in a helper column, then filter and delete the flagged rows.

Formula: Get unique values

Extract unique values from a column =UNIQUE(A2:A)
Unique rows based on multiple columns =UNIQUE(A2:D)

Trim Whitespace

Extra spaces are the #1 cause of VLOOKUP failures. A cell that looks like "John Smith" might actually be " John Smith " with invisible leading/trailing spaces.

Remove leading, trailing, and extra spaces =TRIM(A2)
Remove ALL whitespace (including non-breaking spaces) =TRIM(CLEAN(SUBSTITUTE(A2, CHAR(160), " ")))

CHAR(160) is the non-breaking space character that often comes from web copy-paste. TRIM doesn't catch it by default.

Apply to entire column with ARRAYFORMULA =ARRAYFORMULA(IF(A2:A="", "", TRIM(A2:A)))

Standardize Text Case

Three case functions =UPPER(A2) → "JOHN SMITH" =LOWER(A2) → "john smith" =PROPER(A2) → "John Smith"
Clean + standardize in one formula =PROPER(TRIM(LOWER(A2)))
Standardize email addresses (always lowercase) =ARRAYFORMULA(IF(C2:C="", "", LOWER(TRIM(C2:C))))

Split Text into Columns

Built-in: Split text to columns

Select a column, then: Data > Split text to columns. Choose comma, space, or custom delimiter.

Formula: SPLIT

Split "John Smith" into first and last name =SPLIT(A2, " ")
Split email to get name and domain =SPLIT(C2, "@")

Extract specific parts with REGEXEXTRACT

Extract domain from email =REGEXEXTRACT(C2, "@(.+)")
Extract numbers from mixed text =REGEXEXTRACT(A2, "\d+")
Extract area code from phone number =REGEXEXTRACT(D2, "\(?(\d{3})")

Fix Dates

Dates stored as text are the second most common data problem. If your date column is left-aligned, it's text, not a real date.

Convert text dates to real dates // "03/15/2026" (text) → real date =DATEVALUE(A2) // "March 15, 2026" (text) → real date =DATEVALUE(A2) // "2026-03-15" (ISO format text) → real date =DATEVALUE(A2)
Standardize date format =TEXT(DATEVALUE(A2), "YYYY-MM-DD")
Handle mixed date formats with REGEXMATCH =IF(REGEXMATCH(A2, "^\d{4}-"), DATEVALUE(A2), IF(REGEXMATCH(A2, "^\d{1,2}/"), DATEVALUE(A2), "UNKNOWN FORMAT" ) )

Clean Phone Numbers

Strip all non-digit characters =REGEXREPLACE(D2, "\D", "")
Format as (123) 456-7890 =IF(LEN(REGEXREPLACE(D2,"\D",""))=10, "("&MID(REGEXREPLACE(D2,"\D",""),1,3)&") " &MID(REGEXREPLACE(D2,"\D",""),4,3)&"-" &MID(REGEXREPLACE(D2,"\D",""),7,4), D2 )

Find and Replace with SUBSTITUTE

Replace text in a column // Remove commas from numbers =SUBSTITUTE(A2, ",", "") // Replace "N/A" with empty string =IF(A2="N/A", "", A2) // Replace multiple values =SUBSTITUTE(SUBSTITUTE(A2, "Inc.", ""), "LLC", "")

Validate Data

Check if email is valid format =REGEXMATCH(C2, "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$" )
Check if a value is a number =ISNUMBER(A2)
Flag rows with missing required fields =IF(OR(A2="", B2="", C2=""), "INCOMPLETE", "OK")

The Data Cleanup Workflow

  1. Duplicate your data. Work on a copy. Never clean the original.
  2. Remove duplicates. Data > Data cleanup > Remove duplicates.
  3. Trim whitespace. ARRAYFORMULA + TRIM on all text columns.
  4. Standardize case. LOWER for emails, PROPER for names.
  5. Fix dates. DATEVALUE + format as YYYY-MM-DD.
  6. Standardize formats. Phone numbers, addresses, codes.
  7. Validate. Flag incomplete rows, invalid emails, out-of-range numbers.
  8. Paste values. When done, Ctrl+Shift+V to paste formulas as values.
Clean data isn't glamorous, but it's the foundation of every formula, dashboard, and report that works. Garbage in, garbage out. Clean data in, insights out.