CSV Column Analyser — Free Online CSV Statistics & Data Profiler
Upload or paste any CSV and instantly see per-column statistics for every column — mean, median, std dev, null %, unique values, top-5 frequency bars, outlier count, and a data quality score. Export the full analysis report as CSV. Free, browser-based, no data uploaded.
Upload a CSV file or paste CSV text — instant per-column statistics
Auto-detects delimiter (comma, semicolon, tab, pipe). All processing happens in your browser — your data never leaves your device.
Rate this tool
The most complete free CSV column analyser online
How to analyse a CSV file online in 6 steps
LazyTools vs other free CSV analysers
We compared the leading free online CSV statistics tools. LazyTools is the only tool offering paste mode, data quality scores, outlier detection, column search, CSV report export, and markdown copy — all without an account.
| Feature | LazyTools | StatSim Profile | Nextooly | androiddevhub | quicktable.io | csv-stats lovable |
|---|---|---|---|---|---|---|
| Paste CSV text | ✅ Yes | ❌ File only | ✅ Yes | ❌ File only | ❌ File only | ❌ File only |
| Data quality score | ✅ Per column | ❌ No | ❌ No | ❌ No | ❌ No | ❌ No |
| Outlier detection | ✅ 2-SD rule | ❌ No | ❌ No | ❌ No | ❌ No | ❌ No |
| Min / Max / Mean / Median | ✅ All four | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes |
| Mode + IQR | ✅ Both | ❌ No | ❌ No | ❌ No | ❌ No | ❌ No |
| Top-5 values with bars | ✅ Visual bars | ✅ No bars | ✅ Bar chart | ❌ No | ❌ No | ❌ No |
| Inline histogram | ✅ 10-bin | ✅ Yes | ✅ Yes | ❌ No | ❌ No | ❌ No |
| Export stats as CSV | ✅ Yes | ❌ No | ❌ No | ❌ No | ✅ Yes | ❌ No |
| Copy as markdown table | ✅ Yes | ❌ No | ❌ No | ❌ No | ❌ No | ❌ No |
| Column search/filter | ✅ Yes | ❌ No | ❌ No | ❌ No | ❌ No | ❌ No |
| 100% browser-based | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | ❌ Server upload | ✅ Yes |
| No signup required | ✅ Never | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes |
All computed statistics — formulas and interpretation
| Statistic | Applies to | Formula / Method | Interpretation |
|---|---|---|---|
| Count (non-null) | Both | Rows with a non-empty value | How many rows actually have a value in this column |
| Null count | Both | Rows with an empty or blank value | Number of missing values — high counts indicate incomplete data |
| Null % | Both | (nulls / total rows) x 100 | Percentage of missing values — drives the quality score |
| Quality score | Both | 100 − null % | 0–100 completeness score: 100 = no nulls, 0 = all nulls |
| Unique values | Both | Count of distinct non-null values | Low unique count relative to row count indicates repetitive / categorical data |
| Minimum | Numeric | Smallest parsed numeric value | Lower bound of the column range; unexpectedly low values may be errors |
| Maximum | Numeric | Largest parsed numeric value | Upper bound; unexpectedly high values may be outliers |
| Mean | Numeric | Sum of values / count | Arithmetic average; sensitive to outliers |
| Median | Numeric | Middle value of sorted list | Central tendency; robust to outliers — better than mean for skewed data |
| Mode | Numeric | Most frequently occurring value | Most common value; useful for identifying default or dominant values |
| Std Deviation | Numeric | sqrt(variance) | Average distance from the mean; higher = more spread out data |
| Variance | Numeric | Mean of squared deviations | Square of std dev; used in statistical tests and ML feature scaling |
| Sum | Numeric | Total of all non-null values | Useful for financial and count columns |
| Q1 (25th pctile) | Numeric | Value at 25% position when sorted | Lower quartile; 25% of values fall below this point |
| Q3 (75th pctile) | Numeric | Value at 75% position when sorted | Upper quartile; 75% of values fall below this point |
| IQR | Numeric | Q3 − Q1 | Interquartile range; middle 50% spread; used in box-plot outlier detection |
| Outlier count | Numeric | Values where |x − mean| > 2 x std dev | Count of statistically extreme values that may be errors or anomalies |
| Avg value length | Text | Mean character count of non-null values | Useful for validating field length constraints in databases |
| Top 5 values | Both | 5 most frequent non-null values with counts | Reveals dominant categories, common entries, and potential cardinality issues |
CSV Column Analyser Guide — Data Profiling and Descriptive Statistics
Data profiling is the process of examining a dataset to understand its structure, content, and quality before using it for analysis, reporting, or machine learning. Column-level analysis — examining each column in a CSV file individually — is the most important step in any data profiling workflow. It reveals which columns have missing values, which contain outliers, what the typical ranges and distributions are, and where data quality issues lurk that could corrupt downstream analysis.
What is exploratory data analysis (EDA) on CSV files?
Exploratory data analysis (EDA) is the practice of summarising and visualising a dataset’s key characteristics before formal modelling or analysis. For CSV files, EDA typically starts with column-level statistics: count, null rate, minimum, maximum, mean, and median for numeric columns; unique value count and most frequent values for text columns. These statistics form a data profile that guides decisions about data cleaning, feature engineering, and modelling strategy. The LazyTools CSV Column Analyser automates this entire first-pass EDA in seconds, eliminating the need to write Python pandas code or open a Jupyter notebook for initial data inspection.
How to identify missing data in a CSV file
Missing data appears as empty cells in a CSV file — cells where a value should exist but was not recorded, not collected, or was lost during export. The null count and null percentage for each column directly quantify missing data. A column with 30% null values should raise immediate questions: was data collection optional for that field? Were records incomplete at source? Is the null pattern random (MCAR), related to other variables (MAR), or systematic (MNAR)? The answer determines the appropriate imputation strategy. The data quality score in the LazyTools analyser (100 minus null %) gives an at-a-glance health check: a score below 70 typically means the column requires cleaning or imputation before use in analysis.
Understanding mean vs median in CSV data analysis
The mean and median are both measures of central tendency — the typical or central value in a distribution — but they respond very differently to skewed data and outliers. The mean (average) sums all values and divides by the count. One extreme value can shift the mean dramatically. For example, in a salary dataset where most employees earn $40,000–$80,000 but one executive earns $2,000,000, the mean is pulled far above the actual typical salary. The median, the middle value in a sorted list, is unaffected by this single outlier and gives a much better picture of the typical salary. When mean and median differ significantly in your CSV column analysis, it is a strong signal that the distribution is skewed or contains outliers that require investigation.
How to detect outliers in a CSV column
The LazyTools CSV Column Analyser uses the two-standard-deviation rule for outlier detection: any value more than 2 standard deviations above or below the mean is flagged as an outlier. For a normal distribution, approximately 95.4% of values fall within 2 standard deviations of the mean, so about 4.6% of values would be flagged even in perfectly clean data. In practice, an outlier count significantly higher than 5% of the column’s non-null values warrants investigation. Outliers can represent data entry errors (a salary entered as 500000 instead of 50000), measurement errors (a sensor recording 9999 when it fails), or genuinely unusual cases that are valid but need separate treatment in modelling.
Data quality scoring for CSV columns
Data quality is multi-dimensional — it encompasses completeness (are all values present?), consistency (are values in the expected format?), accuracy (do values reflect reality?), uniqueness (are there unexpected duplicates?), and validity (do values fall within expected ranges?). The LazyTools quality score focuses on completeness — the proportion of non-null values — as it is the most universally applicable and directly measurable dimension from raw CSV data. A score of 100 means every row has a value in that column. A score of 80 means 20% of rows are missing data. Combine the quality score with unique count (to detect duplicate-heavy columns) and the outlier count (to assess range validity) for a fuller picture of column quality.
Using CSV statistics for data cleaning decisions
Column statistics directly inform data cleaning decisions. A column with 90%+ null values may be safe to drop from the dataset. A numeric column where mean and median differ by more than 50% likely has outliers that need capping or removal before regression modelling. A text column with only 5 unique values across 10,000 rows is a categorical variable that should be one-hot encoded or label-encoded for machine learning. A column where the minimum value is negative in a dataset that should only contain positive quantities (e.g. order quantities) signals a data entry or extraction error. A column with 100% unique values across all rows is a candidate for a primary key or identifier field. All of these decisions start with the column statistics the LazyTools analyser computes.
Exporting CSV analysis reports for documentation
Column analysis reports are valuable documentation assets in data engineering and data science projects. A data dictionary — a structured document describing each column in a dataset, its type, typical range, null rate, and unique value count — is a standard deliverable in enterprise data projects. The LazyTools Export Report CSV function generates a machine-readable version of the analysis results: one row per column, all statistics as columns. This can be imported into Excel, appended to a project wiki, or used to generate automated data quality alerts. The Copy Markdown Table function produces a formatted markdown table ready to paste into GitHub READMEs, Confluence pages, Notion databases, or any Markdown-aware documentation system.
CSV column analyser — 10 questions answered
A CSV column analyser examines each column in a CSV file and computes descriptive statistics: min, max, mean, median, mode, standard deviation, null rate, unique count, and top values. It is the first step in data profiling and exploratory data analysis (EDA) before cleaning, modelling, or reporting.
Upload a CSV file or paste raw CSV text. The tool auto-detects the delimiter, parses all columns, and shows per-column statistics cards. The summary card shows totals across the dataset. Export the full analysis as a CSV report or copy as a markdown table.
Numeric columns: min, max, mean, median, mode, std dev, variance, sum, Q1, Q3, IQR, outlier count, null count, null %, unique count. Text columns: non-null count, null count, null %, unique count, min/max/avg value length. All columns: top 5 most frequent values with counts and percentage bars.
The quality score is 100 minus the null percentage. A score of 100 means no missing values. Below 90 (amber) means at least 10% of values are missing. Below 70 (red) indicates significant missing data requiring imputation or column removal. It gives an immediate visual health check across all columns.
An outlier is a value more than 2 standard deviations from the column mean. For a normal distribution this flags about 4.6% of values. A significantly higher outlier count may indicate data entry errors, sensor failures, or genuine anomalies that need investigation before using the data in analysis.
The mean is the arithmetic average (sum divided by count). The median is the middle value when all values are sorted. Mean is sensitive to outliers; one extreme value can shift it significantly. Median is robust to outliers and better represents the typical value in skewed distributions. When they differ significantly, the distribution is skewed or contains outliers.
Yes. Use the Paste CSV tab to paste raw CSV text directly from a spreadsheet, database query result, or clipboard. No file upload needed. The tool auto-detects the delimiter from the pasted text. This feature is unique among free CSV analysers.
Auto-detects comma, semicolon, tab, and pipe delimiters. You can also override the auto-detection manually. Covers standard CSV (comma), European Excel exports (semicolon), TSV files (tab), and pipe-delimited database exports.
Yes. Export Report CSV downloads a structured CSV with one row per column and all statistics as columns — for documentation, sharing, or importing into reporting tools. Copy Markdown Table generates a formatted markdown table for GitHub READMEs, Notion, Confluence, or any markdown-aware tool.
LazyTools CSV Column Analyser is 100% free with no signup, no account, and no data upload. Upload or paste any CSV and get instant per-column statistics. Export the full analysis as a CSV report. Your data never leaves your browser.