1+ ![ PyPI - Version] ( https://img.shields.io/pypi/v/panda-helper )
2+ [ ![ Download Stats] ( https://img.shields.io/pypi/dm/panda-helper )] ( https://pypistats.org/packages/panda-helper )
13![ PyPI - Python Version] ( https://img.shields.io/pypi/pyversions/panda-helper )
2- ![ Tests Status] ( https://github.com/ray310/Panda-Helper/actions/workflows/pytest_old .yml/badge.svg )
3- ![ Lint/Format Status] ( https://github.com/ray310/Panda-Helper/actions/workflows/format_lint_old .yml/badge.svg )
4+ ![ Tests Status] ( https://github.com/ray310/Panda-Helper/actions/workflows/pytest .yml/badge.svg )
5+ ![ Lint/Format Status] ( https://github.com/ray310/Panda-Helper/actions/workflows/format_lint .yml/badge.svg )
46
57# Panda-Helper: Quickly and easily inspect data
6- Panda-Helper is a simple data-profiling utility for Pandas DataFrames and Series
8+ Panda-Helper is a simple data-profiling utility for Pandas' DataFrames and Series.
79
8- Assess data quality and usefulness with minimal effort
10+ Assess data quality and usefulness with minimal effort.
911
10- Quickly perform initial data exploration, _ so you can move on to more in-depth analysis_
12+ Quickly perform initial data exploration, _ so you can move on to more in-depth analysis_ .
1113
1214-----
1315### DataFrame profiles:
@@ -23,7 +25,7 @@ _Vehicles passing through toll stations_
2325 ------------------------- ------------
2426 DF Shape (1586280, 6)
2527 Duplicated Rows 2184
26-
28+
2729 Column Name Data Type
2830 -------------------------- -----------
2931 Plaza ID int64
@@ -32,7 +34,7 @@ _Vehicles passing through toll stations_
3234 Direction object
3335 # Vehicles - ETC (E-ZPass) int64
3436 # Vehicles - Cash/VToll int64
35-
37+
3638 Summary of Nulls Per Row
3739 -------------------------- -----------
3840 count 1.58628e+06
@@ -53,12 +55,12 @@ _Vehicles passing through toll stations_
5355
5456-----
5557### Series profiles report the:
56- - Series data type
58+ - Series data type
5759- Count of non-null values in the series
5860- Number of unique values
5961- Count of null values
6062- Counts and frequency of the most and least common values
61- - Distribution statistics for numeric data
63+ - Distribution statistics for numeric-like data
6264
6365__ Sample profile of categorical data__ <br >
6466_ Direction vehicles are traveling_
@@ -69,7 +71,7 @@ _Direction vehicles are traveling_
6971 Count 1586280
7072 Unique Values 2
7173 Null Values 0
72-
74+
7375 Value Count % of total
7476 ------- ------- ------------
7577 I 814100 51.32%
@@ -84,7 +86,7 @@ _Hourly vehicle counts at tolling points_
8486 Count 1586280
8587 Unique Values 8987
8688 Null Values 0
87-
89+
8890 Value Count % of total
8991 ------- ------- ------------
9092 0 3137 0.20%
@@ -112,7 +114,7 @@ _Hourly vehicle counts at tolling points_
112114 8876 1 0.00%
113115 8261 1 0.00%
114116 8694 1 0.00%
115-
117+
116118 Statistic Value
117119 ------------------------- ---------------
118120 count 1.58628e+06
@@ -141,7 +143,7 @@ __Profiling a DataFrame__<br>
141143Create the DataFrameProfile and then display it or save the profile.
142144``` python
143145import pandas as pd
144- import pandahelper.reports as ph
146+ import pandahelper as ph
145147
146148data = {
147149 " user_id" : [1 , 2 , 3 , 4 , 4 ],
@@ -158,14 +160,14 @@ df_profile
158160 ------------------------- ------
159161 DF Shape (5, 4)
160162 Obviously Duplicated Rows 1
161-
163+
162164 Column Name Data Type
163165 ------------- -----------
164166 user_id int64
165167 transaction object
166168 amount float64
167169 survey object
168-
170+
169171 Summary of Nulls Per Row
170172 -------------------------- --------
171173 count 5
@@ -183,7 +185,7 @@ df_profile
183185 median absolute deviation 1
184186 standard deviation 0.83666
185187 skew 0.512241
186-
188+
187189``` python
188190df_profile.save_report(" df_profile.txt" )
189191```
@@ -200,13 +202,13 @@ series_profile
200202 Count 4
201203 Unique Values 3
202204 Null Values 1
203-
205+
204206 Value Count % of total
205207 ------- ------- ------------
206208 85.12 2 50.00%
207209 100 1 25.00%
208210 1400 1 25.00%
209-
211+
210212 Statistic Value
211213 ------------------------- ----------
212214 count 4
@@ -224,7 +226,7 @@ series_profile
224226 median absolute deviation 7.44
225227 standard deviation 654.998
226228 skew 1.99931
227-
229+
228230``` python
229231series_profile.save_report(" amount_profile.txt" )
230232```
0 commit comments