proceduce for data analysis some basic functions
1.df.shape() : to print row and columns
2.df.head()
3.df.sample(5): to print sample of data with random data,5 is for printing 5 data rows
4. df.info(): - to find data type of data
object: string
intrger: int64(consumes less memory than float)
float: float64
- to find not null value in particular column
- memory space occupied by data
4. df.isnull().sum()
-to find total no. of missing value in all the column
5 df.describe(): to visualize data in mean std, deviation etc.
6.df.duplicated().sum()
- to find no. of duplicate rows in datasets
if so use drop duplicate function :google it
7. df.corr()
to find corelation among multiple data:
it gives us person corelation: -1<c<1: if more positive: than directly proportional;
if more negative: inversely proportional
values that is soo close to 0 i.e 0.0000034 wont create any difference in output: so you can eliminate those column
Comments
Post a Comment