Data Analysis of Indian Startups
Data Visualization of Startup Company Dataset
Introduction:
I have taken Indian startup data from Kaggle(https://www.kaggle.com/datasets/ashishraut64/indian-startups-top-300/data).
The dataset contains details of 300 Indian startup companies and their details like
- Company - Name of the Startup
- City - The City in which the startup is started
- Starting Year - The Year in which the startup was started
- Founders - Name of the founders of the startup
- Industries - The Industrial domain in which the startup falls
- No. of Employees - Number of employees in the startup
- Funding Amount in USD - Total funding amount funded to the startup
- Funding Rounds - Funding rounds are the number of times a startup goes back to the market to raise more capital. The goal of every round is for founders to trade equity in their business for the capital they can utilize to advance their companies to the next level
- No. of Investors - Number of investors in the startup.
I have analyzed the dataset using the Power BI tool. I have imported the CSV file into Power BI and transformed the data in the power query.
Transforming data includes:
- Checked if any null values exist for each column.
- Removed a row as it contains a "Not Available" value for the Number of Employees Column.
- Used the split columns feature to split the Number of Employees Column into two columns using the "-" delimiter. So, Power BI will create two new columns. If the Number of Employees column has a value of 11-50. Value 11 will be present in the first newly created column and 50 will be present in the second newly created column.
- As we have values like "10001+" we will get null values in the second column. So, I have replaced them with "10001" and changed the datatype of the divided columns from text to number to perform the average of two columns.
- Created a column for the average of employees and changed the datatype of the resultant column from decimal to whole number as it represents the Number of employees.
- Loaded the data into Power BI.
- Created Bins for the Starting Year Column with a Bin size of 5.
- Created groups for the Number of Investors Column. If the number of investors is between 0 to 10 created as a Low Investor group, from 11 to 25 as a Medium Investor Group and greater than 25 as a High Investor group.
- Created groups for the Funding Rounds Column. If the Funding rounds are between 1 to 5 created as a Funding A, from 6 to 15 as a Funding B and 16 to 25 as a Funding C.
- Then created visualizations using a Donut chart, Line chart, Area chart, Stacked bar chart and Clustered column chart.
- A greater number of Startup companies started in the year 2015 in India, which is the peak in the Line chart.
- Low investor groups have the highest percentage in the donut chart, which concludes that 1 to 10 people only invest in startup companies.
- In the initial years, the funding was low and in 1995 it increased. The highest peak in the area chart represents the year 2005 when the maximum funding amount happened and the maximum number of employees were present.
- Most startup companies in India are in early funding rounds and their employee count is below 1000.
Comments
Post a Comment