Iarampatta
New Member
- Joined
- Oct 20, 2015
- Messages
- 3
Hello,
I am working with a .csv file produced by a proprietary application that includes 4 different tables all jammed together into a single file. Here is a link to a dummy version of that .csv to highlight how the tables are arranged. The key point is that each table has a different number of rows and a different number of columns but are always separated by one blank row.
I do a pull from our platform on a weekly basis and, as the data is always updating as we add new deals to our database, the number of rows in each table will grow every time I do a download. As each table has fundamentally different information on it, I want to interrogate the data separately as four separate queries. My current process is to take the .csv and manually separate it into four .csv files which I then pull into Power BI separately and interrogate as 4 separate queries. Some examples of questions my dashboard should be able to answer include:
1) "How many M&A deals do we have on the database?" (A countrows on the first table).
2) "What proportion of deals in March were SBOs" (A countif type function of the second table)
3) A scatter plot of M&A deals by date and size
The analysis is relatively simple (even for me, a beginner), but it's the automation of loading the data into Power BI without having to split the .csv each week. It doesn't take long but introduces human error (especially if this task moves to someone else).
Is there a way for me to import the .csv into Power BI and then split the four tables into four queries? The clear delimiter here is the blank row that separates each table from each other. Obviously, the difficulty is the fact that this is a recurring exercise, so the automated solution needs to be able to deal with a different number of rows each time.
Ideally, I'd like to be able to get the dashboard to a situation where I just need to download the .csv, move it into the correct folder and overwrite the existing file, then click refresh on Power BI to get the most up-to-date stats.
I'd appreciate any steer as I know this is probably an M code solution. If you need any more detail on what I am looking to achieve, please let me know.
I am working with a .csv file produced by a proprietary application that includes 4 different tables all jammed together into a single file. Here is a link to a dummy version of that .csv to highlight how the tables are arranged. The key point is that each table has a different number of rows and a different number of columns but are always separated by one blank row.
I do a pull from our platform on a weekly basis and, as the data is always updating as we add new deals to our database, the number of rows in each table will grow every time I do a download. As each table has fundamentally different information on it, I want to interrogate the data separately as four separate queries. My current process is to take the .csv and manually separate it into four .csv files which I then pull into Power BI separately and interrogate as 4 separate queries. Some examples of questions my dashboard should be able to answer include:
1) "How many M&A deals do we have on the database?" (A countrows on the first table).
2) "What proportion of deals in March were SBOs" (A countif type function of the second table)
3) A scatter plot of M&A deals by date and size
The analysis is relatively simple (even for me, a beginner), but it's the automation of loading the data into Power BI without having to split the .csv each week. It doesn't take long but introduces human error (especially if this task moves to someone else).
Is there a way for me to import the .csv into Power BI and then split the four tables into four queries? The clear delimiter here is the blank row that separates each table from each other. Obviously, the difficulty is the fact that this is a recurring exercise, so the automated solution needs to be able to deal with a different number of rows each time.
Ideally, I'd like to be able to get the dashboard to a situation where I just need to download the .csv, move it into the correct folder and overwrite the existing file, then click refresh on Power BI to get the most up-to-date stats.
I'd appreciate any steer as I know this is probably an M code solution. If you need any more detail on what I am looking to achieve, please let me know.