Cozkincaud
New Member
- Joined
- Apr 18, 2019
- Messages
- 8
Hi,
I need help determining the best way to extract rows from an Excel r x c matrix with all columns extracted and the rows extracted according to a text value that might appear in any, but only one of the array columns.
Let me explain.
What I want to do is very simple: extract from this matrix all rows and all the columns where any column for any row has a specified text value.
The result should be a reduced version of the original r x c matrix. This reduced matrix will have fewer rows, but the same number of columns as the original matrix
While I have been using Excel for nearly forty years, until now I have not needed to analyze tables or databases. Microsoft Access provided what I needed for simple database manipulation and I used SPSS for statistical analysis.
Based on what I have Googled to date, I could do my analysis with:
My problem is I do not know enough about these options to make an informed choice, especially as I am probably going to incorporate whatever approach I choose into a VBA program. I have not written VBA for something like 15-16 years, so I want to minimize the code I have to write. I suspect the less manual input the approach I choose requires, the less VBA coding I will be writing.
My impression is:
So, if anyone could suggest how I might solve what seems to be a very simple problem (filter a matrix by a text value in any column and output the filtered rows/column to another range in the worksheet), I would very beholden to you.
My thanks in advance.
I need help determining the best way to extract rows from an Excel r x c matrix with all columns extracted and the rows extracted according to a text value that might appear in any, but only one of the array columns.
Let me explain.
- The matrix: no more than twenty columns that over time will expand to rows numbering in the triple digits. I assume this matrix should be defined and named as an Excel Table
- All values are text
- Each matrix cell has only one text value.
- Some of the cells in the matrix are blank, but I can I fill these with null characters (e.g. “”) if needed.
- The text values in each cell are unique to that row i.e. rows contain no duplicates
- However, columns will contain duplicates, but these duplicates will not necessarily be in the same column down rows. For example, the value ‘John’ may appear in r2c4, r4c6, r8c4 and r10c12. However, ‘John’ will never appear more than once in the same row.
- These duplicates do not need removing from the matrix
What I want to do is very simple: extract from this matrix all rows and all the columns where any column for any row has a specified text value.
The result should be a reduced version of the original r x c matrix. This reduced matrix will have fewer rows, but the same number of columns as the original matrix
- For example: My matrix is 20 rows by 12 columns and the text value ‘John’ appears in cells r2c4, r4c6, r8c4 and r10c12. If my filter is 'John', I want to extract rows 2, 4, 8 and 10 and all twelve (12) columns. The point of my analysis is to determine what other text values 'John' is associated with.
- The constraints are as follows:
- The number of columns and rows in the matrix matrix will expand over time (the matrix is tags extracted from Zotero, a citation/note taking software program).
- There is only one tag in each matrix cell
- Some matrix cells will not have a tag, but can have a filler value (“” or N/A) as needed.
- The row selection value/filter (‘John’ in the above example in 1.) will be different every time I query the matrix. I want to have this filter in a named cell and be able to enter it automatically. I assume the best way to do this would be i) a macro ii) VBA code. Ultimately, I hope to automate the input, formatting, and outputting of my Zotero tags in Excel with macros/VBA code.
- I want to automate this matrix querying as much as possible.
While I have been using Excel for nearly forty years, until now I have not needed to analyze tables or databases. Microsoft Access provided what I needed for simple database manipulation and I used SPSS for statistical analysis.
Based on what I have Googled to date, I could do my analysis with:
- VLOOKUP
- INDEX and MATCH
- Pivot tables
- Power Query
My problem is I do not know enough about these options to make an informed choice, especially as I am probably going to incorporate whatever approach I choose into a VBA program. I have not written VBA for something like 15-16 years, so I want to minimize the code I have to write. I suspect the less manual input the approach I choose requires, the less VBA coding I will be writing.
My impression is:
- LOOKUP, INDEX and MATCH would require quite a lot of manual input with quite complex formulas. Depending on the help fellow forum members could give me, I might need to edit these formulas every time I wanted to query the matrix with a new filter
- I have tried to query my matrix with a pivot table, but I just cannot see how I could bend a pivot table to my purpose.
- As for Power Query, as far as I can tell, this Excel feature can only modify a table by reducing the number of rows it contains. I cannot see any way of outputting any of the matrix/table rows to elsewhere in the worksheet. In contrast to this, I want to keep querying the same matrix
So, if anyone could suggest how I might solve what seems to be a very simple problem (filter a matrix by a text value in any column and output the filtered rows/column to another range in the worksheet), I would very beholden to you.
My thanks in advance.