BiochemBoi95
New Member
- Joined
- Jan 22, 2020
- Messages
- 20
- Office Version
- 365
- 2016
- Platform
- Windows
- Web
So I have DNA sequences that excel imports in this format:
Each sequence varies in length so it may not occupy the same number of rows. I want each of these blocks of text to be combined into once cell under the sequence labels you see at the top of each block instead of spread out across multiple rows as it currently is. I just showed you these two blocks but my document has many more in the same format. Then I have code for searching these long strings for a specific substring but I want that to be incorporated into the same VBA after the text has been formatted into one cell.
Here is the code that I have for the search part:
=IFERROR(MID($A$1,SMALL(IF(MID($A$1,ROW(INDIRECT("1:"&LEN($A$1)-24)),6)="GTACAA",IF(MID($A$1,ROW(INDIRECT("25:"&LEN($A$1))),6)="GGGAGG",ROW(INDIRECT("7:"&LEN($A$1)-18)))),ROWS($A$3:$A3)),18),"")
What the code is supposed to do is search for subsrings "GTACAA" and "GGGAGG" and returns the letters between these two substrings but only if the two substrings are 18 characters apart. Potentially, I would also like to exclude any returned strings if there is an N present. Ideally I'd like the returned strings for each block of text to be put off the right somewhere in a single column.
I know that was a lot so any help is greatly appreciated.
>PRS1_SeqF2 |
NNNNNNNNNNGGTTNNNNGTCNCGNCGTTGTANACGACGGCCAGCATGGAGTACAAGGGA |
GGTACTTCCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACA |
CAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACT |
CACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCT |
GCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGC |
TTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCA |
CTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTG |
AGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCA |
TAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAA |
CCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCC |
TGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGC |
GCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCT |
GGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCG |
TCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAG |
GATTAGCAGAGCGAGGTATGTAGGNCNGTGCTACAGAGTTCTTGAAGTGGTGGNCCTAAC |
TACNGCTACACTAGAAAGAANNGTATTTGGNATCTGCNCTCTGCTGAANNCANNNACNNN |
CGGAAAAAANAGTTGGGTAGCTNNTNNANTCCCGGGNNAANNAAACCACCNNGNTNNNNN |
NGNNNNNNNNNNNNNNNNTTNGCNANNCCANNNNNNNNNNNNNNCTNGNNNNNNNNNGGN |
NTNNNNANNNNANNNN |
>PRS2_SeqF2 |
NNNNNNNNNNNNTNNNNTCACGNCGTTGTAAACGACGGCCAGCATGGAGTACAAGGGAGG |
TACTTCCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACA |
ACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCA |
CATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGC |
ATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTT |
CCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACT |
CAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAG |
CAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATA |
GGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACC |
CGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTG |
TTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGC |
TTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGG |
GCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTC |
TTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGA |
TTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAANTACG |
GCTACACTAGAAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCCAGTTACCTTTCG |
GAAAAAAGAGTTNGGNAGCTCTTGNTNCCGGGCAAANNNNCNACCGCTNGGTANNNGNNG |
NNNTTTTTTTTNNTTGCNNNCNNGCANNNTNACGCCNNANNAAANNNNNNTTNNNANNNN |
Each sequence varies in length so it may not occupy the same number of rows. I want each of these blocks of text to be combined into once cell under the sequence labels you see at the top of each block instead of spread out across multiple rows as it currently is. I just showed you these two blocks but my document has many more in the same format. Then I have code for searching these long strings for a specific substring but I want that to be incorporated into the same VBA after the text has been formatted into one cell.
Here is the code that I have for the search part:
=IFERROR(MID($A$1,SMALL(IF(MID($A$1,ROW(INDIRECT("1:"&LEN($A$1)-24)),6)="GTACAA",IF(MID($A$1,ROW(INDIRECT("25:"&LEN($A$1))),6)="GGGAGG",ROW(INDIRECT("7:"&LEN($A$1)-18)))),ROWS($A$3:$A3)),18),"")
What the code is supposed to do is search for subsrings "GTACAA" and "GGGAGG" and returns the letters between these two substrings but only if the two substrings are 18 characters apart. Potentially, I would also like to exclude any returned strings if there is an N present. Ideally I'd like the returned strings for each block of text to be put off the right somewhere in a single column.
I know that was a lot so any help is greatly appreciated.