top of page

Helitron Art - from Network Graphs of Wikipedia pages, to DNA sequences from DNA transposons, to Alg


Synopsis_

The work presented here explored the aesthetic possibilities of using Helitron Transposon DNA sequences from the maize genome as raw material for the creation of algorithmic art. Complex network graphs constructed from Wikipedia pages were re-mixed based on the DNA sequence of a particular Helitron element belonging to the Cornucopius

family after they were aligned. The work extends general efforts from the artist in integrating art with science, and strengthen in particular his recently developed artistic approach at integrating genomics with algorithmic art, a novel avenue of work termed 'Geometric And Genomic AbstractionISM'.

Helitron Transposons explained: what are they?_

Eukaryotes genomes harbor transposable elements (TEs) that are able of intragenomic multiplication by a mechanism that transfers a DNA segment from one genomic location to another. TEs can be divided into retrotransposons, which multiply via reverse transcription; and DNA transposons, which are transposed without the need of RNA intermediary molecules. DNA transposons proliferate through the utilization of single or double-stranded DNA intermediary molecules. In eukaryotes, DNA transposons can mainly be divided into three classes: (1) those in which the excision mechanism involve double-stranded DNA and reinsert to a different location in the genome ('cut-and-paste' transposons); (2) those that transpose via a rolling-circle replication such as Helitrons; (3) and Polintons/Mavericks that are believed to replicate using a self-encoded DNA polymerase.

Helitrons are particularly interesting because they were only recently discovered by computational means in 2001 by Kapitonov and Jurka. They don't harbor the structural hallmarks of other DNA transposons such as Terminal Inverted Repeats (TIRs) and Target Site Duplications (TSDs). Helitrons instead harbor conserved TC and CTAG sequence contexts at 5' and 3' termini, respectively; palindromes (16 to 20bp 'hairpin loops') located 10 to 15bp upstream of the 3' terminus; and flanking A and T host nucleotides at the 5' and 3' termini, respectively. A remarkable feature of Helitrons is their capacity to capture gene sequences, with Helitrons shown to be causative agents of allelic variability and evolutionary importance.

Helitron transposons were proposed to move via a rolling-circle replication (RCR) because autonomous Helitrons encode RepHel protein domains related to the prokaryotic Rep protein involved in RCR. However, most Helitrons found in plant genomes are non-autonomous and thus encode non-functional RepHel proteins. Furthermore, there are Helitrons that are agenic (they don't contain captured genes) such as the elements that belong to the Cornucopious family recently discovered in the maize genome.

How this collaboration came to be?_

The Cornucopious family of Helitron-related sequences were identified by Dr. Chunguang Du (collaborator in this project) and colleagues, who developed and used Helitron discovery algorithms such as HelitronFinder and HelitronScanner. I had the opportunity to meet Dr. Du while I was a graduate student at the Waksman Institute at Rutgers University [2006-2013], our labs used to have joint laboratory meetings each week and that's when I learned about Dr. Du's research project on Helitron discovery and annotation in the maize genome. Several years later (6 years to be specific) we decided to join efforts in exploring the use of genomic/genetic knowledge pertaining Helitrons in maize as raw material for computational art.

I've previously explored the use of maize genome concepts to create computational art, in particular the transformation of glitch art by subjecting the image to processes analogous to those shaping the evolution of the maize genome, the creation of visual art and experimental sound by addressing the reduction in genome diversity at the ba-1 gene in maize ought to plant domestication, and the auditory rendering of CG and CCG sequence context variation at the VERNALIZATION 1 gene in maize. All these projects were created as means to develop the aesthetic, discursive and materialistic component of a new artistic disciplined termed by the author as 'Geometric And Genomic AbstractionISM' (GAGAISMO). As a discipline, it encompasses and reflects upon the practice of using genome data as raw material for art, either directly by co-opting scientific principles and tools; or indirectly as an inspirational source. As bioinformatics includes a set of computational tools used to access and manipulate genome data for research purposes, GAGAISMO includes a set of computational expressive tools used to access and manipulate genome data for artistic purposes. It is geometric abstractionism guided by genomics and enabled by computers.

In the work presented here, I explored the possibility of maize Helitrons as departure point for creation of visual works using algorithms. I was interested in utilizing network graphs and their re-mixing as primary elements for visual impact and aesthetic investigation.

Construction of network graphs from Wikipedia pages related to Helitrons_

I recently explored the use of NetworkX, a Python library for complex network construction, in conjunction with Gephi for network visualization and analysis, to visualize the application of Natural Language Processing algorithms to Facebook messages exchanged by the author with his friends during a two year period. I decided then I should keep exploring further the construction and visualization of networks and used them as raw material for visual art creation.

Node and edge data from Wikipedia were automatically collected with the use of the Wikipedia Python module. Four concepts related to Helitrons were used as 'seed page' from which to build the network. Wikipedia pages were treated as nodes and the links between the pages were treated as network edges. Snowball sampling was conducted to discover all nodes and edges of interest. Wikipedia entries used as 'seed page' were the following:

Processing involved the seed node itself and its immediate neighbors (layer 0 and 1), and as result a directed NetworkX graph was created. It is a directed graph because the edges representing HTML links are inherently directed: a link from page A to page B does not necessary imply a reciprocal link (from page B to page A).

As network measure, the author focused his attention to node indegrees (the number of edges directed into the node); with the indegree of a node equaling the number of HTML links pointing towards a respective page. If a Wikipedia page has lots of links to it, the content of the page must be of wide interest. Nodes with only one connection were intentionally removed from the graph to facilitate the visualization and make the graph more compact. The output of NetworkX was imported into Gephi via GraphML files and was visualized accordingly (Figure 1 to 4).

Figure 1. Network visualization of Wikipedia pages relating to 'Transposable Element'. Node and label font sizes represent the indegrees. Color differences represent community structure of related pages within the network. The network graph is composed of 6,512 nodes and 26,493 edges (4.1 edges per node on average). The top 25 most connected nodes in the graph are as follow: 93 Genome - 90 Dna - 83 Genetics - 73 Gene - 71 Transposable Element - 66 Eukaryote - 60 Protein - 57 Mutation - 56 Chromosome - 54 Retrotransposon - 51 Virus - 45 Transposon - 44 Bacteria - 42 Rna - 41 P-Element - 41 Horizontal Gene Transfer - 40 Repeated Sequence (Dna) - 38 Dna Replication - 38 Prokaryote - 38 Helitron (Biology) - 38 Cell Nucleus - 38 Citeseerx - 37 Alu Element - 37 Microsatellite (Genetics) - 37 Long Terminal Repeat.

Figure 2. Network visualization of Wikipedia pages relating to 'Helitron'. Node and label font sizes represent the indegrees. Color differences represent community structure of related pages within the network. The network graph is composed of 1,528 nodes and 4,145 edges (2.7 edges per node on average). The top 25 most connected nodes in the graph are as follow: 20 Dna - 18 Gene - 17 Genome - 16 Eukaryote - 15 Protein - 14 Rna - 13 Genetics - 13 Chromosome - 12 Molecular Biology - 11 Citeseerx - 11 Virus - 11 Intron - 10 Base-Pair - 10 Nucleotide - 10 Saccharomyces Cerevisiae - 10 Evolution - 10 Enzyme - 10 Bacteria - 9 Phenotype - 9 Mutation - 9 Integrated Authority File - 9 Prokaryote - 9 Cell (Biology) - 9 Amino-Acid - 9 Organism.

Figure 3. Network visualization of Wikipedia pages relating to 'Vladimir Kapitonov'. Node and label font sizes represent the indegrees. Color differences represent community structure of related pages within the network. The network graph is composed of 1,391 nodes and 3,936 edges (2.8 edges per node on average). The top 25 most connected nodes in the graph are as follow: 18 Taxonomy (Biology) - 15 Animal - 14 Genome - 13 Wikidata - 12 Encyclopedia Of Life - 12 Integrated Authority File - 12 Wikispecies - 11 Protein - 11 National Center For Biotechnology Information - 11 Integrated Taxonomic Information System - 11 Inaturalist - 11 Eukaryote11 Genetics - 11 Global Biodiversity Information Facility - 10 Evolution - 10 Enzyme - 10 Chromosome - 10 World Register Of Marine Species - 10 Citeseerx - 9 Phylogenetic - 9 Model Organism - 9 Gene - 9 Virus - 9 Species - 9 Eppo Code.

Figure 4. Network visualization of Wikipedia pages relating to 'Jerzy Jurka'. Node and label font sizes represent the indegrees. Color differences represent community structure of related pages within the network. The network graph is composed of 1,020 nodes and 2,840 edges (2.8 edges per node on average). The top 25 most connected nodes in the graph are as follow: 13 Integrated Authority File - 13 Evolution - 12 Genome - 12 Genetics - 10 Transposable Element - 9 Biology - 9 Molecular Biology - 9 Population Genetics - 9 Phylogenetic - 9 Eukaryote - 9 Dna - 9 Bibliothèque Nationale De France - 8 Speciation - 8 Virtual International Authority File - 8 Worldcat Identities - 8 Oclc - 7 Gene - 7 Computational Biology - 7 Chromosome - 7 Evolutionary Biology - 7 Système Universitaire De Documentation - 7 Phenotype - 7 Wayback Machine - 7 Helitron (Biology).

From figures 1 to 4 it can be seen that the topologies of the networks differ remarkably from each other. The most complex network graph was the one related to 'Transposable Element' wikipedia page with more than 6 thousand nodes. Interestingly, Kapitonov and Jurka (scientists who discovered Helitrons) Wikipedia pages are quite different from each other, with Jurka's page being more detailed and informative, giving a more interesting network graph as result. The 'Evolution' Wikipedia page was shared among three network graphs (Helitron, Kapitonov and Jurka) top 25 most prominent nodes, suggesting the importance of this term when referencing Helitrons. Topology and aesthetic differences among network graphs were used to compose a visual artwork containing relevance to Helitron Transposons in maize and one of the scientists who contributed to their discovery and further characterization, Dr. Charles Du from Montclair University.

Re-mixing of networks graphs based on DNA multiple sequence alignment of Cornucopius family of Helitron sequences_

I previously developed a methodology to create algorithmic visual art based on multiple sequence alignment of DNA and protein data. This approach was used to re-mix network graphs shown on figure 1 to 4 according to the nucleotide sequence of three Cornucopious Helitron elements post-alignment (Figure 5). DNA sequences used were kindly provided by Dr. Du and are presented below:

Accession Number for Related Data: >AC186626.4-Contig45

, Gi Number:

Genome sequence: reverse complemented Genome Size: 74569

TCTCTACTACTACATAAG, 17016, 17033

GTTGT CG TTGC AA CGCA CG GGCACTCAC CTAGT

GTTGTCGTTGCAACGCACGGGCACTCACCTAGT

Found at 64560, 64592

Hairpin to End: 9978, 10010

GTTGTCGTTGCAACGCACGGGCACTCACCTAGT

CACGGGCACGCAACGTTGCTGTTG

110001110000000011100011

Helitron Sequence Location: 9978, 17032

TCTCTACTACTACATAAGAAGCTAATGTAGACGTTCACAAAAGCTTTTGGTGCACGGTTCTGCCGAGGACCTCCGCCATCAACGCTCGCGATCGGACCAATACGCGATATCACAGCACCGCCCGTTGCCGAGGCCGCCATCAACGCCATCCTCCCGCAATCCCCGCCCTCGCCACCTCCATTTCCTCAGAGGAAACGCCTTGCACAGCACGACTCCTCCCCATCTACCGCTACCTTTCTACCTACCCATCTAGCACCGCTCGCTCGCAAATCGCAATGGCGCTGGCCACCAACTCCGCCGCAGCCGCCGCGGCCGTGTCCGGCGTGGAGGCGACGATCCATTCCGCCCCCCGTAGGATTTTGCTCCTCCCGATCCAGACGTCAGATCTCGCATGGGCGGCACGCGGATCTGGTGGTTTCTCTAGAGGCGGCGGCGGCGGCCGGAACAGGGGCAGATGGCGCTCTCGGGGATGCGGGGGCTCTCTGTCTTCATCAGCGACCCTCCTCCGACGCTCCTCCGACGCTCGCCCCGCTGTGGGACCTCCGCTGGCGAGCCCCTCACCGCCGAGGGACGTCCGCTGGCGAGCAGACCAGGTCCATCATTCCCTATGCTCTTCTCCTTCACCATGGGTTAGGCAAGTTTCCTACCTCCTGGTATACGTTATAACCCTACCTCCGTATACAGTGCACCTGATACTGTATGGTTGCCAAAGGCCCGCAGCGGAAGCCTCCATGTCTCATGTATGCGCGCCGCCAGCCCCTGCCGATTGGAGACCACGCTGATTGGAGCCGCCGCCGCCGATCCGTGGCCGCCGACGCCGATCTGTGGTCGTCAACTCGTCCGTCGCCACGAGGTCGGTGGTCGTGCTCTGTTGTCACTCGTCAGCCCATCCACCATTGCTAGGTTTGTGGCCATGATCTGCCGTCTGCCCTGCCCTTCTTTCTTCTCAAAGTCTCAAACTCCTCCCGTTCTTCTCTGTTCTTCTCAGATCTCATGGCCCGCGGCAAGTTCTGTACTTCTGTTGGCGCCTGACTCTGACTGCTAATTGTTCCATTCTCTGTTCTTATTAAATCTATAGTAACCACCTTATTTTTCTTCTTGTTTATTTCTTTTCAGGGAAACCATTAAAGGGGTTTAACTTTCAATAGTATATAAATATTAATGGGTTGCCGGTATGATTTTGTTATCCCTTTTTACCTCCAGCTTGGGGCATGAAGCTACTTACATGACATATATTAAATATAAAAATTTGTACAACAATTTCATGCTAGGACAAGCGCAGCTCCTCCAAACTTGGATGAACAACAACGTAATCAGGTATCTTGTCTTCTCTTGTTTATCTGATATTCTGATGTATGTTATTCTTATTTGTTTGTTCAGTGCTTGTTTATTTGACTGCTAGAACTAGCAAAAATATTCTTCCAATGACACTATCCGTCCACAATATGCCTGGAAAAAGAGAGTACTTTGCTAGAAAAATCACCATAATGCAAGTGAATCATCAAACGAAACAACAGATTTTCGCAGAGCTATGTTAATACGGATTAAACCTGTTGTGCCTATGTTAATACCGTTCTAATTGTAAGACACTTGTCATTTTTCCCTACACCACGGTTGCAATGGAAATCGATGCGCTAATTTAGCCAAAGAGACAACTGTAGCATTGTGCTTACTACTCAAAAATAGAGACTGTACTATATATCTACAGTGCTGCACAAATCCTGACGAGCAATTTGCTGCTGCAAATTTACGTCATAATGCTCTTTAGATTAAGCAATTCTGCCCAGTTCAGGAGTTCCTATGATGCGGACATGAATGCATTCAGAGACCCTTTACAGCGAAAACACCTTATTGAAGTATCTAGCAGTACAAACTACAATCGATCATCCAGGATAGCACATTTTATTTTTTCCACATGATATCTATGTGTCATAGAGATCGTTAATCTCAGAGTCTAGACTGAACACAAGTGTTACTCGGGTCAAGCTAGATGGCATATCCTGCCACTATTTAGCGTCTCTGCATCTAAGCCATCAAAGGTCACGAATCTCACTATCCCCGGAGATCATCCAATCCAAAAACACTTACCGTAAACTTTTGCTACCTTCAGTTCAGTAGTAATGTTGTCGTGTGTCTATACTGTACCGTTAGACACTTAGATAATACCTAAAGCTTGTGTGATCATCTGTCGTCTTACAATTCTTTTTTAAGTGGGCAGCTACTGTGCTGTGCCGCTCCCCAAGATCTGTTTGCGCTGTTCGTGCAAAGCCTGCCTACACCATTTTAGTCCCGCTGGTTTGTTTCTTTTACGAATCGTGCATGCAGTTCTGTATTTACGTGATGGTTCATGGCAAACGCTGCGTCATCATAATATAGCTACAGTACTTTGTAAGCTGTTCTTCAACTGTTCTATATTATGATTGTAAGATGCAAAATATGTGCACGCTTCATTGGTTTCGCGATTTTAAAATGTTGTTTTTTCCAGTGTGAGCGTCGTGACCAAGGACCGGCAGAATGGCAGGTGCGCAGTGCGCTCCATCTATGAGCAGCAACAGCATCAATAAGGGATGCGATTGGTATGTGGGCACAGGAGGGCGCACCATGAGGCATGAGCTGCCAGCAGCCTGATGATGCGTCGCGGCCAGTCCTCCGATTCCGTGTTCTTTGTTCCTGGTGGATGACTTCCCGAGTGGTGAGGTTGCCTGGGCCCTGGGCCAAGGGAAAATATTAACGCTGCGTTCGGTTTGAATGTTCCCTGAACTAGGTAATATAGTGATGCATGGTATGCTCCACAAATAGTCTACTAATCAAACACACTAAGCAAAGAAAAGCCATGGTAAACTTTTGCTACCTTCAGTTCAGTAGTAATGTTGTCGTGTGTCTATACTGTACCGTTAGACACTTAGATAATACCTAAAGCTTATGTGATCATCTGTCGTCTTACAATTAGGAACTAAAATATTTCAACATTTTGAGTGCTGCAAACAAAGAGTTAATTCTCATTGAACTCAAGAACAACCTGGGAACATATGAGTCCTTTTAAGATGTTGTTATCTGCTCATACCATGCAGTACTTTTCCTCAACAAGTGGTGTTGTTTTAAATACTAGAATAGAACTACATCCATGAAAACTATTACTGCTCATATCCACGACAAAGCGCCAGTGAACGTGTGTGCGACGAGGAGGGACCGCCTGCTGTTCTGAGTCCAACCAACTAGACCTACAAGTGAAAGGTATGCTATATATAATATGGTGCTATTTTGTCGTGATCCAGATTGTTATTGTAGGCCATACACAAAACGGGGACCGATAAAATAAAACCACAATTACTTCTGATTCTGAATTGGAAGCATTACATAGCTTCGAAGGTATACACTTCCCTTTTGATTCTTTGGGCAGTTCAACCACATTAGCTACTCCATCCACACTAGCTACTCCATCCTCAGATTCTAGAACTGCTCATCATTATCTTCATCCAGGTCAACACCTTTGTCATCTCTAATTGTTTTAGCGTCTTGATGCCATTGTACTCGTGGACAGGGGCTAGCTGTCTTTGTGTCATTGTACTTGTGGACATGGGCTAACTCATGTAAGTTTCAGACTCTCAGGTATTCTCTATTGCAACATTTTCAGGCTAGAGCACAGTGTTGTTACCTAATCCTTTGTTTTATTTTTTTACCTTCAAGCTCTGGAATTACATTCTCCTTAAGCATAAAAGGAATGCCACTATAATTCTGGAAACATTAGCACATAATCCTATCTTTTAAAGGGTTATACCTGGTTCTTCCTGTGCAATCTTACCAATTTTTGCATGCAATCTTCCTGGTTATACATGATAATTATGAAGAGGGACGTATGGTAGCGGGTTCCAGAGGCAGCAGACCGCTTGGTGCCTAGCTGGGGATGCAGGCGGACGTGGTGGGAGCGAGCAGCGCCGTGGTTGGTAGGACTGGCGATAGATGACAGGGAAGAACTAGCATGTTTTTTTTATATATGTCCATATCAATATTTTATACTAATTTTTAACTCTCACGGCAACGCAAGTATACAATGCTGCTCATCTAGGCGCCACTCACCATGCTTTGCCCTTATACTATGACTTATTAGACTGATTTGTTATTCCTGCACACAAATCTCAGGAAACCAAATGATTGAAAGACTTCAACATATATTTGCTTCATCAAGTCACACATATGGTGGTTTGACTCTAGATTTTCATTGTAATCGACTGGGTCCAACTACTTTATTTCAAGTTTTTACAATCCTAGAAACTTATCTATCAAAGTGAAATAATATGATACTCCTATCATCTCATCGGGATTTTGGCTTAAATTCTCCATTCAACTTCTGCACATTAGTAACATGCATCTTGGATGTTATAGTAATGAGAGGAGGTAGATGATATAATTTGCCTGTTCCTGTTCCAGTATAAATTCAATTCTGGTTTCTTTGGATGCTCATTTCCAGTTCAATCCTGCCCTTAGCATGGGGATGTGGTGAGGTGGTACATCCAGTCCGCGGAGCCCAGGATGAGGTGGAGCGCCGACCTCCACTGCAGCTCCGTGCAGGCCATAGACTGCCTCGGTGGCCAACACAGCACGTTCTCCTTCCTCTGCCTCTTCCCTTCATCATAAAGCCTCCCATGGTGTAATCTTGCCCTCGGTTTTTGTATGCTCATGTAGTAACATTGCCATTGAGTGATTGAGTAGAATTATTCAGTGATTTGTTCCCTATCGGTTCTAATATTGATTCCATTCTTTCTTTCGAAGAAACTATTGGTTGTGTTGTGTGCTTGTGTGTATTTCTCCTACTTCCAATCACGTTTTCAACAGCCTAGCTAACTTTTGTTTTGGCGGTGTAATAAATCGCAGAGGCTACACCAAAGCTCATTCTTCAGTTCATGGGCCATGGGCGCCAGGGGGCTCACCATATCTCATGTCAAGAGTCATCTCCAGGTTAGTTGTTTCACTTTTTCATCCATTGCCTGGCATGCACCTGTAGCTTCTCTGCCACTCATGTCTTCACAAAATCTTGTAAGCATGGCTGCCAATGTTATCTTAGCCTTTTTCTAAAAGAATTATTTTGCTTACGATTGTTATGTTTGTCCACTGTTGTCGTAACCAGTGTAGGTCCTCTAGATTTTAGCAAACCTTGAATCATTCTGCAATTCTTTTTATGGTGAGTTAACAAAAAATAATTGTTATACTGTGACGATAAAGGCGAAACAACTTTTTAGGTTCTTTTTTGTAGAATATAACAAGCTGATTTAATCTTGCAAGTTGTTCAATGTAGACCTCATACTTATCAGTTTCTTTCTTGTTGTGGGATATCTCAGATTTTATCTCAATATTGACTTTAAGTATCGTAAGCCTTTGTCAGTTTGTGGTTGTTCATTGTTGTGTCAATCTAAGAATATTGATTGCAAATTCTGTTGTTTGGACAATGAAATTTACAATATTTAGAATATATGTTAGACACTGAATACTGTAATGTTGTCTTTGAGTTGGTGGACATTGATAAAAAGGAGATGGAGATGATCAAGGACCTGCCTGAAGAACTCAAGCAGGACATCAAGCGCTACCTCTGCCTCGAGCTGGTTAAGTAGGTACAATCATTAAAGTCACTTGGGATCTTGCCTAACTTTTTTACATATGAATGTGCGGTGAGATTGTTACAATATTTTCCATTGCCAGGTCTCGCTGTTTCATGGCATGGACGACCTGATCCTGGACAACATTCATTATAGTTCTGAATGAGATATGGATTGTTCTGTTTAGGTTATGATCCTGCTATGGGACTGCTTCAGTTATGACCCAAATTCTAAGTTGTGACTATCATTGTGTGCATTTGTTTTCTTTTACTTTAGAAAAGTAGCTCATGTTCTATGTCAGCTTCTCTCAAAATCCGAAACACGTTTTTGAGGTTGGCGTAGTTGTAGTTTGGCGTTCTGATGTTTATCAATTTTGTTTTATATTTTTGCATCCAAAATTCACTTGTCTTTCTTCATAGTCTGTTTCAAAGATGCCATGAATGTTTTAGACAATTGGTTTACAGTTGACAGACTTCTTCACTTAGCTTGTGATCTACAACTGATGTACAAGCTGACATATGTAGTTCATTTGAGTGGAACTGCAGCCGCATCAAGATTTCTTTCATTACTTCTACAACCCATTCTAAAGAGATTTTTGTTATATAGGATTCGACCGAGGATAAGCATTGGTCATATCCATGATTTTGACCACTAATATTGTTGCTTATAGTTGACAGAAGCTCTCACTTCTTCACTTCTAGATGGAGTCCTTTTGATACTTGGGGTGTTCGGTAAGTTGGTGTTCAATTGGAGTATGTTGCATAATTTCATAGCTTTCTTAATATGCCAATTCTGTTGATGGAGCTTTACCGACATGCCAATTTATAAATGGAGCATGTTGCAGTTAGATGTAGTCTATTTCAGAATTGAGTGTACAAATAGATTTTACTTATAATGTGTTGCATAATAGGACTAAGCTTTAGGGGAGTGCTTTTGTACCAATGGTAGTAATTGGTTAGTATCTTATGATCTTCATGAGAAATATGACATCGTTATATGACTGTATTTGGTAGCACCTTATGCAATTTTTTTGAATTGGCCAAGTAGTGTGGTTTCGTGCCCATAATAGAATAGTGACACTTAGTTGATCTTTTGTTATTCTTTTTCAGATGTGAAGACCAAGTAGGAGACAACTCATGGGCATAAGCATATTTTCTAGAAGAGGAGAGTAGCACTTGATGACTTTGATAGGTTCAAAGTCATGCTATCAAATGTTTAGTTCGTGTTGGCACTGTTTCTATTGCTCGCTCACACTTTTTTCTTTATGTAAACAGAGGGTTGGTGCTATCAGGTAAGAGCTCGCCAAGTTGAAGAAGGCATCCATGGCTTAATCGAGATTTATTGTTTGTATATATCTTATCATAACATTTTTACTTCGTAGCAACACATGAACATTCACCTATTTGTATATAAGTTATCATGATATTTATAAGTTGTCGTTGCAACGCACGGGCACTCACCTAGT

5' end

TCTCTACTACTACATAAGAA

AT content is: 0.7

CG content is: 0.3

3' end

AAGTTGTCGTTGCAACGCACGGGCACTCACCTAGT

AT content is: 0.457142857142857

CG content is: 0.542857142857143

Accession Number for Related Data: >AC191691.3-Contig129

, Gi Number:

Genome sequence: reverse complemented Genome Size: 24998

TCTATACTACTTATTAAG, 16548, 16565

TCTCTACTACTTATTAAG, 12973, 12990

GTTTC CG TTGC AA CGCA CG GGCACTGAC CTAGT

GTTTCCGTTGCAACGCACGGGCACTGACCTAGT

Found at 18511, 18543

Hairpin to End: 6456, 6488

GTTTCCGTTGCAACGCACGGGCACTGACCTAGT

CACGGGCACGCAACGTTGCCTTTG

110011110000000011110011

Helitron Sequence Location: 6456, 12989

TCTCTACTACTTATTAAGGCAACAAGGGTAGCCTACCTCCCTAGGTTCTGTCGTTCTGCCTCTCCTTGTTATGTCTATTCCGGACTCTGACTGGTGGGCCTCCCATCTCTATATCCCTGCACATCCTTGTGGCCCACCATGTCCAGGGCATTTAACAAAAAATGGGATGTGTGGAGAGAGTGACAAGACGATAGTAGCAGCGGAGCATATAAGCCGGTGGTAGCATCGTTCGATGGTCGGTCTGGTACAAATCCTCAGTTCATAAACCGTCAAGACGAAAGCCATAGGCATTCTGGACTGGGTGCTTTGGTTCGGCGCAGATTTAATGGAAAACAACAGATCAGACGTCCTGGGCTAGATGTTTTGGTGTGGCCCAGATTTCATGAGAAATAGGTGTGTCTGCTCATCATCTTTGTGTAGGCAGCCTTCGTTCTCCATGGATGCTCACGACACAATCAGAGCAACCACTCTCCCTTTCCACGACCATGTCCTCCTCTTCTCCCCCTCAATCACGAGTTCATTATCGTCCACAACCTCCCCTCCTCAAGGACAACTGACTTTGTCGACACCCCTTCTACAGGCTCCTCAATCCCCACGCCATCATTGACCCCAACTCCAAGCTCGAGTTTGATATGCCCGATATCTTGCGTGTGTACAAGATCGACTGTGTCGAGTGCTTCGACGGCACTGAGATCGTCTTGTCGTCCCCCCAATGGCAATTCCACCAACGACGTCATGTCCAAGGACTTCACTACAGGCATCTCCTCCCACCTCTACCTACCCGCGGGGGTGGATCCTGAGAAGAAGCTCCGCATCGTTGTGTTCTTCCACGATGGTGCATTCATGGTCCACAACGCCTCCTTCCCGTTGTACCACATTTACGTCGCCTTCGTCGATGCTGCTGTGCCCACCAGCCGCTGATCGTGCTCCCTCGCTCGCGATGCTCCCCCCTGACTGGCCTCCCCCTTTGTTTATTGAACGATTCTTGTATATTATTAGGTGAGGATTGGTAAATTCATTGACTGTTCTTGCCCCCCTCTCTTAGGTTCGGGTGTTGTTGTTGCGTGCACAGGTTCAATTGTCTACACTTGTACAACAATTTAGTGAGGTTGATACTAACTGTATGCTCGCTTTGCTTTCTTTTGTGTCGCAGTTGTGTACAAGATCGATTATGCTTAATATGAAGAACTAAGGAATAGAATTCAGAGATAAAGGTTGTAATTGTTTTTGCATGTTTCATCTTCTGTATATGAATTTTGCTTGTTACATTATGTGTGTAATTGGTGTCGTTTCTGATTATTGATTGTGCCTCTTACCAGTTCGATATGTTGTCTACACTGCAACTTATCCCAACAATAAGAGTGAAAATGTGCCAATTTTATAATCTTAAATAGACATGAGACCAAGGTGAGATGTTGTTTTTTAAAAACTCACCCTCACTATTTGTTTCCACATGAGCAATTTGTATTTTAGTAGCATTTTTGCATCACACTTTATTACTATGTCTTGTGTAAGGCTCTAATTGTTGTTAAGTGTTCGTGTATGCCCAGGTGATTTTTTGTTATCTTCCTGTATGTGCAGGTTTAGTGTTCAAAATGGCAGCCTTAGGCAGGCTTGGTGGTCTTCTGAGTTAGGTTTAGGGCCACTAGTAGCTTGTCTCCTTCAGTGTTCAATGCTCCTCATGTCCACCAGGCTATTCATTGGTGGTGAGTTTTTGTTGTCTTGTTTCCTCTAATATACCAATAGTCTTCATTTATGTTTATTAGTGTTCTTTGGTTAGGTCTTGACGTGAAACTAAGACAAGCATTCAGTTAGTTTGGAGAGGTTACTAAAGATTTGTATACACATTTGAATGCTATTATGATGTGTGTTAGTGGCTCATTGAATGGTATTGATCCAATTTCCTCTACTGTTGTGTAATGGTAGATTACGAACAGCATAACAACTTCGATCATTGGTCTTTTGGTTACTGATATAGTAGTGATGTACCTTTTAGACTAGAGGAGAAGTATGTGGTGCAAAATGTCAATTCTCCAAAAAAACAGTAACAGTAATAGAGCCTTCTTTCAGAAAATCTAGACATCATCTAAAAAATATATAGCCAAATGAGAAGGCAAAATGCCAAAAAACAAGAAGGAAATGTTTTTAGTGATATTTCAGCATCCAGCATTATATACTTCTATGTTCTCTTCCATCCACAACATTATTGAATTGTTTCTGTTATATCTTACTTTGACTAGAAGTAGGATTTCCTGATATTTTTCCTTTTCCTGCAGAACAAACAAGATACTTGGTTTTTCTGTATCTCTCATCCTCATCAATCTGGCTTCAATTATGGAGCGTGCTGATGAGAATCTCCTTCCAACAGTTTATAAGCAAGTCAGTGCAGCCTTCAATGCTGGTCCTACTGATCTAGGATATTCACCTTTGTAATGAACTTTCTGAAGTCAATAGCATCTCCCTTAGTAGGTATCCTTGCTCTGCACTATGATCGACCAACGGTGCTTGCAACAGGGATTGTTTTGACTGTTAGAGTTAGCCAGTATTTTGGGCATGTTACATTCAAGAGAGCAGTAAATGGCCTTGGGCTTGCCATTGTAATACCTGGTCTTCAGTCGTTCATTGCTGGAACTGTTATTACTGGGTTTGACACATGTTAGATTTTTTTATTACATGTGGTCGAATATGCGATACAACTAAATTACTGCATGTTATTTCACTTTTATTGTTTGAAGTATGAATTGAATGGACTGTTCGTTTATCAAATGTAATATGTTAGTGTTGCCTCCGAAACTATAACATGTTTTCGCTTTTTGGGATATATTGCTTTTATTATGTATTTACACATACTGTATATCTAAGTGCATAGCAAAGACTATGTATGTATCTAGAGAATGGAAAATGTTTTTTATAATTTAAAATGAGGGAGTATTTAACAAGGACCAATATGTCCTAAATTTTAGACTTGCCTGGGTCATTATTTTGGTTAATACATGGGCCTCATGGTGCAAGATATGGAGCTCTAACTCAGCATACAACCCACTTTTTCCTTGGATGTAACTTTTGGGTGATTTGTTTGTAAATGTCGTAAGTGAATAACTAATCACAGGAGCATAAGGTTATTCTATTTTTTTGTGTTCCTTGCTATTTACTACTCATTGTACCTATGTTGTGATGAAATGTATTCTTGCTTGCAAAGTGGACCCTTGAAGACAACAGTTACTGGTGTAGAGATGTTCAAAAAATACTGGGCCATGGAGAGGTAATGATTTTGAAAGTAGTATGTGCTAGGTTTTGTATTGTTCCATATTATCTTACATGCAACTTCATATGTGCAGGCTGGTGACAATGTTGGTCTTCTTCTTCGTGGTCTTAAGCTTGGCTAGGTTACCATACTTGTCAATCCCATTCATTGGTAAGCATCTTGAGTCTTTTGGAGCTTCGTTCTTACTATGGGTCATGCAATATTCTGGAATCAATTGCTGGCTAAAAATTTGAAAGGAAAGTTGATAGTGGAGAACCATCATTGTAATTTACTGAATGAATAATGCAAGGAAACCCATCACTCAAAAATAACCATGCAGTTGACGAGCACAACAAGTGCGAGGTTAGTGGCCTCCCTTGCCTTCTGGTCTACTCATCCAGTTCGGTGCTTCTAGTCTCCTCGCCCTATGTGATAGGTCCCACGCTAGCCTCCACCTCGGCGGCATGGTCGTGGCAACAGGAGCATCGCCATAGGCAATGACACCGCGCTCCATGTGCCCCCACGGTCGTGCCTTAACAGACTACATTAAGTAACATGAGCAACTTAATAGACTACTATCCTAAATTGAATCAAAAACACCAGTGGTATAATATCTTGTCATGATTAGTTTTGTCAAATAGTCCTCAAATACGTATCTTGAAGATATTTGTCTGCTCATTGTTTTGGGTCATATATGATGTTTTTAAGTGATGTATGCACAAACTCAGTTATGTTTTCTAATTCGCTCATACTGTGAAACTATGCTTTTATAACTGCATAAATCCTAATCCTCATATATGTAGTCTGCCGTATGGTGGCCCTGTGAATGTTGTGACAATGTTGTTGAGTATGCTGGCCCTATGACTCCTTTGACCATGCTGGTGAGCTTGATGGCCCTATGAATCCTGTGAATCTTATAGACGATGGCCAGGTTTTTTGTGAACTCACTCATTTAGTGCAGTTTGCACTGTGATGCATAGAGCCAAGGGCCTTGCTTTGCATAGACACTGAGTTTGTGATTTTATAACTGCACAAATCCTATCCTCATGTATGCAGTCTGCCCTATGATGGCCTTGTGATGCTTGGAGATGATGAACCTGTTCTGTGAACTCGCTTATTTTGTGCGGTCTGTCCTGTGATGCATGGAGTTGAGGGCCCTACTTTTTTACAGACATTAAGTTTGTGCAGTAACACATGTAATGCTTATACATGCTAGCATTGTTTGGTTCATGCACTAATTTTGTGCAGACTTCCTTGTGATGCATATAGCTGAGGGTTCTGCTTTTGTACAGGCACTGAGTTTCTGCAGTAAGACTTGTTGTGCTTATACATGTCAGCCTTCTCCAGTTTGGACAAAGAAGTTGTATGTTGGCGTCCTTGTGATGCTTATAGATAATGCACCTATTTTATTCAGGCTGATTTGGTAATTATGTCCTGTGATGCATTTGGGGAGTTCCTTTACCTGACATAAAGATGTATGTTGAGGGCTCTTTTCTTATGAACTTAGTGAGGTTATGTGTTTTTATTATTCTTTCATCAAAGAATGTACTGATTGTGTTTCTATTTGTTAACAGATGAAGAAAAAAGCTCCCAGTTTGGACAAAGCTCTTATGCTGTGAATCACTTGTGGGAGTTATGCGCAGAGAAGTACATAGGTAGTGCTACTGCACAGTGTAATTGTTATTGCTCTGAAATGATGTAACTATTATAGGGAACTGATGCTGATCCAATTATACAGTTCCTAGAGTTGGAAAAGTAGGTAACACTTTTTGTAAAGAGCATTACATAGAACAGATTCAAAAGCTATCTGAGAAGGTAGGTATTATGCGCTTCTGGTTGCTGATTCAATTTATGGACTCTCCACATTAGAGCGAGAGTGCAGCACAACTTTCATTGTCTGCTTATTCAGATATTAGTCCTTAAGTAGATATTATTTAGATAAAAAGAAGGCATTGCAGTGTCAGGACTGATAAAATAGGGGAAAACAAAAAGATACATCACTGTCAAATCCTACAACATAACTTCTAGGAAATACTCCTTAGGAATAGGAAGAGAAATCTAAGGTTTGTACTTCTCCTAAAAATTTAGGGCTATTAACCAGCCTTTGGTTCATGCCTTCTGAACAAGCATTTTCTTTTGTTTGTTTATAGATATATGAGCCCATTACATGACCATGTTTCTGTGATGGGACAACACGGGAGTGTCTTCGTTGGACACATATTTTTAGCAGCTTTTGTTGCTCGATCTTCAAGAAACATAGGTACTATCTCTATTTTTTACTTGGCGCTCAATTGTTCAAGTGTATACTAACAAATGTCAAATAAAAAATTGGAGGGAGTACTTTTCATGGAACCTGAAACCATGGAGAGCTAGAGGTTATATTGTTTTTTCTGTATATCAAATAACCTACTCCTACCGTGCTACTTGATTAATTCATCATCTTTGGCCAATTATTTATGAGTTGTCACATGGGTGTTGTTGATAATCATGTTTTTGGAAACATCTACTTATCTCACTTTTCCCTTAAGTGCTTTGGACAACTATATGTTGGATTCTGTATAGAAGATCCTCTACAACAACAATTATGTATTGTTTGTGTTGAAATTCACCCCTGCCAATCTTTTATGATTTATAAATGTTGTGTTTACTATTGTTACCCCTGCCATTATGATGTTCCAATGTTCTAAAAAAGGTTGACTTTTCACTTGCACTAATTGTTCTTTTTGAATTTGGATCTTTGCTAGCAAGATTCTTTTATCATATTGATTTTGTAGCAACATACGTGCACCTTACTATATCTTTTAAGGTATATAAACATCCACTTTTAGTGATGCTAACAAAAGTATAGAATAACAATTAGAGTTTTTGCATATCAATGTATTTTATCATATTGATTTCGTAGCAACGCACGTGCATATACCTGCCTCTCCTTCCCGTCGCCGCAACACAAAGCTCCGTCGGACACCGACCCGACGCTGCTCTACTGCCTGCTACGTTGTCCCTCCGCTGGCTCCCACGTACACCTTCTCGACCGCGCGGCTACTCCACCACTATCCCTCGAAGATTGCCCCCTCCCCAAGACTGCCAACCTCCACCTCACGTCGCGTGCCTCGCAGACGGGCGACCTCGCACCGCAGGCTTGGCGCTCGACCGTGGCCCGAGTTTCGCTTCTACGTCATCCGGAAAGCTCGAGGAGGTCATCCTCCATCTCATAACAGTATAATACACATTTGCTTATAAGTTATAGTGATATTATATGTTTCCGTTGCAACGCACGGGCACTGACCTAGT

5' end

TCTCTACTACTTATTAAGGC

AT content is: 0.65

CG content is: 0.35

3' end

ATGTTTCCGTTGCAACGCACGGGCACTGACCTAGT

AT content is: 0.457142857142857

CG content is: 0.542857142857143

Accession Number for Related Data: >AC191632.3-Contig9

, Gi Number:

Genome sequence: reverse complemented Genome Size: 38160

TCTCTACTAACTATTAAG, 37165, 37182

ACTTT CG TGGC AA CGCA CG GGCACACGG CTAGT

ACTTTCGTGGCAACGCACGGGCACACGGCTAGT

Found at 3725, 3757

Hairpin to End: 34404, 34436

ACTTTCGTGGCAACGCACGGGCACACGGCTAGT

CACGGGCACGCAACGGTGCTTTCA

000001111000000111100000

Helitron Sequence Location: 34404, 37181 Silico_101

TCTCTACTAACTATTAAGAGCTTATTGTAGACTGCCCCCGCCTCCCTACCCCGCCGCGACCGCTCCGCATGCCCGCCGCGAAGCGCTCGCATGCCCGCCTTGACCGCTCTGCACGCCCGCCGCGAAGCGCTCGCATGCCCGCCTTGACCGCTCCGCACGCCCGCAAAGCCCGCTGCGACCGCTCTGCCGCGAACCGCACGGCCACATCGGGTGCCGCACCACAACGGATCCTTCTTCAGCGGGATCCACCATTTAGGCCTCTGATTCGGCGCGGATTCACGGCGCGGAGCCGCCGAGGTTGCGTGCGTGATTCGGATTGCGGAGCCGTCCGATGGAGGCGCGAGCGTGCGGTCGAGGCAGCCCACGAGCCGCGCGTTGGCGTCGTCGAGCGCCTCGAGGTCCAGCCACAGCGCGGCGGCCTTGACGTCGAGCGCGGCCTGTAGGCGGTTCACCACGGCGTCGAACGACCCGCAGCAGTGCCTGGTTCTCGCGCACTTGGACCTCGAGGTGCGCCGTGAGGGCCCAACCACCGTCCGGCACGGGCGGGCCCACTGGCCCGCTCCTCGCGATCCGCTTGAGCTCCGACAGCCTCCGGAGGTGCGACACAGCGGCGGCGTCCGCATCCGGCAGGAACGTCGTGTGGGCGGCCTGCAGGTGGAGGTACGCCGCCTGGAAAGACGAGGTCGTCGCAAGCGCCGCGGCGACGGCGGCATCCTGCCCGGACGCCGCCTTCCGATCCCCTCCCCATCAGCGCTAGGGTTAGGGTTCGGCGGGTCGGGCTTCAGCACGACAACCCGCTGCCAGGCAATTACCCCATCCGCCCCCGGCGCCTGAGAGCGTGCTAACCGGTCGGCGTCCTCGTCTTCGTCCTCCTCGGCGGGGAACTCGATGGTTCGTGTCTTTCCTTTGAGTGGCCAGAGGCATTGCTGCCAGAGCCTATAAAACCATTATGCGATGAACATGATCCACCTTGTGCAAATCCTGCCCGTGCTTCAGTATCCTTATTCCTCCTGTTGTTTCCCTTGAATCACAGTACAGCATGATCAATACCAACCTATAGAAAGATAAAGGAGCTCACAAAGATATACTTATTGGATGAGGATCTGACGCTTGTCCAGGCACCTCCTTTATATGATCCATTCATAGTAACTGAAGGCACTGAGTCAGATGAACAAGACGATGAGCTATCATCCACAGTAGAGGTCCTCTTTCCTGCATGGTCATTTTGCACCTCACTACTTCCAGGAATAGTAGTTTGAGTTTCTGAAGCATCTATCTCCCAATTAACAGGGCTAGATTCACGGTCTTCAATATCAACATGAAGCACATCTGAATTATCGTCTCTACTGTCAGATATATCAGAAACTTCTTCAGGGTTGTCAGCATTTAAGGGCATCTCTTCAATTTGTCCACAAAAATCATCCAATATTCTATCATCTGAATGACTGTGGTCCATAATTATTTCCTTATTGATGTAAATGTGGCAACACTTGTCAGTTTCATGCTTTTACTAATGGAGGTCCTGTACTGTTCAAATTTAGAGAAATATTACATTTATGTTTTTGAGCACATATTGCTCCAGCTTAGATGTCTGCCAACACAAGTATTGACGTGCGGATGTTTACTTGTTTAGATCATTGAAGGATCTGGCAAGGTTGCTATTAGAGGGTGATTAATTATCTTACATTATTCCAGTTCCTTAGGAGCACTTAAGTTTTTAATAACTAATCAGGTATTCACAAGGTATGGACTCCAAAAGTTGTTTGGCCTAGACTGCTTGGCACTCACCTCAGACCAAACCCTAATTTACCCTGATCTTCCATCTAAATATGGTATGCATAATCGTTTTTAACAAGTTTGAGTCAAGGAATTTAAGAGAAACCAGTGTGGATTTTATATGTACTAACTCATGTTGTCAGTGATGATAATTTTGGTGTCTTTCTTTTGGATTCTGAGCATTGTGTTGACTGTAGTAAAGATGCATAGTCTTGCTGGGCTGTTCGGTAGCACCTCCAGGTGTTTGGTTATGCTGGTACCTGGCAACACTCAATGGCCATGGAATTGGCAAATAGAGATCCTTGAAATGGTTGCCCACTAGGACCCTGGTAGCCAATGTTCTTGCTGCAAGCATAATGGCAGTTCTTGCTATAACATCCAAAGTGGTATGAGCGAACATTTTTTCATACAGATTCATTGTTTGCGGTTGGTATATTGACTCTTCCATACATCAGCATTTACACCTGATTTATTTAATTTTTCTTACAGGTAGATACAAAGCGATCGACAACTATTCGTAGTGGAATACAACCGACTTCCTTGGTTGCTTGAGCACAATGTCTACTTTTGATGCTGAAGTATATACAATGAGAAGAAGTGGGCAGATTGCCAGGGCCTTCATTTATGTTGCATCCACCTTCCTGCTTTTTTTCTGCTAGCACAACTCTCTTCATGGTAGGTCTGTAATGGATAGTAAATCAAAGTATTTATAGGCTACCATGGTTCAAAGTATTTATAGGCTACCATGGTCCATGGGCTAGAATAATCAGGATTAGAGATAGAAACTTATTACCAAATAGCCAATTGGTTGTAATAGTGATTCAATCAATGTATCTCGCTTTCAGGTGCCCATCGAGGATGCTACAGGCTCATCGACATCTCGGACAACACACTAACATTTTGCCACACAGATTTTTATATATTATAGTGATATTTGTTGTTTTATATTAATGTTTTATACTAATTTTTAACTTTCGTGGCAACGCACGGGCACACGGCTAGT

5' end

TCTCTACTAACTATTAAGAG

AT content is: 0.7

CG content is: 0.3

3' end

TAACTTTCGTGGCAACGCACGGGCACACGGCTAGT

AT content is: 0.428571428571429

CG content is: 0.571428571428571

Figure 5. Multiple sequence alignment of three Helitron sequences from the maize genome. The first 1,020 nucleotides post-alignment are shown in the figure.

In order to construct a new image from the remixing of the four network graphs shown on figure 1 to 4, the first 1,024 nucleotides from Helitron sequence AC191691.3-Contig129 were used, as the image size from the network graphs created using Gephi comprised a dimension of 1,024 x 1,024 pixels. Each network graph was assigned to each nucleotide from the Helitron sequence post-alignment as follow:

'Vladimir Kapitonov' > nucleotide 'A'

'Jerzy Jurka' > nucleotide 'G'

'Helitron' > nucleotide 'C'

'Transposon' > nucleotide 'T'

The resulting image is shown on Figure 6, clearly depicting the re-mixing/collaging nature of the new visual creation from the original network graphs. This collaging being dictated by the Helitron nucleotide sequence post-alignment that was used.

Figure 6. Resulting image from remixing network graphs according to the Helitron nucleotide sequence AC191691.3-Contig129 post-alignment (first 1,024 nucleotides).

Subsequent arrangements of figure 6, together with figures 1 to 4 were pursued by removing their white backgrounds using the Python Image Library 'Pillow'. Additional visual elements were created using Processing and added in layers to compose the final artwork. It was important to include the image of Dr. Charles Du, the scientist involved in the identification and discovery of Cornucopius Helitron family in the maize genome, portraying his image into the final work. As the ultimate background for the painting, the network graph shown on figure 2 was used to 'paint' by moving the image along the X,Y position of the mouse. The resulting artwork is shown on Figure 7.

Figure 7. 'Helitron Art' created by Martin Calvino based on complex network graphs of Wikipedia pages and their re-mixing determined by maize Helitron DNA sequences post-alignment.

Conclusion_

The resulting artwork shown on Figure 7 clearly exemplifies the broad possibilities of co-opting algorithmic tools used in technology and science for the creation of interesting visual art. In doing this, the artist become a humanizing force in translating scientific concepts to color, form and texture for emotional impact. At the same time, the artist played a central role in portraying the scientist behind the work being explored for art creation, making him a central piece of the visual work. Interestingly, although each visual element from the painting was derived from technical/science information, its re-contextualization and re-mixing was used to a different but related purpose: art-science.

References_

Kapitonov, V and Jurka, J (2001). Rolling-circle transposons in eukaryotes. PNAS, (98) 15: 8714-8719

Feschotte, C and Pritham, EJ (2007). DNA transposons and the evolution of eukaryotic genomes. Annual Review in Genetics, 41: 331-368

Du, C., Caronna, J., He, L., Dooner, HK (2008). Computational prediction and molecular confirmation of Helitron transposons in the maize genome. BMC Genomics, 9: 51

Du, C., Fefelova, N., Caronna, J., He, L., Dooner, HK (2009). The polychromatic Helitron landscape of the maize genome. PNAS, (106) 47: 19916-19920

Xiong, W., He, L., Lai, J., Dooner, HK., Du, C (2014). HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes. PNAS, (111) 28: 10263-10268

Xiong, W., Dooner, HK., Du, C (2016). Rolling-circle amplification of centromeric Helitrons in plant genomes. The Plant Journal 88: 1038-1045

Xiong, W., and Du, C (2014). Mining hidden polymorphic sequence motifs from divergent plant Helitrons. Mobile Genetic Elements, (4) 5: 1-5

Zinoviev, D (2018). Complex Network Analysis in Python. The Pragmatic Programmers (Raleigh, North Carolina)

Calvino, M (2018). https://www.martincalvino.co/single-post/2018/03/09/Auditory-perception-of-reduction-in-genome-diversity-as-consequence-of-plant-domestication

Calvino, M (2017). https://www.martincalvino.co/single-post/2017/07/05/Post-polyploidy-subgenome-evolution-of-Glitch-Art

Wikipedia Module in Python_

https://pypi.org/project/wikipedia/ (Accessed during April 8-11 of 2019)

Python Image Library_

https://pillow.readthedocs.io/en/stable/ (Accessed during April 8-11 of 2019)

Processing_

NetworkX_

Gephi_

bottom of page