Do I Need to Trim Bacterial Rna Seq Reads

Open access peer-reviewed chapter

RNA-seq – Revealing Biological Insights in Leaner

Mariana P. Santana, Flavia F. Aburjaile, Mariana T.D. Parise, Sandeep Tiwari, Artur Silva, Vasco Azevedo and Anne Cybele Pinto

Submitted: April 1st, 2015 Reviewed: October 2d, 2015 Published: Jan 14th, 2016

DOI: 10.5772/61669

Abstruse

New technologies are constantly beingness released and the improvements therein bring advances non only to transcriptome, the focus of this affiliate, only likewise to diverse areas of biological research. Since the proclamation and application of the RNA-seq approach, discoveries are being made in this field, simply when we consider bacterial species, this progress proceeded a few years behind. However, with the application of RNA-seq derivative approaches, we tin can gain biological insights into the bacterial world and aspire to uncover the mysteries involving gene expression, organization and other functional genomic features.

Keywords

RNA-seq
leaner
transcriptomics
bioinformatics analysis workflow

Mariana P. Santana
- Instituto de Ciências Biológicas-ICB/UFMG, Departamento de Biologia Geral, Pampulha, Belo Horizonte, Minas Gerais, Brasil
Flavia F. Aburjaile
- Instituto de Ciências Biológicas-ICB/UFMG, Departamento de Biologia Geral, Pampulha, Belo Horizonte, Minas Gerais, Brasil
Mariana T.D. Parise
- Instituto de Ciências Biológicas-ICB/UFMG, Departamento de Biologia Geral, Pampulha, Belo Horizonte, Minas Gerais, Brasil
Sandeep Tiwari
- Instituto de Ciências Biológicas-ICB/UFMG, Departamento de Biologia Geral, Pampulha, Belo Horizonte, Minas Gerais, Brasil
Artur Silva
- Centro de Ciências Biológicas, Departamento de Genética. Universidade Federal do Pará, Campus do Guamá, Guamá. Belém, Pará, Brasil
Vasco Azevedo*
- Instituto de Ciências Biológicas-ICB/UFMG, Departamento de Biologia Geral, Pampulha, Belo Horizonte, Minas Gerais, Brasil
Anne Cybele Pinto
- Instituto de Ciências Biológicas-ICB/UFMG, Departamento de Biologia Geral, Pampulha, Belo Horizonte, Minas Gerais, Brasil

*Address all correspondence to: vasco@icb.ufmg.br

1. Introduction

RNA-seq technology has driven advances in gene expression analysis through new-generation sequencing platforms, equally they are versatile, powerful and ensure quality results with accuracy and reproducibility never reached earlier. This technology generates information that provides meaning to the set of transcripts (transcriptome), opening upwardly possibilities for understanding jail cell behavior in dissimilar environments. RNA is an of import component within the jail cell, since information technology plays different roles as a messenger regulatory molecule and carrier; and, it is also essential for the maintenance of housekeeping genes [ane].

In 2005, the first new generation of sequencing engineering was released and has been evolving apace [2]. After starting the process of factor expression analysis in bacteria [3, 4] at a more attainable cost, shorter experimental time and without probes, the technology took off and today overlaps other tools used for this purpose, such equally microarray engineering science, until now extremely useful for this type of assay.

2. Applications of RNA-seq

Understanding the transcriptome is essential to knowledge of the functional genomics of an organism. The development of next-generation sequencing (NGS) impacts unlike areas, such as medical and industrial, and has gone through a revolutionary process. Different approaches, amongst them the RNA-seq technique, have emerged in the fields of microbiology and molecular biology in order to assistance in understanding and bring solutions to bacterial domain investigations. In this section, nosotros will detail some applications that are role of our electric current context.

ii.ane. The medical field

The applications of these NGS technologies in medicine have allowed expansion in the fields of diagnosis, treatment and prevention, especially concerning bacterial diseases. One of their major applications has been the quantification of expression levels of each transcript under different conditions that simulate the intracellular environment. Such work has been washed past Pinto et al. (2014) to empathize the host–pathogen relationship [5]. Westermann et al. (2012) demonstrated the validity of this technique, with the transcriptome of the pathogenic bacteria as their host, using the dual RNA-seq that simultaneously analyzed the gene expressions of the pathogen and host [half-dozen]. This gives us better agreement of the systems biology involving bacteria and their hosts, helping scientists to develop drugs and vaccines.

Another field that has been explored extensively involves metatranscriptome, as scientists accept sought to comprehend the composition and regulation of microbial ecosystems [7, 8]. To pursue this, they have used the RNA-seq technique to generate, and allow the estimation of, a large volume of very reliable data. Leimena et al. (2013) also validated the RNA-seq technique using the microbiota of a human small intestine with ileostomy. Their aim was to understand the interactions involved in this microbial ecosystem and how these relationships can exist associated with disease [8]. Transcriptome analysis pipelines (run across Section 5) can exist used with different experimental designs and applied to many bacteria in addition to those in the medical field.

2.2. The industrial field

Industrial applications have been developed in recent years, mainly in the probiotic industry, since it benefits the world economic system. Bisanz et al. (2014) used the RNA-seq technique [9] to prove the metatranscriptome of probiotic yogurt, seeking to understand the metabolic activities that allow the survival of this organism in the products. Their results bear witness the adaptive capacity of this bacterium, as well as the variation in differential factor expression, yielding the taste or storage life of the production [ix]. Studies such as these are of import considering they enrich the knowledge of the industrial field and open up new possibilities for an attractive surface area in the marketplace, which results in improvement in the quality of the product that is ultimately delivered to the consumer.

In addition to the probiotic market, another important area is the bacterial production and synthesis of biomolecules. Wiegand et al. (2013) used the RNA-seq technique to understand the regulatory RNAs in the fermentation of Bacillus licheniformis . Their study identified active genomic regions which, in turn, contribute to the efficiency and optimization of the fermentation procedure, which can promote the industrial production of exoenzymes and antibiotics [x].

Microorganisms produce antioxidant molecules that tin exist used in the pharmaceutical and cosmetic industries. They also produce other compounds, such as propionate, that are applicable in the production of chemical aids and are produced by Propionibacterium freudenreichii ssp. shermanii , which 1 is considered valuable in the food industry [eleven]. In this area, the RNA-seq engineering science is very promising and its awarding can bring advances in these studies.

Advertizement

3. RNA-seq and derivative techniques

iii.ane. RNA-seq

The RNA-seq technology is able to identify all RNAs directly and quantitatively: coding and non-coding, rare and abundant, smaller and larger. This method provides information near the transcription commencement site (TSS), untranslated regions (UTRs), detection of unknown open reading frames (ORFs), improved quality in genomic note [12], and also allows the distinction between main and processed transcripts (dRNA-seq) [thirteen].

The major constraint is to ensure representatives for rare transcripts. In this case, the recommendation is either to increase the representation of reads per library [14] or to enhance these transcripts, eliminating the ribosomal (rRNA) and transfer (tRNA) RNAs that are in abundance in the cells representing about 95% of total RNA [fifteen].

Despite RNA-seq generally being considered the gold standard for gene expression assay, some researchers nevertheless find information technology complicated to define this technology as the gilt standard. It is a method that is available in different platforms and address different strategies, showing advantages and disadvantages. However, the superiority of this engineering science, compared to others in the past, is not questioned [16].

Despite the technological superiority, the need for biological replicates and depth of sequencing remains. Hence, the results may achieve greater reliability and reproducibility [17]. Differentially expressed genes are improve appraised when there are samples with more biological replicates, as compared to enhanced depth with fewer replicates [xviii].

Transcriptomics studies have contributed a revolution in the report of the bacterial environment. Dissimilar bacterial species have been targeted for RNA-seq studies [5, thirteen, 19, 20], and gene expression-based discovery has transformed the scientific paradigm of these organisms. The detection of an unexpected amount of coding genes in Helicobacter pylori has demonstrated that, despite having a pocket-size compact genome, the transcriptome of this bacterium is extremely complex [xiii].

A surprising result was the detection of a large number of transcription start sites (TSS). This has never been achieved before using whatever engineering science aside from derivative RNA-seq applied science, like the differential RNA-seq (dRNA-seq), which differentiated principal transcripts that showroom triphosphate ends from processed transcripts that present monophosphate ends, such as rRNAs and tRNAs. In this instance, to enrich mRNA, the strategy was to care for all the RNA samples with exonuclease enzymes that degrade nucleotide monophosphate. This strategy identified v'UTR ends, operons and antisense transcription, thus providing a new perception of the arrangement of the bacterial transcriptome and a new model for the analysis of private genes [13].

The results obtained allow the inference of a role of 5'UTR regions. A correlation between size and cell function was proposed by the researchers, who found that larger size is related to pathogenicity [13]. These results show how little knowledge at that place is regarding microorganisms, believed to be the simplest form of life, yet which nevertheless prove to exist more than complex than previously anticipated. This leaves a lot to exist discovered.

An RNA-seq awarding that has been widely used in bacterial genomes is establish in studies focused on identifying small RNAs (sRNA). These elements are regulators of various biological processes and were initially studied primarily in Escherichia coli [21]. Nonetheless, with the advances in technology, it has been possible to place and characterize minor RNAs in a variety of bacterial species [xiii, 22, 23]. Yan et al. (2013) identified an expression profile of sRNA in the Yersinia pestis , both in vitro and in vivo . This has allowed the identification of new sRNAs and the recognition of gene expression modulation during the infection procedure, thus improving the understanding of the transcription regulation mechanisms of this organism [24]. The importance of studies involving sRNA also includes assistance in research related to antibiotics therapies, a study in initial development despite a lot of noesis to be meliorate exploited [25].

RNA-seq has been used in different areas and situations. Avant-garde studies using this technology can find details in prison cell expression [26]. Fifty-fifty with the difficulties in separating eukaryotic and prokaryotic materials, it was possible to distinguish the simultaneous expression profiles betwixt the host–pathogen responses through dual transcriptome studies. This work allowed to disclosure the host response against the bacterial infection and virulence factors, enabling the infectious process decision [27]. These studies contribute to the research in the field of biological infection by examining diverse pathogens with unlike life cycles and methods of infection and providing crucial knowledge for studies of diagnostics and vaccines, such as metatranscriptomics study.

After a relatively short time on the market, RNA-seq can accurately reveal structural and functional elements of bacteria. The mapping of transcripts in the genome can refine the note or even identify new regions, improve the quality of the studied genome compared to regions previously annotated past predictors or assembled using an ab initio approach [28, 29], and can even check the affluence of transcript expression.

Data coming from a quality genome tends to provide more than promising results, responding to the biological question being investigated by researchers. In search of a quality genome, ab initio transcripts assembly or even a hybrid approach, which uses both the reference genome and ab initio assembly, become an auspicious endeavour to solve many problems encountered in the genome and complicated to adjust [28].

Pinto et al. (2012) conducted a study of Corynebacterium pseudotuberculosis adopting ab initio assembly and, therefore, were able to identify differences in the expression of agile genes nether different environmental weather condition. This allowed them to detect new possible virulence factors involved in pathogenicity, making them targets for vaccine development, diagnosis or treatment against caseous lymphadenitis disease caused by this bacterium [30].

These results propose the importance of this engineering and the possibility of going farther with a tool that aims to amend, and probably will expand, the field of analysis. This could bring the results increasingly closer to bacterial molecular reality.

3.2. tagRNA-seq

Bacterial RNA can exist divided in two groups: principal and processed transcripts. Primary transcripts are represented by the presence of five'-triphosphate (five'PPP), which includes messenger RNA (mRNA) and modest RNAs (sRNA). Candy transcripts are those carrying 5'-monophosphate (v'P), such every bit mature ribosomal RNA (rRNA) and transfer RNA (tRNA).

Transcriptome represents approximately 95% of the total bacterial transcriptome [xv]. A recently developed approach chosen dRNA-seq [13] revolutionized the report of the primary transcripts by because the 5' deviation between the primary and the processed groups, equally mentioned previously (see Department 3.1).

RNAs are very stable and during training, considering the "wet-lab" experiments, some transcripts are partially or totally degraded. 5'PPP and 5'P are ii of the mechanisms of protection against exonucleases and the first degraded portion of the transcripts. During that process, information is lost and some chief transcripts end up with 5'P and are treated equally processed transcripts. Consequently, they are eliminated by the dRNA-seq technique. A new methodology was created to overcome this problem by tagging and clustering the two groups together in an RNA-seq-derived approach named tagRNA-seq [31]. This technique also considers the departure between processed and primary transcripts, but instead of degrading the candy ones, two different ligation reactions are implemented with two dissimilar markers: PSS-tag (processed start site) and TSS-tag (transcription outset site). They differ in their nucleotide sequence. Figure i exhibits briefly the methodology, considering the three main steps: (one) the first reaction tags (PSS-tag) on the candy transcripts; (2) treatment with tobacco alkaline phosphatase (TAP), where the v'PPP loses 2 phosphates, which allows the tertiary step; (three) the second ligation reaction (TSS-tag) on the primary transcripts. Later those steps are completed, the transcripts are sequenced and, due to the unlike markers, they tin can be distinguished and compared [31].

Figure 1.

The iii main steps of the tagRNA-seq approach. (1) The start ligation reaction, during which the zipper of the PSS-tag (blue) to the processed transcripts (v'P) occurs. (ii) Treatment with tobacco alkaline phosphatase (TAP), turning triphosphate to monophosphate groups. (three) The second ligation, corresponding to the TSS-tag (yellow) marker on the previously 5'PPP grouping (master transcripts). The different markers allow the differentiation of the triphosphate and monophosphate groups later on sequencing.

This methodology was beginning described for Enterococcus faecalis [31] and was based on another technique, five'tagRACE [32], a 5'RACE derived method. The results provided by tagRNA-seq improved the annotation of the East. faecalis genome by having identified or corrected several genome portions, including both not-coding and coding regions. This study also compared different libraries to prove the effectiveness of this innovative approach. With this, it provided a new method capable of differentiating primary and processed RNAs and was suited to better comprehending of the genetic data of bacteria as other groups [31].

dRNA-seq and tagRNA-seq are approaches that enable a new view of the transcriptome by selecting the principal transcripts for sequencing or by differentiating the primary from the processed transcripts, for a broader insight into the transcriptome. These state-of-the-fine art techniques promise a better agreement of RNA structures like TSS, 5'UTR, promoters, amidst others, as well the noesis of non-annotated genes and small RNAs.

3.three. FRT-seq (flowcell opposite transcription sequencing)

Flowcell reverse transcription sequencing (FRT-seq) is a new and improved methodology, derived from the RNA-seq engineering that was created for Illumina sequencers. Unlike RNA-seq, FRT-seq does not crave amplification by PCR, a step that usually introduces bias into the results by displaying an erroneous view of the quantity of some RNA species [33]. Other important features of the Illumina sequencing methodology are the power to generate strand-specific information, the utilise of pair-end libraries and the demand for a considerable initial amount of RNA template. PCR-free amplification is a major step towards a more than comprehensive library, akin to the original one, but without the formation of intermolecular priming artefacts amid other errors. It will probably get a adequately useful technique in the well-nigh time to come [33, 34]. Tertiary-generation sequencing platforms, like Nanopore and PacBio, too use amplification-gratis approaches. However, neither is currently being broadly used since they still showroom sequencing errors.

FRT-seq comprises the fragmentation of the template (e.g., mRNA) followed past ligation of adapters in both the 3' and the 5' ends, which are responsible for the hybridization of the template with oligonucleotides on the flowcell surface. The next steps performed are quantification, reverse transcription and and then sequence reaction [33, 34].

This approach can be applied to both eukaryotes and prokaryotes, although the number of published papers involving eukaryotes is more substantial. From the bacterial world, nosotros tin quote papers involving Salmonella enterica [23] and Shigella fleneri [35] in which FRT-seq was practical equally a complementary arroyo to describe the transcriptional landscape of the species. In both cases, FRT-seq showed greater sensitivity and first-class cyclopedia when compared to other approaches and replicates.

The S. enterica paper [23] shows that FRT-seq is as efficient as the RNA-seq and dRNA-seq techniques (Effigy ii) (Tabular array one). Figure 2 compares ix different RNA libraries: TEX (1, 2, three), RNA-seq (i, 2, 3, *) and FRT-seq (depleted and not depleted). TEX (libraries treated with terminator exonuclease) is a dRNA-seq methodology (see Sections 3.1 and 3.2) that, together with the first three RNA-seq biological replicates, was sequenced using a 454 (one and 2) or an Illumina GAII (3 and FRT-seq) sequencer and the RNA-seq* (library enriched for small RNA species) was sequenced using Illumina HiSeq. The charts relate the percentages of different RNA species and show that the FRT-seq libraries provide similar or better results than the other approaches. The data presented in Table 2 also support this claim, peculiarly considering both the full number of reads and the uniquely mapped reads achieved using the FRT-seq libraries.

Figure 2.

Sequencing methodology comparing. Adjusted from [23]. IGR – Intergenic region; TEX – libraries treated with terminator exonuclease; RNA-seq* – library enriched for pocket-sized RNA species (sRNA).

Library	Sequencing technology	Description	Totalnumber	Number of reads (not mapped)	Number of reads (uniquely mapped)	Percent uniquely mapped reads [%]	Minimum fold coverage ^#
TEX_1	454	dRNA-seq library biological replicate ane	161,031	72,623	88,408	54.90	1.11
RNA-seq_1	454	RNA-seq library biological replicate one	248,993	83,030	165,963	66.65	2.03
TEX_2	454	dRNA-seq library biological replicate 2	111,462	10,785	100,677	90.32	ii.sixteen
RNA-seq_2	454	RNA-seq library biological replicate 2	93,337	38,577	54,760	58.67	0.61
TEX_3	Illumina GAII	dRNA-seq library biological replicate 3	1,738,867	122,058	1,211,426	69.67	20.99
RNA-seq_3	Illumina GAII	RNA-seq library biological replicate 3	2,148,563	136,871	1,360,113	63.30	21.16
RNA-seq*	Illumina HiSeq	RNA-seq library biological replicate 4	3,750,797	164,658	2,596,010	69.21	25.11
FRT-seq	Illumina GAII	FRT-seq library biological replicate five	18,563,218	4,203,715	2,456,792	thirteen.23	16.42
FRT-seq dep	Illumina GAII	FRT-seq library biological replicate v rRNA depleted	24,585,564	9,652,397	4,093,744	16.65	27.77

Table one.

Sequencing statistics. Adjusted from [23]

The S. fleneri paper [35] as well reports a favourable upshot concerning FRT-seq. In fact, this approach revealed a larger gene repertoire than the RNA-seq (Table ii).

	RNA-seq		FRT-seq
	Condition A	Status B	Condition A	Condition B
Total number of mapped reads	xx,099,597	22,736,494	49,925,286	47,605,241
Total number of reads mapping to genes	1,525,782	2,271,423	3,037,954	two,585,600
Reads mapping genes in sense	1,195,446	one,958,533	2,469,828	2,129,951
Reads mapping genes in antisense	330,336	312,890	568,126	455,649

Tabular array 2.

Sequencing statistics. Adapted from [31].

The data presented in this topic demonstrate the quality of this recently published methodology and, according to the authors [33, 34], new updates are still beingness developed. This will probably provide an fifty-fifty better approach for users. The fact that this technique is merely applicable for Illumina sequencers is a drawback; simply, since this sequencing platform is available worldwide, this disadvantage can easily be fixed. Possibly, in the near future, it can exist extended to work in other sequencing platforms. Another particularity of this technique is its efficiency with AT-rich genomes, which does not constrain its awarding with AT-poor genomes. This is due to the PCR-free amplification, which raises a question for other sequencers like Nanopore and PacBio. Despite these issues, this applied science has a bright future and is a great advance over the conventional RNA-seq.

3.4. Chromatin immunoprecipitation followed by sequencing (ChIP-seq)

Chromatin immunoprecipitation followed by sequencing (Bit-Seq) is a technique for the genome-wide profiling of DNA-bounden proteins, histone modifications or nucleosomes [36]. ChIP-Seq has become an essential tool for studying gene regulation and epigenetic mechanisms. Information technology offers higher resolution, less noise and greater coverage than its array-based predecessor, the ChIP-bit [37, 38]. This approach has six main steps: (1) information technology is initiated with jail cell cultures that are grown under defined weather condition; and, when the cultures reach the desired stage of development, they are treated with formaldehyde for the cross-linking of proteins and DNA; (2) the chromatin is sheared past sonication into minor fragments (200–600 bp); (3) an antibody specific to the poly peptide is used to immunoprecipitate the DNA–protein complex; (4) the cross-links are reversed past heating; (5) the released Dna is subjected to loftier-throughput sequencing and (6) in silico assay is carried out in which the resulting sequencing reads are studied for quality and then cropped, based on the quality of the reads [38–forty]. The cropped reads are then aligned to a reference genome. Afterwards, areas of enrichment in the ChIP-seq data are identified and those areas, usually called peaks, represent where the transcription factors (TF) bind throughout the genome. CisGenome, MOSAiCs and MACS are some known algorithms that take been utilized in bacterial Fleck-seq analysis [38, 41]. Afterward peaks are associated with genes downstream, a number of bioinformatics analyses tin can be carried out, including identification and analysis of motifs, differential analysis and association with expression data for deep agreement of bacterial regulon. This is shown in Effigy 3 [36].

Figure three.

ChIP-seq sample preparation and assay. Adjusted from [36].

Every bit whole-genome transcription profiling cannot reveal whether the influence of the transcription factors (TF) on RNA levels is directly or indirect, this requires identification of transcription factors binding within the appropriate promoter region. Chip-seq provides information about where the TF are spring. Thus, by integrating Scrap methods and transcription profiling, it is possible to place all direct regulatory targets of a TF for a given condition. For instance, work carried out by Stringer et al. (2014) on the araC gene of Escherichia coli and Salmonella enterica has identified straight regulatory targets of AraC, including five novel target genes: ytfQ , ydeN , ydeM , ygeA and polB [42]. Although ChIP-seq has been used only in moderation to written report bacterial systems in a few bacterial species, such as Vibrio harveyi , 5. cholerae , Rhodobacter sphaeroides , Mycobacterium tuberculosis , Southward. enterica and Caulobacter crescentus [36, 37, 43–45], information technology is used to identify novel regulatory interactions, even for well-studied proteins [46, 47].

Scrap-seq, in combination with RNA-seq, could be an efficient tool to get detailed information well-nigh bacterial transcription regulation and how bacteria respond to different external conditions.

3.v. RNA immunoprecipitation sequencing (RIP-seq)

RNA immunoprecipitation (RIP) is the study of intracellular RNA and protein binding; it is a tool for understanding the dynamic process of postal service-transcriptional regulatory networks. With this technique, an antibody is used against a protein of interest to recover the RNA species spring to the protein. Since the sequence data of the RNA species jump to a specific protein is often desired, an approach combining RNA immunoprecipitation with sequencing technology (RIP-seq) was created [48]. The main challenge of RIP-seq is the cross-linking stride, which is relatively inefficient and only a small amount of RNA is bachelor to construct the library [48, 49]. Afterwards that step, treatment with endonuclease elucidates the specific binding sites within the RNA, as they will exist protected from digestion. This is followed by purification of the RNA–protein complexes using electrophoresis and high-throughput sequencing [48, l]. Finally, the data obtained from the sequencer are analyzed using bioinformatics tools. The beginning study using the RIP-seq-based technique was carried out on Salmonella by Sittka et al. (2008) [51]. They used the RNA-binding property of the Hfq protein in their analysis and, as a outcome, many new sRNA were discovered [52]. Thus, RIP-Seq could exist an efficient tool for the identification of bacterial non-coding RNAs.

3.half dozen. LEA-seq (depression error amplicon sequencing)

The LEA-seq technique (low error amplicon sequencing) emerged in 2013 and was developed and patented by Gordon and Faith (2014) [53]. This method was created to improve the quality and depth of sequencing runs, since the massive amount of data produced past NGS has caused a high error rate in the sequencing, due to problems with the algorithms or platform reading lengths [53].

LEA-seq is a nucleic acid sequencing technique that identifies events that occur at depression frequency, seeking to empathise mutation events. The three bones steps for implementing this technique are: (ane) linear PCR, (2) exponential PCR and (three) sequencing. This technique is performed based on bacterial 16S sequencing in which PCR carries numerous times and each amplified PCR uses specific primers for each linear molecule [53].

The LEA-seq technique is a quantitative method that has the advantages of generating and reading. This permits the formation of a consensus and the elimination of errors for each molecule. Currently, the available techniques do not support error detection in sequencing or identification of whether at that place is a real variation in the sequence of that microorganism. The multiple sequencing, using the LEA-seq technique, supports improve quality and precision most the organism.

The report past Organized religion et al. (2013) aimed to place the limerick of the faecal microbiota of adults and to understand the role of these bacterial species and their therapeutic potential for intestinal diseases. This technique allowed them to work with a large number of samples (over 500 isolates), equally well every bit to achieve a fast and authentic analysis of the data [54].

Researchers accept a continuing interest in improving this technique, since it can exist used for clinical investigation due to its loftier accuracy: for case, in patients with genetic mutations or somatic mutations. LEA-seq can assistance in the search for noesis about intestinal microbiota, as information technology may reveal their composition, opening up prospects for the diagnosis, treatment and prevention of gastrointestinal tract diseases.

three.7. CRISPR (clustered regularly interspaced brusque palindromic repeats)

Ishino et al. (1987) were the first to describe CRISPR [55]. This arrangement has been identified in 40% of bacterial genomes and so far [56] and they are divers as short repetitions of grouped bases. The determination of the CRISPR locus and the label of adjacent genes, known as cas genes, responsible for the function of CRISPR, only occurred in 2002 [57]. The CRISPR/Cas system uses small-scale non-coding RNAs in clan with Cas proteins. Cas9 is a nuclease which cleaves Deoxyribonucleic acid in the selected region, and then that the CRISPR system/Cas9 tin exist used to edit genomes.

CRISPR/Cas activeness involves three main mechanisms: (1) acquisition, the stride in which the DNA fragment is inserted into the CRISPR locus in the genome of involvement; (two) transcription, in which the CRISPR locus is transcribed and processed; (3) interference, in which the ejection of nucleic acids occurs. All those mechanisms contribute to bacterial persistence in the environment [58, 59]. Furthermore, CRISPR provides mechanisms to limit the spread of antibody resistance or virulence factors. Even so, Gophna et al. (2015) demonstrated that, fifty-fifty though there are different measurements to evaluate horizontal gene transfer, information technology is not possible to identify a correlation betwixt the CRISPR/Cas system and the development of the species. Changes occur only at the population level [60].

RNA-seq helped in the annotation transcription of regions, mainly non-coding, and too enabled the identification of CRISPR elements in prokaryotes [61]. The CRISPR arrangement can also be used as a tool in studies centered on gene regulation, since this organization is able to activate or repress genes.

Zoephel and Randau (2013) talk over how the construction of CRISPR can affect the maturation of RNA and, thus, influence the functionality of the CRISPR/Cas system [62]. The RNA-seq approach was used to evaluate differential gene expression in S. aureus , a pathogen of major importance. It was able to identify the CRISPR in these strains and helped in investigating their possible role, since these regions show an adaptive response to infection [63]. Thus, we run across the importance of the use of the RNA-seq approach in the magnification of noesis virtually function in prokaryotes.

four. RNA Sequencing Platforms

The RNA-seq arroyo can be practical to different next-generation sequencing platforms and the results obtained past them are proportional to the motorcar capability. In Tabular array three, a comparison is made with some of the platforms currently most employed [64].

Company Name	Instrument	Version	Run Time (Hours )	Read Lengths (Hateful)	Reads Per Run (Millions)	Applications
Illumina	HiSeq 2000	High Output	132	fifty	half dozen,000	Gene expression, Splice junction detection, variant calling, fusion
Illumina	HiSeq 2500	High Output	132	50	6,000	Factor expression, Splice junction detection, variant calling, fusion
Illumina	MiSeq	v2 kit	39	250	30	Splice junction detection, variant calling,
Life Technologies	PGM	318 Chip	seven.iii	176	6	Splice junction detection, variant calling
Life Technologies	Proton	Proton I fleck	ii-4	81	70	Factor expression, Splice junction detection, variant calling
Pacific Biosciences	RS	RS	0.5-2	1,289	0.03	Splice junction detection, variant calling, full-length factor coverage
Roche	454	GS FLX+	twenty	686	i	Splice junction detection

Tabular array 3.

Different Next Generations sequencing platforms in the report of RNA-seq. Adopted and modified from [64].

five. Bioinformatics Analysis

Experimental investigations in prokaryotes take been facilitated, extended and complemented using computational approaches [65]. Big amounts of data have been generated from RNA-seq experiments which need to be stored and analyzed using computational techniques and tools [66]. This amount has become a clogging to bioinformatics analysis and to biologists, since today's transcriptome analysis consists of experiments and information evaluation [65]. Extracting biological information from RNA-seq datasets requires bioinformatics noesis and tools, making the software choice an important result for successful RNA-seq analysis [65, 67].

According to Chierico et al. (2015) [68] and Pinto et al. (2011) [67], RNA-seq can be understood as a v-footstep process: (1) isolation of the total RNA of the organism; (2) mRNA enrichment; (iii) synthesis of cDNA; (4) NGS sequencing, which returns raw data to the (five) bioinformatics analysis [67]. A flowchart of this process can be seen in Figure iv.

This session focuses on bioinformatics analysis and the computational tools available. Based on a literature review [29, 65, 67–69], bioinformatics analysis can be comprehended as the extraction and classification/partitioning of biological information gleaned from the sequencing of raw data (Figure 5).

Figure 5.

Bioinformatics analysis workflow

5.one. Bioinformatics workflow

The quality cheque step aims to increase the accuracy of the results by removing sequences that may comprise errors [70]; trimming sequences introduced in the library preparation step, such equally adapters and poly(A)-tails [71]; and, removing reads with depression phred quality. Nevertheless, in that regard, the use of poor-quality databases tin lead to less precise results [72]; considering this, the quality check tin affect the next steps drastically.

Some RNA-Seq pipelines, similar ReaDemption [71], implement quality checking which performs quality trimming, removes adapters and poly(A) tails and discards reads shorter than a given cutting-off (the default cut-off is 12 nucleotides (nt)). Quality assessment [72] evaluates the quality based on quality-graph analysis and estimated coverage. According to Backofen et al. (2014) [65], FastQC (http://www.bioinformatics.babraham.air-conditioning.u.k./projects/ fastq c/) is a tool commonly used to bank check read quality and to determine the quality contour of the reads. Software suites tin can also be used for this purpose, FASTX-Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/) provides tools to remove sequences attached in previous steps and to perform other pre-processing strategies on raw data.

After the quality check, if a reference genome is available, so a mapping step will be done; otherwise, de novo associates. Mapping consists of producing the transcriptome map past aligning reads to a reference genome [67]. This aims to detect the correct position of the reads and to distinguish between sequencing errors and genetic variations [73]. Arable mapping software has been released, differing in their algorithms, retention management, velocity and computational cost [65]. This makes the choice of a mapping tool a challenge. McClure et al. (2013) [69] made a comparison between SOAP2, BWA, Bowtie and Bowtie2 aligners using 75 RNA-seq experiment information. The comparison of mapping algorithms applied to IonTorrent data tin exist seen in [73]. Later on mapping quality is evaluated, ReadXplorer software offers quality classification of read mapping in order to provide information about the quality and quantity of each single read mapping [74]. This approach is recommended when a high-quality genome is available every bit a reference. If one is unavailable, transcripts should be assembled de novo [29].

De novo associates tin be used when investigating poorly studied organisms [14], complex microbial communities or uncultivable organisms [29]. Both Dna and RNA must be assembled, but transcriptome assembly is significantly dissimilar than genome assembly [75]; thus, it is of import to use RNA assemblers. Tjaden (2015) [29] affirms that assemblers should exist specifically designed to prokaryotes, owing to the different challenges of eukaryotic and prokaryotic transcriptomes. Bacterial genomes are often denser than eukaryotic genomes, because the proximity of the genes. Neighbouring bacterial transcripts tin can overlap, making it difficult to identify transcript boundaries appropriately. Non-coding eukaryotic RNA models are not appropriate for detecting bacterial small regulatory RNAs [29]. An assembly comparison of iii different software titles (Trinity, SOAPdenovo2 and Rockhooper 2), using data from ix dissimilar bacteria, tin can exist seen in [29].

When reference mapping or de novo assembly is washed, data can exist analyzed structurally and differentially. The principal purpose of differential assay is to make up one's mind the differences in expression among dissimilar growth atmospheric condition or treatments [76]. Several software titles have been released for this purpose, but there is no consensus almost best practices, which makes information technology difficult to select a tool or method. Seyednasrollah et al. (2013) [76] compared eight differential expression software packages using 2 existent, publicly available datasets. Software that analyzes differential expression can be based on the Poisson method (DEGseq and Myrna), negative binomial method (edgeR and DEseq) or other methods [67, 76]. Pinto et al. (2011) [67] recommends using DEseq or edgeR when analyzing replicates.

Transcriptome note and classification tin can be based on structural analysis, evaluating transcripts regarding the genomic region with which they accept been associated and in which they have been classified: protein-coding, non-coding and intergenic regions [65]. Aiming to predict ncRNA transcripts, several computational methods have been developed. Herbig and Nieselt (2011) [77] highlight the SIPHT, sRNAFinder, sRNAscanner, NOCORNAr and sRNAPredict software. NOCORNAr distinguishes itself as it is useful for predicting and characterizing ncRNAs in bacteria [77].

Assessing transcripts concerning genomic regions rely on transcript note. The computational approach is convenient to utilise due to its velocity and precision, compared to manual annotation. However, homo supervision of the results is considered important in club to avoid false-positives or missing features [ane]. With this technique, some master structures must be detected: 5' transcript ends, 3' transcript ends, TSS and operon [i, 65].

Transcript boundaries identification

Notation of transcript boundaries is important for operon identification and regulatory analyses [ane]. Identifying 5' UTR is not always possible; a significant number of transcripts defective 5' UTR were constitute in leaner and called leaderless transcripts. In this situation, the transcript translation start site and the transcription start site remain in most the same position [65]. Annotation of 3' UTR is important in society to obtain the unabridged analytical value of the RNA-seq information. Creecy and Conway (2014) [1] assert that the current best method for detecting 3' ends is to search for correlations betwixt replicates data. They highlight that the software parcel TransTermHP tin can find intrinsic terminators successfully.

TSS identification

TSS annotation can aid in ncRNA annotation and polycistronic transcripts [65]. According to Creecy and Conway (2013) [i], it is essential to discover unknown transcripts and to analyze operon, v' UTR and promoters architecture. Although there are no well-established strategies for TSS identification, attributable to scarce knowledge about transcription start sites in bacteria, with computational developments in both computational analyses and "moisture-lab" experiments, TSS annotation has become more feasible [65]. TSSAR is a dRNA-seq data-based tool for rapid annotation of TSS that considers dRNA-seq library statistics [78]. Co-ordinate to Backofen et al. (2014) [65], the main advantage is in the statistical analysis presented as an piece of cake-to-employ web service. The TSSpredator tool provides automated TSS detection and classification from RNA-seq information, performing a genome-wide comparative prediction of TSS [79]. A comparison amidst manual note, TSSpredator and TSSAR annotation can be seen in [78].

Operon identification

The operon represents clusters of co-transcribed genes regulated by the same regulatory sequence and co-transcribed into a single mRNA. This structure has immense biological importance, improving functional gene notation and giving important information to studies of drug targeting, functional analyses and antibiotic resistance [80]. To handle operon occurrence complexity, the occurrence should exist detected using operon architecture (i.eastward., 5' ends and 3' ends) and have sufficient read coverage to connect promoters and terminators. A stiff indication that an operon is existent is that at to the lowest degree 90% of the bases of the reads is covered [1]. Chuang et al. (2012) [80] allocate computational methods to predict operons and they evaluate 15 algorithms with respect to accuracy, specificity and sensitivity.

v.2. RNA-seq pipeline tools

Not all pipeline tools feature the complete RNA-seq workflow described earlier. To assist with tool option, a software functionalities comparison was developed and is shown in Tabular array four. To provide additional support, of import issues well-nigh each software are described, beneath.

Tool	Quality Check	Mapping	De novo associates	Differential analyses
Rockhopper [69]	-	10	10	10
Rockhopper 2 [29]	-	-	x	x
RNA-Rocket [81]	ten	x	-	x
READemption [71]	x	x	-	x
ReadXplorer [74]	x	-	-	x

Table 4.

Software comparison.

Rockhopper is a system designed specifically for bacterial transcriptome RNA-seq data analysis. A novel approach to mapping transcripts is implemented in this software (similar to the Bowtie2 arroyo). Mapping normalization is performed followed by transcripts assembly, identification of transcript boundaries, quantification of transcript abundance, testing for differential factor expression and operon prediction. Assay results are presented using Integrative Genome Viewer, which allows different experiments to exist viewed simultaneously [69].

Rockhopper 2 is a comprehensive organization focused on de novo assembly that supports differential analysis and transcripts affluence quantification. Co-ordinate to Tjaden (2015) [29], it does not crave high-performance computers and can run on personal computers. Rockhopper ii implements a novel de novo assembly algorithm for bacterial transcriptomes. The algorithm works in 2 stages: (1) candidate transcripts are assembled using a institute 1000-mer and (2) sequencing reads are mapped to candidate transcripts aimed at filtering candidate transcripts to high-quality concluding transcripts. Apropos differential assay, Rockhopper 2 first normalizes each RNA-seq dataset, enabling it to compare different experiments or samples [29].

RNA-Rocket aims to simplify the process of aligning RNA-seq data to a reference genome and to generate quantitative transcript profiles. It is built on Galaxy, to provide the tools and services necessary to procedure RNA-seq data. Some of its benefits are: the possibility of sharing results across inquiry groups; the support of batch analysis for multiple samples; and, the integration of tools and projects, integrating data from the PATRIC platform [81].

READemption pipeline aims to integrate individual RNA-seq analysis tasks and provides a user-friendly tool with a command line interface. This tool was primarily adult to analyze bacterial transcriptome. In order to employ the total chapters of modern computers and reduce run time, READemption offers parallel data processing. Beginning, it performs quality trimming of polyA and adapters followed past mapping, coverage calculation, gene expression quantification, differential gene expression analysis and plotting. The software is able to clarify RNA-seq data from Illumina and 454 platforms.

ReadXplorer offers straightforward visualization and analysis functions congenital around its unique read mapping classification. Analyses such as TSS and operon detection, differential expression, RPKM value and read count calculations are available in ReadXplorer and can be exported to Microsoft Excel files. Read mapping classification sorts read mappings into three dissimilar classes: perfect lucifer, best match and mutual match. These classifications are incorporated in all analyses functions.

five.iii. Bioinformatics challenges

Through bibliographic research [29, 66, 69, 71, 82, 83], information technology has been concluded that bioinformatics has many challenges related to computational issues. RNA-seq experiments generate large amounts of data that must exist computationally candy, analyzed, stored and retrieved using a nifty deal of computational ability. In addition to the computational bug, it is important to accept into account that not all bioinformatic researchers accept extensive computational experience: this makes the lack of user-friendly tools a problem for some users and an of import issue for developers. Notwithstanding, great computers, excellent bioinformatic researchers and user-friendly tools do not guarantee successful analysis. The software selected must be appropriate to each biological question and to the organisms studied. Fifty-fifty with all questions presented here, RNA-seq assay has been very successful in recent years. This success can lead united states of america to imagine the wonderful possibilities for RNA-seq bioinformatic analyses in the future.

References

1. Creecy JP, Conway T. Quantitative bacterial transcriptomics with RNA-seq. Curr Opin Microbiol 2015;23:133–40.
two. Margulies M, Egholm Thou, Altman WE, Attiya South, Bader JS, Bemben LA, et al. Genome sequencing in microfabricated loftier-density picolitre reactors. Nature 2005 Sep 15;437(7057):376–80.
3. Perkins TT, Kingsley RA, Fookes MC, Gardner PP, James KD, Yu 50, et al. A strand-specific RNA-Seq assay of the transcriptome of the typhoid bacillus Salmonella typhi. PLoS Genet 2009 Jul;5(7):e1000569.
4. Passalacqua KD, Varadarajan A, Ondov BD, Okou DT, Zwick ME, Bergman NH. Construction and complexity of a bacterial transcriptome. J Bacteriol 2009 May 15;191(ten):3203–11.
5. Pinto Air-conditioning, Sá PHCG de, Ramos RTJ, Barbosa Southward, Barbosa HPM, Ribeiro AC, et al. Differential transcriptional contour of Corynebacterium pseudotuberculosis in response to abiotic stresses. BMC Genomics 2014 Jan 9;15(1):14.
6. Westermann AJ, Gorski, SA and Vogel J. Dual RNA-seq of pathogen and host. Nat Rev Microbiol 2012;10:618–30.
7. Macklaim JM, Fernandes AD, Bella JMD, Hammond J-A, Reid G, Gloor GB. Comparative meta-RNA-seq of the vaginal microbiota and differential expression by Lactobacillus iners in health and dysbiosis. Microbiome 2013 Apr 12;ane(1):12.
8. Leimena MM, Ramiro-Garcia J, Davids 1000, van den Bogert B, Smidt H, Smid EJ, et al. A comprehensive metatranscriptome analysis pipeline and its validation using human being small intestine microbiota datasets. BMC Genomics 2013;14:530.
ix. Bisanz JE, Macklaim JM, Gloor GB, Reid Thou. Bacterial metatranscriptome analysis of a probiotic yogurt using an RNA-Seq approach. Int Dairy J 2014;39(2):284–92.
10. Wiegand Southward, Dietrich Southward, Hertel R, Bongaerts J, Evers S, Volland S, et al. RNA-Seq of Bacillus licheniformis: active regulatory RNA features expressed within a productive fermentation. BMC Genomics 2013;14(1):667.
11. Wang Z, Yang S-T. Propionic acid production in glycerol/glucose co-fermentation by Propionibacterium freudenreichii subsp. shermanii. Bioresour Technol 2013;137:116–23.
12. Sorek R, Cossart P. Prokaryotic transcriptomics: a new view on regulation, physiology and pathogenicity. Nat Rev Genet 2010 Jan;11(1):nine–xvi.
13. Sharma CM, Hoffmann S, Darfeuille F, Reignier J, Findeiß S, Sittka A, et al. The primary transcriptome of the major human pathogen Helicobacter pylori. Nature. 2010 Mar 11;464(7286):250–5.
14. Haas BJ, Mentum Thou, Nusbaum C, Birren BW, Livny J. How deep is deep enough for RNA-Seq profiling of bacterial transcriptomes? BMC Genomics 2012 Dec 27;13(1):734.
xv. Bischler T, Siew Tan H, Nieselt K, Sharma CM. Differential RNA-seq (dRNA-seq) for note of transcriptional beginning sites and small-scale RNAs in Helicobacter pylori. Methods [Cyberspace]. 2015 Jul half-dozen [cited 2015 Jul half-dozen]; Available from: http://www.sciencedirect.com/science/article/pii/S1046202315002546
sixteen. Kratz A, Carninci P. The devil in the details of RNA-seq. Nat Biotechnol 2014 Sep;32(ix):882–four.
17. Sendler E, Johnson GD, Krawetz SA. Local and global factors affecting RNA sequencing analysis. Anal Biochem 2011;419(2):317–22.
18. Liu Y, Zhou J, White KP. RNA-seq differential expression studies: more sequence or more replication? Bioinformatics 2014 February 1;xxx(3):301–4.
19. Isabella VM, Clark VL. Deep sequencing-based assay of the anaerobic stimulon in Neisseria gonorrhoeae. BMC Genomics 2011 Jan twenty;12(1):51.
20. Patenge Northward, Pappesch R, Khani A, Kreikemeyer B. Genome-broad analyses of small non-coding RNAs in streptococci. Forepart Genet 2015;half-dozen:189.
21. Gottesman Southward, Storz G. Bacterial small RNA regulators: versatile roles and rapidly evolving variations. Common cold Spring Harb Perspect Biol [Internet]. 2011 Dec [cited 2015 Jul 2];iii(12). Available from: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3225950/
22. Papenfort M, Vogel J. Regulatory RNA in bacterial pathogens. Prison cell Host Microbe 2010 Jul 22;eight(one):116–27.
23. Kröger C, Dillon SC, Cameron ADS, Papenfort 1000, Sivasankaran SK, Hokamp K, et al. The transcriptional landscape and small RNAs of Salmonella enterica serovar Typhimurium. Proc Natl Acad Sci 2012 May 15;109(20):E1277–86.
24. Yan Y, Su S, Meng X, Ji Ten, Qu Y, Liu Z, et al. Determination of sRNA expressions past RNA-seq in Yersinia pestis grown in vitro and during infection. PLoS One 2013 Sep 11;viii(ix):e74495.
25. Storz G, Vogel J, Wassarman KM. Regulation past small-scale RNAs in bacteria: expanding frontiers. Mol Cell 2011 Sep 16;43(half dozen):880–91.
26. Van Dijk EL, Auger H, Jaszczyszyn Y, Thermes C. Ten years of next-generation sequencing engineering science. Trends Genet 2014;30(nine):418–26.
27. Humphrys MS, Creasy T, Sun Y, Shetty AC, Chibucos MC, Drabek EF, et al. Simultaneous transcriptional profiling of bacteria and their host cells. PLoS ONE 2013 Dec 4;8(12):e80597.
28. Martin JA, Wang Z. Next-generation transcriptome assembly. Nat Rev Genet 2011 Sep 7;12(10):671–82.
29. Tjaden B. De novo associates of bacterial transcriptomes from RNA-seq data. Genome Biol 2015;16(1):1.
30. Pinto AC, Ramos RTJ, Silva WM, Rocha FS, Barbosa South, Miyoshi A, et al. The core stimulon of Corynebacterium pseudotuberculosis strain 1002 identified using ab initio methodologies. Integr Biol 2012;iv(vii):789.
31. Innocenti North, Golumbeanu M, d' Hérouël AF, Lacoux C, Bonnin RA, Kennedy SP, et al. Whole genome mapping of five'RNA ends in bacteria past tagged sequencing: a comprehensive view in Enterococcus faecalis. ArXiv Prepr ArXiv14101925 [Cyberspace]. 2014 [cited 2014 Dec 15]; Available from: http://arxiv.org/abs/1410.1925
32. Fouquier d'Herouel A, Wessner F, Halpern D, Ly-Vu J, Kennedy SP, Serror P, et al. A uncomplicated and efficient method to search for selected primary transcripts: not-coding and antisense RNAs in the human pathogen Enterococcus faecalis. Nucleic Acids Res 2011 Apr i;39(7):e46–e46.
33. Mamanova L, Turner DJ. Low-bias, strand-specific transcriptome Illumina sequencing by on-flowcell reverse transcription (FRT-seq). Nat Protoc 2011 November;half-dozen(eleven):1736–47.
34. Mamanova 50, Andrews RM, James KD, Sheridan EM, Ellis PD, Langford CF, et al. FRT-seq: amplification-complimentary, strand-specific transcriptome sequencing. Nat Methods 2010 Feb;7(2):130–2.
35. Vergara-Irigaray M, Fookes MC, Thomson NR, Tang CM. RNA-seq assay of the influence of anaerobiosis and FNR on Shigella flexneri. BMC Genomics 2014 Jun 6;15(ane):438.
36. Myers KS, Park DM, Beauchene NA, Kiley PJ. Defining bacterial regulons using Chip-seq methods. Methods [Internet]. 2015 [cited 2015 Jul 17]; Available from: http://www.sciencedirect.com/science/article/pii/S1046202315002285
37. Myers KS, Yan H, Ong IM, Chung D, Liang Thou, Tran F, et al. Genome-scale assay of Escherichia coli FNR reveals circuitous features of transcription factor binding. PLoS Genet 2013;ix(6):e1003565.
38. Park PJ. ChIP–seq: advantages and challenges of a maturing technology. Nat Rev Genet 2009;10(10):669–80.
39. Johnson DS, Mortazavi A, Myers RM, Wold B. Genome-broad mapping of in vivo protein-Deoxyribonucleic acid interactions. Science 2007;316(5830):1497–502.
40. Robertson G, Hirst M, Bainbridge Yard, Bilenky M, Zhao Y, Zeng T, et al. Genome-broad profiles of STAT1 DNA clan using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 2007;4(eight):651–7.
41. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling One thousand, Dudoit Southward, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004;5(10):R80.
42. Stringer AM, Currenti South, Bonocora RP, Baranowski C, Petrone BL, Palumbo MJ, et al. Genome-scale analyses of Escherichia coli and Salmonella enterica AraC reveal noncanonical targets and an expanded core regulon. J Bacteriol 2014;196(3):660–71.
43. Haycocks JRJ, Sharma P, Stringer AM, Wade JT, Grainger DC. The molecular footing for control of ETEC enterotoxin expression in response to surroundings and host. PLoS Pathog 2015 Jan 8;11(i):e1004605.
44. Singh SS, Singh North, Bonocora RP, Fitzgerald DM, Wade JT, Grainger DC. Widespread suppression of intragenic transcription initiation by H-NS. Genes Dev 2014 February 1;28(3):214–9.
45. Kahramanoglou C, Seshasayee ASN, Prieto AI, Ibberson D, Schmidt South, Zimmermann J, et al. Direct and indirect furnishings of H-NS and Fis on global gene expression command in Escherichia coli. Nucleic Acids Res 2011 Mar 1;39(6):2073–91.
46. Wade JT, Struhl K, Busby SJ, Grainger DC. Genomic analysis of protein–DNA interactions in bacteria: insights into transcription and chromosome organisation. Mol Microbiol 2007;65(1):21–6.
47. Grainger DC, Aiba H, Hurd D, Browning DF, Busby SJW. Transcription factor distribution in Escherichia coli: studies with FNR protein. Nucleic Acids Res 2007 Jan 1;35(1):269–78.
48. Chu Y, Corey DR. RNA sequencing: platform option, experimental design, and data interpretation. Nucleic Acrid Ther 2012;22(four):271–4.
49. Hafner 1000, Landthaler K, Burger L, Khorshid M, Hausser J, Berninger P, et al. Transcriptome-wide identification of RNA-bounden protein and microRNA target sites by PAR-Clip. Jail cell 2010;141(1):129–41.
50. Chi SW, Zang JB, Mele A, Darnell RB. Argonaute HITS-Prune decodes microRNA–mRNA interaction maps. Nature 2009;460(7254):479–86.
51. Sittka A, Lucchini S, Papenfort Yard, Sharma CM, Rolle K, Binnewies TT, et al. Deep sequencing analysis of minor noncoding RNA and mRNA targets of the global mail service-transcriptional regulator, Hfq. PLoS Genet 2008;4(eight):e1000163.
52. Cho S, Cho Y, Lee Due south, Kim J, Yum H, Kim SC, et al. Current challenges in bacterial transcriptomics. Genomics Inform 2013;xi(two):76.
53. Gordon JI, Organized religion JJ. Methods of low mistake amplicon sequencing (LEA-Seq) and the employ thereof [Internet]. Google Patents; 2014 [cited 2015 Jul xiv]. Bachelor from: https://world wide web.google.com/patents/US20140357499
54. Faith JJ, Guruge JL, Charbonneau G, Subramanian S, Seedorf H, Goodman AL, et al. The long-term stability of the human gut microbiota. Science 2013;341(6141):1237439.
55. Ishino Y, Shinagawa H, Makino One thousand, Amemura Thou, Nakata A. Nucleotide sequence of the iap cistron, responsible for alkaline phosphatase isozyme conversion in Escherichia coli, and identification of the gene production. J Bacteriol 1987;169(12):5429–33.
56. Kunin V, Sorek R, Hugenholtz P. Evolutionary conservation of sequence and secondary structures in CRISPR repeats. Genome Biol 2007;8(4):R61.
57. Jansen R, Embden J, Gaastra W, Schouls L, others. Identification of genes that are associated with Dna repeats in prokaryotes. Mol Microbiol 2002;43(6):1565–75.
58. Gasiunas G, Barrangou R, Horvath P, Siksnys V. Cas9–crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc Natl Acad Sci 2012;109(39):E2579–86.
59. Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E. A programmable dual-RNA–guided DNA endonuclease in adaptive bacterial immunity. Science 2012;337(6096):816–21.
60. Gophna U, Ron EZ. Virulence and the oestrus shock response. Int J Med Microbiol IJMM 2003 February;292(7-8):453–61.
61. Heidrich N, Dugar M, Vogel J, Sharma CM. Investigating CRISPR RNA biogenesis and office using RNA-seq. CRISPR Methods Protoc 2015;1–21.
62. Zoephel J, Randau L. RNA-Seq analyses reveal CRISPR RNA processing and regulation patterns. Biochem Soc Trans 2013 Dec;41(6):1459–63.
63. Osmundson J, Dewell S, Darst SA. RNA-Seq reveals differential gene expression in Staphylococcus aureus with unmarried-nucleotide resolution. PLoS ONE 2013 Oct seven;8(10):e76572.
64. Li Southward, Tighe SW, Nicolet CM, Grove D, Levy Southward, Farmerie Due west, et al. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF adjacent-generation sequencing study. Nat Biotechnol 2014 Aug 24;32(ix):915–25.
65. Backofen R, Amman F, Costa F, Findei S, Richter Every bit, Stadler PF. Bioinformatics of prokaryotic RNAs. RNA Biol 2014;11(v):470–83.
66. McGettigan PA. Transcriptomics in the RNA-seq era. Curr Opin Chem Biol 2013 February;17(1):4–11.
67. Pinto AC, Melo-Barbosa HP, Miyoshi A, Silva A, Azevedo V. Review awarding of RNA-seq to reveal the transcript profile in bacteria. Genet Mol Res 2011;ten(iii):1707–18.
68. Del Chierico F, Ancora M, Marcacci One thousand, Camma C, Putignani 50, Conti S. Choice of adjacent-generation sequencing pipelines. Bacterial Pangenomics [Internet]. Springer; 2015 [cited 2015 Jul 14]. p. 31–47. Available from: http://link.springer.com/protocol/10.1007/978-one-4939-1720-4_3
69. McClure R, Balasubramanian D, Sunday Y, Bobrovskyy K, Sumby P, Genco CA, et al. Computational assay of bacterial RNA-Seq data. Nucleic Acids Res 2013 Aug ane;41(14):e140–e140.
70. De Sá PH, Veras AA, Carneiro AR, Pinheiro KC, Pinto Air conditioning, Soares SC, et al. The touch on of quality filter for RNA-Seq. Gene 2015;563(ii):165–71.
71. Förstner KU, Vogel J, Sharma CM. READemption–A tool for the computational analysis of deep-sequencing-based transcriptome information. Bioinformatics 2014;btu533.
72. Ramos RT, Carneiro AR, Baumbach J, Azevedo V, Schneider MP, Silva A. Analysis of quality raw data of 2d generation sequencers with Quality Cess Software. BMC Res Notes 2011 Apr 18;iv(1):130.
73. Caboche S, Audebert C, Lemoine Y, Hot D. Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data. BMC Genomics 2014;fifteen(one):264.
74. Hilker R, Stadermann KB, Doppmeier D, Kalinowski J, Stoye J, Straube J, et al. ReadXplorer—visualization and analysis of mapped sequences. Bioinformatics 2014;btu205.
75. Haas BJ, Papanicolaou A, Yassour K, Grabherr M, Blood PD, Bowden J, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc 2013;viii(viii):1494–512.
76. Seyednasrollah F, Laiho A, Elo LL. Comparison of software packages for detecting differential expression in RNA-seq studies. Cursory Bioinform [Internet]. 2013 December ii [cited 2014 Apr xxx]; Available from: http://bib.oxfordjournals.org/cgi/doi/10.1093/bib/bbt086
77. Herbig A, Nieselt K. nocoRNAc: characterization of non-coding RNAs in prokaryotes. BMC Bioinformatics 2011;12(1):40.
78. Amman F, Wolfinger MT, Lorenz R, Hofacker IL, Stadler PF, Findei S. TSSAR: TSS annotation authorities for dRNA-seq information. BMC Bioinformatics 2014;15(one):89.
79. Dugar One thousand, Herbig A, Förstner KU, Heidrich N, Reinhardt R, Nieselt 1000, et al. Loftier-resolution transcriptome maps reveal strain-specific regulatory features of multiple Campylobacter jejuni isolates. 2013 [cited 2015 Jul xiv]; Available from: http://dx.plos.org/ten.1371/journal.pgen.1003495
eighty. Chuang 50-Y, Chang H-West, Tsai J-H, Yang C-H. Features for computational operon prediction in prokaryotes. Brief Funct Genomics 2012;els024.
81. Warren AS, Aurrecoechea C, Brunk B, Desai P, Emrich S, Giraldo-Calderón GI, et al. RNA-Rocket: an RNA-Seq analysis resource for infectious disease research. Bioinformatics 2015;btv002.
82. Van Verk MC, Hickman R, Pieterse CM, Van Wees SC. RNA-Seq: revelation of the messengers. Trends Plant Sci 2013;18(iv):175–nine.
83. Dai L, Gao X, Guo Y, Xiao J, Zhang Z, others. Bioinformatics clouds for big data manipulation. Biol Direct 2012;7(1):43.

Written By

Mariana P. Santana, Flavia F. Aburjaile, Mariana T.D. Parise, Sandeep Tiwari, Artur Silva, Vasco Azevedo and Anne Cybele Pinto

Submitted: Apr 1st, 2015 Reviewed: Oct second, 2015 Published: January 14th, 2016

© 2016 The Author(s). Licensee IntechOpen. This affiliate is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted utilise, distribution, and reproduction in any medium, provided the original work is properly cited.

herreraturuily72.blogspot.com

Source: https://www.intechopen.com/chapters/49316

Do I Need to Trim Bacterial Rna Seq Reads

RNA-seq – Revealing Biological Insights in Leaner

Abstruse

Keywords

Mariana P. Santana

Flavia F. Aburjaile

Mariana T.D. Parise

Sandeep Tiwari

Artur Silva

Vasco Azevedo*

Anne Cybele Pinto

1. Introduction

2. Applications of RNA-seq

ii.ane. The medical field

2.2. The industrial field

3. RNA-seq and derivative techniques

iii.ane. RNA-seq

3.2. tagRNA-seq

Figure 1.

3.three. FRT-seq (flowcell opposite transcription sequencing)

Figure 2.

Table one.

Tabular array 2.

3.4. Chromatin immunoprecipitation followed by sequencing (ChIP-seq)

Figure three.

3.v. RNA immunoprecipitation sequencing (RIP-seq)

3.half dozen. LEA-seq (depression error amplicon sequencing)

three.7. CRISPR (clustered regularly interspaced brusque palindromic repeats)

four. RNA Sequencing Platforms

Tabular array 3.

five. Bioinformatics Analysis

Effigy 4.

Figure 5.

5.one. Bioinformatics workflow

v.2. RNA-seq pipeline tools

Table 4.

five.iii. Bioinformatics challenges

References

0 Response to "Do I Need to Trim Bacterial Rna Seq Reads"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel