|Year : 2015 | Volume
| Issue : 3 | Page : 212-217
In-silico gene co-expression network analysis in Paracoccidioides brasiliensis with reference to haloacid dehalogenase superfamily hydrolase gene
Raghunath Satpathy1, VB Konkimalla2, Jagnyeswar Ratha1
1 School of Life Science, Sambalpur University, Jyoti Vihar, Burla, Odisha, India
2 School of Biological Sciences, National Institute of Science Education and Research, Bhubaneswar, Odisha, India
|Date of Submission||05-Feb-2015|
|Date of Decision||20-Mar-2015|
|Date of Acceptance||21-Apr-2015|
|Date of Web Publication||6-Jul-2015|
School of Life Science, Sambalpur University, Jyoti Vihar, Burla, Odisha
Source of Support: None, Conflict of Interest: None
| Abstract|| |
Context: Paracoccidioides brasiliensis, a dimorphic fungus is the causative agent of paracoccidioidomycosis, a disease globally affecting millions of people. The haloacid dehalogenase (HAD) superfamily hydrolases enzyme in the fungi, in particular, is known to be responsible in the pathogenesis by adhering to the tissue. Hence, identification of novel drug targets is essential. Aims: In-silico based identification of co-expressed genes along with HAD superfamily hydrolase in P. brasiliensis during the morphogenesis from mycelium to yeast to identify possible genes as drug targets. Materials and Methods: In total, four datasets were retrieved from the NCBI-gene expression omnibus (GEO) database, each containing 4340 genes, followed by gene filtration expression of the data set. Further co-expression (CE) study was performed individually and then a combination these genes were visualized in the Cytoscape 2. 8.3. Statistical Analysis Used: Mean and standard deviation value of the HAD superfamily hydrolase gene was obtained from the expression data and this value was subsequently used for the CE calculation purpose by selecting specific correlation power and filtering threshold. Results: The 23 genes that were thus obtained are common with respect to the HAD superfamily hydrolase gene. A significant network was selected from the Cytoscape network visualization that contains total 7 genes out of which 5 genes, which do not have significant protein hits, obtained from gene annotation of the expressed sequence tags by BLAST X. For all the protein PSI-BLAST was performed against human genome to find the homology. Conclusions: The gene co-expression network was obtained with respect to HAD superfamily dehalogenase gene in P. Brasiliensis.
Keywords: Dimorphic fungi, drug target, gene annotation, gene co-expression network, microarray analysis
|How to cite this article:|
Satpathy R, Konkimalla V B, Ratha J. In-silico gene co-expression network analysis in Paracoccidioides brasiliensis with reference to haloacid dehalogenase superfamily hydrolase gene. J Pharm Bioall Sci 2015;7:212-7
|How to cite this URL:|
Satpathy R, Konkimalla V B, Ratha J. In-silico gene co-expression network analysis in Paracoccidioides brasiliensis with reference to haloacid dehalogenase superfamily hydrolase gene. J Pharm Bioall Sci [serial online] 2015 [cited 2019 Nov 19];7:212-7. Available from: http://www.jpbsonline.org/text.asp?2015/7/3/212/160023
The dimorphic fungus Paracoccidioides brasiliensis is known to be as a pathogenic agent of paracoccidioidomycosis disease. The clinical symptoms of this disease are pulmonary lesions and systemic generalized infections. , Pathogenesis study states that, the P. brasiliensis usually adhere to extracellular matrix proteins by means of its own surface molecules. The adhesion of the pathogenic microorganisms to host tissues is thought to be essential for initial colonization and further dissemination.  As a consequence, this results a stage of severely hampered respiratory function due to infection, which is presumed to be acquired through inhalation of air-borne conidia produced by the mycelial form of fungus.  Yet there is very little information available on the mechanisms underlying the pathogenesis of paracoccidioidomycosis or on the substance by which the fungus persists in the lungs and disseminates to other organs.  A recent study has indicated that the ability of P. brasiliensis adherence property to host cells and tissues could play an important role in the establishment of infection.  A 32-kDa protein of haloacid dehalogenase (HAD) superfamily of hydrolases present in the fungus is reported to be responsible for this.  In addition to this, during the mycelium-to-yeast transition morphogenesis process of P. brasiliensis in the human host is considered as part of major metabolism involved in pathogenesis.  For the treatment of the disease, drugs such as itraconazole, cotrimoxazole, trimethoprim, and amphotericin-B are commonly applied, also some of the drugs are in clinical trials. However, the current drugs are observed to be in-capable of reducing the infection and mortality rate in a satisfactory manner. , Therefore, there is an urgent need to identify novel drug molecules, as well as drug targets, by observing the high mortality rate and degree of pathogenicity of the fungus. The recent approach for new drug discovery is the use of genomics technology such as microarrays that provide a great platform to examine thousands of genes at a time. , This technology has tremendous potential in the field of biological annotation in health and disease. Microarray-based studies provide essential information for biomedical experiments, such as identification of disease-causing genes in malignancies and regulatory factors in cell cycle mechanism. This can identify genes for new and unique potential drug targets, predict drug responsiveness and ultimately lead to the development of prevention strategies of many diseases as described by many researchers. ,,, Identification of disease-causing gene and constructing a gene regulatory networks in pathogen is essential. This can be viewed and extrapolated to the metabolic interaction of the pathogen. The knowledge about the network related to a specific disease or cellular metabolic process is of utmost importance for controlling them. However, the regulatory network re/construction is highly challenging and requires careful analysis of large sets of experimental data as well as through literature survey. A straight forward approach to reconstruct gene regulatory network is based on co-expression (CE) analysis of transcriptomic data popularly known as gene co-expression network (GCN), which can be performed in-silico by analyzing the data from cDNA microarrays.  The HAD-like hydrolase superfamily constitutes a large group of proteins with diverse function and members are not only involved in the enzymatic cleavage carbon-halogen bonds (C-halogen), but also in a variety of other function. ,, One of such function is adhesion of fungal pathogen P. brasiliensis with the cellular membrane, leads to pathogenesis as explained before.
Therefore, this study was performed to search for the co-expressed genes along with the HAD superfamily hydrolase gene in P. brasiliensis, which will lead to discover the gene expression network, from which potential drug candidate can be discovered. Also, attempts have been taken to analyze the CE network with the gene taken as references and annotation of the unknown gene by utilizing few existing software.
| Subjects and Methods|| |
Retrieval of data
The microarray expression data taken for the present in-silico analysis was retrieved from the gene expression omnibus (GEO) database of NCBI (www.ncbi.nlm.nih.gov/geo/), for mining purpose. In the GEO data search menu "P. brasiliensis" was searched, and the data set for the transition from spore to yeast was obtained. The researchers who submitted this data in GEO database has designed the experiment on expressed sequence tags (EST) of P. brasiliensis taken as probes for obtaining the signals of the intensity values. The GEO repository organization is basically divided into three levels as a platform, samples and series. A platform describes the list of elements on the array that may comprise of cDNAs, oligonucleotide probesets, ORFs, antibodies, etc., based on experiment. Each platform has a unique GEO accession number that starts with a GPL number. A sample describes the conditions under which an individual sample was handled. It also holds a unique and stable GEO accession number, which starts with GSM number. It contains a table describing hybridization details and a table of experiment result of that hybridization. Similarly, a series is a group of related samples. A GSE defines how and why the samples relate to each other. It contains tables describing extracted data, summary conclusions, etc. The GSM dataset contains only Cy3 labeled probes that were taken for all experiments along with their replica data. Also from the data sets, the value (Cy3/Cy5) was obtained along with the EST accession separately on the excel sheet. 
Experimental design and analysis
Since the experimental design is a very crucial step for the microarray data analysis, hence for the current study a comprehensive, concise and logical experimental design was developed to achieve the best co-expressed set of genes along with the HAD superfamily hydrolase gene. The data sets were then subjected to a filtering method using Coexpress tool.  This is one of the sophisticated and user-friendly software tool for interactive comparison of construction profiles and building of the pair-wise gene or gene/miRNA co-expression matrixes. It is freely available can be downloaded from http://bioinformatics.lu/CoExpress. The software contains analytical pipeline for microarray data import, filtering large number of genes, normalization methods, prediction and validation of CE and network analysis. First from each data set the intensity of HAD superfamily gene was obtained and from the expression values mean, standard deviation, etc., were obtained. Filtering is then treated by submitting such data like maximum expression value, average construction value, standard deviation, etc., to facilitate the obtaining co-expressed genes with respect to the considered gene. The CE for an input matrix is calculated by the co-express software as following steps:
- Create and fill with zeros for CE matrix M for storing the final result
- For each iteration i, while i ≤ number of runs (NR) perform the following actions:
- Randomly remove from the dataset percentage removed dataset % of the experiments
- Calculate CE matrix for the reduced dataset, L
- All values of L, which are on an absolute scale less than threshold during bootstrapping are set to 0
- Add L to M: M = M + L
- Back to step 2.
- Calculate the average CE in M by, M = M/NR.
Further, these co-expressed genes obtained for all data sets were visualized in the Cytoscape 2.8.3 software tool and network construction was observed (www.cytoscape.org). Further annotation was performed in the common genes present in the network by performing a BLAST X search with the EST to identify the protein (http://blast.ncbi.nlm.nih.gov/Blast.cgi?). Also, those EST which no match was translated by the TRANSEQ module of EMBOSS software (www.ebi.ac.uk/Tools/st/emboss_transeq). Then PSI-BLAST was performed to each protein sequence against the human genome to obtain the homology. Further the nonhomologous genes in the selected network were annotated by the SMART domain prediction server (smart.embl-heidelberg.de/).
| Results|| |
The GEO data set entitled "transcriptome analysis of P. brasiliensis cells undergoing the mycelium-to-yeast transition" was considered for the study.  Microarray data in the experiment has been derived with RNA hybridization from fungal culture at 5, 10, 24, 48, 72 and 120 h respectively in a temperature range from 26°C to 37°C, and the RNA obtained from the original mycelial has taken as reference (t = 0 h). Each time point was analyzed with four independent hybridizations, labeled as "Exp 1" and "Exp 2" with each chip carried two replicas were available for each time point in the data. For our study the data set GPL 2780 was selected from GEO, having a single platform and contains 48 samples (GSM files). In the current work, total 24 sample files were obtained for each Cy3 labeled probe in replica and experiments. The four Cy3 labeled group of data were named as CY3_E1_R1 (for experiment 1 and replica 1), CY3_E2_R1 (for experiment 2 and replica 1), CY3_E1_R2 (for experiment 1 and replica 2) and CY3_E2_R2 (for experiment 2 and replica 2) as described in [Table 1].
With the selected GSM files an in-silico CE analysis was performed. Preliminary analysis about the mean and standard deviation value of the HAD superfamily hydrolase gene was obtained [Figure 1].
|Figure 1: The expression value of haloacid dehalogenase superfamily hydrolase expressed sequence tags in different data set of consideration cy3_e1_r1 (standard deviation = 0.61, mean = −0.067), cy3_e2_r1 (standard deviation = 0.659, mean = −0.09), cy3_e1_r2 (standard deviation = 0.17, mean = 0.05), cy3_e2_r2 (standard = 0.34, mean = 0.129)|
Click here to view
This value was also carefully observed during setting of gene filtering parameter. The mean of expression values from all time periods was obtained that represents the average expression of the genes. The original dataset contains 4340 genes, but after removal of EST, that does bear any expression value, the final input of each data set was made as 3229 × 6 matrix containing row as EST and expression value (Cy3/Cy5) in different time intervals as columns. For filtration purpose following parameters are considered [Table 2].
|Table 2: Parameters for different datasets considered for gene filtration purpose|
Click here to view
After filtration of the data, Pearson correlation measure was selected with correlation power 1 and filtering threshold 0.9 for CE calculation purpose by Coexpress tool. Further, the CE calculation was bootstrapped 1000 times for validation purpose. The result of CE was obtained for all data sets and the interacting EST is given in [Table 3].
|Table 3: The results of the individual co - expressed genes (EST codes) for each data set|
Click here to view
Network construction and analysis of gene annotation
From the computed CE matrix obtained from the CE tool, the data was fed to Cytoscape tool to observe the network architecture. The network displays the CE pattern of 23 different common genes that were expressed in all the considered data sets along with the test gene HAD superfamily hydrolases [Figure 2].
|Figure 2: The common network architecture of the genes that are co-express with haloacid dehalogenase superfamily hydrolase expressed sequence tags (ID BQ491758) when observed in Cytoscape tool|
Click here to view
Further annotation of the EST sequences was performed by the BLASTX tool to key out the protein present in the organism. The result contains the major hits proteins, which are ~100% similarity with P. brasiliensis Pb03 organism. Of 23, only 9 proteins do not have significant matches in BLAST X search with 8 proteins are found to be nonhomologous to human by protein PSI BLAST search [Table 4].
| Discussion|| |
In the present analysis, the non-receipt of a homologous sequence of ESTs indicates these might have genes; whose function has not been identified. Also, the selected pathway contains the genes linked to the spermidine synthase, which is found to be an essential drug target in many organisms as protozoa, as well as pathogenic fungi. ,, Again the spermidine synthase is also found to be an essential protein in dimorphism of the pathogenic fungi, which is related to the pathogenesis, has been observed earlier. ,, Although the proteins share homology with human, but this is found to be nonessential in case of human, as searched from the Database of essential gene for Homo sapiens available at http://www.essentialgene.org/. The three-dimensional structure of spermidine synthase of P. brasiliensis Pb03 was also not present in the Protein Data Bank. Hence, the result obtained here provides an opportunity for structural modeling and robust system biology study to find out about the novel drug target by considering the pathway. Moreover, the pathway obtained in this work will lead to discovery of suitable potential drugs against the P. brasiliensis Pb03 with novel drug targets that have been predicted by the gene expression analysis. The genes listed in the table do not show homology with the human, therefore, could be considered for the development of new antifungal drug targets. The SMART based domain analysis indicates about the sequences also contains the domains, which are essential in function [Table 5].
|Table 5: Annotation of the EST by in - silico translation followed by SMART domain analysis software (only proteins available in the selected network are given)|
Click here to view
Since P. brasiliensis pb03 infection is a medical problem, the prediction of new drug targets from sequence information is of great importance. The genes that are commonly co-expressed in the dimorphic fungi P. brasiliensis Pb03 could be considered for drug target for paracoccidioidomycosis therapy by inhibiting the adhesion process during the establishment of pathogenesis. Previously, the gene CE analysis at a particular physiological stage in case of Mycobacterium has revealed the scope for identification of novel drug targets.  Similarly, the CE protein network has also been used in an integrated manner with metabolic pathways for identification of targets in the cancer therapeutics research. ,
Our work suggested that in-silico analysis is a suitable strategy for discovering co-expressed genes with HAD superfamily hydrolases. Therefore, this analysis also provides a valuable resource of information regarding a gene responding during the transformation of mycelium-to-yeast transition in pathogenic fungi P. brasiliensis. We analyzed the expression profile of 4340 genes present in 4 selected data sets, among which 23 genes were found to be co-expressed under all conditions of experimental concern. Annotating these genes and homology search predicts that many of these genes having strong potential for the development of anti-fungal therapy, as these genes play a significant role in the fungal metabolism during human infection. Gene CE and functional analysis of the P. brasiliensis Pb03 genes described in this work may be used for further study for morphogenesis of the pathogen. In conclusion, the knowledge of the predicted genes of P. brasiliensis pb03 will most likely facilitate the development of new therapeutics against paracoccidioidomycosis disease as well as other related mycosis.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
de Moraes-Vasconcelos D, Grumach AS, Yamaguti A, Andrade ME, Fieschi C, de Beaucoudrey L, et al. Paracoccidioides brasiliensis
disseminated disease in a patient with inherited deficiency in the beta1 subunit of the interleukin (IL)-12/IL-23 receptor. Clin Infect Dis 2005;41:e31-7.
Marques SA. Paracoccidioidomycosis: Epidemiological, clinical, diagnostic and treatment up-dating. An Bras Dermatol 2013;88:700-11.
González A, Gómez BL, Diez S, Hernández O, Restrepo A, Hamilton AJ, et al.
Purification and partial characterization of a Paracoccidioides brasiliensis
protein with capacity to bind to extracellular matrix proteins. Infect Immun 2005;73:2486-95.
Terçarioli GR, Bagagli E, Reis GM, Theodoro RC, Bosco Sde M, Macoris SA, et al.
Ecological study of Paracoccidioides brasiliensis
in soil: Growth ability, conidia production and molecular detection. BMC Microbiol 2007;7:92.
Marques ER, Ferreira ME, Drummond RD, Felix JM, Menossi M, Savoldi M, et al.
Identification of genes preferentially expressed in the pathogenic yeast phase of Paracoccidioides brasiliensis
, using suppression subtraction hybridization and differential macroarray analysis. Mol Genet Genomics 2004;271:667-77.
Pereira LA, Báo SN, Barbosa MS, da Silva JL, Felipe MS, de Santana JM, et al.
Analysis of the Paracoccidioides brasiliensis
triosephosphate isomerase suggests the potential for adhesin function. FEMS Yeast Res 2007;7:1381-8.
Hernández O, Almeida AJ, Gonzalez A, Garcia AM, Tamayo D, Cano LE, et al.
A 32-kilodalton hydrolase plays an important role in Paracoccidioides brasiliensis
adherence to host cells and influences pathogenicity. Infect Immun 2010;78:5280-6.
Felipe MS, Andrade RV, Arraes FB, Nicola AM, Maranhão AQ, Torres FA, et al.
Transcriptional profiles of the human pathogenic fungus Paracoccidioides brasiliensis
in mycelium and yeast cells. J Biol Chem 2005;280:24706-14.
Martinez R, Malta MH, Verceze AV, Arantes MR. Comparative efficacy of fluconazole and amphotericin B in the parenteral treatment of experimental paracoccidioidomycosis in the rat. Mycopathologia 1999;146:131-4.
Naranjo MS, Trujillo M, Munera MI, Restrepo P, Gomez I, Restrepo A. Treatment of paracoccidioidomycosis with itraconazole. J Med Vet Mycol 1990;28:67-76.
Rogers PD, Barker KS. Evaluation of differential gene expression in fluconazole-susceptible and -resistant isolates of Candida
albicans by cDNA microarray analysis. Antimicrob Agents Chemother 2002;46:3412-7.
Dowd C, Wilson IW, McFadden H. Gene expression profile changes in cotton root and hypocotyl tissues in response to infection with Fusarium oxysporum
f. sp. vasinfectum. Mol Plant Microbe Interact 2004;17:654-67.
Marton MJ, DeRisi JL, Bennett HA, Iyer VR, Meyer MR, Roberts CJ, et al.
Drug target validation and identification of secondary drug target effects using DNA microarrays. Nat Med 1998;4:1293-301.
Debouck C, Goodfellow PN. DNA microarrays in drug discovery and development. Nat Genet 1999;21:48-50.
Welsh JB, Sapinoso LM, Su AI, Kern SG, Wang-Rodriguez J, Moskaluk CA, et al.
Analysis of gene expression identifies candidate markers and pharmacological targets in prostate cancer. Cancer Res 2001;61:5974-8.
Oda Y, Owa T, Sato T, Boucher B, Daniels S, Yamanaka H, et al.
Quantitative chemical proteomics for identifying candidate drug targets. Anal Chem 2003;75:2159-65.
Wang Y, Joshi T, Zhang XS, Xu D, Chen L. Inferring gene regulatory networks from multiple microarray datasets. Bioinformatics 2006;22:2413-20.
Koonin EV, Tatusov RL. Computer analysis of bacterial haloacid dehalogenases defines a large superfamily of hydrolases with diverse specificity. Application of an iterative approach to database search. J Mol Biol 1994;244:125-32.
Caparrós-Martín JA, McCarthy-Suárez I, Culiáñez-Macià FA. HAD hydrolase function unveiled by substrate screening: Enzymatic characterization of Arabidopsis thaliana
subclass I phosphosugar phosphatase AtSgpp. Planta 2013;237:943-54.
Thomson JM, Parker J, Perou CM, Hammond SM. A custom microarray platform for analysis of microRNA gene expression. Nat Methods 2004;1:47-53.
Nazarov PV, Muller A, Khutko V, Vallar L. Co-Expression Analysis of Large Microarray Data Sets Using CoExpress Software Tool.
Nunes LR, Costa de Oliveira R, Leite DB, da Silva VS, dos Reis Marques E, da Silva Ferreira ME, et al.
Transcriptome analysis of Paracoccidioides brasiliensis
cells undergoing mycelium-to-yeast transition. Eukaryot Cell 2005;4:2115-28.
Taylor MC, Kaur H, Blessington B, Kelly JM, Wilkinson SR. Validation of spermidine synthase as a drug target in African trypanosomes. Biochem J 2008;409:563-9.
Becker JV, Mtwisha L, Crampton BG, Stoychev S, van Brummelen AC, Reeksting S, et al.
Plasmodium falciparum spermidine synthase inhibition results in unique perturbation-specific effects observed on transcript, protein and metabolite levels. BMC Genomics 2010;11:235.
Kumar R, Chadha S, Saraswat D, Bajwa JS, Li RA, Conti HR, et al.
Histatin 5 uptake by Candida
albicans utilizes polyamine transporters Dur3 and Dur31 proteins. J Biol Chem 2011;286:43748-58.
Valdés-Santiago L, Cervantes-Chávez JA, Ruiz-Herrera J. Ustilago maydis spermidine synthase is encoded by a chimeric gene, required for morphogenesis, and indispensable for survival in the host. FEMS Yeast Res 2009;9:923-35.
Jin Y, Bok JW, Guzman-de-Peña D, Keller NP. Requirement of spermidine for developmental transitions in Aspergillus nidulans
. Mol Microbiol 2002;46:801-12.
Arraes FB, Benoliel B, Burtet RT, Costa PL, Galdino AS, Lima LH, et al.
General metabolism of the dimorphic and pathogenic fungus Paracoccidioides brasiliensis
. Genet Mol Res 2005;4:290-308.
Puniya BL, Kulshreshtha D, Verma SP, Kumar S, Ramachandran S. Integrated gene co-expression network analysis in the growth phase of Mycobacterium tuberculosis
reveals new potential drug targets. Mol Biosyst 2013;9:2798-815.
Penrod NM, Moore JH. Influence networks based on coexpression improve drug target discovery for the development of novel cancer therapeutics. BMC Syst Biol 2014;8:12.
Chen J, Ma M, Shen N, Xi JJ, Tian W. Integration of cancer gene co-expression network and metabolic network to uncover potential cancer drug targets. J Proteome Res 2013;12:2354-64.
[Figure 1], [Figure 2]
[Table 1], [Table 2], [Table 3], [Table 4], [Table 5]