Frequently Asked Questions (FAQs)

Q1. What is BacEffluxPred?

BacEffluxPred is 2-tier an online, open source server to predict the bacterial efflux conferring antibiotic resistance (ARE) proteins and their associated subfamilies.

Q2. How this server is useful?

This server can be used to discriminate the ARE protein from non-ARE proteins and also classified the ARE proteins into associated sub-families with significant accuracy which will then allow the experimental biologists to cut down time and cost efforts by classification of ARE proteins from non-ARE proteins.

Q3. What is the difference between already available ARE prediction algorithm (if any) and BacEffluxpred?

We didn't find any computational methods which can discriminate ARE from non-ARE proteins and further classifying into associated families. Taking these point into consideartion we have designed the prediction method named as BacEffluxPred which is designed using experimentally determined sequences followed by SVM method in classification mode and thus it predicts the bacterial efflux conferring AR proteins and their families with a high significant prediction accuracy.

Q4. Which machine learning techniques have been implemented in this server?

Support vector machine (SVM) machine learning technique has been used to design the algorithm for predicting Efflux conferring AR proteins.

Q5. From where the data for model generation has been taken?

Experimentally determined sequencesof non-antibiotic resistance (non-ARE proteins), non-efflux proteins (non-EP) and non-efflux antibiotic resistance (non-ARE) proteins and efflux families were taken from public databases i.e. UniportKB, Patric and our in-built resource BacARscan. After removal of redundant sequences, 1,132 sequences comprised of non-redundant, non-fragmented, non-ARE protein, non-efflux proteins and non-ARE proteins and 210 efflux sub-families proteins were divided into 5/6 of total data i.e. 1,099 protein sequences were used as a training dataset (Dtrain) while remaining 1/6 i.e. 243 protein sequences were used as an independent dataset (Dind) in tier-I. Similarly in tier-II we have used the 178 protein sequences from the complete datasets of 210 ARE protein sequences, which includes ABC, RND, MATE, MFS & SMR family, were used for training and remaining 32 protein sequences were used as an independent dataset (Dind) for benchmarking of trained models.

Q6. Which Input features have been encoded for model development?

We have used position specific scoring matrix (PSSM) generated by PSI-BLAST search against NR90 database for model development.

Q7. Can I download the above mentioned datasets?

Yes, user can easily download the training/validation datasets. We have provided the all dataset datasets under the “Downloads” section.

Q8. Are their any other tools integrated in the server?

Yes, Blast tool (psi-blast) which can search the query protein against NR90 database has been incorporated in this web server.

Q9. Is this server freely available?

Yes, the server is an open source, freely accessible server available at the url: http://proteininformatics.org/mkumar/baceffluxpred/.