Sažetak | Objectives: The aim of the first (major) part of this study was to identify and evaluate the most important in silico databases, that are commonly used by VarSome, regarding basic principles of function, classification output (scores and thresholds), web link and PMID number. The second (minor) aim of this study was to investigate in a pilot study, how well the classifications of yet unidentified genomic alterations in oncological samples, predicted by in silico databases, are congruent with the classifications determined via careful literature-based single variant analysis by the molecular oncology of the department of pathology at the medical center in Coburg.
Materials and methods: In the first (major) part of this study, 92 VUS (variants of unknown significance), were entered into VarSome and the most important ones were identified. The results were clearly presented in the form of two tables. The second (minor) part of this work was conducted in form of a pilot study. 58 genetic variants were entered into VarSome for prediction. The predictive results (benign/pathogenic) were compared to the classifications determined by the department of pathology. Accuracy and MCC score were determined for each in silico database. A MCC score of 0.5 or higher was set as the threshold for acceptability, which is equivalent to an accuracy of >75%.
Results: 20 in silico databases were identified as the most important prediction tools commonly used by VarSome (BayesDel addAF, BayesDel noAF, DEOGEN2, MetaLR, MetaRNN, MetaSVM, REVEL, EIGEN PC, FATHMM, FATHMM-MKL, FATHMM-XF, LIST-S2, LRT, Mutation Assessor, MutationTaster2, PrimateAl, PROVEAN, SIFT, SIFT 4G). The classification output (scores, thresholds), assigned to each tool, gives the possibility for a deeper understanding of predictive results. Further research can be quickly achieved by following the web link or PMID number determined for each in silico database. In the pilot study, the testing of congruency was indicative, but remained low in congruency. This was reflected in an MCC score, that remained overall below the acceptability value of 0.5.
Conclusion: The results of the first (major) part of this work enable pathologists to extend the knowledge about interpreting and validating in silico databases, that are frequently used at the molecular oncology of the department of pathology at the medical center Coburg. The results of the second (minor) part, the pilot study, have contributed to get a feeling for the reliability of in silico databases. However, careful literature-based single variant analysis cannot be replaced by predictions of in silico databases, yet. |
Sažetak (hrvatski) | Ciljevi: Cilj prvog (glavnog) dijela ove studije bio je identificirati i evaluirati najvažnije in silico baze podataka, koje VarSome obično koristi, u vezi s osnovnim načelima funkcioniranja, izlazom klasifikacije (rezultati i pragovi), web poveznicom i PMID brojem . Drugi (manji) cilj ove studije bio je u pilot studiji istražiti koliko su klasifikacije još neidentificiranih genomskih promjena u onkološkim uzorcima, predviđene bazama podataka in silico, u skladu s klasifikacijama utvrđenim pažljivom analizom jedne varijante temeljenom na literaturi. od strane molekularne onkologije odjela patologije u medicinskom centru u Coburgu.
Materijali i metode: U prvom (velikom) dijelu ovog istraživanja u VarSome su unesene 92 VUS (varijante nepoznatog značaja) te su identificirane one najvažnije. Rezultati su pregledno prikazani u obliku dvije tablice. Drugi (manji) dio ovog rada proveden je u obliku pilot studije. 58 genetskih varijanti uneseno je u VarSome za predviđanje. Prediktivni rezultati (benigni/patogeni) uspoređeni su s klasifikacijama koje je odredio odjel patologije. Točnost i MCC rezultat određeni su za svaku in silico bazu podataka. MCC rezultat od 0,5 ili viši postavljen je kao prag prihvatljivosti, što je ekvivalentno točnosti od >75%.
Rezultati: 20 baza podataka in silico identificirano je kao najvažniji alati za predviđanje koje VarSome obično koristi (BayesDel addAF, BayesDel noAF, DEOGEN2, MetalLR, MetaRNN, MetaSVM, REVEL, EIGEN PC, FATHMM, FATHMM-MKL, FATHMM-XF, LIST-S2, LRT, Mutation Assessor, MutationTaster2, PrimateAl, PROVEAN, SIFT, SIFT 4G). Izlaz klasifikacije (rezultati, pragovi), dodijeljen svakom alatu, daje mogućnost za dublje razumijevanje prediktivnih rezultata. Daljnje istraživanje može se brzo postići praćenjem web poveznice ili PMID broja određenog za svaku in silico bazu podataka. U pilot studiji, ispitivanje kongruencije bilo je indikativno, ali je ostalo niske kongruencije. To se odrazilo na MCC ocjenu, koja je u cjelini ostala ispod vrijednosti prihvatljivosti od 0,5.
Zaključci: Rezultati prvog (velikog) dijela ovog rada omogućuju patolozima da prošire znanje o interpretaciji i validaciji in silico baza podataka, koje se često koriste na molekularnoj onkologiji odjela patologije medicinskog centra Coburg. Rezultati drugog (manjeg) dijela, pilot studije, pridonijeli su stjecanju osjećaja pouzdanosti in silico baza podataka. Međutim, pažljiva analiza pojedinačnih varijanti temeljena na literaturi još se ne može zamijeniti predviđanjima baza podataka in silico. |