Tentamen (uitwerkingen)

Assignment-5 with Correct Answers Michigan Technological University MATH MA 5790

Beoordeling

Verkocht

Pagina's

Cijfer

A+

Geüpload op

15-04-2023

Geschreven in

2022/2023

Assignment 5 Raghavendran Shankar 1. The hepatic injury data set was described in the introductory chapter and contains 281 unique compounds, each of which has been classified as causing no liver damage, mild damage, or severe damage (Fig. 1.2). These compounds were analyzed with 184 biological screens (i.e., experiments) to assess each compound’s effect on a particular biologically relevant target in the body. The larger the value of each of these predictors, the higher the activity of the compound. In addition to biological screens, 192 chemical fingerprint predictors were determined for these compounds. Each of these predictors represent a substructure (i.e., an atom or combination of atoms within the compound) and are either counts of the number of substructures or an indicator of presence or absence of the particular substructure. The objective of this data set is to build a predictive model for hepatic injury so that other compounds can be screened for the likelihood of causing hepatic injury. Start R and use these commands to load the data: (a) Given the classification imbalance in hepatic injury status, describe how you would create a training and testing set. A: We use stratified random sampling to split the data to cope up with the imbalance in hepatic injury status. Stratified random sampling is used to split the training and test data in balance according to the hepatic status label (None, Mild, Severe) using CreateDataPartition() method. (b) Which classification statistic would you choose to optimize for this exercise and why? A: Accuracy is used as a classification statistic. Accuracy can be used to optimize as it make good decisions to select optimal model for training and testing set. Accuracy tells how good a classification model is functioning. (c) Split the data into a training and a testing set, pre-process the data, and build models described in this chapter for the biological predictors and separately for the chemical fingerprint predictors. Which model has the best predictive ability for the biological predictors and what is the optimal performance? Which model has the best predictive ability for the chemical predictors and what is the optimal performance? Based on these results, which set of predictors contains the most information about hepatic toxicity? A: Biological Data: GLM:

Meer zien Lees minder

Instelling

Vak

Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Meld schending auteursrecht

Geschreven voor

Vak: MATH MA 5790

Alle documenten voor dit vak (1)

Documentinformatie

Geüpload op: 15 april 2023
Aantal pagina's: 37
Geschreven in: 2022/2023
Type: Tentamen (uitwerkingen)
Bevat: Vragen en antwoorden

Onderwerpen

assignment 5 raghavendran shankar 1 the hepatic injury data set was described in the introductory chapter and contains 281 unique compounds
each of which has been classified as causing no liver dama

Voorbeeld van de inhoud

Assignment 5
Raghavendran Shankar

1. The hepatic injury data set was described in the introductory chapter and
contains 281 unique compounds, each of which has been classified as causing no
liver damage, mild damage, or severe damage (Fig. 1.2). These compounds were
analyzed with 184 biological screens (i.e., experiments) to assess each
compound’s effect on a particular biologically relevant target in the body. The
larger the value of each of these predictors, the higher the activity of the
compound. In addition to biological screens, 192 chemical fingerprint predictors
were determined for these compounds. Each of these predictors represent a
substructure (i.e., an atom or combination of atoms within the compound) and
are either counts of the number of substructures or an indicator of presence or
absence of the particular substructure. The objective of this data set is to build a
predictive model for hepatic injury so that other compounds can be screened for
the likelihood of causing hepatic injury. Start R and use these commands to load
the data:
(a) Given the classification imbalance in hepatic injury status, describe how you
would create a training and testing set.
A: We use stratified random sampling to split the data to cope up with the imbalance in
hepatic injury status. Stratified random sampling is used to split the training and test data
in balance according to the hepatic status label (None, Mild, Severe) using
CreateDataPartition() method.
(b) Which classification statistic would you choose to optimize for this exercise and
why?
A: Accuracy is used as a classification statistic. Accuracy can be used to optimize as it
make good decisions to select optimal model for training and testing set. Accuracy tells
how good a classification model is functioning.

(c) Split the data into a training and a testing set, pre-process the data, and build models
described in this chapter for the biological predictors and separately for the chemical
fingerprint predictors. Which model has the best predictive ability for the biological
predictors and what is the optimal performance? Which model has the best predictive
ability for the chemical predictors and what is the optimal performance? Based on
these results, which set of predictors contains the most information about hepatic
toxicity?
A:
Biological Data:
GLM:

,PLSDA:

,
, LDA:

GLMNET:

€7,97

Krijg toegang tot het volledige document:

100% tevredenheidsgarantie

Direct beschikbaar na je betaling

Lees online óf als PDF

Geen vaste maandelijkse kosten

Maak kennis met de verkoper

ExamsConnoisseur

4,3

(67)

Maak kennis met de verkoper

ExamsConnoisseur Self

Bekijk profiel

Volgen

Verkocht

567

Lid sinds

3 jaar

Aantal volgers

344

Documenten

1497

Laatst verkocht

1 week geleden

4,3

67 beoordelingen

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via Bancontact, iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo eenvoudig kan het zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper ExamsConnoisseur. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €7,97. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 48586 samenvattingen verkocht Opgericht in 2010, al 16 jaar dé plek om samenvattingen te kopen