Knowledge-based generation of machine learning experiments: learning with DNA crystallography data.

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Though it has been possible in the past to learn to predict DNA hydration patterns from crystallographic data, there is ambiguity in the choice of training data (both in terms of the relevant set of cases and the features needed to represent them), which limits the usefulness of standard learning techniques. Thus, we have developed a knowledge-based system to generate machine learning experiments for inducing DNA hydration pattern classifiers. The system takes as input (1) a set of classified training examples described by a large set of attributes and (2) information about a set of learning experiments that have already been run. It outputs a new learning experiment, namely a (not necessarily proper) subset of the input examples represented by a new set of features. Domain specific and domain independent knowledge is used to suggest subsets of training examples from suspected subpopulations, transform attributes in the training data or generate new ones, and choose interesting ways to substitute one experiment's set of attributes with another. Automatic hydration pattern predictors are of both theoretical and practical interest to DNA crystallographers, because they can speed up a labor intensive process, and because the extracted rules add to the knowledge of what determines DNA hydration.

Fingerprint

Crystallography
Learning
DNA
Machine Learning

All Science Journal Classification (ASJC) codes

  • Medicine(all)

Cite this

@article{66c2b89e22c645938cf5ac81dad33582,
title = "Knowledge-based generation of machine learning experiments: learning with DNA crystallography data.",
abstract = "Though it has been possible in the past to learn to predict DNA hydration patterns from crystallographic data, there is ambiguity in the choice of training data (both in terms of the relevant set of cases and the features needed to represent them), which limits the usefulness of standard learning techniques. Thus, we have developed a knowledge-based system to generate machine learning experiments for inducing DNA hydration pattern classifiers. The system takes as input (1) a set of classified training examples described by a large set of attributes and (2) information about a set of learning experiments that have already been run. It outputs a new learning experiment, namely a (not necessarily proper) subset of the input examples represented by a new set of features. Domain specific and domain independent knowledge is used to suggest subsets of training examples from suspected subpopulations, transform attributes in the training data or generate new ones, and choose interesting ways to substitute one experiment's set of attributes with another. Automatic hydration pattern predictors are of both theoretical and practical interest to DNA crystallographers, because they can speed up a labor intensive process, and because the extracted rules add to the knowledge of what determines DNA hydration.",
author = "D. Cohen and Casimir Kulikowski and Helen Berman",
year = "1993",
month = "1",
day = "1",
language = "English (US)",
volume = "1",
pages = "92--100",
journal = "Proceedings / . International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology",
issn = "1553-0833",
publisher = "American Association for Artificial Intelligence (AAAI) Press",

}

TY - JOUR

T1 - Knowledge-based generation of machine learning experiments

T2 - learning with DNA crystallography data.

AU - Cohen, D.

AU - Kulikowski, Casimir

AU - Berman, Helen

PY - 1993/1/1

Y1 - 1993/1/1

N2 - Though it has been possible in the past to learn to predict DNA hydration patterns from crystallographic data, there is ambiguity in the choice of training data (both in terms of the relevant set of cases and the features needed to represent them), which limits the usefulness of standard learning techniques. Thus, we have developed a knowledge-based system to generate machine learning experiments for inducing DNA hydration pattern classifiers. The system takes as input (1) a set of classified training examples described by a large set of attributes and (2) information about a set of learning experiments that have already been run. It outputs a new learning experiment, namely a (not necessarily proper) subset of the input examples represented by a new set of features. Domain specific and domain independent knowledge is used to suggest subsets of training examples from suspected subpopulations, transform attributes in the training data or generate new ones, and choose interesting ways to substitute one experiment's set of attributes with another. Automatic hydration pattern predictors are of both theoretical and practical interest to DNA crystallographers, because they can speed up a labor intensive process, and because the extracted rules add to the knowledge of what determines DNA hydration.

AB - Though it has been possible in the past to learn to predict DNA hydration patterns from crystallographic data, there is ambiguity in the choice of training data (both in terms of the relevant set of cases and the features needed to represent them), which limits the usefulness of standard learning techniques. Thus, we have developed a knowledge-based system to generate machine learning experiments for inducing DNA hydration pattern classifiers. The system takes as input (1) a set of classified training examples described by a large set of attributes and (2) information about a set of learning experiments that have already been run. It outputs a new learning experiment, namely a (not necessarily proper) subset of the input examples represented by a new set of features. Domain specific and domain independent knowledge is used to suggest subsets of training examples from suspected subpopulations, transform attributes in the training data or generate new ones, and choose interesting ways to substitute one experiment's set of attributes with another. Automatic hydration pattern predictors are of both theoretical and practical interest to DNA crystallographers, because they can speed up a labor intensive process, and because the extracted rules add to the knowledge of what determines DNA hydration.

UR - http://www.scopus.com/inward/record.url?scp=0027900873&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0027900873&partnerID=8YFLogxK

M3 - Article

C2 - 7584375

AN - SCOPUS:0027900873

VL - 1

SP - 92

EP - 100

JO - Proceedings / . International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology

JF - Proceedings / . International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology

SN - 1553-0833

ER -