Significance of similarities in patterns: An application to β interferon-related DNA on human chromosome 2

L. T. May, F. R. Landsberger, Masayori Inouye, P. B. Sehgal

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

The nucleotide sequence of a 14-kilobase (kb) region of the human β interferon (IFN-β)-related DNA locus on chromosome 2 (genomic DNA clone λB3) was determined and compared to that of the IFN-β1 gene by using the Sellers TT algorithm. This algorithm aligns segments of one sequence with similar segments in a second sequence. A strategy was developed for assessing the significance of similarities between DNA sequences based on a scheme that recognizes patterns or runs of identities within an alignment. The pattern score (Π) thus obtained is an entropy-like measure. Numerically it is a reflection of the length of the second longest run of identity in an alignment plus a correction factor due to the other shorter identity runs in the alignment. When the IFN-β1 gene is compared to a random nucleotide sequence, the distribution of Π scores in such comparisons fits a Gaussian function. This strategy has been used to identify seven segments along one strand of λB3 DNA that are related to segments in IFN-β1; these seven alignments have Π scores ≥ 3 standard deviations above the mean score obtained in comparisons between IFN-β1 and random nucleotide sequences. One of these alignments (section 7) has a Π score 8.02 standard deviations above this mean score. The likelihood of finding an alignment statement as good as that in section 7 in a random sequence the length of the human genome is approximately 10-7. Furthermore, the λB3 DNA sequence in section 7 selects the human IFN-β1 gene as the most significant alignment in computer searches of mammalian nucleotide sequence data bases.

Original languageEnglish (US)
Pages (from-to)4090-4094
Number of pages5
JournalProceedings of the National Academy of Sciences of the United States of America
Volume82
Issue number12
DOIs
StatePublished - Jan 1 1985

Fingerprint

Chromosomes, Human, Pair 2
Human Chromosomes
Interferons
DNA
Genes
Entropy
Human Genome
Clone Cells
Databases

All Science Journal Classification (ASJC) codes

  • General

Cite this

@article{b91d204d3c2e4078a402a9f52a51ac59,
title = "Significance of similarities in patterns: An application to β interferon-related DNA on human chromosome 2",
abstract = "The nucleotide sequence of a 14-kilobase (kb) region of the human β interferon (IFN-β)-related DNA locus on chromosome 2 (genomic DNA clone λB3) was determined and compared to that of the IFN-β1 gene by using the Sellers TT algorithm. This algorithm aligns segments of one sequence with similar segments in a second sequence. A strategy was developed for assessing the significance of similarities between DNA sequences based on a scheme that recognizes patterns or runs of identities within an alignment. The pattern score (Π) thus obtained is an entropy-like measure. Numerically it is a reflection of the length of the second longest run of identity in an alignment plus a correction factor due to the other shorter identity runs in the alignment. When the IFN-β1 gene is compared to a random nucleotide sequence, the distribution of Π scores in such comparisons fits a Gaussian function. This strategy has been used to identify seven segments along one strand of λB3 DNA that are related to segments in IFN-β1; these seven alignments have Π scores ≥ 3 standard deviations above the mean score obtained in comparisons between IFN-β1 and random nucleotide sequences. One of these alignments (section 7) has a Π score 8.02 standard deviations above this mean score. The likelihood of finding an alignment statement as good as that in section 7 in a random sequence the length of the human genome is approximately 10-7. Furthermore, the λB3 DNA sequence in section 7 selects the human IFN-β1 gene as the most significant alignment in computer searches of mammalian nucleotide sequence data bases.",
author = "May, {L. T.} and Landsberger, {F. R.} and Masayori Inouye and Sehgal, {P. B.}",
year = "1985",
month = "1",
day = "1",
doi = "10.1073/pnas.82.12.4090",
language = "English (US)",
volume = "82",
pages = "4090--4094",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
number = "12",

}

Significance of similarities in patterns : An application to β interferon-related DNA on human chromosome 2. / May, L. T.; Landsberger, F. R.; Inouye, Masayori; Sehgal, P. B.

In: Proceedings of the National Academy of Sciences of the United States of America, Vol. 82, No. 12, 01.01.1985, p. 4090-4094.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Significance of similarities in patterns

T2 - An application to β interferon-related DNA on human chromosome 2

AU - May, L. T.

AU - Landsberger, F. R.

AU - Inouye, Masayori

AU - Sehgal, P. B.

PY - 1985/1/1

Y1 - 1985/1/1

N2 - The nucleotide sequence of a 14-kilobase (kb) region of the human β interferon (IFN-β)-related DNA locus on chromosome 2 (genomic DNA clone λB3) was determined and compared to that of the IFN-β1 gene by using the Sellers TT algorithm. This algorithm aligns segments of one sequence with similar segments in a second sequence. A strategy was developed for assessing the significance of similarities between DNA sequences based on a scheme that recognizes patterns or runs of identities within an alignment. The pattern score (Π) thus obtained is an entropy-like measure. Numerically it is a reflection of the length of the second longest run of identity in an alignment plus a correction factor due to the other shorter identity runs in the alignment. When the IFN-β1 gene is compared to a random nucleotide sequence, the distribution of Π scores in such comparisons fits a Gaussian function. This strategy has been used to identify seven segments along one strand of λB3 DNA that are related to segments in IFN-β1; these seven alignments have Π scores ≥ 3 standard deviations above the mean score obtained in comparisons between IFN-β1 and random nucleotide sequences. One of these alignments (section 7) has a Π score 8.02 standard deviations above this mean score. The likelihood of finding an alignment statement as good as that in section 7 in a random sequence the length of the human genome is approximately 10-7. Furthermore, the λB3 DNA sequence in section 7 selects the human IFN-β1 gene as the most significant alignment in computer searches of mammalian nucleotide sequence data bases.

AB - The nucleotide sequence of a 14-kilobase (kb) region of the human β interferon (IFN-β)-related DNA locus on chromosome 2 (genomic DNA clone λB3) was determined and compared to that of the IFN-β1 gene by using the Sellers TT algorithm. This algorithm aligns segments of one sequence with similar segments in a second sequence. A strategy was developed for assessing the significance of similarities between DNA sequences based on a scheme that recognizes patterns or runs of identities within an alignment. The pattern score (Π) thus obtained is an entropy-like measure. Numerically it is a reflection of the length of the second longest run of identity in an alignment plus a correction factor due to the other shorter identity runs in the alignment. When the IFN-β1 gene is compared to a random nucleotide sequence, the distribution of Π scores in such comparisons fits a Gaussian function. This strategy has been used to identify seven segments along one strand of λB3 DNA that are related to segments in IFN-β1; these seven alignments have Π scores ≥ 3 standard deviations above the mean score obtained in comparisons between IFN-β1 and random nucleotide sequences. One of these alignments (section 7) has a Π score 8.02 standard deviations above this mean score. The likelihood of finding an alignment statement as good as that in section 7 in a random sequence the length of the human genome is approximately 10-7. Furthermore, the λB3 DNA sequence in section 7 selects the human IFN-β1 gene as the most significant alignment in computer searches of mammalian nucleotide sequence data bases.

UR - http://www.scopus.com/inward/record.url?scp=0021839492&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0021839492&partnerID=8YFLogxK

U2 - 10.1073/pnas.82.12.4090

DO - 10.1073/pnas.82.12.4090

M3 - Article

C2 - 3858866

AN - SCOPUS:0021839492

VL - 82

SP - 4090

EP - 4094

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 12

ER -