Significance of similarities in patterns: An application to β interferon-related DNA on human chromosome 2

L. T. May, F. R. Landsberger, M. Inouye, P. B. Sehgal

Research output: Contribution to journalArticle

4 Scopus citations


The nucleotide sequence of a 14-kilobase (kb) region of the human β interferon (IFN-β)-related DNA locus on chromosome 2 (genomic DNA clone λB3) was determined and compared to that of the IFN-β1 gene by using the Sellers TT algorithm. This algorithm aligns segments of one sequence with similar segments in a second sequence. A strategy was developed for assessing the significance of similarities between DNA sequences based on a scheme that recognizes patterns or runs of identities within an alignment. The pattern score (Π) thus obtained is an entropy-like measure. Numerically it is a reflection of the length of the second longest run of identity in an alignment plus a correction factor due to the other shorter identity runs in the alignment. When the IFN-β1 gene is compared to a random nucleotide sequence, the distribution of Π scores in such comparisons fits a Gaussian function. This strategy has been used to identify seven segments along one strand of λB3 DNA that are related to segments in IFN-β1; these seven alignments have Π scores ≥ 3 standard deviations above the mean score obtained in comparisons between IFN-β1 and random nucleotide sequences. One of these alignments (section 7) has a Π score 8.02 standard deviations above this mean score. The likelihood of finding an alignment statement as good as that in section 7 in a random sequence the length of the human genome is approximately 10-7. Furthermore, the λB3 DNA sequence in section 7 selects the human IFN-β1 gene as the most significant alignment in computer searches of mammalian nucleotide sequence data bases.

Original languageEnglish (US)
Pages (from-to)4090-4094
Number of pages5
JournalProceedings of the National Academy of Sciences of the United States of America
Issue number12
StatePublished - Jan 1 1985
Externally publishedYes


All Science Journal Classification (ASJC) codes

  • General

Cite this